Introduction to CySA

Overview

CySA provides an interactive Shiny application for selecting and visualizing clusters from flow-cytometry data stored in SingleCellExperiment objects. It is designed to work with SOM-based clustering outputs such as those produced by FlowSOM and curated by the CATALYST workflow.

The main functions are:

  • prepClusterSelectorData() – subsample a SingleCellExperiment and build the inputs required by the app.
  • clusterSelector() – return a Shiny app object that can be launched with shiny::runApp().
  • plotSOMScatter() and plotScatterBJ() – static ggplot2 helpers for SOM and scatter visualizations.

Installation

Install the package from Bioconductor with:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("CySA")

Quick start

CySA expects a SingleCellExperiment object that contains at least the following items in metadata(sce):

  • SOM_codes – a matrix of SOM node codes (one row per SOM node, one column per marker).
  • SOM_stats – a data frame of per-node statistics with an id column.
  • map$colsUsed – an optional character vector of markers used for SOM mapping.

The object should also contain a sample_id column and a cluster_id column in colData(sce).

This vignette uses a small example data set shipped with the package:

library(CySA)
sce <- CySA_example_sce()
head(S4Vectors::metadata(sce)$SOM_codes)
#>       marker1    marker2   marker3   marker4   marker5   marker6   marker7
#> 1 0.001666667 0.08500000 0.1683333 0.2516667 0.3350000 0.4183333 0.5016667
#> 2 0.003333333 0.08666667 0.1700000 0.2533333 0.3366667 0.4200000 0.5033333
#> 3 0.005000000 0.08833333 0.1716667 0.2550000 0.3383333 0.4216667 0.5050000
#> 4 0.006666667 0.09000000 0.1733333 0.2566667 0.3400000 0.4233333 0.5066667
#> 5 0.008333333 0.09166667 0.1750000 0.2583333 0.3416667 0.4250000 0.5083333
#> 6 0.010000000 0.09333333 0.1766667 0.2600000 0.3433333 0.4266667 0.5100000
#>     marker8   marker9  marker10  marker11  marker12
#> 1 0.5850000 0.6683333 0.7516667 0.8350000 0.9183333
#> 2 0.5866667 0.6700000 0.7533333 0.8366667 0.9200000
#> 3 0.5883333 0.6716667 0.7550000 0.8383333 0.9216667
#> 4 0.5900000 0.6733333 0.7566667 0.8400000 0.9233333
#> 5 0.5916667 0.6750000 0.7583333 0.8416667 0.9250000
#> 6 0.5933333 0.6766667 0.7600000 0.8433333 0.9266667

Use prepClusterSelectorData() to subsample the data and generate a default list of marker pairs:

prepped <- prepClusterSelectorData(
    sce,
    total_cells_to_sample = 200,
    somCodesName = "SOM_codes"
)
names(prepped)
#> [1] "sce"            "sce_subsampled" "dList"

clusterSelector() needs a few additional inputs. Here we build minimal versions from the example data:

som_codes <- S4Vectors::metadata(sce)$SOM_codes
markers <- S4Vectors::metadata(sce)$map$colsUsed

dend <- stats::as.dendrogram(stats::hclust(stats::dist(som_codes)))

dendTable <- data.frame(
    id = seq_len(nrow(som_codes)),
    label = rownames(som_codes),
    stringsAsFactors = FALSE
)

clusterPatientTable <- table(
    sample_id = sce$sample_id,
    cluster_id = sce$cluster_id
)

somRasterData <- data.frame(
    x = rep(seq_len(5), length.out = nrow(som_codes)),
    y = rep(seq_len(2), each = nrow(som_codes) / 2),
    id = seq_len(nrow(som_codes))
)
for (m in markers) {
    somRasterData[[m]] <- seq_len(nrow(som_codes)) / nrow(som_codes)
}

arr <- array(
    data = seq_len(10 * 10 * length(markers)),
    dim = c(10, 10, length(markers))
)
somRasterObj <- raster::brick(arr)
names(somRasterObj) <- markers

Create the reusable Shiny app object:

app <- clusterSelector(
    sce = prepped$sce,
    sce_subsampled = prepped$sce_subsampled,
    dList = prepped$dList,
    dend = dend,
    dendTable = dendTable,
    clusterPatientTable = clusterPatientTable,
    somRasterData = somRasterData,
    somRasterObj = somRasterObj
)

Launch the app interactively:

shiny::runApp(app)

The interactive session writes the selected cluster groupings back to the outputList object that was passed in.

Statistical comparison

CySA can compare the relative abundance of selected SOM nodes between two sample groups. To use this feature, metadata(sce)$experiment_info must contain a grouping column (for example condition) and a sample_id column that matches colData(sce)$sample_id.

In the app, select:

  • groupsVar – the column in experiment_info that defines the groups.
  • group1 and group2 – the two groups to compare.
  • relativeTo – whether to report raw counts, normalize by a numeric column in experiment_info, or normalize by another cluster group.

For each selected SOM node, CySA performs a two-sample t-test on the relative cell counts between the two groups and displays the result in the Stats panel.

Static plots

For non-interactive use, plotSOMScatter() produces a ggplot2 scatter plot of two SOM channels:

plotSOMScatter(sce, chs = c("marker1", "marker2"))
#> Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
#> ℹ Please use tidy evaluation idioms with `aes()`.
#> ℹ See also `vignette("ggplot2-in-packages")` for more information.
#> ℹ The deprecated feature was likely used in the CySA package.
#>   Please report the issue at <https://github.com/baj12/CySA/issues>.
#> This warning is displayed once per session.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.

plotScatterBJ() provides an alternative scatter visualization:

plotScatterBJ(sce, chs = c("marker1", "marker2"))

Session information

sessionInfo()
#> R version 4.6.0 (2026-04-24)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] CySA_0.99.7    rmarkdown_2.31
#> 
#> loaded via a namespace (and not attached):
#>   [1] splines_4.6.0               later_1.4.8                
#>   [3] tibble_3.3.1                polyclip_1.10-7            
#>   [5] XML_3.99-0.23               shinyjqui_0.4.1            
#>   [7] lifecycle_1.0.5             rstatix_0.7.3              
#>   [9] doParallel_1.0.17           lattice_0.22-9             
#>  [11] MASS_7.3-65                 crosstalk_1.2.2            
#>  [13] backports_1.5.1             magrittr_2.0.5             
#>  [15] plotly_4.12.0               sass_0.4.10                
#>  [17] jquerylib_0.1.4             yaml_2.3.12                
#>  [19] plotrix_3.8-14              httpuv_1.6.17              
#>  [21] otel_0.2.0                  askpass_1.2.1              
#>  [23] sp_2.2-1                    reticulate_1.46.0          
#>  [25] cowplot_1.2.0               buildtools_1.0.0           
#>  [27] RColorBrewer_1.1-3          ConsensusClusterPlus_1.77.0
#>  [29] multcomp_1.4-30             abind_1.4-8                
#>  [31] Rtsne_0.17                  GenomicRanges_1.65.0       
#>  [33] purrr_1.2.2                 BiocGenerics_0.59.7        
#>  [35] TH.data_1.1-5               tweenr_2.0.3               
#>  [37] sandwich_3.1-1              circlize_0.4.18            
#>  [39] IRanges_2.47.2              S4Vectors_0.51.3           
#>  [41] data.tree_1.2.0             ggrepel_0.9.8              
#>  [43] irlba_2.3.7                 CATALYST_1.37.0            
#>  [45] maketools_1.3.2             terra_1.9-34               
#>  [47] umap_0.2.10.0               RSpectra_0.16-2            
#>  [49] codetools_0.2-20            DelayedArray_0.39.3        
#>  [51] DT_0.34.0                   scuttle_1.23.1             
#>  [53] ggforce_0.5.0               tidyselect_1.2.1           
#>  [55] shape_1.4.6.1               raster_3.6-32              
#>  [57] farver_2.1.2                ScaledMatrix_1.21.0        
#>  [59] viridis_0.6.5               matrixStats_1.5.0          
#>  [61] stats4_4.6.0                Seqinfo_1.3.0              
#>  [63] jsonlite_2.0.0              GetoptLong_1.1.1           
#>  [65] BiocNeighbors_2.7.2         Formula_1.2-5              
#>  [67] ggridges_0.5.7              survival_3.8-6             
#>  [69] scater_1.41.1               iterators_1.0.14           
#>  [71] foreach_1.5.2               tools_4.6.0                
#>  [73] ggnewscale_0.5.2            Rcpp_1.1.1-1.1             
#>  [75] glue_1.8.1                  gridExtra_2.3              
#>  [77] SparseArray_1.13.2          xfun_0.59                  
#>  [79] MatrixGenerics_1.25.0       dplyr_1.2.1                
#>  [81] shinydashboard_0.7.3        withr_3.0.3                
#>  [83] fastmap_1.2.0               shinyjs_2.1.1              
#>  [85] openssl_2.4.2               digest_0.6.39              
#>  [87] rsvd_1.0.5                  R6_2.6.1                   
#>  [89] mime_0.13                   colorspace_2.1-2           
#>  [91] gtools_3.9.5                tidyr_1.3.2                
#>  [93] generics_0.1.4              data.table_1.18.4          
#>  [95] httr_1.4.8                  htmlwidgets_1.6.4          
#>  [97] S4Arrays_1.13.0             pkgconfig_2.0.3            
#>  [99] gtable_0.3.6                ComplexHeatmap_2.29.0      
#> [101] RProtoBufLib_2.25.0         S7_0.2.2                   
#> [103] SingleCellExperiment_1.35.1 XVector_0.53.0             
#> [105] sys_3.4.3                   htmltools_0.5.9            
#> [107] carData_3.0-6               clue_0.3-68                
#> [109] scales_1.4.0                Biobase_2.73.1             
#> [111] png_0.1-9                   colorRamps_2.3.4           
#> [113] knitr_1.51                  reshape2_1.4.5             
#> [115] rjson_0.2.23                shinydashboardPlus_2.0.6   
#> [117] cachem_1.1.0                zoo_1.8-15                 
#> [119] GlobalOptions_0.1.4         stringr_1.6.0              
#> [121] parallel_4.6.0              vipor_0.4.7                
#> [123] pillar_1.11.1               grid_4.6.0                 
#> [125] vctrs_0.7.3                 promises_1.5.0             
#> [127] ggpubr_0.6.3                car_3.1-5                  
#> [129] BiocSingular_1.29.0         cytolib_2.25.0             
#> [131] beachmat_2.29.0             xtable_1.8-8               
#> [133] cluster_2.1.8.2             beeswarm_0.4.0             
#> [135] evaluate_1.0.5              mvtnorm_1.4-1              
#> [137] cli_3.6.6                   compiler_4.6.0             
#> [139] rlang_1.2.0                 crayon_1.5.3               
#> [141] ggsignif_0.6.4              labeling_0.4.3             
#> [143] FlowSOM_2.21.0              flowCore_2.25.1            
#> [145] plyr_1.8.9                  ggbeeswarm_0.7.3           
#> [147] stringi_1.8.7               viridisLite_0.4.3          
#> [149] BiocParallel_1.47.0         nnls_1.6                   
#> [151] lazyeval_0.2.3              Matrix_1.7-5               
#> [153] ggplot2_4.0.3               shiny_1.13.0               
#> [155] SummarizedExperiment_1.43.0 drc_3.0-1                  
#> [157] fontawesome_0.5.3           igraph_2.3.2               
#> [159] broom_1.0.13                memoise_2.0.1              
#> [161] bslib_0.11.0                collapsibleTree_0.1.8