This vignettes describes a protocol for analysing flow cytometry intracellular cytokine staining (ICS) data with the polyICSFlow protocol. polyICSFlow is designed to assign single cells cytokine positivity based on gates and subsequently find cytokine combinations observed in the data, and lastly, quantify the peptide-specific T cells within a user-defined cell subset. The package requires pre-processed flow cytometry data from an ICS dataset with cells stimulated with a peptide of interest (e.g. viral peptide or a tumor antigen) and a negative control stimulation to determine background activation.
In this vignette, we analyze two FCS files from the same participant: one stimulated with HIV Gag peptide and one unstimulated control. The samples were stained with a 27-color panel and acquired using spectral flow cytometry. This panel includes cytokine and degranulation markers (IFNγ, TNF, IL-2, CD107a), as well as markers of differentiation, proliferation, transcriptional activity, effector function, and immune checkpoints.
The FCS files were transformed using estimateLogicle and pre-gated to exclude debris, dead cells, doublets, and non-lymphocytes. Low-quality events were further removed using PeacoQC. CD8 T cells were then identified by FlowSOM clustering of lymphocytes based on CD3, CD4, CD8, CD16, and CD19 expression, selecting the CD3⁺CD8⁺CD4⁻CD16⁻CD19⁻ FlowSOM metacluster.
The examples below focus on the cytokines IFNγ, TNF, IL-2, and the degranulation marker CD107a; however, polyICSFlow can be applied to any combination of cytokines of interest in your flow panel. It is also compatible with any peptide stimulation enabling analyses of antigen specificity and is agnostic to the acquisition platform, supporting data from both spectral and conventional flow cytometry, provided appropriate preprocessing (see below) has been performed.
# if (!requireNamespace("BiocManager", quietly = TRUE))
# install.packages("BiocManager")
# BiocManager::install("flowCore")
# BiocManager::install("flowWorkspace")
# BiocManager::install("openCyto")
# BiocManager::install("FlowSOM")
# BiocManager::install("ComplexHeatmap")
# BiocManager::install("SummarizedExperiment")
# BiocManager::install("polyICSFlow")
#
# remotes::install_github("RGLab/cytoinstaller")
#
#
# install.packages("cowplot")
# install.packages("dplyr")
# install.packages("forcats")
# install.packages("ggtext")
# install.packages("rlang")
# install.packages("scales")
# install.packages("stringr")
# install.packages("tidyselect")
# install.packages("tidyr")
# install.packages("magrittr")
# install.packages("ggplot2")
# install.packages("ggridges")
# install.packages("ggh4x")polyICSFlow requires a gatingHierarchy or gatingSet as input with an applied gatingTemplate to gate on cytokines of interest.
To get our input for polyICSFlow, raw FCS files need to go through the following steps, which are described in detail in the sections below:
Data needs to be pre-processed including unmixing (spectral) or compensation (conventional), data transformation, and cleaning. Pre-processing steps have been described in detail elsewhere and will therefore not be shown here. We refer to resources such as:
We load our pre-processed data as a flowSet.
# define path to data
path_data <- system.file("extdata", package = "polyICSFlow")
# load pre-processed fcs files as flowSet
fs <- flowCore::read.flowSet(path = path_data,
pattern = ".fcs",
truncate_max_range = FALSE,
transformation = FALSE)
flowWorkspace::pData(fs)
#> name
#> ID001_gag.fcs ID001_gag.fcs
#> ID001_unstim.fcs ID001_unstim.fcsThen create a gatingSet to prepare for gating:
From the 2D scatter plots we see that the FCS files indeed only contain CD3+CD8+ cells:
# check marker expression on cells
CytoExploreR::cyto_plot_custom(matrix(c(1:2),ncol=2))
for(channel_plot in c("CD8","CD4")){
CytoExploreR::cyto_plot(gs[[1]],
parent = "root",
channels = c("CD3",channel_plot),
ylim = c(0,4))
}Because our gatingSet only contains CD8 T cells, we gate on the root population as specified in the gatingTemplate.
# apply GatingTemplate to flowSet
gt <- openCyto::gatingTemplate(file.path(path_data, "gatingTemp_cytokines.csv"))
#> Adding population:IFN+
#> Adding population:TNFa+
#> Adding population:IL2+
#> Adding population:CD107a+
openCyto::gt_gating(x = gt, y = gs)
#> Preprocessing for 'CD107a+'
#> Gating for 'CD107a+'
#> Loading required package: openCyto
#> done!
#> done.
#> Preprocessing for 'IL2+'
#> Gating for 'IL2+'
#> done!
#> done.
#> Preprocessing for 'TNFa+'
#> Gating for 'TNFa+'
#> done!
#> done.
#> Preprocessing for 'IFN+'
#> Gating for 'IFN+'
#> done!
#> done.
#> finished.
# see nodes in gatingSet
names(gt_get_nodes(gt))
#> [1] "root" "/IFN+" "/TNFa+" "/IL2+" "/CD107a+"We can check our cytokine gates:
Cell population labels need to be provided as a string and can be obtained from any source, e.g., annotated metacluster_ids from FlowSOM clustering or cell labels obtained from manual gating.
FlowSOM clustering of the CD8 T cells:
markers_clust <- c("CD45RA","CCR7","CD27","CD95","TCF1","Tbet","TIGIT","CD39","CCR4","CD16")
channels_clust <- flowCore::pData(flowCore::parameters(fs[[1]])) %>% dplyr::filter(desc %in% markers_clust)%>%dplyr::pull(name) %>% unname
fsom <- FlowSOM::FlowSOM(fs,
compensate = FALSE,
transform = FALSE,
scale = FALSE,
colsToUse = channels_clust,
xdim = 8,
ydim = 8,
nClus = 10)
cell_labels <- fsom$metaclustering[fsom$map$mapping[,1]]Loaded FlowSOM cluster labels from elsewhere:
# Get marker positivity per cell
df_markerPos <- getMarkerPositivity(x = gs,
gate_names = c("IFN+","TNFa+","IL2+","CD107a+"))
head(df_markerPos)
#> IFN TNFa IL2 CD107a
#> 1 FALSE FALSE FALSE FALSE
#> 2 FALSE FALSE FALSE FALSE
#> 3 FALSE FALSE FALSE FALSE
#> 4 FALSE FALSE FALSE FALSE
#> 5 FALSE FALSE FALSE FALSE
#> 6 FALSE FALSE FALSE FALSEFrom the output of the previous function, we then assign each cell cytokine combinations.
Either in simple mode, where we get information on:
MarkerComb: A factor describing
which marker combination a specific cell has. It represents all mutually
exclusive combinations (e.g., 4 markers → 2⁴ = 16
combinations).
MarkerCount: An integer describing
the number of functions of the cell (i.e., the count of positive
markers).
Functionality: A factor grouping
MarkerCount into three categories:
MarkerCount == 0)MarkerCount == 1)MarkerCount > 1)# Assign marker combinations to each cell based on marker positivity
# Simple mode
df_markerComb <- assignMarkerCombinations(data_markerPos = df_markerPos,
mode = "simple")
#> Assigning cells to marker combinations (16 total from 4 markers)...
#> Done!
head(df_markerComb)
#> IFN TNFa IL2 CD107a MarkerComb MarkerCount Functionality
#> 1 FALSE FALSE FALSE FALSE IFN-TNFa-IL2-CD107a- 0 None
#> 2 FALSE FALSE FALSE FALSE IFN-TNFa-IL2-CD107a- 0 None
#> 3 FALSE FALSE FALSE FALSE IFN-TNFa-IL2-CD107a- 0 None
#> 4 FALSE FALSE FALSE FALSE IFN-TNFa-IL2-CD107a- 0 None
#> 5 FALSE FALSE FALSE FALSE IFN-TNFa-IL2-CD107a- 0 None
#> 6 FALSE FALSE FALSE FALSE IFN-TNFa-IL2-CD107a- 0 NoneOr exhaustive mode, which additionally adds atleast_n
columns, which are logical variables indicating whether cells express at
least 1, 2, 3, …, n markers in any combination.
# Exhaustive mode
df_markerComb <- assignMarkerCombinations(data_markerPos = df_markerPos,
mode = "exhaustive")
#> Assigning cells to marker combinations (16 total from 4 markers)...
#> Done!
#> Adding logical columns for at least 1, 2, 3 and 4 markers and Polyfunctional, at least IFN, TNFa, IL2, CD107a...
#> Done!
names(df_markerComb)
#> [1] "IFN" "TNFa"
#> [3] "IL2" "CD107a"
#> [5] "MarkerComb" "MarkerCount"
#> [7] "Functionality" "atleast_1_IFN+"
#> [9] "atleast_1_TNFa+" "atleast_1_IL2+"
#> [11] "atleast_1_CD107a+" "atleast_2_IFN+TNFa+"
#> [13] "atleast_2_IFN+IL2+" "atleast_2_IFN+CD107a+"
#> [15] "atleast_2_TNFa+IL2+" "atleast_2_TNFa+CD107a+"
#> [17] "atleast_2_IL2+CD107a+" "atleast_3_IFN+TNFa+IL2+"
#> [19] "atleast_3_IFN+TNFa+CD107a+" "atleast_3_IFN+IL2+CD107a+"
#> [21] "atleast_3_TNFa+IL2+CD107a+" "atleast_4_IFN+TNFa+IL2+CD107a+"
#> [23] "Polyfunctional_atleast_IFN+" "Polyfunctional_atleast_TNFa+"
#> [25] "Polyfunctional_atleast_IL2+" "Polyfunctional_atleast_CD107a+"The two modes are described in the figure below:
To calculate peptide-specificity, we need to add sample metadata to define cell population levels and stimulation conditions:
# Read metadata
df_md <- read.csv(file.path(path_data, "metadata.csv"))
# Add metadata to dataframe
df <- df_markerComb %>%
dplyr::mutate(ID = df_md$ID,
Stimulation = df_md$Stimulation,
cell_type = cell_labels_external)
# Alternatively use the cell labels obtained from FlowSOM clustering in step 2.3
# df <- df_markerComb %>%
# dplyr::mutate(ID = df_md$ID,
# Stimulation = df_md$Stimulation,
# cell_type = cell_labels)In the first example, we want to quantify the number of functions, so
we choose resolution = "MarkerCount"
# Calculate by Marker Count
result <- calcPolyfunctionality(x = df,
md_cols = c("ID"),
pop_col = "cell_type",
resolution = "MarkerCount",
condition_col = "Stimulation",
background_val = "Unstim")
head(result)
#> # A tibble: 6 × 8
#> ID cell_type Stimulation MarkerCount n_cells_pos n_cells_tot perc_pos
#> <int> <fct> <chr> <int> <int> <int> <dbl>
#> 1 1 1 Gag 1 16 1027 1.56
#> 2 1 1 Gag 2 2 1027 0.195
#> 3 1 1 Gag 3 0 1027 0
#> 4 1 1 Gag 4 0 1027 0
#> 5 1 1 Unstim 1 9 1135 0.793
#> 6 1 1 Unstim 2 1 1135 0.0881
#> # ℹ 1 more variable: perc_pos_backgroundSubtracted <dbl>The results can be plotted using ggplot:
palette_MarkerCounts <- scales::brewer_pal(type = "qual")(4)
result %>%
dplyr::filter(Stimulation != "Unstim")%>%
ggplot2::ggplot(ggplot2::aes(x = cell_type,
y = perc_pos_backgroundSubtracted,
fill = as.factor(MarkerCount)))+
ggh4x::facet_grid2(Stimulation~ID)+
ggplot2::geom_bar(stat="identity")+
ggplot2::theme_bw()+
ggplot2::ylab("Frequency of HIV-1-Gag specific CD8 T cells")+
ggplot2::xlab("CD8 metacluster")+
ggplot2::scale_fill_manual(values = palette_MarkerCounts,
name = "Number of functions")In the second example, we choose
resolution = "MarkerComb" to quantify at individual marker
combination levels:
# Calculate by Marker Combination
result <- calcPolyfunctionality(x = df,
md_cols = c("ID"),
pop_col = "cell_type",
resolution = "MarkerComb",
condition_col = "Stimulation",
background_val = "Unstim")
head(result)
#> # A tibble: 6 × 8
#> ID cell_type Stimulation MarkerComb n_cells_pos n_cells_tot perc_pos
#> <int> <fct> <chr> <fct> <int> <int> <dbl>
#> 1 1 1 Gag IFN-TNFa-IL2-CD1… 6 1027 0.584
#> 2 1 1 Gag IFN-TNFa-IL2+CD1… 2 1027 0.195
#> 3 1 1 Gag IFN-TNFa-IL2+CD1… 0 1027 0
#> 4 1 1 Gag IFN-TNFa+IL2-CD1… 2 1027 0.195
#> 5 1 1 Gag IFN-TNFa+IL2-CD1… 1 1027 0.0974
#> 6 1 1 Gag IFN-TNFa+IL2+CD1… 0 1027 0
#> # ℹ 1 more variable: perc_pos_backgroundSubtracted <dbl>The results can again be plottet using ggplot. From this, we see that metaclusters 2 and 4 are the most HIV-1 specific clusters, and that metacluster 4 has a high frequency of triple positive IFNg+TNFa+CD107a+ cells:
palette_markerCombs <- c("#DC050C","#FB8072","#1965B0","#7BAFDE",
"#882E72","#B17BA6","#FF7F00","#FDB462",
"#E7298A","#E78AC3","#33A02C", "#B2DF8A",
"#55A1B1","#8DD3C7","#A6761D")
result %>%
dplyr::filter(Stimulation != "Unstim")%>%
ggplot2::ggplot(ggplot2::aes(x = cell_type,
y = perc_pos_backgroundSubtracted,
fill = MarkerComb))+
ggh4x::facet_grid2(Stimulation~ID)+
ggplot2::geom_bar(stat="identity")+
ggplot2::theme_bw()+
ggplot2::ylab("Frequency of HIV-1-Gag specific CD8 T cells")+
ggplot2::xlab("CD8 metacluster")+
ggplot2::scale_fill_manual(values = palette_markerCombs,
name = "Cytokine combinations")In the third example, we’re interested in quantifying polyfunctional
IFN+ responses, so we choose
resolution = "Polyfunctional_atleastIFN+":
# Calculate by Polyfunctional_atleast_IFN+
result <- calcPolyfunctionality(x = df,
md_cols = c("ID"),
pop_col = "cell_type",
resolution = "Polyfunctional_atleast_IFN+",
condition_col = "Stimulation",
background_val = "Unstim")
head(result)
#> # A tibble: 6 × 8
#> ID cell_type Stimulation Polyfunctional_atleast_I…¹ n_cells_pos n_cells_tot
#> <int> <fct> <chr> <lgl> <int> <int>
#> 1 1 1 Gag TRUE 1 1027
#> 2 1 1 Unstim TRUE 0 1135
#> 3 1 2 Gag TRUE 466 5348
#> 4 1 2 Unstim TRUE 3 4886
#> 5 1 3 Gag TRUE 153 3416
#> 6 1 3 Unstim TRUE 48 3334
#> # ℹ abbreviated name: ¹`Polyfunctional_atleast_IFN+`
#> # ℹ 2 more variables: perc_pos <dbl>, perc_pos_backgroundSubtracted <dbl>The results can again be plottet using ggplot:
result %>%
dplyr::filter(Stimulation != "Unstim")%>%
ggplot2::ggplot(ggplot2::aes(x = cell_type,
y = perc_pos_backgroundSubtracted,
fill = cell_type))+
ggplot2::geom_bar(stat="identity")+
ggplot2::theme_bw()+
ggplot2::ylab("Frequency of polyfunctional IFN+ HIV-1-Gag specific CD8 T cells")+
ggplot2::xlab("CD8 metacluster")(Note that the FCS files were downsampled to 100,000 cells that were negative for all markers to reduce file size and run times; as a result, the apparent frequency of HIV-1–specific cells is artificially inflated and does not reflect the true biological frequency.)
polyICSFlow offers different plotting options to visualise your data and cytokine combinations.
The first is a simple data-independent heatmap visualizing all possible combinations and their default color scheme:
The remaining plotting options plot your data to give an idea of which combinations are most abundant.
plotCellCountsExactly(data_markerComb = df_markerComb)
#> Found litedown! Enabling r-universe templatelibrary(patchwork)
p1 <- plotCellCountsExactly(data_markerComb = df_markerComb)
p2 <- plotCellCountsAtLeast(data_markerComb = df_markerComb)
p1+p2+plot_layout(guides = "collect",widths = c(1,4))set.seed(1) # setting a seed is recommended to get reproducible visualizations
plotMarkerComb2DScatters(input = fs,
assigned_MarkerCombs = df_markerComb$MarkerComb)
#> Aggregating data
#> Done!It is also possible to extract the cell counts for each marker combination:
extractCellCounts(df_markerComb)
#> # A tibble: 15 × 4
#> # Groups: MarkerComb, MarkerCount, Functionality [15]
#> MarkerComb MarkerCount Functionality n
#> <fct> <int> <fct> <int>
#> 1 IFN-TNFa-IL2-CD107a- 0 None 19942
#> 2 IFN+TNFa-IL2-CD107a- 1 Monofunctional 47
#> 3 IFN-TNFa+IL2-CD107a- 1 Monofunctional 41
#> 4 IFN-TNFa-IL2+CD107a- 1 Monofunctional 11
#> 5 IFN-TNFa-IL2-CD107a+ 1 Monofunctional 696
#> 6 IFN+TNFa+IL2-CD107a- 2 Polyfunctional 40
#> 7 IFN+TNFa-IL2+CD107a- 2 Polyfunctional 1
#> 8 IFN+TNFa-IL2-CD107a+ 2 Polyfunctional 218
#> 9 IFN-TNFa+IL2+CD107a- 2 Polyfunctional 4
#> 10 IFN-TNFa+IL2-CD107a+ 2 Polyfunctional 155
#> 11 IFN-TNFa-IL2+CD107a+ 2 Polyfunctional 1
#> 12 IFN+TNFa+IL2+CD107a- 3 Polyfunctional 2
#> 13 IFN+TNFa+IL2-CD107a+ 3 Polyfunctional 549
#> 14 IFN-TNFa+IL2+CD107a+ 3 Polyfunctional 1
#> 15 IFN+TNFa+IL2+CD107a+ 4 Polyfunctional 2sessionInfo()
#> R version 4.6.0 (2026-04-24)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] patchwork_1.3.2 openCyto_2.25.0 polyICSFlow_0.99.3 rmarkdown_2.31
#>
#> loaded via a namespace (and not attached):
#> [1] RBGL_1.89.0 rlang_1.2.0
#> [3] magrittr_2.0.5 clue_0.3-68
#> [5] GetoptLong_1.1.1 otel_0.2.0
#> [7] matrixStats_1.5.0 compiler_4.6.0
#> [9] png_0.1-9 vctrs_0.7.3
#> [11] stringr_1.6.0 shape_1.4.6.1
#> [13] crayon_1.5.3 pkgconfig_2.0.3
#> [15] fastmap_1.2.0 backports_1.5.1
#> [17] XVector_0.53.0 labeling_0.4.3
#> [19] utf8_1.2.6 ncdfFlow_2.59.1
#> [21] markdown_2.0 graph_1.91.0
#> [23] purrr_1.2.2 xfun_0.58
#> [25] cachem_1.1.0 litedown_0.9
#> [27] jsonlite_2.0.0 flowWorkspace_4.25.1
#> [29] DelayedArray_0.39.3 tweenr_2.0.3
#> [31] broom_1.0.13 parallel_4.6.0
#> [33] cluster_2.1.8.2 R6_2.6.1
#> [35] stringi_1.8.7 bslib_0.11.0
#> [37] RColorBrewer_1.1-3 car_3.1-5
#> [39] GenomicRanges_1.65.0 jquerylib_0.1.4
#> [41] Rcpp_1.1.1-1.1 Seqinfo_1.3.0
#> [43] SummarizedExperiment_1.43.0 iterators_1.0.14
#> [45] knitr_1.51 IRanges_2.47.2
#> [47] flowCore_2.25.1 Matrix_1.7-5
#> [49] igraph_2.3.2 tidyselect_1.2.1
#> [51] abind_1.4-8 yaml_2.3.12
#> [53] ggtext_0.1.2 doParallel_1.0.17
#> [55] codetools_0.2-20 lattice_0.22-9
#> [57] tibble_3.3.1 Biobase_2.73.1
#> [59] withr_3.0.2 S7_0.2.2
#> [61] evaluate_1.0.5 Rtsne_0.17
#> [63] polyclip_1.10-7 ConsensusClusterPlus_1.77.0
#> [65] xml2_1.5.2 circlize_0.4.18
#> [67] pillar_1.11.1 ggpubr_0.6.3
#> [69] MatrixGenerics_1.25.0 carData_3.0-6
#> [71] foreach_1.5.2 stats4_4.6.0
#> [73] generics_0.1.4 colorRamps_2.3.4
#> [75] S4Vectors_0.51.3 ggplot2_4.0.3
#> [77] commonmark_2.0.0 scales_1.4.0
#> [79] glue_1.8.1 maketools_1.3.2
#> [81] tools_4.6.0 ggnewscale_0.5.2
#> [83] sys_3.4.3 data.table_1.18.4
#> [85] ggsignif_0.6.4 forcats_1.0.1
#> [87] buildtools_1.0.0 XML_3.99-0.23
#> [89] cowplot_1.2.0 grid_4.6.0
#> [91] flowClust_3.51.0 tidyr_1.3.2
#> [93] RProtoBufLib_2.25.0 colorspace_2.1-2
#> [95] ggforce_0.5.0 Formula_1.2-5
#> [97] cli_3.6.6 cytolib_2.25.0
#> [99] S4Arrays_1.13.0 ComplexHeatmap_2.29.0
#> [101] dplyr_1.2.1 Rgraphviz_2.57.0
#> [103] gtable_0.3.6 ggh4x_0.3.1
#> [105] rstatix_0.7.3 sass_0.4.10
#> [107] digest_0.6.39 BiocGenerics_0.59.7
#> [109] FlowSOM_2.21.0 SparseArray_1.13.2
#> [111] rjson_0.2.23 farver_2.1.2
#> [113] htmltools_0.5.9 lifecycle_1.0.5
#> [115] GlobalOptions_0.1.4 gridtext_0.1.6
#> [117] MASS_7.3-65