--- title: "Multi-Omics Integration in sciNOME" author: "Shitao Zhou" date: "`r Sys.Date()`" output: BiocStyle::html_document: toc: true toc_depth: 3 vignette: > %\VignetteIndexEntry{Multi-Omics Integration in sciNOME} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = TRUE, # CRITICAL: Ensures code is executed during BiocCheck warning = FALSE, message = FALSE ) ``` ## Introduction The `sciNOME` package excels at Region-Centric Integration for Single-Cell Multi-Omics. The core function `Integrate_MultiOmics` allows users to seamlessly merge RNA expression, DNA methylation (CpG), and Chromatin accessibility (GpC) matrices using a global metadata mapping table. In this vignette, we simulate an ultra-lightweight multi-omics dataset to demonstrate the integration workflows rapidly. ```{r load-pkg} library(sciNOME) library(dplyr) ``` ## 1. Prepare Mock Multi-Omics Data To successfully integrate multiple omics layers, three components must perfectly align: 1. The **Global Metadata** (linking sample names across omics). 2. The **Region Dictionary** (linking genomic coordinates to Gene IDs). 3. The **Expression/Methylation Matrices**. ### 1.1 Global Metadata We simulate 4 cells/samples belonging to two biological conditions. ```{r mock-meta} meta_df <- data.frame( Condition = c("Tumor", "Tumor", "Normal", "Normal"), RNA_Sample = c("rna_T1", "rna_T2", "rna_N1", "rna_N2"), CpG_Sample = c("cpg_T1", "cpg_T2", "cpg_N1", "cpg_N2"), GpC_Sample = c("gpc_T1", "gpc_T2", "gpc_N1", "gpc_N2"), stringsAsFactors = FALSE ) ``` ### 1.2 Region Dictionary We simulate 3 genes and their corresponding promoter regions. ```{r mock-region} region_df <- data.frame( chr = c("chr1", "chr2", "chr3"), start = c(1000, 3000, 5000), end = c(2000, 4000, 6000), gene_id = c("GENE_A", "GENE_B", "GENE_C"), gene_name = c("Sym_A", "Sym_B", "Sym_C"), stringsAsFactors = FALSE ) # The function expects 'chrdata' format region_df$chrdata <- paste0(region_df$chr, ":", region_df$start, "-", region_df$end) ``` ### 1.3 Simulate Omics Matrices ```{r mock-matrices} set.seed(42) # 1. RNA Object (using the package's native Build_RNAObject) rna_counts <- matrix(runif(12, 10, 50), nrow = 3, ncol = 4) rownames(rna_counts) <- c("GENE_A", "GENE_B", "GENE_C") # Matches region_df$gene_id colnames(rna_counts) <- meta_df$RNA_Sample # Matches meta_df$RNA_Sample rna_obj <- Build_RNAObject(rna_counts, min_cells = 0, min_features = 0) rna_obj$assays$RNA$data <- rna_obj$assays$RNA$counts # Mock normalized data # 2. CpG Matrix (Values 0-1) cpg_mat <- matrix(runif(12, 0.2, 0.8), nrow = 3, ncol = 4) rownames(cpg_mat) <- region_df$chrdata # Matches region_df$chrdata colnames(cpg_mat) <- meta_df$CpG_Sample # Matches meta_df$CpG_Sample # 3. GpC Matrix (Values 0-1) gpc_mat <- matrix(runif(12, 0.1, 0.9), nrow = 3, ncol = 4) rownames(gpc_mat) <- region_df$chrdata # Matches region_df$chrdata colnames(gpc_mat) <- meta_df$GpC_Sample # Matches meta_df$GpC_Sample ``` ## 2. Multi-Omics Integration Now that our data is prepared and strictly aligned, we can demonstrate the various integration modes. ### Mode A: Tri-Omics Integration (RNA + CpG + GpC) Integrate all three layers for the "Tumor" group. ```{r int-tri} tri_merged <- Integrate_MultiOmics( mode = "tri", target_group = "Tumor", meta_df = meta_df, group_col = "Condition", region_df = region_df, rna_obj = rna_obj, rna_id_col = "RNA_Sample", cpg_mat = cpg_mat, cpg_id_col = "CpG_Sample", gpc_mat = gpc_mat, gpc_id_col = "GpC_Sample" ) # View results (Genes mapped to their average RNA exp, CpG level, and GpC level) knitr::kable(tri_merged, digits = 3) ``` ### Mode B: Dual Integration (RNA + CpG) If you only have RNA and DNA methylation data, use `mode = "rna_cpg"`. Let's calculate for the "Normal" group. ```{r int-rna-cpg} rna_cpg_merged <- Integrate_MultiOmics( mode = "rna_cpg", target_group = "Normal", meta_df = meta_df, group_col = "Condition", region_df = region_df, rna_obj = rna_obj, rna_id_col = "RNA_Sample", cpg_mat = cpg_mat, cpg_id_col = "CpG_Sample" ) knitr::kable(rna_cpg_merged, digits = 3) ``` ### Mode C: Epigenetics Only Integration (CpG + GpC) If you only want to compare Methylation and Accessibility at the region level (without RNA). ```{r int-cpg-gpc} epi_merged <- Integrate_MultiOmics( mode = "cpg_gpc", target_group = "Tumor", meta_df = meta_df, group_col = "Condition", region_df = region_df, cpg_mat = cpg_mat, cpg_id_col = "CpG_Sample", gpc_mat = gpc_mat, gpc_id_col = "GpC_Sample" ) knitr::kable(epi_merged, digits = 3) ``` ## Session Information ```{r session-info} sessionInfo() ```