| Title: | An advanced R-based package to quantify and visualize translation efficiency from sequencing data |
|---|---|
| Description: | Translation elongation is dependent on codon-anticodon interactions, with suitable nucleotide pairing being essential for efficient translation. To quantify this relationship, we previously developed a computational pipeline (GitHub - wgao688/sc_tRNA_mRNA) that uses mRNA codon usage relative to tRNA anticodon availability as a proxy for theoretical translation efficiency (tTE). Here, we introduce tTEscanR, a powerful and user-friendly R-based package that extends this approach to quantify translation efficiency from both bulk and single-cell sequencing data. tTEscanR is a versatile tool for exploring translation efficiency in diverse cellular processes, disease mechanisms, and therapeutic development. It also features an advanced visualization module to generate high-quality plots, enhancing result interpretation and communication. |
| Authors: | Ana Varas-Sánchez [aut, cre] (ORCID: <https://orcid.org/0009-0006-0187-8968>) |
| Maintainer: | Ana Varas-Sánchez <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.99.0 |
| Built: | 2026-06-29 23:01:25 UTC |
| Source: | https://github.com/BiocStaging/tTEscanR |
This function calculates the amino acid (AA) demand and/or supply from a
codon and/or anticodon usage matrices of a tTEscanR_Object. It
aggregates the contribution of their features based on the standard genetic
code (mapping codons/anticodons to AA), The resulting values reflect total
usage (demand) or availability (supply) of each amino acid, depending on the
input type.
computeAAUsage( object, level, genetic_code = "Standard", overwrite = FALSE, verbose = TRUE )computeAAUsage( object, level, genetic_code = "Standard", overwrite = FALSE, verbose = TRUE )
object |
A |
level |
Either |
genetic_code |
A |
overwrite |
Logical; if |
verbose |
Logical; if |
In order to generalize the analysis to any organism available in Ensembl, tTEscanR can be used with the 33 different genetic codes described by NCBI Taxonomy:
“Standard”, "Vertebrate Mitochondrial”, "Yeast Mitochondrial”, ”Mold Mitochondrial; Protozoan Mitochondrial; Coelenterate Mitochondrial; Mycoplasma; Spiroplasma”, "Invertebrate Mitochondrial”, "Ciliate Nuclear; Dasycladacean Nuclear; Hexamita Nuclear”, "Echinoderm Mitochondrial; Flatworm Mitochondrial”, ”Euplotid Nuclear”, "Bacterial, Archaeal and Plant Plastid”, "Alternative Yeast Nuclear”, "Ascidian Mitochondrial”, "Alternative Flatworm Mitochondrial”, "Blepharisma Macronuclear”, "Chlorophycean Mitochondrial”, "Trematode Mitochondrial”, "Scenedesmus obliquus mitochondrial”, "Thraustochytrium mitochondrial code”, "Rhabdopleuridae Mitochondrial”, "Candidate Division SR1 and Gracilibacteria”, "Pachysolen tannophilus Nuclear”, "Karyorelict Nuclear”, "Condylostoma Nuclear”, "Mesodinium Nuclear”, "Peritrich Nuclear”, "Blastocrithidia Nuclear”, "Balanophoraceae Plastid”, "Cephalodiscidae Mitochondrial"
An updated tTEscanR_Object containing a new layer of
information in the assays slot representing the AA demand and/or
supply.
data(default_tTEscanR_tRNA_data) tTEscanR_obj <-createObject( counts = default_tTEscanR_tRNA_data, assay = "tRNA" ) tTEscanR_obj <- computeAnticodonUsage(object = tTEscanR_obj) tTEscanR_obj <- computeAAUsage(object = tTEscanR_obj, level = "supply")data(default_tTEscanR_tRNA_data) tTEscanR_obj <-createObject( counts = default_tTEscanR_tRNA_data, assay = "tRNA" ) tTEscanR_obj <- computeAnticodonUsage(object = tTEscanR_obj) tTEscanR_obj <- computeAAUsage(object = tTEscanR_obj, level = "supply")
This function calculates anticodon usage profiles from tRNA gene
expression data stored in a tTEscanR_Object. It summarizes the
expression of tRNAs by their anticodon identity, which can be used to
estimate the tRNA supply landscape. The tRNA gene names need to be properly
annotated for proper recognition. Expected format: tRNA-Asn-GTT-5-1.
computeAnticodonUsage(object, overwrite = FALSE, verbose = TRUE)computeAnticodonUsage(object, overwrite = FALSE, verbose = TRUE)
object |
A |
overwrite |
Logical; if |
verbose |
Logical; if |
An updated tTEscanR_Object containing a new layer of
information "AnticodonUsage" in the assays slot
representing the anticodon usage.
data(default_tTEscanR_tRNA_data) tTEscanR_obj <- createObject( counts = default_tTEscanR_tRNA_data, assay = "tRNA" ) tTEscanR_obj <- computeAnticodonUsage(object = tTEscanR_obj)data(default_tTEscanR_tRNA_data) tTEscanR_obj <- createObject( counts = default_tTEscanR_tRNA_data, assay = "tRNA" ) tTEscanR_obj <- computeAnticodonUsage(object = tTEscanR_obj)
This function estimates codon usage profiles based on gene-level
mRNA expression data stored in a tTEscanR_Object. It optionally
accepts pre-computed codon frequency tables or uses internally generated
default tables when not provided. When enabled, it can evaluate the
correlation between background codon composition and observed mean codon
usage. If the additional metrics are to be computed the input
tTEscanR_Object needs to The default codon_freq were built
using the canonical filter to select one transcript if several were
available for the same gene.
computeCodonUsage( object, codon_freq = NULL, species = NULL, additional_metrics = TRUE, reduce = 100, corr_method = "spearman", overwrite = FALSE, verbose = TRUE )computeCodonUsage( object, codon_freq = NULL, species = NULL, additional_metrics = TRUE, reduce = 100, corr_method = "spearman", overwrite = FALSE, verbose = TRUE )
object |
A |
codon_freq |
Optional; a user-provided codon frequency-per-gene table.
If necessary, it can be computed using |
species |
Optional; either |
additional_metrics |
Logical; if |
reduce |
Numeric; a scaling factor used to normalize large expression values that exceed R's handling capacity. Defaults to 100. |
corr_method |
A correlation method accepted by |
overwrite |
Logical; if |
verbose |
Logical; if |
An updated tTEscanR_Object containing a new layer of
information "CodonUsage" in the assays slot representing
the codon usage. Additional computations will be stored in the
meta.data slot as "CodonUsage_AdditionalMetrics".
data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA", meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = FALSE, reduce = 1000 )data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA", meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = FALSE, reduce = 1000 )
This function calculates the correlation between observed mean usage and the exonic background. It provides a metric for evaluating how much usage is driven by underlying sequence composition versus condition-specific expression.
computeCorrelationBackground( mean, background, corr_method = "spearman", verbose = TRUE )computeCorrelationBackground( mean, background, corr_method = "spearman", verbose = TRUE )
mean |
A |
background |
A |
corr_method |
A correlation method accepted by |
verbose |
Logical; if |
Integer; correlation information between mean and
background.
data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA" ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = FALSE, reduce = 1000 ) codon_usage <- getAssay(tTEscanR_obj, "CodonUsage") exonic_background <- computeExonicBackground(data = codon_usage) # Input: expression count matrix, need to provide metadata & batch parameters mean_codon_usage <- computeMeanUsage( data = codon_usage, mode = "raw", metadata = default_tTEscanR_metadata, batch = "tissue" ) corr_back <- computeCorrelationBackground( mean = mean_codon_usage, background = exonic_background )data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA" ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = FALSE, reduce = 1000 ) codon_usage <- getAssay(tTEscanR_obj, "CodonUsage") exonic_background <- computeExonicBackground(data = codon_usage) # Input: expression count matrix, need to provide metadata & batch parameters mean_codon_usage <- computeMeanUsage( data = codon_usage, mode = "raw", metadata = default_tTEscanR_metadata, batch = "tissue" ) corr_back <- computeCorrelationBackground( mean = mean_codon_usage, background = exonic_background )
Compute the DESeq2 Analysis
computeDEResults( list_data, metadata, target = NULL, batch = NULL, reference = NULL, reduce = 100, padj_threshold = 0.05, verbose = TRUE, compute_pairwise = TRUE )computeDEResults( list_data, metadata, target = NULL, batch = NULL, reference = NULL, reduce = 100, padj_threshold = 0.05, verbose = TRUE, compute_pairwise = TRUE )
list_data |
A |
metadata |
A |
target |
Optional; a factor based on |
batch |
Optional; name of the categorical variable in |
reference |
Optional; factor from the |
reduce |
Numeric; a scaling factor used to normalize large expression values that exceed R's handling capacity. Defaults to 100. |
padj_threshold |
Numeric; p-value threshold used for highlighting significant features in the volcano plot. Defaults to 0.05. |
verbose |
Logical; if |
compute_pairwise |
Logical; if |
A DESeq2 object with the normalized and vst counts.
data(default_tTEscanR_tRNA_data, default_tTEscanR_metadata) DE_analysis <- computeDEResults( list_data = list(tRNA = default_tTEscanR_tRNA_data), metadata = default_tTEscanR_metadata, batch = "tissue" )data(default_tTEscanR_tRNA_data, default_tTEscanR_metadata) DE_analysis <- computeDEResults( list_data = list(tRNA = default_tTEscanR_tRNA_data), metadata = default_tTEscanR_metadata, batch = "tissue" )
This function calculates the codon/anticodon usage background based solely on exonic sequence composition, independent of expression levels. It provides a reference distribution of codon/anticodon frequencies across conditions, used to normalize/compare against observed usage patterns derived from expression data.
computeExonicBackground(data)computeExonicBackground(data)
data |
A codon usage matrix with codons as rows and conditions or samples as columns. |
A matrix with the codon/anticodon background.
data(default_tTEscanR_mRNA_data) tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA" ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = FALSE, reduce = 1000 ) exonic_background <- computeExonicBackground(data = getAssay( tTEscanR_obj, "CodonUsage" ))data(default_tTEscanR_mRNA_data) tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA" ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = FALSE, reduce = 1000 ) exonic_background <- computeExonicBackground(data = getAssay( tTEscanR_obj, "CodonUsage" ))
This function computes the average usage of codons, anticodons, or
amino acids across conditions, useful for summarizing feature usage trends
across sample groups. It supports direct input of a count matrix and
extraction from a tTEscanR_Object. When the input data is a
tTEscanR_Object the parameters metadata and batch will
be extracted from the object, and ignored if specified as input parameters.
Therefore, variables assay and metadata need to be coherent
with the rules described in createObject.
computeMeanUsage( data, assay = NULL, metadata = NULL, id_col = NULL, batch = NULL, mode = c("raw", "size-corrected", "long_format"), verbose = TRUE )computeMeanUsage( data, assay = NULL, metadata = NULL, id_col = NULL, batch = NULL, mode = c("raw", "size-corrected", "long_format"), verbose = TRUE )
data |
A |
assay |
Optional; a character string specifying the name of the assay
to retrieve from the |
metadata |
Optional; a |
id_col |
Optional; a factor based on |
batch |
Optional; a factor based on |
mode |
Either |
verbose |
Logical; if |
An updated tTEscanR_Object if data is a
tTEscanR_Object. A data.frame containing a new layer of
information representing the mean codon usage if data is an
expression count matrix.
data(default_tTEscanR_tRNA_data, default_tTEscanR_metadata) tTEscanR_obj <- createObject( counts = default_tTEscanR_tRNA_data, assay = "tRNA", meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list( "ConditionsLabels", "CorrectionFactor" ) ) # Input: tTEscanR object containing metadata and batch parameters tTEscanR_obj <- computeAnticodonUsage(object = tTEscanR_obj) anticodon_mean_usage <- computeMeanUsage( data = tTEscanR_obj, assay = "AnticodonUsage" )data(default_tTEscanR_tRNA_data, default_tTEscanR_metadata) tTEscanR_obj <- createObject( counts = default_tTEscanR_tRNA_data, assay = "tRNA", meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list( "ConditionsLabels", "CorrectionFactor" ) ) # Input: tTEscanR object containing metadata and batch parameters tTEscanR_obj <- computeAnticodonUsage(object = tTEscanR_obj) anticodon_mean_usage <- computeMeanUsage( data = tTEscanR_obj, assay = "AnticodonUsage" )
This function calculates the theoretical translation efficiency (tTE) score by integrating codon-anticodon usage and/or amino acid demand-supply across conditions.
computeTheoreticalTE( object, level = c("codon", "aa", "both"), genetic_code = "Standard", corr_method = c("spearman", "pearson", "kendall"), compute_significance = TRUE, overwrite = FALSE, verbose = TRUE )computeTheoreticalTE( object, level = c("codon", "aa", "both"), genetic_code = "Standard", corr_method = c("spearman", "pearson", "kendall"), compute_significance = TRUE, overwrite = FALSE, verbose = TRUE )
object |
A |
level |
Either |
genetic_code |
A |
corr_method |
A correlation method accepted by |
compute_significance |
Logical; if |
overwrite |
Logical; if |
verbose |
Logical; if |
An updated tTEscanR_Object containing a new layer of
information representing the translation efficiency table for the
matching conditions in the mRNA and tRNA data.
data( default_tTEscanR_mRNA_data, default_tTEscanR_tRNA_data, default_tTEscanR_metadata ) tTEscanR_obj <- createObject( counts = list( mRNA = default_tTEscanR_mRNA_data, tRNA = default_tTEscanR_tRNA_data ), meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = FALSE, reduce = 10000 ) tTEscanR_obj <- computeAnticodonUsage(object = tTEscanR_obj) tTEscanR_obj <- computeTheoreticalTE( object = tTEscanR_obj, level = "codon", compute_significance = FALSE )data( default_tTEscanR_mRNA_data, default_tTEscanR_tRNA_data, default_tTEscanR_metadata ) tTEscanR_obj <- createObject( counts = list( mRNA = default_tTEscanR_mRNA_data, tRNA = default_tTEscanR_tRNA_data ), meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = FALSE, reduce = 10000 ) tTEscanR_obj <- computeAnticodonUsage(object = tTEscanR_obj) tTEscanR_obj <- computeTheoreticalTE( object = tTEscanR_obj, level = "codon", compute_significance = FALSE )
This function initializes a tTEscanR_Object, a structured container
designed to hold translational efficiency-related data. The object consists
of two main components; assays, which stores data matrices
(e.g. expression, codon usage), and meta.data, which stores associated
information and additional intermediate calculations. A
tTEscanR_Object can be created using a single dataset, or initialized
with a list of datasets provided at once. Additional assays and metadata
layers can be appended later using the updateObject
function.
createObject( counts, assay = NULL, meta.data = NULL, meta.data.ids = NULL, verbose = TRUE )createObject( counts, assay = NULL, meta.data = NULL, meta.data.ids = NULL, verbose = TRUE )
counts |
A count |
assay |
Optional; a |
meta.data |
Optional; a variable (or |
meta.data.ids |
Optional; a |
verbose |
Logical; if |
In order to ensure robustness throughout the pipeline specific ids
have been assigned and should be respected by the user.
assays slot:
mRNA and tRNA count matrices as "mRNA" and "tRNA"
Codon and anticodon usage count matrices as "CodonUsage" and
"AnticodonUsage"
Amino acid demand and supply count matrices as "AADemand" and
"AASupply"
Size corrected count matrices contain the prefix "SizeCorrected"
added to the raw count matrices names (e.g. "SizeCorrected_mRNA} or
\code{"SizeCorrectedCodonUsage)
meta.data slot:
Table with the conditions of the mRNA and tRNA data as
"ConditionsLabels"
Active correction factor to use when running differential expression
analyses as "CorrectionFactor"
Optional; Identifier DataMetadataIndex to indicate the column in
"ConditionsLabels" that contains the labels of the conditions of the
assay.
A tTEscanR_Object.
data( default_tTEscanR_mRNA_data, default_tTEscanR_tRNA_data, default_tTEscanR_metadata ) tTEscanR_obj <- createObject( counts = list( mRNA = default_tTEscanR_mRNA_data, tRNA = default_tTEscanR_tRNA_data ), meta.data = default_tTEscanR_metadata, meta.data.ids = "ConditionsLabels" )data( default_tTEscanR_mRNA_data, default_tTEscanR_tRNA_data, default_tTEscanR_metadata ) tTEscanR_obj <- createObject( counts = list( mRNA = default_tTEscanR_mRNA_data, tRNA = default_tTEscanR_tRNA_data ), meta.data = default_tTEscanR_metadata, meta.data.ids = "ConditionsLabels" )
This dataset contains the extra information to indicate the conditions of the data
data(default_tTEscanR_metadata)data(default_tTEscanR_metadata)
A data frame with samples as rows and annotation columns such as:
The sequencing batch or sample origin
The experimental group or cell type
The combination of the previous items as will be referred in the columns of the count matrices
This dataset contains an example of mRNA gene expression data with protein-coding genes as rows and cell types as columns.
data(default_tTEscanR_mRNA_data)data(default_tTEscanR_mRNA_data)
A matrix where rows represent genes and columns represent individual samples or cell types.
This dataset contains an example of tRNA gene expression data with tRNA genes as rows and cell types as columns.
data(default_tTEscanR_tRNA_data)data(default_tTEscanR_tRNA_data)
A matrix or data frame where rows represent tRNA gene names and columns represent individual samples or cell types.
This function analyzes a given set of nucleotide sequences and computes the count of each codon present.
extractCodons(sequences, verbose = TRUE)extractCodons(sequences, verbose = TRUE)
sequences |
A |
verbose |
Logical; if |
Codon frequency per gene table of the sequences.
codon_composition <- extractCodons(sequences = list( "ATGCGTACG", "TTAAGGCCG" ))codon_composition <- extractCodons(sequences = list( "ATGCGTACG", "TTAAGGCCG" ))
This function converts: codons and anticodons into anticodons, codons or amino acids based on the genetic code.
featuresToAA( data, position = NULL, genetic_code = "Standard", verbose = TRUE, notation_from = c("codon", "anticodon"), notation_to = c("aa", "anticodon", "codon") )featuresToAA( data, position = NULL, genetic_code = "Standard", verbose = TRUE, notation_from = c("codon", "anticodon"), notation_to = c("aa", "anticodon", "codon") )
data |
A |
position |
Optional; either |
genetic_code |
A |
verbose |
Logical; if |
notation_from |
Either |
notation_to |
Either |
Translated features (codons, anticodons or amino acids) from
data_to_translate.
data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA", meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, codon_freq = NULL, species = "hg38", additional_metrics = FALSE ) codons <- rownames(getAssay(tTEscanR_obj, "CodonUsage")) codons_to_AA <- featuresToAA( data = codons, notation_from = "codon", notation_to = "aa" ) codons_to_anticodons <- featuresToAA( data = codons, notation_from = "codon", notation_to = "anticodon" )data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA", meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, codon_freq = NULL, species = "hg38", additional_metrics = FALSE ) codons <- rownames(getAssay(tTEscanR_obj, "CodonUsage")) codons_to_AA <- featuresToAA( data = codons, notation_from = "codon", notation_to = "aa" ) codons_to_anticodons <- featuresToAA( data = codons, notation_from = "codon", notation_to = "anticodon" )
This function safely retrieves the specified data from a
tTEscanR_Object.
getAssay(object, name) ## S4 method for signature 'tTEscanR_Object' getAssay(object, name)getAssay(object, name) ## S4 method for signature 'tTEscanR_Object' getAssay(object, name)
object |
A |
name |
A character string specifying the name of the assay to retrieve (e.g. "mRNA", "tRNA"). |
The requested assay data (typically a matrix or data.frame).
data(default_tTEscanR_mRNA_data) tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA" ) mRNA_data <- getAssay(tTEscanR_obj, "mRNA")data(default_tTEscanR_mRNA_data) tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA" ) mRNA_data <- getAssay(tTEscanR_obj, "mRNA")
This function computes codon usage frequencies for each gene based on a
provided set of gene sequences. It can optionally subset the analysis to
specific transcripts and apply filtering criteria when multiple transcripts
are available for the same gene. Consequently, no filter parameter
will be considered if parameters transcripts and genes_file
are given.
getCodonFreq( dataset_name = NULL, genes_file = NULL, verbose = TRUE, transcripts = NULL, retain_mitochondrial = FALSE, filter = c("canonical", "length"), retain_unannotated = FALSE, retain_geneversion = TRUE, out_format = c("external_gene_name", "ensembl_transcript_id", "ensembl_gene_id") )getCodonFreq( dataset_name = NULL, genes_file = NULL, verbose = TRUE, transcripts = NULL, retain_mitochondrial = FALSE, filter = c("canonical", "length"), retain_unannotated = FALSE, retain_geneversion = TRUE, out_format = c("external_gene_name", "ensembl_transcript_id", "ensembl_gene_id") )
dataset_name |
A character string specifying the Ensembl species
dataset name (e.g. |
genes_file |
Optional; a path to a FASTA file. |
verbose |
Logical; if |
transcripts |
Optional; a character vector of transcripts or gene IDs to subset the analysis. |
retain_mitochondrial |
Logical; if |
filter |
Either |
retain_unannotated |
Logical; if |
retain_geneversion |
Logical; if |
out_format |
Either |
Codon frequency-per-gene table and a translator gene annotation table (if available).
This function safely retrieves the specified metadata from a
tTEscanR_Object.
getMetadata(object, name) ## S4 method for signature 'tTEscanR_Object' getMetadata(object, name)getMetadata(object, name) ## S4 method for signature 'tTEscanR_Object' getMetadata(object, name)
object |
A |
name |
Optional; A character string specifying the name of the metadata to retrieve (e.g. "ConditionsLabels", "CorrectionFactor"). |
A data.frame or a vector depending of the name
parameter.
data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA", meta.data = default_tTEscanR_metadata, meta.data.ids = "ConditionsLabels" ) conditions <- getMetadata(tTEscanR_obj, "ConditionsLabels")data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA", meta.data = default_tTEscanR_metadata, meta.data.ids = "ConditionsLabels" ) conditions <- getMetadata(tTEscanR_obj, "ConditionsLabels")
This function performs a permutation test to compare a mRNA dataset to the reference codon frequency-per-gene matrix.
getPermutationDist( n_permut = 1000, n_features = 100, target_data = NULL, codon_freq = NULL, species = NULL, verbose = TRUE )getPermutationDist( n_permut = 1000, n_features = 100, target_data = NULL, codon_freq = NULL, species = NULL, verbose = TRUE )
n_permut |
Numeric; number of permutations to perform. Defaults to 1000. |
n_features |
Numeric; number of features to select in each permutation.
Defaults to 100. If |
target_data |
Optional; a mRNA expression count matrix with features as rows and conditions as columns. |
codon_freq |
Optional; a user-provided codon frequency per gene table.
If necessary, it can be computed using |
species |
Optional, a |
verbose |
Logical; if |
A table with the codons and their frequencies after computing all the permutations.
data(default_tTEscanR_mRNA_data) genes <- default_tTEscanR_mRNA_data[1:20, ] permut <- getPermutationDist( n_permut = 100, target_data = genes, species = "hg38" )data(default_tTEscanR_mRNA_data) genes <- default_tTEscanR_mRNA_data[1:20, ] permut <- getPermutationDist( n_permut = 100, target_data = genes, species = "hg38" )
This function calculates the row sums of a given matrix to combine columns that share the same group.
groupConditions(data, group_labels)groupConditions(data, group_labels)
data |
A |
group_labels |
A |
A matrix with the conditions merged based on the metadata.
data <- data.frame( sample_1 = c(10, 5, 20), sample_2 = c(15, 8, 25), sample_3 = c(12, 6, 22), sample_4 = c(1, 2, 3), sample_5 = c(4, 5, 6), sample_6 = c(7, 8, 9) ) rownames(data) <- c("gene_1", "gene_2", "gene_3") groups <- c("cond_A", "cond_A", "cond_A", "cond_B", "cond_B", "cond_B") data_combined <- groupConditions(data = data, group_labels = groups)data <- data.frame( sample_1 = c(10, 5, 20), sample_2 = c(15, 8, 25), sample_3 = c(12, 6, 22), sample_4 = c(1, 2, 3), sample_5 = c(4, 5, 6), sample_6 = c(7, 8, 9) ) rownames(data) <- c("gene_1", "gene_2", "gene_3") groups <- c("cond_A", "cond_A", "cond_A", "cond_B", "cond_B", "cond_B") data_combined <- groupConditions(data = data, group_labels = groups)
This function efficiently combines individual matrices.
mergeMatrices(...)mergeMatrices(...)
... |
A variable number of |
A single sparse matrix with as a combination of all the input matrices.
df1 <- matrix(c(1, 0, 0, 2), nrow = 2, dimnames = list( c("geneA", "geneB"), c("s1", "s2") )) df2 <- matrix(c(3, 0, 0, 4), nrow = 2, dimnames = list( c("geneB", "geneC"), c("s2", "s3") )) merged_matrix <- mergeMatrices(df1, df2)df1 <- matrix(c(1, 0, 0, 2), nrow = 2, dimnames = list( c("geneA", "geneB"), c("s1", "s2") )) df2 <- matrix(c(3, 0, 0, 4), nrow = 2, dimnames = list( c("geneB", "geneC"), c("s2", "s3") )) merged_matrix <- mergeMatrices(df1, df2)
Assess Significance (P-value) & Corrects for Multiple Hypothesis Testing
obtainSignificance(dist, value, padj_threshold = 0.05, verbose = TRUE)obtainSignificance(dist, value, padj_threshold = 0.05, verbose = TRUE)
dist |
A table with the codons and their frequencies after completing a
permutation test. Output from |
value |
A |
padj_threshold |
Numeric; p-value threshold used for highlighting significant features in the volcano plot. Defaults to 0.05. |
verbose |
Logical; if |
A table with the codon exonic background and their significance level before (p-value) and after the correction (p-adjusted value).
data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) selected_genes <- default_tTEscanR_mRNA_data[1:20, ] permutation_test <- getPermutationDist( n_permut = 100, target_data = selected_genes, species = "hg38" ) tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA", meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = FALSE, reduce = 1000 ) codon_usage <- getAssay(tTEscanR_obj, "CodonUsage") codon_background <- rowSums(codon_usage) / sum(rowSums(codon_usage)) codons_to_AA <- featuresToAA( data = names(codon_background), notation_from = "codon", notation_to = "aa" ) codon_background <- data.frame( group = codons_to_AA, codon = names(codon_background), freq = as.numeric(codon_background), row.names = NULL ) significance <- obtainSignificance( dist = permutation_test, value = codon_background )data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) selected_genes <- default_tTEscanR_mRNA_data[1:20, ] permutation_test <- getPermutationDist( n_permut = 100, target_data = selected_genes, species = "hg38" ) tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA", meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = FALSE, reduce = 1000 ) codon_usage <- getAssay(tTEscanR_obj, "CodonUsage") codon_background <- rowSums(codon_usage) / sum(rowSums(codon_usage)) codons_to_AA <- featuresToAA( data = names(codon_background), notation_from = "codon", notation_to = "aa" ) codon_background <- data.frame( group = codons_to_AA, codon = names(codon_background), freq = as.numeric(codon_background), row.names = NULL ) significance <- obtainSignificance( dist = permutation_test, value = codon_background )
This function generates a visualization to correlate different parameters of
the data. The input data (data) is expected to be in long
format and contain a minimum information regarding the features, the
conditions they belong to and their usage counts. A common usage of this
kind of plot is to represent the codon frequencies correlations of the
exonic background and the mean codon usage across conditions. Based on the
selected plot parameter the user can select the most convenient layout
to display the data. It is crucial to ensure consistency between the name of
the columns in data and the parameters x_axis_col,
y_axis_col and condition_col.
plotCorrelation( data, plot = "MeanCodonUsage", x_axis_col, y_axis_col, condition_col, extra_val = NULL, label_col = NULL, color_palette = NULL, out_name = NULL, show_legend = "none", targeted_arg = NULL, save_format = NULL, out_directory = NULL, add_titles = TRUE, verbose = TRUE )plotCorrelation( data, plot = "MeanCodonUsage", x_axis_col, y_axis_col, condition_col, extra_val = NULL, label_col = NULL, color_palette = NULL, out_name = NULL, show_legend = "none", targeted_arg = NULL, save_format = NULL, out_directory = NULL, add_titles = TRUE, verbose = TRUE )
data |
A long format table. This format can be obtained using
|
plot |
Either |
x_axis_col |
Name of the categorical variable to reflect in the plot. |
y_axis_col |
Name of the numerical variable to reflect in the plot. |
condition_col |
Name of the categorical variable to group the data points. |
extra_val |
Optional; variable with additional information to include in the plot (e.g. correlation value). |
label_col |
Name of the categorical variable to label the data points. |
color_palette |
Optional; a vector of color codes to customize plot appearance. |
out_name |
Optional; name for the saved plot (if |
show_legend |
Either |
targeted_arg |
Optional; a vector defining key feature clusters to highlight or label. |
save_format |
Optional; either |
out_directory |
Optional; path to the directory where the plot will
be saved (if |
add_titles |
Logical; if |
verbose |
Logical; if |
A ggplot object representing the correlation. If
save_format is provided, the plot will also be saved to the
specified location.
data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) # Define the object and compute the codon usage tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA", meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = TRUE, reduce = 1000 ) # Compute and extract the mean codon usage additional_metrics <- getMetadata( tTEscanR_obj, "CodonUsage_AdditionalMetrics" ) mean_codon_usage <- additional_metrics$MeanCodonUsage exonic_background <- additional_metrics$CodonExonicBackground exonic_background <- as.data.frame(exonic_background) correlation_mean_background <- cbind(mean_codon_usage, exonic_background) plotCorrelation( data = correlation_mean_background, plot = "MeanCodonUsage", x_axis_col = "mean_usage_across_conditions", y_axis_col = "exonic_background", condition_col = "feature", extra_val = additional_metrics$MeanCodonCorr, add_titles = TRUE, show_legend = "none" )data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) # Define the object and compute the codon usage tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA", meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = TRUE, reduce = 1000 ) # Compute and extract the mean codon usage additional_metrics <- getMetadata( tTEscanR_obj, "CodonUsage_AdditionalMetrics" ) mean_codon_usage <- additional_metrics$MeanCodonUsage exonic_background <- additional_metrics$CodonExonicBackground exonic_background <- as.data.frame(exonic_background) correlation_mean_background <- cbind(mean_codon_usage, exonic_background) plotCorrelation( data = correlation_mean_background, plot = "MeanCodonUsage", x_axis_col = "mean_usage_across_conditions", y_axis_col = "exonic_background", condition_col = "feature", extra_val = additional_metrics$MeanCodonCorr, add_titles = TRUE, show_legend = "none" )
Generates Visualizations from the DEA data
plotDEResults( DE_results_list, dataset_name = NULL, heatmap = TRUE, dim_reduct = NULL, numPC = 2, target = NULL, verbose = TRUE, color_factor = NULL, shape_factor = NULL, label_factor = NULL, highlight_median = FALSE, scale_pca = TRUE, color_palette = NULL, fc_threshold = 1, padj_threshold = 0.05, label_significant = TRUE, show_legend = "none" )plotDEResults( DE_results_list, dataset_name = NULL, heatmap = TRUE, dim_reduct = NULL, numPC = 2, target = NULL, verbose = TRUE, color_factor = NULL, shape_factor = NULL, label_factor = NULL, highlight_median = FALSE, scale_pca = TRUE, color_palette = NULL, fc_threshold = 1, padj_threshold = 0.05, label_significant = TRUE, show_legend = "none" )
DE_results_list |
A DESeq2 object with the normalized and vst counts.
Can be obtained by running |
dataset_name |
String to specify the assay in |
heatmap |
Logical; if |
dim_reduct |
Either |
numPC |
Numeric; number of principal components to include in the PCA
analysis. Required if |
target |
Optional; a factor based on |
verbose |
Logical; if |
color_factor |
Optional; name of the categorical variable in
|
shape_factor |
Optional; name of the categorical variable to label the
data points. Used if |
label_factor |
Optional; name of the categorical variable to label the
data points. Used if |
highlight_median |
Logical; if |
scale_pca |
Logical; if |
color_palette |
Optional; a |
fc_threshold |
Numeric; fold change threshold used for highlighting
significant features in the volcano plot (if |
padj_threshold |
Numeric; p-value threshold used for highlighting significant features in the volcano plot. Defaults to 0.05. |
label_significant |
Logical; if |
show_legend |
Either |
Visualization of the DEA results.
data(default_tTEscanR_tRNA_data, default_tTEscanR_metadata) DE_analysis <- computeDEResults( list_data = list(tRNA = default_tTEscanR_tRNA_data), metadata = default_tTEscanR_metadata, batch = "tissue" ) DE_plots <- plotDEResults( DE_results_list = DE_analysis, dataset_name = "tRNA", dim_reduct = "PCA", color_factor = "tissue", heatmap = FALSE )data(default_tTEscanR_tRNA_data, default_tTEscanR_metadata) DE_analysis <- computeDEResults( list_data = list(tRNA = default_tTEscanR_tRNA_data), metadata = default_tTEscanR_metadata, batch = "tissue" ) DE_plots <- plotDEResults( DE_results_list = DE_analysis, dataset_name = "tRNA", dim_reduct = "PCA", color_factor = "tissue", heatmap = FALSE )
This function generates a visualization of codon-anticodon usage or amino
acid demand-supply distributions across conditions. The input data
(data) is expected to be in long format and contain a minimum
information regarding the features, the conditions they belong to and their
usage counts. Based on the selected plot parameter the user can
select the most convenient layout to display the data. It is crucial to
ensure consistency between the name of the columns in data and the
parameters x_axis_col, y_axis_col and condition_col.
plotDistribution( data, plot = "jitter", bar_position = "dodge", x_axis_col, y_axis_col, condition_col, color_palette = NULL, ncols = 1, facet_col = NULL, add_stats = FALSE, targeted_arg = NULL, save_format = NULL, out_name = NULL, out_directory = NULL, show_legend = "none", add_titles = TRUE, verbose = TRUE )plotDistribution( data, plot = "jitter", bar_position = "dodge", x_axis_col, y_axis_col, condition_col, color_palette = NULL, ncols = 1, facet_col = NULL, add_stats = FALSE, targeted_arg = NULL, save_format = NULL, out_name = NULL, out_directory = NULL, show_legend = "none", add_titles = TRUE, verbose = TRUE )
data |
A long format table. This format can be obtained using
|
plot |
Either |
bar_position |
Either |
x_axis_col |
Name of the categorical variable to reflect in the plot. |
y_axis_col |
Name of the numerical variable to reflect in the plot. |
condition_col |
Name of the categorical variable to group the data points. |
color_palette |
Optional; a vector of color codes to customize plot appearance. |
ncols |
Numeric; number of columns for arranging panels. Defaults to 1. |
facet_col |
Optional; name of the categorical variable to divide the
plot into different panels. Required if |
add_stats |
Logical; if |
targeted_arg |
Optional; a vector defining key feature clusters to highlight or label. |
save_format |
Optional; either |
out_name |
Optional; name for the saved plot (if |
out_directory |
Optional; path to the directory where the plot will
be saved (if |
show_legend |
Either |
add_titles |
Logical; if |
verbose |
Logical; if |
A ggplot object representing the requested distribution plot.
If save_format is provided, the plot will also be saved to the
specified location.
data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) # Define the object and compute the codon usage tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA", meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = FALSE, reduce = 1000 ) # Transform the data long_cu <- transformFormat( data = getAssay(tTEscanR_obj, "CodonUsage"), normalize = TRUE, rownames_to_column = "codon", names_to = "condition", values_to = "usage" ) long_cu <- long_cu |> tidyr::separate(condition, into = c("tissue", "cell_type"), sep = "-") # Generate the plot codon_usage_plot <- plotDistribution( data = long_cu, plot = "jitter", x_axis_col = "codon", y_axis_col = "usage", condition_col = "tissue", show_legend = "right", add_titles = FALSE )data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) # Define the object and compute the codon usage tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA", meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = FALSE, reduce = 1000 ) # Transform the data long_cu <- transformFormat( data = getAssay(tTEscanR_obj, "CodonUsage"), normalize = TRUE, rownames_to_column = "codon", names_to = "condition", values_to = "usage" ) long_cu <- long_cu |> tidyr::separate(condition, into = c("tissue", "cell_type"), sep = "-") # Generate the plot codon_usage_plot <- plotDistribution( data = long_cu, plot = "jitter", x_axis_col = "codon", y_axis_col = "usage", condition_col = "tissue", show_legend = "right", add_titles = FALSE )
This function generates a plot to compare the baseline codon exonic background against the current codon usage. For a better interpretation, the codons are colored by amino acid.
plotPermutation( permut_data, sig_data, color_palette = NULL, save_format = NULL, out_name = NULL, out_directory = NULL, show_legend = "none", add_titles = TRUE, verbose = TRUE )plotPermutation( permut_data, sig_data, color_palette = NULL, save_format = NULL, out_name = NULL, out_directory = NULL, show_legend = "none", add_titles = TRUE, verbose = TRUE )
permut_data |
A table with the codons and their frequencies after
computing all the permutations. Output from
|
sig_data |
A table with the codon exonic background and their
significance level before (p-value) and after the correction (p-adjusted
value). Output from |
color_palette |
Optional; a vector of color codes to customize plot appearance. |
save_format |
Optional; either |
out_name |
Optional; name for the saved plot (if |
out_directory |
Optional; path to the directory where the plot will be
saved (if |
show_legend |
Either |
add_titles |
Logical; if |
verbose |
Logical; if |
Permutation plot.
data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) selected_genes <- default_tTEscanR_mRNA_data[1:20, ] permutation_test <- getPermutationDist( n_permut = 100, target_data = selected_genes, species = "hg38" ) # Generate table with codon and freq tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA", meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = FALSE, reduce = 1000 ) codon_usage <- getAssay(tTEscanR_obj, "CodonUsage") codon_background <- rowSums(codon_usage) / sum(rowSums(codon_usage)) codons_to_AA <- featuresToAA( data = names(codon_background), notation_from = "codon", notation_to = "aa" ) codon_background <- data.frame( group = codons_to_AA, codon = names(codon_background), freq = as.numeric(codon_background), row.names = NULL ) significance <- obtainSignificance( dist = permutation_test, value = codon_background ) plotPermutation(permut_data = permutation_test, sig_data = significance)data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) selected_genes <- default_tTEscanR_mRNA_data[1:20, ] permutation_test <- getPermutationDist( n_permut = 100, target_data = selected_genes, species = "hg38" ) # Generate table with codon and freq tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA", meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = FALSE, reduce = 1000 ) codon_usage <- getAssay(tTEscanR_obj, "CodonUsage") codon_background <- rowSums(codon_usage) / sum(rowSums(codon_usage)) codons_to_AA <- featuresToAA( data = names(codon_background), notation_from = "codon", notation_to = "aa" ) codon_background <- data.frame( group = codons_to_AA, codon = names(codon_background), freq = as.numeric(codon_background), row.names = NULL ) significance <- obtainSignificance( dist = permutation_test, value = codon_background ) plotPermutation(permut_data = permutation_test, sig_data = significance)
This function generates a proportion plot to compare codon-anticodon usage
or amino acid demand-supply frequencies across conditions. The input data
(data) is expected to be in long format and contain a minimum
information regarding the features, the conditions they belong to and their
usage counts. Based on the selected plot parameter the user can
select the most convenient layout to display the data. It is crucial to
ensure consistency between the name of the columns in data and the
parameters var_numerical, var_categorical and var_color.
plotProportion( data, plot = "bar", var_numerical, var_categorical, var_color = NULL, facet_col = NULL, color_palette = NULL, num_limits = NULL, num_rings = 5, save_format = NULL, out_name = NULL, out_directory = NULL, zoom = FALSE, show_legend = "none", add_titles = TRUE, order = NULL, normalize = TRUE, verbose = TRUE )plotProportion( data, plot = "bar", var_numerical, var_categorical, var_color = NULL, facet_col = NULL, color_palette = NULL, num_limits = NULL, num_rings = 5, save_format = NULL, out_name = NULL, out_directory = NULL, zoom = FALSE, show_legend = "none", add_titles = TRUE, order = NULL, normalize = TRUE, verbose = TRUE )
data |
A long format table. This format can be obtained using
|
plot |
Either |
var_numerical |
Name of the numerical variable to reflect in the plot. |
var_categorical |
Name of the categorical variable to reflect in the plot. |
var_color |
Optional; name of the categorical variable to group the
data points when coloring. Required if |
facet_col |
Optional; name of the categorical variable to divide the plot into different panels. |
color_palette |
Optional; a vector of color codes to customize plot appearance. |
num_limits |
Optional; a vector with the upper and lower ranges of the
values in |
num_rings |
Optional; a number specifying the amount of rings to
display if |
save_format |
Optional; either |
out_name |
Optional; name for the saved plot (if |
out_directory |
Optional; path to the directory where the plot will be
saved (if |
zoom |
Logical; if |
show_legend |
Either |
add_titles |
Logical; if |
order |
Optional; a vector of the levels to organize the data, based
on the |
normalize |
Logical; if |
verbose |
Logical; if |
A ggplot object representing the requested proportion plot.
If save_format is provided, the plot will also be saved to the
specified location.
data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) # Define the object and compute the codon usage tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA", meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = TRUE, reduce = 1000 ) # Compute and extract the mean codon usage additional_metrics <- getMetadata( tTEscanR_obj, "CodonUsage_AdditionalMetrics" ) mean_codon_usage <- additional_metrics$MeanCodonUsage mean_codon_usage$codon <- mean_codon_usage$feature # Translate the codons to amino acids mean_codon_usage <- featuresToAA( data = mean_codon_usage, position = "feature", notation_from = "codon", notation_to = "aa", verbose = FALSE ) # Generate the plot plotProportion( data = mean_codon_usage, plot = "bar", var_numerical = "mean_usage_across_conditions", var_categorical = "codon", var_color = "feature", show_legend = "none" )data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) # Define the object and compute the codon usage tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA", meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = TRUE, reduce = 1000 ) # Compute and extract the mean codon usage additional_metrics <- getMetadata( tTEscanR_obj, "CodonUsage_AdditionalMetrics" ) mean_codon_usage <- additional_metrics$MeanCodonUsage mean_codon_usage$codon <- mean_codon_usage$feature # Translate the codons to amino acids mean_codon_usage <- featuresToAA( data = mean_codon_usage, position = "feature", notation_from = "codon", notation_to = "aa", verbose = FALSE ) # Generate the plot plotProportion( data = mean_codon_usage, plot = "bar", var_numerical = "mean_usage_across_conditions", var_categorical = "codon", var_color = "feature", show_legend = "none" )
This function generates a distribution plot to compare distribution usages
between a targeted set of conditions against the rest. A common application
of this plot is to compare mean usages across conditions. Both input data
sources (target_data and overall_data) are expected to be in
long format and contain a minimum information regarding the features, the
conditions they belong to and their usage counts. It is crucial to ensure
consistency between the name of the columns in target_data and
overall_data, together with the parameters x_axis_col and
y_axis_col.
plotTargetComparison( target_data, overall_data, x_axis_col, y_axis_col, color_palette = NULL, show_difference = TRUE, save_format = NULL, out_name = NULL, add_titles = TRUE, out_directory = NULL, show_legend = "none", verbose = TRUE )plotTargetComparison( target_data, overall_data, x_axis_col, y_axis_col, color_palette = NULL, show_difference = TRUE, save_format = NULL, out_name = NULL, add_titles = TRUE, out_directory = NULL, show_legend = "none", verbose = TRUE )
target_data |
A long format table of a condition of interest. This
format can be obtained using |
overall_data |
A long format table of the whole set of conditions
(with or without the data in |
x_axis_col |
Name of the categorical variable to reflect in the plot. |
y_axis_col |
Name of the numerical variable to reflect in the plot. |
color_palette |
Optional; a vector of color codes (min. 2 codes and max. 3 codes) to customize plot appearance. Colors for (i) data, (ii) target value, and (iii) difference bar. |
show_difference |
Logical; if |
save_format |
Optional; either |
out_name |
Optional; name for the saved plot (if |
add_titles |
Logical; if |
out_directory |
Optional; path to the directory where the plot will
be saved (if |
show_legend |
Either |
verbose |
Logical; if |
A ggplot object representing the requested distribution plot.
If save_format is provided, the plot will also be saved to the
specified location.
This function generates a violin plot to visualize the distribution of tTE
scores across different condition. The tTE scores should be obtained using
computeTheoreticalTE. The plot allows the user to easily
compare the tTE distribution between different conditions, with the option
to highlight specific feature clusters based on the provided targets.
plotTEscore( data, metadata, class_col, index_col, target_col = NULL, score_col = "tTE", cond_col = "condition", pval_col = "p_value", facet_col = NULL, color_palette = NULL, save_format = NULL, out_name = NULL, add_stats = TRUE, out_directory = NULL, verbose = TRUE, show_legend = "none", add_titles = TRUE, show_outliers = FALSE )plotTEscore( data, metadata, class_col, index_col, target_col = NULL, score_col = "tTE", cond_col = "condition", pval_col = "p_value", facet_col = NULL, color_palette = NULL, save_format = NULL, out_name = NULL, add_stats = TRUE, out_directory = NULL, verbose = TRUE, show_legend = "none", add_titles = TRUE, show_outliers = FALSE )
data |
A tTE results table obtained from
|
metadata |
A table with additional information regarding the conditions
in |
class_col |
Name of the categorical variable to reflect in the plot. |
index_col |
Name of the categorical variable that links the conditions
in |
target_col |
Optional; name of the categorical variable to perform the
statistical comparison (the most specific level). Used if
|
score_col |
Name of the numerical variable that contains the tTE scores
in the |
cond_col |
Name of the categorical variable that contains the conditions
in the |
pval_col |
Name of the numerical variable that contains the significance
scores in |
facet_col |
Optional; name of the categorical variable separate the plot into different panels. |
color_palette |
Optional; a vector of color codes to customize plot appearance. |
save_format |
Optional; either |
out_name |
Optional; name for the saved plot (if |
add_stats |
Logical; if |
out_directory |
Optional; path to the directory where the plot will be
saved (if |
verbose |
Logical; if |
show_legend |
Either |
add_titles |
Logical; if |
show_outliers |
Logical; if |
A ggplot object representing the tTE scores. If
save_format is provided, the plot will also be saved to the
specified location. If add_stats reports a table with the
statitical measures summarized.
data( default_tTEscanR_mRNA_data, default_tTEscanR_tRNA_data, default_tTEscanR_metadata ) # Define the tTEscanR object tTEscanR_obj <- createObject( counts = list( mRNA = default_tTEscanR_mRNA_data, tRNA = default_tTEscanR_tRNA_data ), meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) # Compute the codon and anticodon usage tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = FALSE, reduce = 10000 ) tTEscanR_obj <- computeAnticodonUsage(object = tTEscanR_obj) # Compute the theoretical translation efficiency (tTE scores) tTEscanR_obj <- computeTheoreticalTE( object = tTEscanR_obj, level = "codon", compute_significance = TRUE ) tTEresults_codon <- getMetadata(tTEscanR_obj, "tTEresults_codon") conditions_metadata <- getMetadata(tTEscanR_obj, "ConditionsLabels") # Visualize the tTE scores plotTEscore( data = tTEresults_codon, metadata = conditions_metadata, index_col = "conditions", class_col = "tissue", add_stats = TRUE )data( default_tTEscanR_mRNA_data, default_tTEscanR_tRNA_data, default_tTEscanR_metadata ) # Define the tTEscanR object tTEscanR_obj <- createObject( counts = list( mRNA = default_tTEscanR_mRNA_data, tRNA = default_tTEscanR_tRNA_data ), meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) # Compute the codon and anticodon usage tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = FALSE, reduce = 10000 ) tTEscanR_obj <- computeAnticodonUsage(object = tTEscanR_obj) # Compute the theoretical translation efficiency (tTE scores) tTEscanR_obj <- computeTheoreticalTE( object = tTEscanR_obj, level = "codon", compute_significance = TRUE ) tTEresults_codon <- getMetadata(tTEscanR_obj, "tTEresults_codon") conditions_metadata <- getMetadata(tTEscanR_obj, "ConditionsLabels") # Visualize the tTE scores plotTEscore( data = tTEresults_codon, metadata = conditions_metadata, index_col = "conditions", class_col = "tissue", add_stats = TRUE )
This function applies differential expression analysis using the DESeq2 framework on a matrix of expression values. It supports both exploratory visualizations (heatmap and PCA) and targeted comparisons using a custom contrast table.
runDEAnalysis( list_data, metadata, batch = NULL, reference = NULL, reduce = 100, dim_reduct = NULL, color_factor = batch, heatmap = TRUE, shape_factor = NULL, label_factor = NULL, target = NULL, highlight_median = FALSE, numPC = 2, color_palette = NULL, fc_threshold = 1, show_legend = "none", padj_threshold = 0.05, label_significant = TRUE, compute_pairwise = TRUE, verbose = TRUE )runDEAnalysis( list_data, metadata, batch = NULL, reference = NULL, reduce = 100, dim_reduct = NULL, color_factor = batch, heatmap = TRUE, shape_factor = NULL, label_factor = NULL, target = NULL, highlight_median = FALSE, numPC = 2, color_palette = NULL, fc_threshold = 1, show_legend = "none", padj_threshold = 0.05, label_significant = TRUE, compute_pairwise = TRUE, verbose = TRUE )
list_data |
A |
metadata |
A |
batch |
Optional; name of the categorical variable in |
reference |
Optional; factor from the |
reduce |
Numeric; a scaling factor used to normalize large expression values that exceed R's handling capacity. Defaults to 100. |
dim_reduct |
Either |
color_factor |
Optional; name of the categorical variable in
|
heatmap |
Logical; if |
shape_factor |
Optional; name of the categorical variable in
|
label_factor |
Optional; name of the categorical variable to label the
data points. Used if |
target |
Optional; a factor based on |
highlight_median |
Logical; if |
numPC |
Numeric; number of principal components to include in the PCA
analysis. Required if |
color_palette |
Optional; a |
fc_threshold |
Numeric; fold change threshold used for highlighting significant features in the volcano plot. Defaults to 1. |
show_legend |
Either |
padj_threshold |
Numeric; p-value threshold used for highlighting significant features in the volcano plot. Defaults to 0.05. |
label_significant |
Logical; if |
compute_pairwise |
Logical; if |
verbose |
Logical; if |
A list of outputs per each matrix in list_data,
based on the enabled parameters: (i) exploratory plots (heatmap and/or
PCA), (ii) targeted plot (volcano), and (iii) size-corrected
data.
data(default_tTEscanR_tRNA_data, default_tTEscanR_metadata) DE_analysis <- runDEAnalysis( list_data = list(tRNA = default_tTEscanR_tRNA_data), metadata = default_tTEscanR_metadata, batch = "tissue", color_factor = "tissue" )data(default_tTEscanR_tRNA_data, default_tTEscanR_metadata) DE_analysis <- runDEAnalysis( list_data = list(tRNA = default_tTEscanR_tRNA_data), metadata = default_tTEscanR_metadata, batch = "tissue", color_factor = "tissue" )
This function wraps up all the independent functions of theoretical translation efficiency pipeline. Requires an mRNA and tRNA count matrices to compute the codon and anticodon usage and further derive the amino acid demand and supply. With matching condition in the former matrices the theoretical translation efficiency would be computed.
runPipeline( mRNA_data, tRNA_data, metadata, batch, corr_method = "spearman", additional_metrics = TRUE, runDESeq = TRUE, compute_significance = TRUE, codon_freq = NULL, species = NULL, genetic_code = "Standard", dim_reduct = NULL, reduce = 100, color_factor = NULL, verbose = TRUE )runPipeline( mRNA_data, tRNA_data, metadata, batch, corr_method = "spearman", additional_metrics = TRUE, runDESeq = TRUE, compute_significance = TRUE, codon_freq = NULL, species = NULL, genetic_code = "Standard", dim_reduct = NULL, reduce = 100, color_factor = NULL, verbose = TRUE )
mRNA_data |
A count matrix of mRNA genes (rows) per conditions (columns). |
tRNA_data |
A count matrix of tRNA genes (rows) per conditions (columns). |
metadata |
A |
batch |
A factor based on |
corr_method |
A correlation method accepted by |
additional_metrics |
Logical; if |
runDESeq |
Logical; if |
compute_significance |
Logical; if |
codon_freq |
Optional; a user-provided codon frequency-per-gene table.
If necessary, it can be computed using |
species |
Either |
genetic_code |
A |
dim_reduct |
Either |
reduce |
Numeric; a scaling factor used to normalize large expression values that exceed R's handling capacity. Defaults to 100. |
color_factor |
A factor based on |
verbose |
Logical; if |
A tTEscanR_Object with the assays and metadata computed
through the tTE pipeline.
data( default_tTEscanR_mRNA_data, default_tTEscanR_tRNA_data, default_tTEscanR_metadata ) tTEscanR_obj <- runPipeline( mRNA_data = default_tTEscanR_mRNA_data, tRNA_data = default_tTEscanR_tRNA_data, metadata = default_tTEscanR_metadata, species = "hg38", batch = "tissue", additional_metrics = FALSE, compute_significance = FALSE, runDESeq = FALSE )data( default_tTEscanR_mRNA_data, default_tTEscanR_tRNA_data, default_tTEscanR_metadata ) tTEscanR_obj <- runPipeline( mRNA_data = default_tTEscanR_mRNA_data, tRNA_data = default_tTEscanR_tRNA_data, metadata = default_tTEscanR_metadata, species = "hg38", batch = "tissue", additional_metrics = FALSE, compute_significance = FALSE, runDESeq = FALSE )
This function analyzes the contribution of the most highly expressed genes
to the overall codon pool across conditions. It is particularly useful for
evaluating codon bias in highly expressed genes and how it varies across
conditions. If needed, gene annotations can be translated for consistency,
and internal species-specific for human ("hg38") and mouse
("mm39") are supported.
showPoolContribution( object, codon_freq = NULL, species = NULL, N = 10, corr_method = "spearman", overwrite = FALSE, verbose = TRUE )showPoolContribution( object, codon_freq = NULL, species = NULL, N = 10, corr_method = "spearman", overwrite = FALSE, verbose = TRUE )
object |
A |
codon_freq |
Optional; a user-provided codon frequency per gene table.
If necessary, it can be computed using |
species |
A character string specifying the species reference genome
version (used if |
N |
Numeric; number of top genes to consider in the codon pool contribution. Defaults to 10. |
corr_method |
A correlation method accepted by |
overwrite |
Logical; if |
verbose |
Logical; if |
An updated tTEscanR_Object containing new layers of
information in the meta.data slot, representing the codon pool
contribution.
data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) tTEscanR_obj <- createObject( counts = list(mRNA = default_tTEscanR_mRNA_data), meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = FALSE ) tTEscanR_obj <- showPoolContribution( object = tTEscanR_obj, species = "hg38" )data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata) tTEscanR_obj <- createObject( counts = list(mRNA = default_tTEscanR_mRNA_data), meta.data = list(default_tTEscanR_metadata, "tissue"), meta.data.ids = list("ConditionsLabels", "CorrectionFactor") ) tTEscanR_obj <- computeCodonUsage( object = tTEscanR_obj, species = "hg38", additional_metrics = FALSE ) tTEscanR_obj <- showPoolContribution( object = tTEscanR_obj, species = "hg38" )
This function converts a matrix or data.frame into a tidy,
long-format tibble. Optionally normalizes the values.
transformFormat(data, normalize, rownames_to_column, names_to, values_to)transformFormat(data, normalize, rownames_to_column, names_to, values_to)
data |
A table to be converted. Supported formats: |
normalize |
Logical; if |
rownames_to_column |
A character string specifying the name of the new
column that will hold the former row names in |
names_to |
A character string specifying the name of the new column
that will hold the former column names in |
values_to |
A character string specifying the name of the new column
that will hold the corresponding values from the pivoted columns in
|
A tibble of the input data
data(default_tTEscanR_tRNA_data) tRNA_long_format <- transformFormat( data = default_tTEscanR_tRNA_data, normalize = FALSE, rownames_to_column = "tRNA_genes", names_to = "condition", values_to = "abundance" ) tTEobj <- createObject( counts = default_tTEscanR_tRNA_data, assay = "tRNA" ) tTEobj <- computeAnticodonUsage(object = tTEobj) anticodon_long_format <- transformFormat( data = getAssay(tTEobj, "AnticodonUsage"), normalize = TRUE, rownames_to_column = "anticodons", names_to = "condition", values_to = "usage" )data(default_tTEscanR_tRNA_data) tRNA_long_format <- transformFormat( data = default_tTEscanR_tRNA_data, normalize = FALSE, rownames_to_column = "tRNA_genes", names_to = "condition", values_to = "abundance" ) tTEobj <- createObject( counts = default_tTEscanR_tRNA_data, assay = "tRNA" ) tTEobj <- computeAnticodonUsage(object = tTEobj) anticodon_long_format <- transformFormat( data = getAssay(tTEobj, "AnticodonUsage"), normalize = TRUE, rownames_to_column = "anticodons", names_to = "condition", values_to = "usage" )
This function filters a tRNA expression matrix by removing conditions
(columns) that fall below a specific total read count (cutoff). It
is useful for eliminating low-quality or poorly sequenced conditions that
may bias downstream analyses.
tRNAFilterCuts(data, cutoff = 5000, verbose = TRUE)tRNAFilterCuts(data, cutoff = 5000, verbose = TRUE)
data |
A |
cutoff |
Numeric; minimum total number of tRNA cuts required to retain
a condition in |
verbose |
Logical; if |
A filtered matrix or data.frame with tRNAs below the
cutoff removed.
data(default_tTEscanR_tRNA_data) tRNA_data_filtered <- tRNAFilterCuts( data = default_tTEscanR_tRNA_data, cutoff = 5000 )data(default_tTEscanR_tRNA_data) tRNA_data_filtered <- tRNAFilterCuts( data = default_tTEscanR_tRNA_data, cutoff = 5000 )
Generate a tRNA expression matrix
tRNAGetMatrix( data, assay = "peaks", confidence_set = NULL, tRNA_name_map = NULL, species = NULL, flanking_region = 100, name_sep = c("-", "-"), save = TRUE, out_name = NULL, out_directory = NULL, verbose = TRUE )tRNAGetMatrix( data, assay = "peaks", confidence_set = NULL, tRNA_name_map = NULL, species = NULL, flanking_region = 100, name_sep = c("-", "-"), save = TRUE, out_name = NULL, out_directory = NULL, verbose = TRUE )
data |
|
assay |
Optional; a character string specifying the name of the assay to
retrieve from |
confidence_set |
Either a file path to the tRNA annotations (confidence set file from gtRNAdb), or a GRanges object. Contains the set of high confidence tRNA genes |
tRNA_name_map |
Optional; a |
species |
Optional; either |
flanking_region |
Integer; number of nucleotides that form the flanking region of each tRNA. Defaults to 100. |
name_sep |
A string delimiter to format the tRNA gene names in the
output matrix. Defaults to |
save |
Logical; if |
out_name |
Optional; name for the saved plot (if |
out_directory |
Optional; path to the directory where the plot will be
saved (if |
verbose |
Logical; if |
Sparse matrix of tRNA counts (tRNAs x cells)
Selection of the Optimal tRNA Cut Cutoff
tRNASetCutoff( data, num_iter = 1000, cutoffs_limits = c(50, 10000), generate_plot = TRUE, slope_threshold = 0.001, rho_threshold = 0.95, compute_aa = FALSE, verbose = TRUE )tRNASetCutoff( data, num_iter = 1000, cutoffs_limits = c(50, 10000), generate_plot = TRUE, slope_threshold = 0.001, rho_threshold = 0.95, compute_aa = FALSE, verbose = TRUE )
data |
tRNA gene expression count |
num_iter |
Numeric; value to select the number of iterations to perform in order to determine the optimal cutoff. Defaults to 1000. |
cutoffs_limits |
Minimum and maximum values to test to search for the optimal tRNA cuts threshold. Defaults to c(50, 10000). |
generate_plot |
Logic; if |
slope_threshold |
Numeric; value to consider for the determination of the correlation stability. Defaults to 0.001. |
rho_threshold |
Numeric; value to consider for the determination of the correlation strength. Defaults to 0.95. |
compute_aa |
Logic; if |
verbose |
Logical; if |
Table with the optimal cutoff at the anticodon isoacceptor and amino acid isotype.
data(default_tTEscanR_tRNA_data) optimal_tRNA_cutoffs <- tRNASetCutoff( data = default_tTEscanR_tRNA_data, generate_plot = FALSE, num_iter = 50, cutoffs_limits = c(3500, 4000) )data(default_tTEscanR_tRNA_data) optimal_tRNA_cutoffs <- tRNASetCutoff( data = default_tTEscanR_tRNA_data, generate_plot = FALSE, num_iter = 50, cutoffs_limits = c(3500, 4000) )
Annotate the tRNA genes from tRNA tags
tRNASetGenes(data, tRNA_bed, flanking_region = 100, name_sep = c("-", "-"))tRNASetGenes(data, tRNA_bed, flanking_region = 100, name_sep = c("-", "-"))
data |
A |
tRNA_bed |
Path to the directory that contains the .bed file |
flanking_region |
Numeric; number of bases to include expand the region interrogated. Defaults to 100. |
name_sep |
A string delimiter to format the tRNA gene names in the
output matrix. Defaults to |
A data with the translated tRNA gene names.
The tTEscanR object is dynamically updated to store assays
and meta.data at each analysis step. Ensures efficient tracking and
organization of inputs and outputs throughout the pipeline. In order to
ensure robustness throughout the pipeline, specific ids have been assigned
and should be respected by the user.
assaysA list of assays.
meta.dataA list of meta-information associated with the assays.
Updates an existing tTEscanR_object using
createObject. For more details, refer to
createObject.
updateObject( object, counts = NULL, assay = NULL, main_name = NULL, meta.data = NULL, meta.data.ids = NULL, overwrite = FALSE, verbose = TRUE )updateObject( object, counts = NULL, assay = NULL, main_name = NULL, meta.data = NULL, meta.data.ids = NULL, overwrite = FALSE, verbose = TRUE )
object |
An existing |
counts |
Optional; a count matrix (or |
assay |
Optional; a |
main_name |
Optional; a |
meta.data |
Optional; a |
meta.data.ids |
Optional; a |
overwrite |
Logical; if |
verbose |
Logical; if |
An updated tTEscanR_Object.
data(default_tTEscanR_mRNA_data, default_tTEscanR_tRNA_data) tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA" ) tTEscanR_obj <- updateObject( object = tTEscanR_obj, counts = default_tTEscanR_tRNA_data, assay = "tRNA" )data(default_tTEscanR_mRNA_data, default_tTEscanR_tRNA_data) tTEscanR_obj <- createObject( counts = default_tTEscanR_mRNA_data, assay = "mRNA" ) tTEscanR_obj <- updateObject( object = tTEscanR_obj, counts = default_tTEscanR_tRNA_data, assay = "tRNA" )