Package 'tTEscanR'

Title: An advanced R-based package to quantify and visualize translation efficiency from sequencing data
Description: Translation elongation is dependent on codon-anticodon interactions, with suitable nucleotide pairing being essential for efficient translation. To quantify this relationship, we previously developed a computational pipeline (GitHub - wgao688/sc_tRNA_mRNA) that uses mRNA codon usage relative to tRNA anticodon availability as a proxy for theoretical translation efficiency (tTE). Here, we introduce tTEscanR, a powerful and user-friendly R-based package that extends this approach to quantify translation efficiency from both bulk and single-cell sequencing data. tTEscanR is a versatile tool for exploring translation efficiency in diverse cellular processes, disease mechanisms, and therapeutic development. It also features an advanced visualization module to generate high-quality plots, enhancing result interpretation and communication.
Authors: Ana Varas-Sánchez [aut, cre] (ORCID: <https://orcid.org/0009-0006-0187-8968>)
Maintainer: Ana Varas-Sánchez <[email protected]>
License: MIT + file LICENSE
Version: 0.99.0
Built: 2026-06-29 23:01:25 UTC
Source: https://github.com/BiocStaging/tTEscanR

Help Index


Compute Amino Acid (AA) Demand or Supply

Description

This function calculates the amino acid (AA) demand and/or supply from a codon and/or anticodon usage matrices of a tTEscanR_Object. It aggregates the contribution of their features based on the standard genetic code (mapping codons/anticodons to AA), The resulting values reflect total usage (demand) or availability (supply) of each amino acid, depending on the input type.

Usage

computeAAUsage(
  object,
  level,
  genetic_code = "Standard",
  overwrite = FALSE,
  verbose = TRUE
)

Arguments

object

A tTEscanR_Object containing codon and/or anticodon usage assays to be analyzed.

level

Either "demand", "supply" or "both" to indicate which analysis to perform.

genetic_code

A character string to specify the genetic code to be used. Defaults to "Standard".

overwrite

Logical; if TRUE, overwrites any existing assay in the object. Defaults to FALSE

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE

Details

In order to generalize the analysis to any organism available in Ensembl, tTEscanR can be used with the 33 different genetic codes described by NCBI Taxonomy:

“Standard”, "Vertebrate Mitochondrial”, "Yeast Mitochondrial”, ”Mold Mitochondrial; Protozoan Mitochondrial; Coelenterate Mitochondrial; Mycoplasma; Spiroplasma”, "Invertebrate Mitochondrial”, "Ciliate Nuclear; Dasycladacean Nuclear; Hexamita Nuclear”, "Echinoderm Mitochondrial; Flatworm Mitochondrial”, ”Euplotid Nuclear”, "Bacterial, Archaeal and Plant Plastid”, "Alternative Yeast Nuclear”, "Ascidian Mitochondrial”, "Alternative Flatworm Mitochondrial”, "Blepharisma Macronuclear”, "Chlorophycean Mitochondrial”, "Trematode Mitochondrial”, "Scenedesmus obliquus mitochondrial”, "Thraustochytrium mitochondrial code”, "Rhabdopleuridae Mitochondrial”, "Candidate Division SR1 and Gracilibacteria”, "Pachysolen tannophilus Nuclear”, "Karyorelict Nuclear”, "Condylostoma Nuclear”, "Mesodinium Nuclear”, "Peritrich Nuclear”, "Blastocrithidia Nuclear”, "Balanophoraceae Plastid”, "Cephalodiscidae Mitochondrial"

Value

An updated tTEscanR_Object containing a new layer of information in the assays slot representing the AA demand and/or supply.

Examples

data(default_tTEscanR_tRNA_data)
tTEscanR_obj <-createObject(
    counts = default_tTEscanR_tRNA_data,
    assay = "tRNA"
)
tTEscanR_obj <- computeAnticodonUsage(object = tTEscanR_obj)
tTEscanR_obj <- computeAAUsage(object = tTEscanR_obj, level = "supply")

Compute Anticodon Usage from tRNA Gene Expression Data

Description

This function calculates anticodon usage profiles from tRNA gene expression data stored in a tTEscanR_Object. It summarizes the expression of tRNAs by their anticodon identity, which can be used to estimate the tRNA supply landscape. The tRNA gene names need to be properly annotated for proper recognition. Expected format: tRNA-Asn-GTT-5-1.

Usage

computeAnticodonUsage(object, overwrite = FALSE, verbose = TRUE)

Arguments

object

A tTEscanR_Object containing a tRNA assay.

overwrite

Logical; if TRUE, overwrites any existing assay and metadata in the object. Defaults to FALSE.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

An updated tTEscanR_Object containing a new layer of information "AnticodonUsage" in the assays slot representing the anticodon usage.

Examples

data(default_tTEscanR_tRNA_data)
tTEscanR_obj <- createObject(
    counts = default_tTEscanR_tRNA_data,
    assay = "tRNA"
)
tTEscanR_obj <- computeAnticodonUsage(object = tTEscanR_obj)

Compute Codon Usage from mRNA Gene Expression Data

Description

This function estimates codon usage profiles based on gene-level mRNA expression data stored in a tTEscanR_Object. It optionally accepts pre-computed codon frequency tables or uses internally generated default tables when not provided. When enabled, it can evaluate the correlation between background codon composition and observed mean codon usage. If the additional metrics are to be computed the input tTEscanR_Object needs to The default codon_freq were built using the canonical filter to select one transcript if several were available for the same gene.

Usage

computeCodonUsage(
  object,
  codon_freq = NULL,
  species = NULL,
  additional_metrics = TRUE,
  reduce = 100,
  corr_method = "spearman",
  overwrite = FALSE,
  verbose = TRUE
)

Arguments

object

A tTEscanR_Object containing a mRNA assay.

codon_freq

Optional; a user-provided codon frequency-per-gene table. If necessary, it can be computed using getCodonFreq.

species

Optional; either "hg38" (human) or "mm39" (mouse) to load the default settings. Required if codon_freq is not provided.

additional_metrics

Logical; if TRUE, computes: (i) codon exonic background, (ii) mean codon usage, and (iii) correlation between the previous metrics. Defaults to TRUE.

reduce

Numeric; a scaling factor used to normalize large expression values that exceed R's handling capacity. Defaults to 100.

corr_method

A correlation method accepted by cor. Required if additional_metrics is TRUE. Defaults to "spearman".

overwrite

Logical; if TRUE, overwrites any existing assay and metadata in the object. Defaults to FALSE.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

An updated tTEscanR_Object containing a new layer of information "CodonUsage" in the assays slot representing the codon usage. Additional computations will be stored in the meta.data slot as "CodonUsage_AdditionalMetrics".

Examples

data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata)
tTEscanR_obj <- createObject(
    counts = default_tTEscanR_mRNA_data,
    assay = "mRNA",
    meta.data = list(default_tTEscanR_metadata, "tissue"),
    meta.data.ids = list("ConditionsLabels", "CorrectionFactor")
)
tTEscanR_obj <- computeCodonUsage(
    object = tTEscanR_obj, species = "hg38",
    additional_metrics = FALSE, reduce = 1000
)

Compute the Correlation Between Mean Usage and Exonic Background

Description

This function calculates the correlation between observed mean usage and the exonic background. It provides a metric for evaluating how much usage is driven by underlying sequence composition versus condition-specific expression.

Usage

computeCorrelationBackground(
  mean,
  background,
  corr_method = "spearman",
  verbose = TRUE
)

Arguments

mean

A matrix containing the mean usage across conditions. Can be computed using computeMeanUsage.

background

A matrix or table containing the frequencies in exonic regions. Can be computed using computeExonicBackground.

corr_method

A correlation method accepted by cor. Defaults to "spearman".

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

Integer; correlation information between mean and background.

Examples

data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata)
tTEscanR_obj <- createObject(
    counts = default_tTEscanR_mRNA_data,
    assay = "mRNA"
)
tTEscanR_obj <- computeCodonUsage(
    object = tTEscanR_obj, species = "hg38",
    additional_metrics = FALSE, reduce = 1000
)
codon_usage <- getAssay(tTEscanR_obj, "CodonUsage")
exonic_background <- computeExonicBackground(data = codon_usage)
# Input: expression count matrix, need to provide metadata & batch parameters
mean_codon_usage <- computeMeanUsage(
    data = codon_usage,
    mode = "raw",
    metadata = default_tTEscanR_metadata,
    batch = "tissue"
)
corr_back <- computeCorrelationBackground(
    mean = mean_codon_usage,
    background = exonic_background
)

Compute the DESeq2 Analysis

Description

Compute the DESeq2 Analysis

Usage

computeDEResults(
  list_data,
  metadata,
  target = NULL,
  batch = NULL,
  reference = NULL,
  reduce = 100,
  padj_threshold = 0.05,
  verbose = TRUE,
  compute_pairwise = TRUE
)

Arguments

list_data

A list of matrices with features (i.e. genes, codons, anticodons or amino acids) as rows and samples or conditions as columns. The list can be named or unnamed.

metadata

A data.frame with the data associated to the conditions in the matrices of list_data. There has to be one column with the same labels as the column names.

target

Optional; a factor based on metadata columns to define the comparisons to perform in a targeted analysis. Defaults to NULL.

batch

Optional; name of the categorical variable in metadata to correct the data. Required if dim_reduct specified.

reference

Optional; factor from the batch variable to use as reference for the corrections. If not specified, the 1st factor that appears will be used instead.

reduce

Numeric; a scaling factor used to normalize large expression values that exceed R's handling capacity. Defaults to 100.

padj_threshold

Numeric; p-value threshold used for highlighting significant features in the volcano plot. Defaults to 0.05.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

compute_pairwise

Logical; if TRUE, computes all the pairwise comparisons based on the conditions included in the input data.

Value

A DESeq2 object with the normalized and vst counts.

Examples

data(default_tTEscanR_tRNA_data, default_tTEscanR_metadata)
DE_analysis <- computeDEResults(
    list_data = list(tRNA = default_tTEscanR_tRNA_data),
    metadata = default_tTEscanR_metadata, batch = "tissue"
)

Compute the Exonic Background of the Codon/Anticodon Usage

Description

This function calculates the codon/anticodon usage background based solely on exonic sequence composition, independent of expression levels. It provides a reference distribution of codon/anticodon frequencies across conditions, used to normalize/compare against observed usage patterns derived from expression data.

Usage

computeExonicBackground(data)

Arguments

data

A codon usage matrix with codons as rows and conditions or samples as columns.

Value

A matrix with the codon/anticodon background.

Examples

data(default_tTEscanR_mRNA_data)
tTEscanR_obj <- createObject(
    counts = default_tTEscanR_mRNA_data,
    assay = "mRNA"
)
tTEscanR_obj <- computeCodonUsage(
    object = tTEscanR_obj, species = "hg38",
    additional_metrics = FALSE, reduce = 1000
)
exonic_background <- computeExonicBackground(data = getAssay(
    tTEscanR_obj,
    "CodonUsage"
))

Compute Usage Across Conditions (Mean Usage)

Description

This function computes the average usage of codons, anticodons, or amino acids across conditions, useful for summarizing feature usage trends across sample groups. It supports direct input of a count matrix and extraction from a tTEscanR_Object. When the input data is a tTEscanR_Object the parameters metadata and batch will be extracted from the object, and ignored if specified as input parameters. Therefore, variables assay and metadata need to be coherent with the rules described in createObject.

Usage

computeMeanUsage(
  data,
  assay = NULL,
  metadata = NULL,
  id_col = NULL,
  batch = NULL,
  mode = c("raw", "size-corrected", "long_format"),
  verbose = TRUE
)

Arguments

data

A tTEscanR_Object or expression count matrix (with codons, anticodons or amino acids as features).

assay

Optional; a character string specifying the name of the assay to retrieve from the tTEscanR_Object.

metadata

Optional; a data.frame with the meta-information related with the conditions in data. There has to be one column with the same labels as the column names.

id_col

Optional; a factor based on metadata columns to define the variable to use to link it with the data. If NULL the column with the highest agreement will be automatically selected.

batch

Optional; a factor based on metadata columns to define the variable to correct for. Required if mode is "raw" and data is not a tTEscanR_Object.

mode

Either "raw", "size-corrected" or "long-format" to specify the format that the input data belongs to. Defaults to "raw".

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

An updated tTEscanR_Object if data is a tTEscanR_Object. A data.frame containing a new layer of information representing the mean codon usage if data is an expression count matrix.

Examples

data(default_tTEscanR_tRNA_data, default_tTEscanR_metadata)
tTEscanR_obj <- createObject(
    counts = default_tTEscanR_tRNA_data,
    assay = "tRNA",
    meta.data = list(default_tTEscanR_metadata, "tissue"),
    meta.data.ids = list(
        "ConditionsLabels",
        "CorrectionFactor"
    )
)
# Input: tTEscanR object containing metadata and batch parameters
tTEscanR_obj <- computeAnticodonUsage(object = tTEscanR_obj)
anticodon_mean_usage <- computeMeanUsage(
    data = tTEscanR_obj,
    assay = "AnticodonUsage"
)

Compute the Theoretical Translation Efficiency (tTE) score

Description

This function calculates the theoretical translation efficiency (tTE) score by integrating codon-anticodon usage and/or amino acid demand-supply across conditions.

Usage

computeTheoreticalTE(
  object,
  level = c("codon", "aa", "both"),
  genetic_code = "Standard",
  corr_method = c("spearman", "pearson", "kendall"),
  compute_significance = TRUE,
  overwrite = FALSE,
  verbose = TRUE
)

Arguments

object

A tTEscanR_Object containing a mRNA and a codon usage assay and/or a tRNA and an anticodon usage assay.

level

Either "codon", "aa" or "both" to indicate which analysis to perform.

genetic_code

A character string to specify the genetic code to be used. Defaults to "Standard".

corr_method

A correlation method accepted by cor. Defaults to "spearman".

compute_significance

Logical; if TRUE, computes the statistical significance (p-value) of the tTE scores. Defaults to TRUE.

overwrite

Logical; if TRUE, overwrites any existing tTE table in the tTEscanR_Object. Defaults to FALSE.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

An updated tTEscanR_Object containing a new layer of information representing the translation efficiency table for the matching conditions in the mRNA and tRNA data.

Examples

data(
    default_tTEscanR_mRNA_data, default_tTEscanR_tRNA_data,
    default_tTEscanR_metadata
)
tTEscanR_obj <- createObject(
    counts = list(
        mRNA = default_tTEscanR_mRNA_data, tRNA = default_tTEscanR_tRNA_data
    ),
    meta.data = list(default_tTEscanR_metadata, "tissue"),
    meta.data.ids = list("ConditionsLabels", "CorrectionFactor")
)
tTEscanR_obj <- computeCodonUsage(
    object = tTEscanR_obj, species = "hg38",
    additional_metrics = FALSE, reduce = 10000
)
tTEscanR_obj <- computeAnticodonUsage(object = tTEscanR_obj)
tTEscanR_obj <- computeTheoreticalTE(
    object = tTEscanR_obj, level = "codon", compute_significance = FALSE
)

Create a tTEscanR Object

Description

This function initializes a tTEscanR_Object, a structured container designed to hold translational efficiency-related data. The object consists of two main components; assays, which stores data matrices (e.g. expression, codon usage), and meta.data, which stores associated information and additional intermediate calculations. A tTEscanR_Object can be created using a single dataset, or initialized with a list of datasets provided at once. Additional assays and metadata layers can be appended later using the updateObject function.

Usage

createObject(
  counts,
  assay = NULL,
  meta.data = NULL,
  meta.data.ids = NULL,
  verbose = TRUE
)

Arguments

counts

A count matrix (or list of matrices) that will be stored in the assays slot.

assay

Optional; a character string (or list of characters) to identify the counts. Required if counts is not a named list.

meta.data

Optional; a variable (or list of variables) with additional information that will be stored in meta.data slot.

meta.data.ids

Optional; a character string (or list with the labels) to identify the meta.data. Required if meta.data is not a named list.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Details

In order to ensure robustness throughout the pipeline specific ids have been assigned and should be respected by the user. assays slot:

  • mRNA and tRNA count matrices as "mRNA" and "tRNA"

  • Codon and anticodon usage count matrices as "CodonUsage" and "AnticodonUsage"

  • Amino acid demand and supply count matrices as "AADemand" and "AASupply"

  • Size corrected count matrices contain the prefix "SizeCorrected" added to the raw count matrices names (e.g. "SizeCorrected_mRNA} or \code{"SizeCorrectedCodonUsage) meta.data slot:

  • Table with the conditions of the mRNA and tRNA data as "ConditionsLabels"

  • Active correction factor to use when running differential expression analyses as "CorrectionFactor"

  • Optional; Identifier DataMetadataIndex to indicate the column in "ConditionsLabels" that contains the labels of the conditions of the assay.

Value

A tTEscanR_Object.

Examples

data(
    default_tTEscanR_mRNA_data, default_tTEscanR_tRNA_data,
    default_tTEscanR_metadata
)
tTEscanR_obj <- createObject(
    counts = list(
        mRNA = default_tTEscanR_mRNA_data,
        tRNA = default_tTEscanR_tRNA_data
    ),
    meta.data = default_tTEscanR_metadata,
    meta.data.ids = "ConditionsLabels"
)

Metadata of mRNA and tRNA data

Description

This dataset contains the extra information to indicate the conditions of the data

Usage

data(default_tTEscanR_metadata)

Format

A data frame with samples as rows and annotation columns such as:

tissue

The sequencing batch or sample origin

cell.type

The experimental group or cell type

conditions

The combination of the previous items as will be referred in the columns of the count matrices


mRNA Expression Data Subset

Description

This dataset contains an example of mRNA gene expression data with protein-coding genes as rows and cell types as columns.

Usage

data(default_tTEscanR_mRNA_data)

Format

A matrix where rows represent genes and columns represent individual samples or cell types.


tRNA Expression Data Subset

Description

This dataset contains an example of tRNA gene expression data with tRNA genes as rows and cell types as columns.

Usage

data(default_tTEscanR_tRNA_data)

Format

A matrix or data frame where rows represent tRNA gene names and columns represent individual samples or cell types.


Extract Codon Composition of Sequences

Description

This function analyzes a given set of nucleotide sequences and computes the count of each codon present.

Usage

extractCodons(sequences, verbose = TRUE)

Arguments

sequences

A list of nucleotide sequences (character strings) from which to extract the codon composition.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

Codon frequency per gene table of the sequences.

Examples

codon_composition <- extractCodons(sequences = list(
    "ATGCGTACG",
    "TTAAGGCCG"
))

Relates Codons, Anticodons and their corresponding Amino Acids

Description

This function converts: codons and anticodons into anticodons, codons or amino acids based on the genetic code.

Usage

featuresToAA(
  data,
  position = NULL,
  genetic_code = "Standard",
  verbose = TRUE,
  notation_from = c("codon", "anticodon"),
  notation_to = c("aa", "anticodon", "codon")
)

Arguments

data

A character vector or data.frame containing gene names or features to be translated.

position

Optional; either "row", "column" or <column_name> to specify the location of the genes or features in data. Required if data_to_transalte is a data.frame.

genetic_code

A character string to specify the genetic code to be used. Defaults to "Standard".

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

notation_from

Either "codon" or "anticodon" to select the input format of the features in data_to_translate.

notation_to

Either "codon", "anticodon" or "aa" to select the output format of the features in data.

Value

Translated features (codons, anticodons or amino acids) from data_to_translate.

Examples

data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata)
tTEscanR_obj <- createObject(
    counts = default_tTEscanR_mRNA_data,
    assay = "mRNA",
    meta.data = list(default_tTEscanR_metadata, "tissue"),
    meta.data.ids = list("ConditionsLabels", "CorrectionFactor")
)
tTEscanR_obj <- computeCodonUsage(
    object = tTEscanR_obj, codon_freq = NULL,
    species = "hg38", additional_metrics = FALSE
)
codons <- rownames(getAssay(tTEscanR_obj, "CodonUsage"))
codons_to_AA <- featuresToAA(
    data = codons, notation_from = "codon", notation_to = "aa"
)
codons_to_anticodons <- featuresToAA(
    data = codons,
    notation_from = "codon", notation_to = "anticodon"
)

Get Assay Data from a tTEscanR Object

Description

This function safely retrieves the specified data from a tTEscanR_Object.

Usage

getAssay(object, name)

## S4 method for signature 'tTEscanR_Object'
getAssay(object, name)

Arguments

object

A tTEscanR_Object.

name

A character string specifying the name of the assay to retrieve (e.g. "mRNA", "tRNA").

Value

The requested assay data (typically a matrix or data.frame).

Examples

data(default_tTEscanR_mRNA_data)
tTEscanR_obj <- createObject(
    counts = default_tTEscanR_mRNA_data,
    assay = "mRNA"
)
mRNA_data <- getAssay(tTEscanR_obj, "mRNA")

Compute Codon Frequency-per-Gene Table

Description

This function computes codon usage frequencies for each gene based on a provided set of gene sequences. It can optionally subset the analysis to specific transcripts and apply filtering criteria when multiple transcripts are available for the same gene. Consequently, no filter parameter will be considered if parameters transcripts and genes_file are given.

Usage

getCodonFreq(
  dataset_name = NULL,
  genes_file = NULL,
  verbose = TRUE,
  transcripts = NULL,
  retain_mitochondrial = FALSE,
  filter = c("canonical", "length"),
  retain_unannotated = FALSE,
  retain_geneversion = TRUE,
  out_format = c("external_gene_name", "ensembl_transcript_id", "ensembl_gene_id")
)

Arguments

dataset_name

A character string specifying the Ensembl species dataset name (e.g. "hsapiens_gene_ensembl").

genes_file

Optional; a path to a FASTA file.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

transcripts

Optional; a character vector of transcripts or gene IDs to subset the analysis.

retain_mitochondrial

Logical; if FALSE filters out the mitochondrial genes. Defaults to FALSE.

filter

Either "canonical" or "length" (longest transcript) to specify which transcript to choose if several are available for the same gene.

retain_unannotated

Logical; if FALSE filters out the gene names that do not have an "external_gene_name" identifier. Defaults to FALSE.

retain_geneversion

Logical; if FALSE retains the gene versions from the "ensembl_gene_id" identifier. Defaults to TRUE.

out_format

Either "external_gene_name", "ensembl_gene_id" or "ensembl_transcript_id" to specify annotation to use in the output codon frequency-per-gene table.

Value

Codon frequency-per-gene table and a translator gene annotation table (if available).


Get Metadata from a tTEscanR Object

Description

This function safely retrieves the specified metadata from a tTEscanR_Object.

Usage

getMetadata(object, name)

## S4 method for signature 'tTEscanR_Object'
getMetadata(object, name)

Arguments

object

A tTEscanR_Object.

name

Optional; A character string specifying the name of the metadata to retrieve (e.g. "ConditionsLabels", "CorrectionFactor").

Value

A data.frame or a vector depending of the name parameter.

Examples

data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata)
tTEscanR_obj <- createObject(
    counts = default_tTEscanR_mRNA_data,
    assay = "mRNA",
    meta.data = default_tTEscanR_metadata,
    meta.data.ids = "ConditionsLabels"
)
conditions <- getMetadata(tTEscanR_obj, "ConditionsLabels")

Run Codon Usage Permutation Test (Background)

Description

This function performs a permutation test to compare a mRNA dataset to the reference codon frequency-per-gene matrix.

Usage

getPermutationDist(
  n_permut = 1000,
  n_features = 100,
  target_data = NULL,
  codon_freq = NULL,
  species = NULL,
  verbose = TRUE
)

Arguments

n_permut

Numeric; number of permutations to perform. Defaults to 1000.

n_features

Numeric; number of features to select in each permutation. Defaults to 100. If target_data is given this parameter will take its length.

target_data

Optional; a mRNA expression count matrix with features as rows and conditions as columns.

codon_freq

Optional; a user-provided codon frequency per gene table. If necessary, it can be computed using getCodonFreq.

species

Optional, a character string specifying the species reference genome version (used if codon_freq is not provided or translate is TRUE). Supported values include "hg38" (human) and "mm39" (mouse).

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

A table with the codons and their frequencies after computing all the permutations.

Examples

data(default_tTEscanR_mRNA_data)
genes <- default_tTEscanR_mRNA_data[1:20, ]
permut <- getPermutationDist(
    n_permut = 100, target_data = genes,
    species = "hg38"
)

Aggregates data by group

Description

This function calculates the row sums of a given matrix to combine columns that share the same group.

Usage

groupConditions(data, group_labels)

Arguments

data

A matrix with the features to group for as columns.

group_labels

A vector with the metadata features to group the columns in data

Value

A matrix with the conditions merged based on the metadata.

Examples

data <- data.frame(
    sample_1 = c(10, 5, 20), sample_2 = c(15, 8, 25),
    sample_3 = c(12, 6, 22), sample_4 = c(1, 2, 3),
    sample_5 = c(4, 5, 6), sample_6 = c(7, 8, 9)
)
rownames(data) <- c("gene_1", "gene_2", "gene_3")
groups <- c("cond_A", "cond_A", "cond_A", "cond_B", "cond_B", "cond_B")
data_combined <- groupConditions(data = data, group_labels = groups)

Combine large matrices

Description

This function efficiently combines individual matrices.

Usage

mergeMatrices(...)

Arguments

...

A variable number of matrix.

Value

A single sparse matrix with as a combination of all the input matrices.

Examples

df1 <- matrix(c(1, 0, 0, 2), nrow = 2, dimnames = list(
    c("geneA", "geneB"),
    c("s1", "s2")
))
df2 <- matrix(c(3, 0, 0, 4), nrow = 2, dimnames = list(
    c("geneB", "geneC"),
    c("s2", "s3")
))
merged_matrix <- mergeMatrices(df1, df2)

Assess Significance (P-value) & Corrects for Multiple Hypothesis Testing

Description

Assess Significance (P-value) & Corrects for Multiple Hypothesis Testing

Usage

obtainSignificance(dist, value, padj_threshold = 0.05, verbose = TRUE)

Arguments

dist

A table with the codons and their frequencies after completing a permutation test. Output from getPermutationDist.

value

A list of data.frame of the codon exonic background of a mRNA gene expression matrix. The codon exonic background can be computed in computeCodonUsage using the additional_metrics parameter or directly running computeExonicBackground.

padj_threshold

Numeric; p-value threshold used for highlighting significant features in the volcano plot. Defaults to 0.05.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

A table with the codon exonic background and their significance level before (p-value) and after the correction (p-adjusted value).

Examples

data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata)
selected_genes <- default_tTEscanR_mRNA_data[1:20, ]
permutation_test <- getPermutationDist(
    n_permut = 100, target_data = selected_genes, species = "hg38"
)
tTEscanR_obj <- createObject(
    counts = default_tTEscanR_mRNA_data,
    assay = "mRNA",
    meta.data = list(default_tTEscanR_metadata, "tissue"),
    meta.data.ids = list("ConditionsLabels", "CorrectionFactor")
)
tTEscanR_obj <- computeCodonUsage(
    object = tTEscanR_obj, species = "hg38",
    additional_metrics = FALSE, reduce = 1000
)

codon_usage <- getAssay(tTEscanR_obj, "CodonUsage")
codon_background <- rowSums(codon_usage) / sum(rowSums(codon_usage))
codons_to_AA <- featuresToAA(
    data = names(codon_background),
    notation_from = "codon", notation_to = "aa"
)
codon_background <- data.frame(
    group = codons_to_AA, codon = names(codon_background),
    freq = as.numeric(codon_background), row.names = NULL
)
significance <- obtainSignificance(
    dist = permutation_test, value = codon_background
)

Correlation Plot: Exonic Codon Background - Mean Codon Usage

Description

This function generates a visualization to correlate different parameters of the data. The input data (data) is expected to be in long format and contain a minimum information regarding the features, the conditions they belong to and their usage counts. A common usage of this kind of plot is to represent the codon frequencies correlations of the exonic background and the mean codon usage across conditions. Based on the selected plot parameter the user can select the most convenient layout to display the data. It is crucial to ensure consistency between the name of the columns in data and the parameters x_axis_col, y_axis_col and condition_col.

Usage

plotCorrelation(
  data,
  plot = "MeanCodonUsage",
  x_axis_col,
  y_axis_col,
  condition_col,
  extra_val = NULL,
  label_col = NULL,
  color_palette = NULL,
  out_name = NULL,
  show_legend = "none",
  targeted_arg = NULL,
  save_format = NULL,
  out_directory = NULL,
  add_titles = TRUE,
  verbose = TRUE
)

Arguments

data

A long format table. This format can be obtained using transformFormat.

plot

Either "MeanCodonUsage" (default) or "PoolDiversity" to indicate the data source.

x_axis_col

Name of the categorical variable to reflect in the plot.

y_axis_col

Name of the numerical variable to reflect in the plot.

condition_col

Name of the categorical variable to group the data points.

extra_val

Optional; variable with additional information to include in the plot (e.g. correlation value).

label_col

Name of the categorical variable to label the data points.

color_palette

Optional; a vector of color codes to customize plot appearance.

out_name

Optional; name for the saved plot (if save_format specified).

show_legend

Either "none" (default), "top", "bottom", "right" or "left" to specify the position of the legend.

targeted_arg

Optional; a vector defining key feature clusters to highlight or label.

save_format

Optional; either "png" or "pdf" to specify the format to save the plot.

out_directory

Optional; path to the directory where the plot will be saved (if save_format specified).

add_titles

Logical; if TRUE, includes titles in the plot. Defaults to TRUE.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

A ggplot object representing the correlation. If save_format is provided, the plot will also be saved to the specified location.

Examples

data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata)

# Define the object and compute the codon usage
tTEscanR_obj <- createObject(
    counts = default_tTEscanR_mRNA_data, assay = "mRNA",
    meta.data = list(default_tTEscanR_metadata, "tissue"),
    meta.data.ids = list("ConditionsLabels", "CorrectionFactor")
)
tTEscanR_obj <- computeCodonUsage(
    object = tTEscanR_obj, species = "hg38",
    additional_metrics = TRUE, reduce = 1000
)

# Compute and extract the mean codon usage
additional_metrics <- getMetadata(
    tTEscanR_obj,
    "CodonUsage_AdditionalMetrics"
)
mean_codon_usage <- additional_metrics$MeanCodonUsage
exonic_background <- additional_metrics$CodonExonicBackground
exonic_background <- as.data.frame(exonic_background)
correlation_mean_background <- cbind(mean_codon_usage, exonic_background)

plotCorrelation(
    data = correlation_mean_background, plot = "MeanCodonUsage",
    x_axis_col = "mean_usage_across_conditions",
    y_axis_col = "exonic_background", condition_col = "feature",
    extra_val = additional_metrics$MeanCodonCorr, add_titles = TRUE,
    show_legend = "none"
)

Generates Visualizations from the DEA data

Description

Generates Visualizations from the DEA data

Usage

plotDEResults(
  DE_results_list,
  dataset_name = NULL,
  heatmap = TRUE,
  dim_reduct = NULL,
  numPC = 2,
  target = NULL,
  verbose = TRUE,
  color_factor = NULL,
  shape_factor = NULL,
  label_factor = NULL,
  highlight_median = FALSE,
  scale_pca = TRUE,
  color_palette = NULL,
  fc_threshold = 1,
  padj_threshold = 0.05,
  label_significant = TRUE,
  show_legend = "none"
)

Arguments

DE_results_list

A DESeq2 object with the normalized and vst counts. Can be obtained by running computeDEResults

dataset_name

String to specify the assay in DE_results_list to extract.

heatmap

Logical; if TRUE, generates a heatmap for exploratory analysis. Defaults to TRUE.

dim_reduct

Either "PCA", "UMAP" or "tSNE" to specify the dimensionality reduction approach to be executed. Defaults to "PCA".

numPC

Numeric; number of principal components to include in the PCA analysis. Required if dim_reduct is "PCA". Defaults to 2.

target

Optional; a factor based on metadata columns to define the comparisons to perform in a targeted analysis. Defaults to NULL.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

color_factor

Optional; name of the categorical variable in metadata to group by color the data points. Used if dim_reduct specified.

shape_factor

Optional; name of the categorical variable to label the data points. Used if dim_reduct specified.

label_factor

Optional; name of the categorical variable to label the data points. Used if dim_reduct specified.

highlight_median

Logical; if TRUE the data points of each cluster will be summarized into the median. Defaults to FALSE.

scale_pca

Logical; if TRUE, scales the data if dim_reduct is "PCA". Defaults to TRUE.

color_palette

Optional; a vector of color codes to customize the plot appearance.

fc_threshold

Numeric; fold change threshold used for highlighting significant features in the volcano plot (if targets is specified). Defaults to 1.

padj_threshold

Numeric; p-value threshold used for highlighting significant features in the volcano plot. Defaults to 0.05.

label_significant

Logical; if TRUE displays the axis of the plots based on fc_threshold and padj_threshold. Defaults to TRUE.

show_legend

Either "none" (default), "top", "bottom", "right" or "left" to specify the position of the legend in the plot.

Value

Visualization of the DEA results.

Examples

data(default_tTEscanR_tRNA_data, default_tTEscanR_metadata)
DE_analysis <- computeDEResults(
    list_data = list(tRNA = default_tTEscanR_tRNA_data),
    metadata = default_tTEscanR_metadata, batch = "tissue"
)
DE_plots <- plotDEResults(
    DE_results_list = DE_analysis, dataset_name = "tRNA",
    dim_reduct = "PCA", color_factor = "tissue", heatmap = FALSE
)

Distribution Plot of Codon/Anticodon Usage or Amino Acid Demand/Supply

Description

This function generates a visualization of codon-anticodon usage or amino acid demand-supply distributions across conditions. The input data (data) is expected to be in long format and contain a minimum information regarding the features, the conditions they belong to and their usage counts. Based on the selected plot parameter the user can select the most convenient layout to display the data. It is crucial to ensure consistency between the name of the columns in data and the parameters x_axis_col, y_axis_col and condition_col.

Usage

plotDistribution(
  data,
  plot = "jitter",
  bar_position = "dodge",
  x_axis_col,
  y_axis_col,
  condition_col,
  color_palette = NULL,
  ncols = 1,
  facet_col = NULL,
  add_stats = FALSE,
  targeted_arg = NULL,
  save_format = NULL,
  out_name = NULL,
  out_directory = NULL,
  show_legend = "none",
  add_titles = TRUE,
  verbose = TRUE
)

Arguments

data

A long format table. This format can be obtained using transformFormat.

plot

Either "jitter" (default), "barplot", "boxplot" or "dot" to indicate the type of plot to generate.

bar_position

Either "dodge" (default), "stack" or "fill" to indicate the display of the plot if plot is "barplot".

x_axis_col

Name of the categorical variable to reflect in the plot.

y_axis_col

Name of the numerical variable to reflect in the plot.

condition_col

Name of the categorical variable to group the data points.

color_palette

Optional; a vector of color codes to customize plot appearance.

ncols

Numeric; number of columns for arranging panels. Defaults to 1.

facet_col

Optional; name of the categorical variable to divide the plot into different panels. Required if ncols bigger than 1.

add_stats

Logical; if TRUE, performs a statistical analysis based on the available parameters. Defaults to FALSE.

targeted_arg

Optional; a vector defining key feature clusters to highlight or label.

save_format

Optional; either "png" or "pdf" to specify the format to save the plot.

out_name

Optional; name for the saved plot (if save_format specified).

out_directory

Optional; path to the directory where the plot will be saved (if save_format specified).

show_legend

Either "none" (default), "top", "bottom", "right" or "left" to specify the position of the legend.

add_titles

Logical; if TRUE, includes titles in the plot. Defaults to TRUE.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

A ggplot object representing the requested distribution plot. If save_format is provided, the plot will also be saved to the specified location.

Examples

data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata)

# Define the object and compute the codon usage
tTEscanR_obj <- createObject(
    counts = default_tTEscanR_mRNA_data,
    assay = "mRNA",
    meta.data = list(default_tTEscanR_metadata, "tissue"),
    meta.data.ids = list("ConditionsLabels", "CorrectionFactor")
)
tTEscanR_obj <- computeCodonUsage(
    object = tTEscanR_obj, species = "hg38",
    additional_metrics = FALSE, reduce = 1000
)

# Transform the data
long_cu <- transformFormat(
    data = getAssay(tTEscanR_obj, "CodonUsage"), normalize = TRUE,
    rownames_to_column = "codon", names_to = "condition", values_to = "usage"
)
long_cu <- long_cu |>
    tidyr::separate(condition, into = c("tissue", "cell_type"), sep = "-")

# Generate the plot
codon_usage_plot <- plotDistribution(
    data = long_cu, plot = "jitter", x_axis_col = "codon",
    y_axis_col = "usage", condition_col = "tissue", show_legend = "right",
    add_titles = FALSE
)

Permutation Plot

Description

This function generates a plot to compare the baseline codon exonic background against the current codon usage. For a better interpretation, the codons are colored by amino acid.

Usage

plotPermutation(
  permut_data,
  sig_data,
  color_palette = NULL,
  save_format = NULL,
  out_name = NULL,
  out_directory = NULL,
  show_legend = "none",
  add_titles = TRUE,
  verbose = TRUE
)

Arguments

permut_data

A table with the codons and their frequencies after computing all the permutations. Output from getPermutationDist.

sig_data

A table with the codon exonic background and their significance level before (p-value) and after the correction (p-adjusted value). Output from obtainSignificance

color_palette

Optional; a vector of color codes to customize plot appearance.

save_format

Optional; either "png" or "pdf" to specify the format to save the plot.

out_name

Optional; name for the saved plot (if save_format specified).

out_directory

Optional; path to the directory where the plot will be saved (if save_format specified).

show_legend

Either "none" (default), "top", "bottom", "right" or "left" to specify the position of the legend in the plot.

add_titles

Logical; if TRUE, includes titles in the plot. Defaults to TRUE.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

Permutation plot.

Examples

data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata)
selected_genes <- default_tTEscanR_mRNA_data[1:20, ]
permutation_test <- getPermutationDist(
    n_permut = 100, target_data = selected_genes, species = "hg38"
) # Generate table with codon and freq
tTEscanR_obj <- createObject(
    counts = default_tTEscanR_mRNA_data, assay = "mRNA",
    meta.data = list(default_tTEscanR_metadata, "tissue"),
    meta.data.ids = list("ConditionsLabels", "CorrectionFactor")
)
tTEscanR_obj <- computeCodonUsage(
    object = tTEscanR_obj, species = "hg38",
    additional_metrics = FALSE, reduce = 1000
)

codon_usage <- getAssay(tTEscanR_obj, "CodonUsage")
codon_background <- rowSums(codon_usage) / sum(rowSums(codon_usage))
codons_to_AA <- featuresToAA(
    data = names(codon_background),
    notation_from = "codon", notation_to = "aa"
)
codon_background <- data.frame(
    group = codons_to_AA, codon = names(codon_background),
    freq = as.numeric(codon_background), row.names = NULL
)
significance <- obtainSignificance(
    dist = permutation_test, value = codon_background
)

plotPermutation(permut_data = permutation_test, sig_data = significance)

Proportion Plot of Codon/Anticodon Usage or Amino Acid Demand/Supply

Description

This function generates a proportion plot to compare codon-anticodon usage or amino acid demand-supply frequencies across conditions. The input data (data) is expected to be in long format and contain a minimum information regarding the features, the conditions they belong to and their usage counts. Based on the selected plot parameter the user can select the most convenient layout to display the data. It is crucial to ensure consistency between the name of the columns in data and the parameters var_numerical, var_categorical and var_color.

Usage

plotProportion(
  data,
  plot = "bar",
  var_numerical,
  var_categorical,
  var_color = NULL,
  facet_col = NULL,
  color_palette = NULL,
  num_limits = NULL,
  num_rings = 5,
  save_format = NULL,
  out_name = NULL,
  out_directory = NULL,
  zoom = FALSE,
  show_legend = "none",
  add_titles = TRUE,
  order = NULL,
  normalize = TRUE,
  verbose = TRUE
)

Arguments

data

A long format table. This format can be obtained using transformFormat.

plot

Either "bar" (default), "donut" or "radar" to indicate the type of plot to generate.

var_numerical

Name of the numerical variable to reflect in the plot.

var_categorical

Name of the categorical variable to reflect in the plot.

var_color

Optional; name of the categorical variable to group the data points when coloring. Required if plot is "bar".

facet_col

Optional; name of the categorical variable to divide the plot into different panels.

color_palette

Optional; a vector of color codes to customize plot appearance.

num_limits

Optional; a vector with the upper and lower ranges of the values in var_numerical.

num_rings

Optional; a number specifying the amount of rings to display if plot is "radar". Defaults to 5.

save_format

Optional; either "png" or "pdf" to specify the format to save the plot.

out_name

Optional; name for the saved plot (if save_format specified).

out_directory

Optional; path to the directory where the plot will be saved (if save_format specified).

zoom

Logical; if TRUE, centers the plot display to the values in data.Defaults to TRUE.

show_legend

Either "none" (default), "top", "bottom", "right" or "left" to specify the position of the legend.

add_titles

Logical; if TRUE, includes titles in the plot. Defaults to TRUE.

order

Optional; a vector of the levels to organize the data, based on the var_categorical.

normalize

Logical; if TRUE, normalizes the data in order to display the relative contribution of the var_categorical. Defaults to TRUE.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

A ggplot object representing the requested proportion plot. If save_format is provided, the plot will also be saved to the specified location.

Examples

data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata)

# Define the object and compute the codon usage
tTEscanR_obj <- createObject(
    counts = default_tTEscanR_mRNA_data, assay = "mRNA",
    meta.data = list(default_tTEscanR_metadata, "tissue"),
    meta.data.ids = list("ConditionsLabels", "CorrectionFactor")
)
tTEscanR_obj <- computeCodonUsage(
    object = tTEscanR_obj, species = "hg38",
    additional_metrics = TRUE, reduce = 1000
)

# Compute and extract the mean codon usage
additional_metrics <- getMetadata(
    tTEscanR_obj, "CodonUsage_AdditionalMetrics"
)
mean_codon_usage <- additional_metrics$MeanCodonUsage
mean_codon_usage$codon <- mean_codon_usage$feature

# Translate the codons to amino acids
mean_codon_usage <- featuresToAA(
    data = mean_codon_usage, position = "feature",
    notation_from = "codon", notation_to = "aa", verbose = FALSE
)

# Generate the plot
plotProportion(
    data = mean_codon_usage, plot = "bar",
    var_numerical = "mean_usage_across_conditions",
    var_categorical = "codon", var_color = "feature", show_legend = "none"
)

Distribution Plot of Codon/Anticodon Usage or Amino Acid Demand/Supply Compared to a Target

Description

This function generates a distribution plot to compare distribution usages between a targeted set of conditions against the rest. A common application of this plot is to compare mean usages across conditions. Both input data sources (target_data and overall_data) are expected to be in long format and contain a minimum information regarding the features, the conditions they belong to and their usage counts. It is crucial to ensure consistency between the name of the columns in target_data and overall_data, together with the parameters x_axis_col and y_axis_col.

Usage

plotTargetComparison(
  target_data,
  overall_data,
  x_axis_col,
  y_axis_col,
  color_palette = NULL,
  show_difference = TRUE,
  save_format = NULL,
  out_name = NULL,
  add_titles = TRUE,
  out_directory = NULL,
  show_legend = "none",
  verbose = TRUE
)

Arguments

target_data

A long format table of a condition of interest. This format can be obtained using transformFormat.

overall_data

A long format table of the whole set of conditions (with or without the data in target_data. This format can be obtained using transformFormat.

x_axis_col

Name of the categorical variable to reflect in the plot.

y_axis_col

Name of the numerical variable to reflect in the plot.

color_palette

Optional; a vector of color codes (min. 2 codes and max. 3 codes) to customize plot appearance. Colors for (i) data, (ii) target value, and (iii) difference bar.

show_difference

Logical; if TRUE, displays the difference between overall_data and target_data.

save_format

Optional; either "png" or "pdf" to specify the format to save the plot.

out_name

Optional; name for the saved plot (if save_format specified).

add_titles

Logical; if TRUE, includes titles in the plot. Defaults to TRUE.

out_directory

Optional; path to the directory where the plot will be saved (if save_format specified).

show_legend

Either "none" (default), "top", "bottom", "right" or "left" to specify the position of the legend.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

A ggplot object representing the requested distribution plot. If save_format is provided, the plot will also be saved to the specified location.


Violin Plot Displaying tTE Scores

Description

This function generates a violin plot to visualize the distribution of tTE scores across different condition. The tTE scores should be obtained using computeTheoreticalTE. The plot allows the user to easily compare the tTE distribution between different conditions, with the option to highlight specific feature clusters based on the provided targets.

Usage

plotTEscore(
  data,
  metadata,
  class_col,
  index_col,
  target_col = NULL,
  score_col = "tTE",
  cond_col = "condition",
  pval_col = "p_value",
  facet_col = NULL,
  color_palette = NULL,
  save_format = NULL,
  out_name = NULL,
  add_stats = TRUE,
  out_directory = NULL,
  verbose = TRUE,
  show_legend = "none",
  add_titles = TRUE,
  show_outliers = FALSE
)

Arguments

data

A tTE results table obtained from computeTheoreticalTE.

metadata

A table with additional information regarding the conditions in data.

class_col

Name of the categorical variable to reflect in the plot.

index_col

Name of the categorical variable that links the conditions in data with the metadata.

target_col

Optional; name of the categorical variable to perform the statistical comparison (the most specific level). Used if add_stats is TRUE.

score_col

Name of the numerical variable that contains the tTE scores in the data. Defaults to "tTE".

cond_col

Name of the categorical variable that contains the conditions in the data. Defaults to "condition".

pval_col

Name of the numerical variable that contains the significance scores in data. Defaults to "p_value".

facet_col

Optional; name of the categorical variable separate the plot into different panels.

color_palette

Optional; a vector of color codes to customize plot appearance.

save_format

Optional; either "png" or "pdf" to specify the format to save the plot.

out_name

Optional; name for the saved plot (if save_format specified).

add_stats

Logical; if TRUE, performs a statistical analysis based on the available parameters. Defaults to TRUE.

out_directory

Optional; path to the directory where the plot will be saved (if save_format specified).

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

show_legend

Either "none" (default), "top", "bottom", "right" or "left" to specify the position of the legend in the plot.

add_titles

Logical; if TRUE, includes titles in the plot. Defaults to TRUE.

show_outliers

Logical; if TRUE, labels the outlier data points based on index_col.

Value

A ggplot object representing the tTE scores. If save_format is provided, the plot will also be saved to the specified location. If add_stats reports a table with the statitical measures summarized.

Examples

data(
    default_tTEscanR_mRNA_data, default_tTEscanR_tRNA_data,
    default_tTEscanR_metadata
)

# Define the tTEscanR object
tTEscanR_obj <- createObject(
    counts = list(
        mRNA = default_tTEscanR_mRNA_data, tRNA = default_tTEscanR_tRNA_data
    ),
    meta.data = list(default_tTEscanR_metadata, "tissue"),
    meta.data.ids = list("ConditionsLabels", "CorrectionFactor")
)

# Compute the codon and anticodon usage
tTEscanR_obj <- computeCodonUsage(
    object = tTEscanR_obj, species = "hg38",
    additional_metrics = FALSE, reduce = 10000
)
tTEscanR_obj <- computeAnticodonUsage(object = tTEscanR_obj)

# Compute the theoretical translation efficiency (tTE scores)
tTEscanR_obj <- computeTheoreticalTE(
    object = tTEscanR_obj, level = "codon", compute_significance = TRUE
)
tTEresults_codon <- getMetadata(tTEscanR_obj, "tTEresults_codon")
conditions_metadata <- getMetadata(tTEscanR_obj, "ConditionsLabels")

# Visualize the tTE scores
plotTEscore(
    data = tTEresults_codon, metadata = conditions_metadata,
    index_col = "conditions", class_col = "tissue", add_stats = TRUE
)

Perform Differential Expression Analysis Using DESeq2

Description

This function applies differential expression analysis using the DESeq2 framework on a matrix of expression values. It supports both exploratory visualizations (heatmap and PCA) and targeted comparisons using a custom contrast table.

Usage

runDEAnalysis(
  list_data,
  metadata,
  batch = NULL,
  reference = NULL,
  reduce = 100,
  dim_reduct = NULL,
  color_factor = batch,
  heatmap = TRUE,
  shape_factor = NULL,
  label_factor = NULL,
  target = NULL,
  highlight_median = FALSE,
  numPC = 2,
  color_palette = NULL,
  fc_threshold = 1,
  show_legend = "none",
  padj_threshold = 0.05,
  label_significant = TRUE,
  compute_pairwise = TRUE,
  verbose = TRUE
)

Arguments

list_data

A list of matrices with features (i.e. genes, codons, anticodons or amino acids) as rows and samples or conditions as columns. The list can be named or unnamed.

metadata

A data.frame with the data associated to the conditions in the matrices of list_data. There has to be one column with the same labels as the column names.

batch

Optional; name of the categorical variable in metadata to correct the data.

reference

Optional; factor from the batch variable to use as reference for the corrections. If not specified, the 1st factor that appears will be used instead.

reduce

Numeric; a scaling factor used to normalize large expression values that exceed R's handling capacity. Defaults to 100.

dim_reduct

Either "PCA", "UMAP" or "tSNE" to specify the dimensionality reduction approach to be executed. Defaults to "PCA".

color_factor

Optional; name of the categorical variable in metadata to group by color the data points. Used if dim_reduct specified.

heatmap

Logical; if TRUE, generates a heatmap for exploratory analysis. Defaults to TRUE.

shape_factor

Optional; name of the categorical variable in metadata to group by shape the data points. Used if dim_reduct specified.

label_factor

Optional; name of the categorical variable to label the data points. Used if dim_reduct specified.

target

Optional; a factor based on metadata columns to define the comparisons to perform in a targeted analysis. Defaults to NULL.

highlight_median

Logical; if TRUE the data points of each cluster will be summarized into the median. Defaults to FALSE.

numPC

Numeric; number of principal components to include in the PCA analysis. Required if dim_reduct is "PCA". Defaults to 2.

color_palette

Optional; a vector of color codes to customize the plot appearance.

fc_threshold

Numeric; fold change threshold used for highlighting significant features in the volcano plot. Defaults to 1.

show_legend

Either "none" (default), "top", "bottom", "right" or "left" to specify the position of the legend in the plot.

padj_threshold

Numeric; p-value threshold used for highlighting significant features in the volcano plot. Defaults to 0.05.

label_significant

Logical; if TRUE displays the axis of the plots based on fc_threshold and padj_threshold. Defaults to TRUE.

compute_pairwise

Logical; if TRUE, computes all the pairwise comparisons based on the conditions included in the input data.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

A list of outputs per each matrix in list_data, based on the enabled parameters: (i) exploratory plots (heatmap and/or PCA), (ii) targeted plot (volcano), and (iii) size-corrected data.

Examples

data(default_tTEscanR_tRNA_data, default_tTEscanR_metadata)
DE_analysis <- runDEAnalysis(
    list_data = list(tRNA = default_tTEscanR_tRNA_data),
    metadata = default_tTEscanR_metadata, batch = "tissue",
    color_factor = "tissue"
)

Runs the Theoretical Translation Efficiency (tTE) Pipeline

Description

This function wraps up all the independent functions of theoretical translation efficiency pipeline. Requires an mRNA and tRNA count matrices to compute the codon and anticodon usage and further derive the amino acid demand and supply. With matching condition in the former matrices the theoretical translation efficiency would be computed.

Usage

runPipeline(
  mRNA_data,
  tRNA_data,
  metadata,
  batch,
  corr_method = "spearman",
  additional_metrics = TRUE,
  runDESeq = TRUE,
  compute_significance = TRUE,
  codon_freq = NULL,
  species = NULL,
  genetic_code = "Standard",
  dim_reduct = NULL,
  reduce = 100,
  color_factor = NULL,
  verbose = TRUE
)

Arguments

mRNA_data

A count matrix of mRNA genes (rows) per conditions (columns).

tRNA_data

A count matrix of tRNA genes (rows) per conditions (columns).

metadata

A data.frame with the meta-information related with the conditions in mRNA_data or tRNA_data. Just the intersecting conditions will be considered.

batch

A factor based on metadata columns to define the variable to correct for when size correcting the count matrices.

corr_method

A correlation method accepted by cor. Defaults to "spearman".

additional_metrics

Logical; if TRUE, computes: (i) codon exonic background, (ii) mean codon usage, and (iii) correlation between the previous metrics. Defaults to TRUE.

runDESeq

Logical; if TRUE, performs differential expression analysis to each assay in the tTEscanR_Object. Defaults to TRUE

compute_significance

Logical; if TRUE, computes the statistical significance (p-value) of the tTE scores. Defaults to TRUE.

codon_freq

Optional; a user-provided codon frequency-per-gene table. If necessary, it can be computed using getCodonFreq.

species

Either "hg38" (human) or "mm39" (mouse) to specify which default codon frequency-per-gene table to use. Required if codon_freq not provided or if the gene annotation is inconsistent between both inputs.

genetic_code

A character string to specify the genetic code to be used. Defaults to "Standard".

dim_reduct

Either "PCA", "UMAP" or "tSNE" to specify the dimensionality reduction approach to be executed if runDESeq is TRUE. Defaults to "PCA".

reduce

Numeric; a scaling factor used to normalize large expression values that exceed R's handling capacity. Defaults to 100.

color_factor

A factor based on metadata columns to define the colors in the plots. Required if runDESeq is TRUE.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

A tTEscanR_Object with the assays and metadata computed through the tTE pipeline.

Examples

data(
    default_tTEscanR_mRNA_data, default_tTEscanR_tRNA_data,
    default_tTEscanR_metadata
)
tTEscanR_obj <- runPipeline(
    mRNA_data = default_tTEscanR_mRNA_data,
    tRNA_data = default_tTEscanR_tRNA_data,
    metadata = default_tTEscanR_metadata,
    species = "hg38", batch = "tissue", additional_metrics = FALSE,
    compute_significance = FALSE, runDESeq = FALSE
)

Examine the Codon Pool Contribution

Description

This function analyzes the contribution of the most highly expressed genes to the overall codon pool across conditions. It is particularly useful for evaluating codon bias in highly expressed genes and how it varies across conditions. If needed, gene annotations can be translated for consistency, and internal species-specific for human ("hg38") and mouse ("mm39") are supported.

Usage

showPoolContribution(
  object,
  codon_freq = NULL,
  species = NULL,
  N = 10,
  corr_method = "spearman",
  overwrite = FALSE,
  verbose = TRUE
)

Arguments

object

A tTEscanR_Object containing a mRNA and CodonUsage assay.

codon_freq

Optional; a user-provided codon frequency per gene table. If necessary, it can be computed using getCodonFreq.

species

A character string specifying the species reference genome version (used if codon_freq is not provided or translate is TRUE). Supported values include "hg38" (human) and "mm39" (mouse).

N

Numeric; number of top genes to consider in the codon pool contribution. Defaults to 10.

corr_method

A correlation method accepted by cor. Defaults to "spearman".

overwrite

Logical; if TRUE, overwrites any existing anticodon usage assay and metadata in the tTEscanR_Object. Defaults to FALSE.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

An updated tTEscanR_Object containing new layers of information in the meta.data slot, representing the codon pool contribution.

Examples

data(default_tTEscanR_mRNA_data, default_tTEscanR_metadata)
tTEscanR_obj <- createObject(
    counts = list(mRNA = default_tTEscanR_mRNA_data),
    meta.data = list(default_tTEscanR_metadata, "tissue"),
    meta.data.ids = list("ConditionsLabels", "CorrectionFactor")
)
tTEscanR_obj <- computeCodonUsage(
    object = tTEscanR_obj, species = "hg38", additional_metrics = FALSE
)
tTEscanR_obj <- showPoolContribution(
    object = tTEscanR_obj, species = "hg38"
)

Transform the format of a table

Description

This function converts a matrix or data.frame into a tidy, long-format tibble. Optionally normalizes the values.

Usage

transformFormat(data, normalize, rownames_to_column, names_to, values_to)

Arguments

data

A table to be converted. Supported formats: matrix or data.frame.

normalize

Logical; if TRUE, values are converted to relative frequencies. Defaults to FALSE.

rownames_to_column

A character string specifying the name of the new column that will hold the former row names in data.

names_to

A character string specifying the name of the new column that will hold the former column names in data.

values_to

A character string specifying the name of the new column that will hold the corresponding values from the pivoted columns in data.

Value

A tibble of the input data

Examples

data(default_tTEscanR_tRNA_data)
tRNA_long_format <- transformFormat(
    data = default_tTEscanR_tRNA_data,
    normalize = FALSE,
    rownames_to_column = "tRNA_genes",
    names_to = "condition",
    values_to = "abundance"
)
tTEobj <- createObject(
    counts = default_tTEscanR_tRNA_data,
    assay = "tRNA"
)
tTEobj <- computeAnticodonUsage(object = tTEobj)
anticodon_long_format <- transformFormat(
    data = getAssay(tTEobj, "AnticodonUsage"),
    normalize = TRUE,
    rownames_to_column = "anticodons",
    names_to = "condition", values_to = "usage"
)

Filter Out Conditions With Low tRNA Cuts

Description

This function filters a tRNA expression matrix by removing conditions (columns) that fall below a specific total read count (cutoff). It is useful for eliminating low-quality or poorly sequenced conditions that may bias downstream analyses.

Usage

tRNAFilterCuts(data, cutoff = 5000, verbose = TRUE)

Arguments

data

A matrix or data.frame of tRNA gene expression data, with tRNA genes as rows and conditions as columns.

cutoff

Numeric; minimum total number of tRNA cuts required to retain a condition in data. Defaults to 5000.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

A filtered matrix or data.frame with tRNAs below the cutoff removed.

Examples

data(default_tTEscanR_tRNA_data)
tRNA_data_filtered <- tRNAFilterCuts(
    data = default_tTEscanR_tRNA_data,
    cutoff = 5000
)

Generate a tRNA expression matrix

Description

Generate a tRNA expression matrix

Usage

tRNAGetMatrix(
  data,
  assay = "peaks",
  confidence_set = NULL,
  tRNA_name_map = NULL,
  species = NULL,
  flanking_region = 100,
  name_sep = c("-", "-"),
  save = TRUE,
  out_name = NULL,
  out_directory = NULL,
  verbose = TRUE
)

Arguments

data

SummarizedExperiment, ChromatinAssay, or SeuratObject.

assay

Optional; a character string specifying the name of the assay to retrieve from data if it is a SeuratObject. Defaults to "peaks".

confidence_set

Either a file path to the tRNA annotations (confidence set file from gtRNAdb), or a GRanges object. Contains the set of high confidence tRNA genes

tRNA_name_map

Optional; a data.frame with the tRNA gene names linked to confidence_set. Contains two columns: tRNAscan-SE and GtRNAdb gene ids.

species

Optional; either "hg38" (human) or "mm39" (mouse) to load the default confidence_set

flanking_region

Integer; number of nucleotides that form the flanking region of each tRNA. Defaults to 100.

name_sep

A string delimiter to format the tRNA gene names in the output matrix. Defaults to c("-", "-").

save

Logical; if TRUE stores the generated tRNA matrix into a file.

out_name

Optional; name for the saved plot (if save specified).

out_directory

Optional; path to the directory where the plot will be saved (if save specified).

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

Sparse matrix of tRNA counts (tRNAs x cells)


Selection of the Optimal tRNA Cut Cutoff

Description

Selection of the Optimal tRNA Cut Cutoff

Usage

tRNASetCutoff(
  data,
  num_iter = 1000,
  cutoffs_limits = c(50, 10000),
  generate_plot = TRUE,
  slope_threshold = 0.001,
  rho_threshold = 0.95,
  compute_aa = FALSE,
  verbose = TRUE
)

Arguments

data

tRNA gene expression count matrix with tRNA genes as rows and conditions as columns.

num_iter

Numeric; value to select the number of iterations to perform in order to determine the optimal cutoff. Defaults to 1000.

cutoffs_limits

Minimum and maximum values to test to search for the optimal tRNA cuts threshold. Defaults to c(50, 10000).

generate_plot

Logic; if TRUE, generates a correlation plot. Defaults to TRUE.

slope_threshold

Numeric; value to consider for the determination of the correlation stability. Defaults to 0.001.

rho_threshold

Numeric; value to consider for the determination of the correlation strength. Defaults to 0.95.

compute_aa

Logic; if TRUE, computes the amino acid supply, otherwise only considers the anticodon usage. Defaults to FALSE.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

Table with the optimal cutoff at the anticodon isoacceptor and amino acid isotype.

Examples

data(default_tTEscanR_tRNA_data)
optimal_tRNA_cutoffs <- tRNASetCutoff(
    data = default_tTEscanR_tRNA_data,
    generate_plot = FALSE, num_iter = 50,
    cutoffs_limits = c(3500, 4000)
)

Annotate the tRNA genes from tRNA tags

Description

Annotate the tRNA genes from tRNA tags

Usage

tRNASetGenes(data, tRNA_bed, flanking_region = 100, name_sep = c("-", "-"))

Arguments

data

A matrix with tRNA tags as rows.

tRNA_bed

Path to the directory that contains the .bed file

flanking_region

Numeric; number of bases to include expand the region interrogated. Defaults to 100.

name_sep

A string delimiter to format the tRNA gene names in the output matrix. Defaults to c("-", "-").

Value

A data with the translated tRNA gene names.


The tTEscanR Class

Description

The tTEscanR object is dynamically updated to store assays and meta.data at each analysis step. Ensures efficient tracking and organization of inputs and outputs throughout the pipeline. In order to ensure robustness throughout the pipeline, specific ids have been assigned and should be respected by the user.

Slots

assays

A list of assays.

meta.data

A list of meta-information associated with the assays.


tTEscanR Object Update

Description

Updates an existing tTEscanR_object using createObject. For more details, refer to createObject.

Usage

updateObject(
  object,
  counts = NULL,
  assay = NULL,
  main_name = NULL,
  meta.data = NULL,
  meta.data.ids = NULL,
  overwrite = FALSE,
  verbose = TRUE
)

Arguments

object

An existing tTEscanR_Object.

counts

Optional; a count matrix (or list of matrices) that will be stored in the assays slot of the input object. Supported formats: matrix, data.frame and list.

assay

Optional; a character string (or list of strings) specifying the name of the counts. Supported formats: character and list.

main_name

Optional; a character string specifying the name of the meta.data if dealing with a list that needs to be added as a single element.

meta.data

Optional; a list with additional information that will be stored in meta.data slot of the input object.

meta.data.ids

Optional; a list with the labels to identify the meta.data (if meta.data is given).

overwrite

Logical; if TRUE, overwrites any existing assay or metadata in the object if the assay or meta.data.ids label coincides. Defaults to FALSE.

verbose

Logical; if TRUE, displays information messages. Defaults to TRUE.

Value

An updated tTEscanR_Object.

Examples

data(default_tTEscanR_mRNA_data, default_tTEscanR_tRNA_data)
tTEscanR_obj <- createObject(
    counts = default_tTEscanR_mRNA_data,
    assay = "mRNA"
)
tTEscanR_obj <- updateObject(
    object = tTEscanR_obj,
    counts = default_tTEscanR_tRNA_data,
    assay = "tRNA"
)