tTEscanR User Guide


Overview

tTEscanR is a powerful, versatile and user-friendly R package designed to quantify and analyze the relationship between codon usage in mRNA and the availability of corresponding anticodons in tRNA. The package computes a theoretical translation efficiency (tTE) score as a proxy of translation elongation efficiency, hereafter referred to as translation efficiency.

In this document, we present a case example to demonstrate the potential of tTEscanR.

# install.packages("/avarassanchez/tTEscanR")
library(tTEscanR)
library(dplyr)

Workflow

tTEscanR features a modular structure that enables running specific components independently or as part of a comprehensive pipeline. This design provides flexibility to enhance and complement the analysis of codon-anticodon dynamics across diverse biological contexts.

1. Loading the data

tTEscanR supports both gene expression and chromatin accessibility profiling data. The accepted mRNA and tRNA inputs consist of pre-processed gene expression count matrices, where features (e.g. genes or transcripts) are organized as rows and conditions (e.g. samples, replicates, or individual cells) as columns. The package is optimized for bulk and single-cell datasets. The datasets should be loaded according to their respective data files formats.

In this tutorial, we will analyze a single-cell fetal human atlas described in (Cao et al. 2020) and (Domcke et al. 2020), and previously examined by (Gao et al. 2022). A this data and a subset of it are included in tTEscanR and can be directly loaded.

data(default_tTEscanR_mRNA_data)

Dimensions: 9900 genes (rows) x 172 cell types (columns)

Rows: The genes are expressed in the gene name format (e.g. GATSL1)

Columns: The cell type labels are composed of two parts: tissue - cell type (e.g. Adrenal-Adrenocortical cells)

data(default_tTEscanR_tRNA_data)

Dimensions: 377 tRNA genes (rows) x 89 cell types (columns)

Rows: The tRNA genes labels are: tRNA - Amino acid - Anticodon - Identifier number (e.g. tRNA-Asn-GTT-5-1)

Columns: The cell types labels have the same format as described for the mRNA data

2. Setup the tTEscanR object

2.1 Pre-processing

The pre-processing module formats and standardizes input matrices to ensure they are structured correctly for reliable analysis through the pipeline.

The tRNACutsFilter() function filters out samples or conditions with low total tRNA expression, helping to ensure overall data quality.

filtered_tRNA_data <- tRNAFilterCuts(
    data = default_tTEscanR_tRNA_data, cutoff = 5000
)

2.2. Defining the tTEscanR object

The tTEscanR object is a centralized data structure that stores input matrices, metadata, and results, continuously updated to ensure consistency across the pipeline. In order to ensure robustness throughout the pipeline specific ids have been assigned and should be respected by the user (see the documentation for more details).

The createObject() function initializes a new tTEscanR object to store and organize analysis data. The input can be either a single matrix or a list of matrices (to support multiple datasets), and may optionally include metadata. For proper functionality, all input matrices must be appropriately named. To modify, extend, or update an existing tTEscanR object with new data or metadata, use updateObject().

data(default_tTEscanR_metadata)
# Adding the mRNA and tRNA datasest to the object
tTEobject <- createObject(
    counts = list(default_tTEscanR_mRNA_data, filtered_tRNA_data),
    assay = list("mRNA", "tRNA"),
    meta.data = default_tTEscanR_metadata, meta.data.ids = "ConditionsLabels"
)
# Updating the object created before some metadata for reference
matching_celltypes <- intersect(
    colnames(default_tTEscanR_mRNA_data), colnames(filtered_tRNA_data)
)

tTEobject <- updateObject(
    object = tTEobject, meta.data = matching_celltypes,
    meta.data.ids = "matching_celltypes", overwrite = TRUE
)

Each component of a tTEscanR object can be accessed using the getAssays() or getMetadata() functions that requires the object and the name of the slot that wants to be retrieved.

3. Standard workflow

The analysis can be carried out across three hierarchical layers of information: gene expression, codon and anticodon pool, and amino acid level. This multi-layered approach provides a comprehensive view of translation efficiency.

3.1. Codon usage assessment

Codon usage is computed by performing a matrix multiplication between the mRNA expression data and a codon frequency-per-gene reference matrix. This reference matrix can be generated using obtainCodonComposition() or alternatively, a user-defined codon frequency matrix can be supplied directly, providing flexibility for custom analyses.

The reference codon frequency-per-gene matrix represents the codon distribution of each protein-coding gene in a reference genome.

For more details, please refer to the dedicated codon frequency vignette.

The computeCodonUsage() function calculates codon usage by multiplying an mRNA expression matrix with a codon frequency-per-gene table. The resulting matrix contains codons as rows and samples or conditions as columns.

The codon frequency table can either be: (i) provided directly (e.g. computed previously using obtainCodonComposition()), or (ii) loaded from the built-in defaults available for human and mouse.

In addition to generating the codon usage matrix, computeCodonUsage() can optionally compute the following:

  • Codon exonic background: genome-wide codon composition calculated across all genes.

  • Mean codon usage: average codon usage across all conditions or samples.

  • Exonic background and mean usage correlation: metric used to assess bias in codon usage relative to the underlying genomic codon composition.

# We first need to add the correction factor to the tTEscanR object
# It has to be stored as CorrectionFactor
tTEobject <- updateObject(
    object = tTEobject, meta.data = "tissue", meta.data.ids = "CorrectionFactor"
)

tTEobject <- computeCodonUsage(
    object = tTEobject, codon_freq = NULL, species = "hg38",
    additional_metrics = TRUE, overwrite = TRUE
)
# Transforming the data
additional_metrics <- getMetadata(tTEobject, "CodonUsage_AdditionalMetrics")
mean_codon_usage <- additional_metrics$MeanCodonUsage
exonic_background <- additional_metrics$CodonExonicBackground
exonic_background <- as.data.frame(exonic_background)
correlation_mean_background <- cbind(mean_codon_usage, exonic_background)

plotCorrelation(
    data = correlation_mean_background, plot = "MeanCodonUsage",
    x_axis_col = "mean_usage_across_conditions",
    y_axis_col = "exonic_background",
    extra_val = additional_metrics$MeanCodonCorr,
    condition_col = "feature", # Here feature = codons
    add_titles = TRUE, show_legend = "none"
)

You can further evaluate the codon usage output using showPoolContribution(), which quantifies the contribution of the most highly expressed genes to the overall codon pool across different conditions. This analysis helps identify whether codon usage is dominated by a small subset of highly expressed transcripts or is broadly distributed across the transcriptome.

tTEobject <- showPoolContribution(
    object = tTEobject, N = 10, species = "hg38", overwrite = TRUE
)
# Transforming the data
codon_pool_contr <- getMetadata(tTEobject, "CodonPoolContribution_Results")
codon_pool_diversity <- codon_pool_contr$top10GenesCodonPoolDiversity
colnames(codon_pool_diversity) <- c(
    "condition", "original_top_contribution", "baseline_correlation"
)
codon_pool_diversity <- codon_pool_diversity %>%
    tidyr::separate(
        .data$condition,
        into = c("tissue", "cell_type"), sep = "-"
    )

plotCorrelation(
    data = codon_pool_diversity, plot = "PoolDiversity",
    x_axis_col = "original_top_contribution",
    y_axis_col = "baseline_correlation",
    condition_col = "tissue", label_col = "cell_type", show_legend = "right"
)

The outputs generated during the execution of tTEscanR can be transformed into comprehensive visualizations to support data interpretation and exploration. A variety of plotting functions are available in tTEscanR to represent codon usage patterns, gene contribution, and other key metrics.

For more details, please refer to the dedicated visualization vignette.

3.2. Anticodon usage assessment

The computeAnticodonUsage() function calculates anticodon usage by aggregating tRNA expression data at the anticodon level. Analogous to computeCodonUsage(), the resulting matrix contains anticodons as rows and samples or conditions as columns.

tTEobject <- computeAnticodonUsage(object = tTEobject)

3.3. Amio acid level assessment

The computeAAUsage() function computes amino acid demand and supply by integrating codon and anticodon usage data, respectively. Users can choose to calculate demand and supply either separately or together.

# Computing AA demand
tTEobject <- computeAAUsage(object = tTEobject, level = "demand")
# Computing AA supply
tTEobject <- computeAAUsage(object = tTEobject, level = "supply")
# Computing simultaneously AA demand and supply
tTEobject <- computeAAUsage(
    object = tTEobject, level = "both",
    overwrite = TRUE
)

3.4. Theoretical Translation Efficiency (tTE) computation

The computeTheoreticalTE() function calculates the Theoretical Translation Efficiency (tTE) by measuring the correlation between: (i) codon usage and anticodon availability, or (ii) amino acid demand and amino acid supply. Users can compute these correlations separately or in combination. To ensure accurate correlation between these data sources, it is crucial that the mRNA and tRNA datasets share matching conditions (i.e. identical column names representing the same samples or groups).

# Computing tTE at the codon-anticodon level
tTEobject <- computeTheoreticalTE(object = tTEobject, level = "codon")
# Computing tTE at the AA demand-supply level
tTEobject <- computeTheoreticalTE(object = tTEobject, level = "aa")
# Computing simultaneously tTE at codon-anticodon and AA demand-supply levels
tTEobject <- computeTheoreticalTE(
    object = tTEobject, level = "both", overwrite = TRUE
)
#>   |                                                                              |                                                                      |   0%  |                                                                              |==                                                                    |   2%  |                                                                              |===                                                                   |   5%  |                                                                              |=====                                                                 |   7%  |                                                                              |=======                                                               |  10%  |                                                                              |=========                                                             |  12%  |                                                                              |==========                                                            |  15%  |                                                                              |============                                                          |  17%  |                                                                              |==============                                                        |  20%  |                                                                              |===============                                                       |  22%  |                                                                              |=================                                                     |  24%  |                                                                              |===================                                                   |  27%  |                                                                              |====================                                                  |  29%  |                                                                              |======================                                                |  32%  |                                                                              |========================                                              |  34%  |                                                                              |==========================                                            |  37%  |                                                                              |===========================                                           |  39%  |                                                                              |=============================                                         |  41%  |                                                                              |===============================                                       |  44%  |                                                                              |================================                                      |  46%  |                                                                              |==================================                                    |  49%  |                                                                              |====================================                                  |  51%  |                                                                              |======================================                                |  54%  |                                                                              |=======================================                               |  56%  |                                                                              |=========================================                             |  59%  |                                                                              |===========================================                           |  61%  |                                                                              |============================================                          |  63%  |                                                                              |==============================================                        |  66%  |                                                                              |================================================                      |  68%  |                                                                              |==================================================                    |  71%  |                                                                              |===================================================                   |  73%  |                                                                              |=====================================================                 |  76%  |                                                                              |=======================================================               |  78%  |                                                                              |========================================================              |  80%  |                                                                              |==========================================================            |  83%  |                                                                              |============================================================          |  85%  |                                                                              |=============================================================         |  88%  |                                                                              |===============================================================       |  90%  |                                                                              |=================================================================     |  93%  |                                                                              |===================================================================   |  95%  |                                                                              |====================================================================  |  98%  |                                                                              |======================================================================| 100%
#>   |                                                                              |                                                                      |   0%  |                                                                              |==                                                                    |   2%  |                                                                              |===                                                                   |   5%  |                                                                              |=====                                                                 |   7%  |                                                                              |=======                                                               |  10%  |                                                                              |=========                                                             |  12%  |                                                                              |==========                                                            |  15%  |                                                                              |============                                                          |  17%  |                                                                              |==============                                                        |  20%  |                                                                              |===============                                                       |  22%  |                                                                              |=================                                                     |  24%  |                                                                              |===================                                                   |  27%  |                                                                              |====================                                                  |  29%  |                                                                              |======================                                                |  32%  |                                                                              |========================                                              |  34%  |                                                                              |==========================                                            |  37%  |                                                                              |===========================                                           |  39%  |                                                                              |=============================                                         |  41%  |                                                                              |===============================                                       |  44%  |                                                                              |================================                                      |  46%  |                                                                              |==================================                                    |  49%  |                                                                              |====================================                                  |  51%  |                                                                              |======================================                                |  54%  |                                                                              |=======================================                               |  56%  |                                                                              |=========================================                             |  59%  |                                                                              |===========================================                           |  61%  |                                                                              |============================================                          |  63%  |                                                                              |==============================================                        |  66%  |                                                                              |================================================                      |  68%  |                                                                              |==================================================                    |  71%  |                                                                              |===================================================                   |  73%  |                                                                              |=====================================================                 |  76%  |                                                                              |=======================================================               |  78%  |                                                                              |========================================================              |  80%  |                                                                              |==========================================================            |  83%  |                                                                              |============================================================          |  85%  |                                                                              |=============================================================         |  88%  |                                                                              |===============================================================       |  90%  |                                                                              |=================================================================     |  93%  |                                                                              |===================================================================   |  95%  |                                                                              |====================================================================  |  98%  |                                                                              |======================================================================| 100%
conditions_metadata <- getMetadata(tTEobject, "ConditionsLabels")
tTEresults_codon <- getMetadata(tTEobject, "tTEresults_codon")
tTEresults_AA <- getMetadata(tTEobject, "tTEresults_AA")

plotTEscore(
    data = tTEresults_codon, metadata = conditions_metadata,
    index_col = "conditions", class_col = "tissue", add_stats = FALSE
)
#> Warning in defineMergedData(data = data, meta = metadata, index = index_col, :
#> One or more groups have fewer than 2 samples. Statistics (p-values) will return
#> NA.
#> $plot
#> Warning: Groups with fewer than two datapoints have been dropped.
#> ℹ Set `drop = FALSE` to consider such groups for position adjustment purposes.
#> Warning: Groups with fewer than two datapoints have been dropped.
#> ℹ Set `drop = FALSE` to consider such groups for position adjustment purposes.

#> 
#> $stats
#> NULL

plotTEscore(
    data = tTEresults_AA, metadata = conditions_metadata,
    index_col = "conditions", class_col = "tissue", add_stats = FALSE
)
#> Warning in defineMergedData(data = data, meta = metadata, index = index_col, :
#> One or more groups have fewer than 2 samples. Statistics (p-values) will return
#> NA.
#> $plot
#> Warning: Groups with fewer than two datapoints have been dropped.
#> ℹ Set `drop = FALSE` to consider such groups for position adjustment purposes.
#> Groups with fewer than two datapoints have been dropped.
#> ℹ Set `drop = FALSE` to consider such groups for position adjustment purposes.

#> 
#> $stats
#> NULL

For visualization purposes, a set of target conditions (e.g. a specific group of cells) can be defined, allowing comparison of their tTE scores against those of all other conditions in the dataset. In this example, we focus on neurons as the target group but exclude the ENS neurons from the selection to refine the analysis.

conditions_metadata$group <- "other"
conditions_metadata$group[grep(
    "neuron", conditions_metadata$conditions
)] <- "neurons"
conditions_metadata$group[grep(
    "ENS neuron", conditions_metadata$conditions
)] <- "other"
# Use tTEresults_codon to assess the codon-anticodon level
plotTEscore(
    data = tTEresults_AA, metadata = conditions_metadata,
    index_col = "conditions", class_col = "group", add_stats = TRUE
)
#> $plot

#> 
#> $stats
#>    group1 group2   p_value       comparison p_signif class
#> 1 neurons  other 0.2705441 neurons_vs_other       ns other

4. Differential expression analysis

The runDEAnalysis() function performs differential expression analysis with DESeq2 and generates multiple plots to display the results. When datasets share the same conditions and name_sep settings, they can be processed together in a single run. The input to this function must be a list of matrices.

# Other outputs that could be analyzed:
# mRNA <- getAssay(tTEobject, "mRNA")
# CodonUsage <- getAssay(tTEobject, "CodonUsage")
# tRNA <- getAssay(tTEobject, "tRNA")
# AnticodonUsage <- getAssay(tTEobject, "AnticodonUsage")

AA_results <- list(
    AADemand = getAssay(tTEobject, "AADemand"),
    AASupply = getAssay(tTEobject, "AASupply")
)

The outputs of the runDEAnalysis() function vary depending on the parameters enabled. In this example, the results include: (i) a heatmap, (ii) PCA plots (based on the selected number of principal components), and (iii) the size corrected input matrix. A separate list of outputs is returned for each matrix included in the input list.

all_DESeq2_results <- runDEAnalysis(
    list_data = AA_results, metadata = metadata, heatmap = TRUE,
    dim_reduct = "PCA", numPC = 2, batch = "tissue",
    color_factor = "tissue", show_legend = "right", label_factor = "cell.type"
)

grid.draw(all_DESeq2_results$plots$AADemand$heatmap) # Visualize heatmap plot
all_DESeq2_results$plots$AADemand$exploratory$ElbowPlot # Visualize elbow plot
all_DESeq2_results$plots$AADemand$exploratory$PC1_vs_PC2 # Visualize PCA plot

5. References

#> R version 4.6.1 (2026-06-24)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 26.04 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.32.so;  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] dplyr_1.2.1      biomaRt_2.69.0   tTEscanR_0.99.0  BiocStyle_2.41.0
#> 
#> loaded via a namespace (and not attached):
#>  [1] tidyselect_1.2.1            viridisLite_0.4.3          
#>  [3] farver_2.1.2                blob_1.3.0                 
#>  [5] filelock_1.0.3              Biostrings_2.81.3          
#>  [7] S7_0.2.2                    fastmap_1.2.0              
#>  [9] BiocFileCache_3.3.0         digest_0.6.39              
#> [11] lifecycle_1.0.5             KEGGREST_1.53.4            
#> [13] RSQLite_3.53.2              magrittr_2.0.5             
#> [15] compiler_4.6.1              rlang_1.2.0                
#> [17] sass_0.4.10                 progress_1.2.3             
#> [19] tools_4.6.1                 yaml_2.3.12                
#> [21] ggsignif_0.6.4              knitr_1.51                 
#> [23] labeling_0.4.3              prettyunits_1.2.0          
#> [25] S4Arrays_1.13.0             bit_4.6.0                  
#> [27] curl_7.1.0                  DelayedArray_0.39.3        
#> [29] RColorBrewer_1.1-3          abind_1.4-8                
#> [31] BiocParallel_1.47.0         purrr_1.2.2                
#> [33] withr_3.0.3                 BiocGenerics_0.59.8        
#> [35] sys_3.4.3                   grid_4.6.1                 
#> [37] stats4_4.6.1                ggpubr_0.6.3               
#> [39] ggplot2_4.0.3               scales_1.4.0               
#> [41] SummarizedExperiment_1.43.0 cli_3.6.6                  
#> [43] rmarkdown_2.31              crayon_1.5.3               
#> [45] generics_0.1.4              otel_0.2.0                 
#> [47] httr_1.4.8                  DBI_1.3.0                  
#> [49] cachem_1.1.0                stringr_1.6.0              
#> [51] parallel_4.6.1              AnnotationDbi_1.75.0       
#> [53] BiocManager_1.30.27         XVector_0.53.0             
#> [55] matrixStats_1.5.0           vctrs_0.7.3                
#> [57] Matrix_1.7-5                carData_3.0-6              
#> [59] jsonlite_2.0.0              car_3.1-5                  
#> [61] IRanges_2.47.2              hms_1.1.4                  
#> [63] S4Vectors_0.51.5            rstatix_0.7.3              
#> [65] ggrepel_0.9.8               bit64_4.8.2                
#> [67] Formula_1.2-5               maketools_1.3.2            
#> [69] locfit_1.5-9.12             tidyr_1.3.2                
#> [71] jquerylib_0.1.4             glue_1.8.1                 
#> [73] codetools_0.2-20            stringi_1.8.7              
#> [75] gtable_0.3.6                GenomicRanges_1.65.0       
#> [77] tibble_3.3.1                pillar_1.11.1              
#> [79] rappdirs_0.3.4              htmltools_0.5.9            
#> [81] Seqinfo_1.3.0               R6_2.6.1                   
#> [83] dbplyr_2.6.0                httr2_1.2.3                
#> [85] evaluate_1.0.5              lattice_0.22-9             
#> [87] Biobase_2.73.1              backports_1.5.1            
#> [89] png_0.1-9                   broom_1.0.13               
#> [91] memoise_2.0.1               bslib_0.11.0               
#> [93] Rcpp_1.1.1-1.1              SparseArray_1.13.2         
#> [95] DESeq2_1.53.0               xfun_0.59                  
#> [97] MatrixGenerics_1.25.0       buildtools_1.0.0           
#> [99] pkgconfig_2.0.3
Cao, J., D. R. O’Day, H. A. Pliner, et al. 2020. “A Human Cell Atlas of Fetal Gene Expression.” Science 370 (6518). https://doi.org/10.1126/science.aba7721.
Domcke, S., A. J. Hill, R. M. Daza, et al. 2020. “A Human Cell Atlas of Fetal Chromatin Accessibility.” Science 370 (6518). https://doi.org/10.1126/science.aba7612.
Gao, W., C. J. Gallardo-Dodd, and C. Kutter. 2022. “Cell Type-Specific Analysis by Single-Cell Profiling Identifies a Stable Mammalian tRNA-mRNA Interface and Increased Translation Efficiency in Neurons.” Genome Research 32 (1): 97–110. https://doi.org/10.1101/gr.275944.121.