---
title: "Co-culture screen"
output:
  rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Co-culture screen}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{css echo=FALSE}
img {
    border: 0px !important;
    margin: 2em 2em 2em 2em !important;
}
code {
    border: 0px !important;
}
```

```{r echo=FALSE, results="hide"}
knitr::opts_chunk$set(
    cache = FALSE,
    echo = TRUE,
    collapse = TRUE,
    comment = "#>"
)
suppressPackageStartupMessages(library(pepitope))
```

This guide shows how to calculate differential construct abundance and visualize
the results. To calculate changes in barcode abundance, we use the `DESeq2` package,
commonly used in differential expression analysis. Finally, we provide a utility
function to visualize the results as an MA-plot (base abundance on the x axis,
log2 fold change on the y axis).

Preparation
-----------

In order to perform this analysis, we need the `SummarizedExperiment` (`dset`)
object from the *Quality Control* results, and we need to know which comparisons
we want to perform.

From the *Quality Control* step, we know that we have two repeats each of `Mock`
and `Sample` for a patient-specific (`pat1`) and a common library (`common`).
These two repeats are important, as we need to estimate within-condition
variability as well as between-condition variability.

```{r echo=FALSE, message=FALSE, results="hide"}
valid_barcodes = example_barcodes(1000)
all_constructs = example_peptides(valid_barcodes)
sample_sheet = system.file("my_samples.tsv", package="pepitope")
fastq_file = example_fastq(sample_sheet, all_constructs)
dset = count_fastq(fastq_file, sample_sheet, all_constructs, valid_barcodes,
                   read_structure="7B12M+T")
```

However, we need to make sure to perform the screen analysis only on
relevant samples, otherwise the model may mis-estimate the variability:

```{r}
dset = dset[,grepl("pat1", dset$patient)]
colData(dset)
```

Calculating differential abundance
----------------------------------

When we want to perform a comparison between two conditions, we refer in this
comparison to the `origin` column of the sample sheet. In our case, we have the
origin `Mock` and `Sample`, which describe an experiment of mock-transfected
and co-cultured cells and cells transfected with the actualy construct library,
respectively.

Hence, our comparison here is that we want to see the changes of `Sample` over
the `Mock` condition, which we indicate as a character vector of
`c(sample, reference)` or a list thereof.

In case of a single comparison (character vector), a `data.frame` will be
returned. If there are more comparisons supplied in a list of character vectors,
the result will be a list of `data.frame`s:

```{r}
res = screen_calc(dset, list(c("Sample", "Mock")))
```

The result will look like the following:

```{r}
res[[1]]
```

Plotting the screen results
---------------------------

We can plot the result of this differential abundance analysis using the
`plot_screen` function. The only argument we need supply is a results table,
but we can fine-tune the plot with the following additional arguments:

* `sample` -- which library (or libraries) to plot (default: all)
* `links` -- whether to draw arrows between ref and significant alt peptides
* `labs` -- whether to label constructs in less dense areas of the plot
* `min_reads` -- a minimum read count to consider a data point for plotting
* `cap_fc` -- a maximum (and negative minimum) fold change value for data points
* `p_sig` -- a p-value threshold to consider a data point significant for labeling and linking

```{r}
plot_screen(res$`Sample vs Mock`, min_reads=100)
```

As with the *Quality Control* plots already, we can also create and interactive plot.
This will not display links between `ref` and `alt` peptides, but instead highlight
a construct group (by `mut_id`) when interacting with a data point.