Differential Gene Expression
Last updated
Last updated
This modules enables users to run differential gene expression analysis.
The command form captures the information required to run the DGE analysis.
Algorithm (required) - choice between limma
(Microarray), limma voom
, DESeq2
, EdgeR GLM
(bulk RNA-seq), wilcoxon
(single-cell RNA-seq)
Selection Type (required) - observation or data slice
If
Group By (required) - the observation used to separate the case and control groups (e.g.: disease)
Group (required) - value for the observation that represents the case (e.g.: lupus)
Reference (required) - value for the observation that represents the control (e.g.: healthy)
If :
Case Group (required) - an existing data slice that represents the case group
Control Group (required) - an existing data slice that represents the control group
p-value cufoff (required) - the adjusted p-value threshold to use for heatmap and volcano plotting
Min abs log fold change (required) - the minimum absolute value fold fold change threshold for heatmap and volcano plotting
Run GSEA with pre-ranked correlation - whether to chain a command
Gene Set Collections (conditionally required) - the gene set collections to use to perform GSEA (e.g.: KEGG, Reactome)
When choosing the Observation
selection type, the user needs to define the case and control groups using an observation field (Group By) and 2 values, one for the case (Group) and one for the control (Reference).
Example:
Group By: disease
Group: lupus
Reference: healthy
In Panomics, it's possible to submit multiple DGE computations with one form submission by using the each
value. For single-cell RNA-seq analyses, the Reference
field can also be set to rest
, which results in a comparison between the case, usually a cluster, and all the other cells.
This form submission will result in "number of unique values of Predicted Cell Type observation" comparisons. Concretely, if there are 15 cell types, this will yield 15 comparison results.
Each result in the tile view is a volcano plot that uses the command's p-value and logFC thresholds.
In the top left corner, we display the case N vs control N. In the top right corner, we display the number of up regulated and down regulated genes. To view a result, click on a tile.
Each row in the grid represents a result. To view a result, click on a row.
The differential gene expression result page contains 5 tabs and includes an extra filtering option in the header.
By using the Hide unannotated
switch control, the user can opt to remove LOC or Rik genes from the result.
The heatmap tab displays an interactive heatmap plot of the differentially expressed genes matching the desired adjusted p-value and log fold change thresholds. Users can quickly filter the plot by changing the thresholds or by inputing genes (manually or by selecting a gene list).
The volcano plot tab displays an interactive volcano plot of the differentially expressed genes matching the desired adjusted p-value and log fold change thresholds. Users can quickly filter the plot by changing the thresholds or by inputing genes (manually or by selecting a gene list). Using the Save Genes
button in the bottom left corner, users can create gene lists with the genes present in the viewport.
The violin plot tab enables the user to request the plotting of the desired genes. As these plots are generated on-demand, the plotting may take around 20-30 seconds to finish.
The data tab displays the unfiltered differential gene expression result data. Users can use the filtering mechanisms offered by the grid to isolate genes based on p-value, log fold change or others. The Save Genes
toolbar button enables users to quickly create gene lists.
TODO
When choosing the Data Slice
selection type, the user needs to define the case and control groups by creating data slices.
Results can be displayed in either a or a . Use the buttons next to the search bar to toggle the mode.
The quick search bar is visible only for and it offers an easy way to get to any result. The name of the result is used for searching.
The similarity tab displays the relationship of the current comparison, also known as a gene signature in Panomics, with respect to others in the data lake.