Differential Gene Expression

This modules enables users to run differential gene expression analysis.

Command Form

The command form captures the information required to run the DGE analysis.

  • Algorithm (required) - choice between limma (Microarray), limma voom, DESeq2, EdgeR GLM (bulk RNA-seq), wilcoxon (single-cell RNA-seq)

  • Selection Type (required) - observation or data slice

    • If Observation

      • Group By (required) - the observation used to separate the case and control groups (e.g.: disease)

      • Group (required) - value for the observation that represents the case (e.g.: lupus)

      • Reference (required) - value for the observation that represents the control (e.g.: healthy)

    • If Data Slice:

      • Case Group (required) - an existing data slice that represents the case group

      • Control Group (required) - an existing data slice that represents the control group

  • p-value cufoff (required) - the adjusted p-value threshold to use for heatmap and volcano plotting

  • Min abs log fold change (required) - the minimum absolute value fold fold change threshold for heatmap and volcano plotting

  • Run GSEA with pre-ranked correlation - whether to chain a GSEA command

  • Gene Set Collections (conditionally required) - the gene set collections to use to perform GSEA (e.g.: KEGG, Reactome)

Observation based group selection

When choosing the Observation selection type, the user needs to define the case and control groups using an observation field (Group By) and 2 values, one for the case (Group) and one for the control (Reference).

Example:

  • Group By: disease

  • Group: lupus

  • Reference: healthy

In Panomics, it's possible to submit multiple DGE computations with one form submission by using the each value. For single-cell RNA-seq analyses, the Reference field can also be set to rest , which results in a comparison between the case, usually a cluster, and all the other cells.

This form submission will result in "number of unique values of Predicted Cell Type observation" comparisons. Concretely, if there are 15 cell types, this will yield 15 comparison results.

Data slice based group selection

When choosing the Data Slice selection type, the user needs to define the case and control groups by creating data slices. Learn more about managing data slices here.

Results

Results can be displayed in either a tile view or a grid view. Use the buttons next to the search bar to toggle the mode.

The quick search bar is visible only for tile view and it offers an easy way to get to any result. The name of the result is used for searching.

Tile View

Each result in the tile view is a volcano plot that uses the command's p-value and logFC thresholds.

In the top left corner, we display the case N vs control N. In the top right corner, we display the number of up regulated and down regulated genes. To view a result, click on a tile.

Grid View

Each row in the grid represents a result. To view a result, click on a row.

Result Details View

The differential gene expression result page contains 5 tabs and includes an extra filtering option in the header.

By using the Hide unannotated switch control, the user can opt to remove LOC or Rik genes from the result.

Heatmap

The heatmap tab displays an interactive heatmap plot of the differentially expressed genes matching the desired adjusted p-value and log fold change thresholds. Users can quickly filter the plot by changing the thresholds or by inputing genes (manually or by selecting a gene list).

Volcano plot

The volcano plot tab displays an interactive volcano plot of the differentially expressed genes matching the desired adjusted p-value and log fold change thresholds. Users can quickly filter the plot by changing the thresholds or by inputing genes (manually or by selecting a gene list). Using the Save Genes button in the bottom left corner, users can create gene lists with the genes present in the viewport.

Violin plot

The violin plot tab enables the user to request the plotting of the desired genes. As these plots are generated on-demand, the plotting may take around 20-30 seconds to finish.

Data

The data tab displays the unfiltered differential gene expression result data. Users can use the filtering mechanisms offered by the grid to isolate genes based on p-value, log fold change or others. The Save Genes toolbar button enables users to quickly create gene lists.

Similarity

The similarity tab displays the relationship of the current comparison, also known as a gene signature in Panomics, with respect to others in the data lake. Learn more about gene signature similarity here.

Video demonstration

TODO

Last updated