# Sample Ingestion

Panomics supports the ingestion of transcriptomics samples from local and remote sources. The ingestion process requires a [Standard Compute](https://documentation.panomics.bio/documentation/compute-runtimes/codeless-compute) runtime.

## Ingestion Workflow

Sample ingestion is a straightforward, multi-step process.&#x20;

{% hint style="info" %}
We recommend users upload samples in a project. This makes tracking and metadata setting simpler.
{% endhint %}

{% stepper %}
{% step %}

### Create a project for your study

Once the project is created, go to the `Samples` tab and click on the `Import Samples` button in the grid toolbar.
{% endstep %}

{% step %}

### Select the sample type

Choose one of the supported modalities.
{% endstep %}

{% step %}

### Select the sample file mode

In Panomics, the concept of file mode was introduced to distinguish between one file per dataset (gene by samples or cells) and one file per sample (output from Salmon quantification).

`single` - each sample will be uploaded as an individual file

`combined` - all samples are represented in a single file
{% endstep %}

{% step %}

### Select the sample source

Users can import samples from their local disk, Panomics assets, AWS S3 or DNAnexus. For AWS and DNAnexus, an [external API key](https://documentation.panomics.bio/documentation/api-keys#external) must be configured.
{% endstep %}

{% step %}

### Select the files

Depending on the sample source, this step will differ.

1. **Local disk** - drop your files in the upload area.
2. **Panomics Assets** - select the Panomics project and then select the files you wish to import.
3. **AWS S3** - input the name of the S3 bucket, the prefix, and a regex to select only the files you intend to import.
4. **DNAnexus** - select the DNAnexus project and a regex to select only the files you intend to import.
   {% endstep %}

{% step %}

### Input sample attributes

1. Batch ID - auto-generated ID to help you identify the sample batch easily.
2. Organism - the organism of the sample donors.
3. Gene column - the name of the column that refers to the gene ID or gene symbol. Supported IDs are Ensemble and Entrez. Applies to CSV file inputs.
4. Normalized counts - whether you are uploading normalized counts.
5. Strip prefix characters - whether you would like the platform to strip one or more characters from the beginning of the sample names. This is useful if your sample names are numeric only, but the output was generated in R, which adds an X.
6. First sample column index - for combined samples, indicate what column index represents the first sample (zero-based).
   {% endstep %}
   {% endstepper %}

## Supported file inputs

### Microarray

Panomics supports the ingestion of normalized intensities files that have been mapped to gene IDs or gene symbols. Only the `combined` sample file mode is accepted.

Accepted formats: .txt, .csv, .tsv, .gz

**Example file:**

<figure><img src="https://3894776587-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUZMYHWMJyvkxhr9s6Lm5%2Fuploads%2FSofOWIA2agE1IF4BNLWG%2FScreenshot%202024-10-22%20at%2017.43.00.png?alt=media&#x26;token=1983446c-b704-4350-bc05-48ad4fb96e93" alt=""><figcaption></figcaption></figure>

{% file src="<https://3894776587-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUZMYHWMJyvkxhr9s6Lm5%2Fuploads%2Fe3UlUQejoqcb1iJ81pTi%2FGSE133715_expression.csv?alt=media&token=38aa4156-85fe-4fdc-b241-2b0a51ebe1ce>" %}
Microarray input file
{% endfile %}

### RNA-seq and sRNA-seq

Panomics supports the ingestion of raw or normalized bulk RNA-seq samples. Both `single` and `combined` file modes are accepted.

Accepted formats: .txt, .csv, .tsv, .gz

#### Single file mode sample data:

<figure><img src="https://3894776587-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUZMYHWMJyvkxhr9s6Lm5%2Fuploads%2F6pdYetJYVoPusoPM4NNd%2FScreenshot%202024-10-22%20at%2017.46.23.png?alt=media&#x26;token=a732b9aa-2926-4688-9757-b009909da3c5" alt=""><figcaption></figcaption></figure>

{% file src="<https://3894776587-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUZMYHWMJyvkxhr9s6Lm5%2Fuploads%2F50hFBM8oMnmAZdAFav7M%2FSRR3184306.transcriptome.genes.results.gz?alt=media&token=76fb664b-d654-44bf-a115-d8a6a7b44e2d>" %}

{% file src="<https://3894776587-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUZMYHWMJyvkxhr9s6Lm5%2Fuploads%2Fz4HqxwLiuUIm5EpE2jHY%2FSRR3184305.transcriptome.genes.results.gz?alt=media&token=237336c7-203a-4023-a6c6-b00dcec3b8a3>" %}

#### Combined file mode sample data:

<figure><img src="https://3894776587-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUZMYHWMJyvkxhr9s6Lm5%2Fuploads%2FVBKy4VDZ5fYboBl4KfSJ%2FScreenshot%202024-10-22%20at%2017.43.57.png?alt=media&#x26;token=62623700-f07c-418a-81ae-45d8fb5d97e7" alt=""><figcaption></figcaption></figure>

{% file src="<https://3894776587-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUZMYHWMJyvkxhr9s6Lm5%2Fuploads%2FDsI0A389WcVntoc10jDJ%2FGSE136961_raw_count_clean.csv?alt=media&token=d8ac3a0f-a64b-46b8-9837-f21bb6bf3f0c>" %}

### scRNA-seq and snRNA-seq

For single cell or single nucleus RNA-seq samples, only the `single` file mode is currently supported.

Accepted formats: .txt, .csv, .tsv, .h5, .h5ad, .gz, .tar.gz

Tar gz files must contain 3 files that follow this convention:

* GSM5474339\_AD5\_barcodes.tsv.gz
* GSM5474339\_AD5\_features.tsv.gz
* GSM5474339\_AD5\_matrix.mtx.gz

#### Sample data:

{% file src="<https://3894776587-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUZMYHWMJyvkxhr9s6Lm5%2Fuploads%2FsJO52Hcgv67OYifGp1b4%2FGSM5474339.zip?alt=media&token=16cfe605-4808-400e-8a0a-25e07995e3ed>" %}

## Sample ingestion demo videos

### Bulk RNA-seq using the combined samples file mode

{% embed url="<https://www.loom.com/share/d12a69319a1046c889cc1a9b1a6d474e?sid=0195c519-6ed4-4de7-b7d2-fd6c0da420fb>" %}

### Single-cell RNA-seq using zipped barcodes, features, matrix files

{% embed url="<https://www.loom.com/share/63cd2fb6acfb47d5ae48d3368207111c?sid=1edcb441-a906-4132-8cda-6a2ef05afb68>" %}
