Index of /public/dataset/Onso/10X_3p_single_cell_data

Icon  Name                                         Last modified      Size  Description
[PARENTDIR] Parent Directory - [DIR] cellranger_FB0032883/ 2024-10-14 18:20 - [DIR] cellranger_FB0032967/ 2024-10-13 20:00 - [   ] FB0032883-BCC_L01_R1_Sample_Library.fastq.gz 2022-12-22 23:02 19G [   ] FB0032883-BCC_L01_R3_Sample_Library.fastq.gz 2022-12-22 20:39 9.1G [   ] FB0032883-BCC_L02_R1_Sample_Library.fastq.gz 2022-12-22 23:39 19G [   ] FB0032883-BCC_L02_R3_Sample_Library.fastq.gz 2022-12-22 20:48 9.6G [   ] FB0032967-BCC_L01_R1_Sample_Library.fastq.gz 2022-12-22 20:11 15G [   ] FB0032967-BCC_L01_R3_Sample_Library.fastq.gz 2022-12-22 21:15 7.6G [TXT] README.txt 2024-10-14 18:22 2.8K
### PacBio Onso Sequencing of 1k and 10k PBMC cells using 10X 3' single cell gene expression libraries 

### Legal disclaimer

All trademarks, trade names, or logos mentioned or used are the property of their respective owners.

### Samples

All cDNA libraries were generated per manufacturer's recommended instructions using the 10x Chromium Next GEM 
Single Cell 3’ Gene Expression Kit (v3.1) w/dual index chemistry with a 10x Chromium 
Next GEM Chip G on a 10x Chromium X system.

### Library prep

Dual indexed P5/P7 libraries from 1k and 10k samples were converted into Onso-compatible libraries using  
Onso library conversion kit per manufacturer's recommendations for subsequent cluster generation and sequencing 
on Onso system.

### Data trimming and filtering

The libraries were sequenced using asymmetric paired-end 28x90bp sequencing configuration per 10X recommendations and demultiplexed using Onso obc2fastq software, available at https://www.pacb.com/onso/software-downloads/.  The "10X_3p_single_cell_data" folder contains FASTQ files for 1k and 10k samples, and output files from 10X Cell Ranger "cellranger_FB0032967" and "cellranger_FB0032883". The FASTQ files have been adapter trimmed and filtered (see below) to retain all reads with 28bp UMI/barcode to be compatible with Cell Ranger. 

Adapter trimming of ILMN P5/P7 adapters were done using 'cutadapt' with the following parameters:

```bash
cutadapt \
    -a AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT \
    -A AGATCGGAAGAGCACACGTCTGAACTCCAGTCA \
    --overlap 3 \
    -j 10 \
    -o {output.fastq1} \
    -p {output.fastq2} \
    {input.fastq1} \
    {input.fastq2}
```

Filtering of reads with less than 28bp for was done using the tool `seqkit` with the following parameters:

```bash
# Example with FB0032883 (10k cells)
sample=FB0032883_S1_L001

# Note that the following filenames have been modified to the conventional naming
# expected by CellRanger, i.e. R1 should be 28bp and R2 should be 90bp

# Filter only reads with 28bp
seqkit seq \
	-m 28 \
	-g ${sample}_R1_001.fastq.gz > ${sample}_R1_001.fastq.gz.tmp

# Pair up filtered R1 reads with R2
seqkit pair \
	-1 ${sample}_R1_001.fastq.gz.tmp \
	-2 ${sample}_R2_001.fastq.gz \
	-O ${sample}_minR1Len28 \
	-u
```

Cellranger 7.1.0 was run on the filtered FASTQs using the following command line parameters:

```bash
# Run cellranger on FB0032883
cellranger count \
	--transcriptome=cellranger/refdata-gex-GRCh38-2020-A \
	--fastqs=FB0032883_S1_L001_minR1Len28,FB0032883_S1_L002_minR1Len28 \
	--localcores=64 \
	--localmem=256 \
	--id=cellranger_FB0032883 \
	--sample=FB0032883 \
	--jobmode=local
```

The `outs` folder from cellranger containing final analysis results are provided in the two folders `cellranger_FB0032967` and `cellranger_FB0032883`

*Rev 2024-10-14*