README (Last Updated 02/20/2024) ******************** INTRODUCTION ******************** This README file describes the contents in this directory. The dataset generated here contains single-cell RNA-Seq data generated using the MAS-Seq for 10x Single Cell 3' kit ("MAS") [1] and the KinnexTM single-cell RNA kit ("Kinnex") [2]. The MAS-Seq libraries were sequenced on the Sequel® II/IIe and Revio systems and processed using SMRT® Link v11.1 [3] or BioConda [4]. The Kinnex libraries were sequenced on the Revio system and processed using SMRT Link v13.1 [5]. To learn more about Kinnex, visit: ******************** SAMPLE ******************** All PBMC samples were purchased from BioIVT. Either fresh or cryopreserved. All HG002/GM24385 10k cells were purchased from Coriell. All cDNA libraries were generated using the 10x Chromium Next GEM Single Cell 3’ kit (v3.1) or Single Cell 5' kit (v2) with a 10x Chromium Next GEM Chip G on a 10x Chromium X system. Below is a description of the kits, systems, samples used for each directory. DATA-Revio-Kinnex-HG002-10x5p: Kinnex kit, Revio, HG002, 10x 5' kit DATA-Revio-Kinnex-PBMC-10x3p : Kinnex kit, Revio, PBMC, 10x 3' kit DATA-Revio-Kinnex-PBMC-10x5p : Kinnex kit, Revio, PBMC, 10x 5' kit DATA-MAS-Revio-PBMC-1 : MAS-Seq kit, Revio, PBMC, 10x 3' kit DATA-MAS-Revio-PBMC-2 : MAS-Seq kit, Revio, PBMC, 10x 3' kit DATA-MAS-SQ2-PBMC_10kcells: MAS-Seq kit, Sequel IIe, PBMC, 10x 3' kit DATA-MAS-SQ2-PBMC_5kcells : MAS-Seq kit, Sequel IIe, PBMC, 10x 3' kit DATA-SQ2_HG002_10kcells : MAS-Seq kit, Sequel IIe, HG002, 10x 3' kit ******************** METHODS ******************** Library Preparation: Procedure & Checklist - Preparing MAS-Seq libraries using MAS-Seq for 10x Single Cell 3’ kit or Procedure & checklist - Preparing Kinnex libraries using Kinnex single-cell RNA kit Sequencing: Sequel IIe system with Sequel II binding kit 3.2 and Sequel II sequencing kit 2.0 (4 rxn) or Revio system with Revio polymerase kit and Revio sequencing plate Run time: Sequel II/IIe – 30 hr movie + 2hr pre-extension + adaptive loading Revio – 24 hr movie Analysis: Read Segmentation and Single-cell Iso-Seq workflow (SL v11.1 and v13.1) ******************** FILE DESCRIPTION ******************** Each sample will contain the following folders: ======================== 0-CCS ======================== This directory contains HiFi reads produced either directly on-instrument or have gone through CCS analysis on SMRT Link. 0-CCS/ |---- .hifi_reads.bam |---- .hifi_reads.bam.pbi ======================== 1-Sreads ======================== This directory contains segmented reads that have been processed by Read segmentation (or skera [6]) to produce S-reads that represent the original cDNA molecules. segmented.bam contains S-reads that have the expected order of MAS/Kinnex primers and is the file used in carrying the subsequent analyses. 1-Sreads/ |---- segmented.bam |---- segmented.non_passing.bam ======================== 2-DeduplicatedReads ======================== This directory contains deduplicated reads that have been through barcode correction (using barcode whitelist) and UMI deduplication. The dedup reads are then used for subsequent mapping and transcript analyses. 2-DeduplicatedReads/ ├── scisoseq.5p--3p.tagged.refined.corrected.sorted.dedup.bam ├── scisoseq.5p--3p.tagged.refined.corrected.sorted.dedup.bam.bai ├── scisoseq.5p--3p.tagged.refined.corrected.sorted.dedup.bam.pbi └── scisoseq.5p--3p.tagged.refined.corrected.sorted.dedup.fasta ======================== 3-CollapsedTranscripts ======================== This directory lists the total set of unique transcripts as a result of mapping the dedup reads to the genome, collapsed into transcripts, classified and filtered against Gencode using pigeon. Read about pigeon at [3]. The classification.txt and junctions.txt are the output from pigeon showing the per-isoform and per-junction-per-isoform classification results against Gencode annotation. The GFF3 file shows the exonic structures of the transcript isoforms. The group.txt file is an intermediate file required for generating Seurat-compatible matrix in the next step, and is kept here for those who wish to re-generate matrices. 3-CollapsedTranscripts/ ├── scisoseq_classification.filtered_lite_classification.txt ├── scisoseq_classification.filtered_lite_junctions.txt ├── └── scisoseq_transcripts.sorted.filtered_lite.gff ======================== 4-SeuratMatrix ======================== This directory contains the gene- and isoform-level count matrix compatible with common tertiary analyses tools such as Seurat. The NoNovelGenesIsoforms/ subdirectory contains only known genes (for genes_seurat/) and known+novel isoforms from known genes (for isoforms_seurat). Ribo/mito genes are excluded. The WithNovelGenesIsoforms/ subdirectory contains both known and novel genes. Ribo/mito genes are excluded. 4-SeuratMatrix/ ├── NoNovelGenesIsoforms │ ├── │ ├── genes_seurat │ │ ├── barcodes.tsv │ │ ├── genes.tsv │ │ └── matrix.mtx │ └── isoforms_seurat │ ├── barcodes.tsv │ ├── genes.tsv │ └── matrix.mtx └── WithNovelGenesIsoforms ├── ├── genes_seurat │ ├── barcodes.tsv │ ├── genes.tsv │ └── matrix.mtx └── isoforms_seurat ├── barcodes.tsv ├── genes.tsv └── matrix.mtx 4. REFERENCES [1] Procedure & Checklist - Preparing MAS-Seq libraries using MAS-Seq for 10x Single Cell 3’ kit [2] Procedure & checklist - Preparing Kinnex libraries using Kinnex single-cell RNA kit [3] SMRT Link v11.1 User Guide [4] SMRT Link v13.1 User Guide Coming soon [5] [6] Research use only. Not for use in diagnostic procedures. © 2024 Pacific Biosciences of California, Inc. (“PacBio”). All rights reserved. The data provided in these files and the information in this document are subject to change without notice. PacBio assumes no responsibility for any errors or omissions in the files or this document. Certain notices, terms, conditions and/or use restrictions may pertain to your use of PacBio products and/or third-party products. Refer to the applicable PacBio terms and conditions of sale and to the applicable license terms at Pacific Biosciences, the PacBio logo, PacBio, Circulomics, Omniome, SMRT, SMRTbell, Iso-Seq, Sequel, Nanobind, SBB, Revio, Onso, Apton, Kinnex, and PureTarget are trademarks of PacBio.