README (Last Updated 11/04/2025) ******************** INTRODUCTION ******************** This README file describes the contents in this directory. The dataset generated here contains single-cell RNA-Seq data generated using the MAS-Seq for 10x Single Cell 3' kit ("MAS") [1] and the Kinnex™ single-cell RNA kit ("Kinnex") [2]. The MAS-Seq libraries were sequenced on the Sequel® II/IIe and Revio® systems and processed using SMRT® Link v11.1 [3] or BioConda [4]. The Kinnex libraries were sequenced on the Revio and Vega™ system and processed using SMRT Link v13.1 and above [5]. To learn more about Kinnex, visit: https://pacb.com/kinnex ******************** SINGLE CELL SAMPLE ******************** All PBMC samples were purchased from BioIVT. Either fresh or cryopreserved. All HG002/GM24385 10k cells were purchased from Coriell. All cDNA libraries were generated using the 10x Chromium Next GEM Single Cell 3’ kit (v3.1, v4) or Single Cell 5' kit (v2, v3) with a 10x Chromium Next GEM Chip G on a 10x Chromium X system. Below is a description of the kits, systems, and samples used for each directory. DATA-Revio-Kinnex-HG002-10x5p: Kinnex kit, Revio, HG002, 10x 5' kit DATA-Revio-Kinnex-PBMC-10x3p : Kinnex kit, Revio, PBMC, 10x 3' kit DATA-Revio-Kinnex-PBMC-10x5p : Kinnex kit, Revio, PBMC, 10x 5' kit DATA-Revio-Kinnex-PBMC-10kcells-10xGEMX3p: Kinnex kit, Revio, PBMC, 10x 3' (v4) kit DATA-Revio-Kinnex-PBMC-20kcells-10xGEMX3p-rep1: Kinnex kit, Revio, PBMC, 10x 3' (v4) kit DATA-Revio-Kinnex-PBMC-20kcells-10xGEMX3p-rep2: Kinnex kit, Revio, PBMC, 10x 3' (v4) kit DATA-Revio-Kinnex-PBMC-20kcells-10xGEMX5p: Kinnex kit, Revio, PBMC, 10x 5' (v3) kit DATA-Vega-Kinnex-PBMC-10kcells-10xGEMX3p: Kinnex kit, Vega, PBMC, 10x 3' (v4) kit DATA-Revio-Kinnex-PBMC-Parse: Kinnex kit, Revio, PBMC, Parse WT kit DATA-MAS-Revio-PBMC-1 : MAS-Seq kit, Revio, PBMC, 10x 3' kit DATA-MAS-Revio-PBMC-2 : MAS-Seq kit, Revio, PBMC, 10x 3' kit DATA-MAS-SQ2-PBMC_10kcells: MAS-Seq kit, Sequel IIe, PBMC, 10x 3' kit DATA-MAS-SQ2-PBMC_5kcells : MAS-Seq kit, Sequel IIe, PBMC, 10x 3' kit DATA-SQ2_HG002_10kcells : MAS-Seq kit, Sequel IIe, HG002, 10x 3' kit ******************** SPATIAL SAMPLE ******************** DATA-RevioSPRQ-Kinnex-VisiumHD-humanBreast: Kinnex kit, Revio SPRQ Vendor: BioIVT Organism: Human Anatomical Entity: Breast Disease State: Infiltrating Ductal Carcinoma Sex: Female Age: 70 Preservation method: Fresh Frozen Quality: RIN 9.1 Thickness: 10 um DATA-RevioSPRQ-Kinnex-VisiumHD-humanColon Vendor: BioIVT Organism: Human Anatomical Entity: Large intestine colon Disease State: Adenocarcinoma, Invasive Sex: Female Age: 69 Preservation method: Fresh Frozen Quality: RIN 8 Thickness: 10 um DATA-RevioSPRQ-Kinnex-VisiumHD-humanTonsil Vendor: BioIVT Organism: Human Anatomical Entity: Tonsil Disease State: Tonsillitis, Chronic Hyperplastic Sex: Male Age: 25 Preservation method: Fresh Frozen Quality: RIN 8.5 Thickness: 10 um DATA-RevioSPRQ-Kinnex-VisiumHD-mouseBrain Vendor: Charles River Laboratories Organism: Mouse Anatomical Entity: Brain Disease State: Healthy Sex: Male Age: 8 weeks Preservation method: Fresh Frozen Quality: RIN unknown Thickness: 10 um ******************** METHODS ******************** Library Preparation: Procedure & Checklist - Preparing MAS-Seq libraries using MAS-Seq for 10x Single Cell 3’ kit or Procedure & checklist - Preparing Kinnex libraries using Kinnex single-cell RNA kit Sequencing: Sequel IIe system with Sequel II binding kit 3.2 and Sequel II sequencing kit 2.0 (4 rxn) or Revio system with Revio polymerase kit and Revio sequencing plate or Revio system with Revio SPRQ™ polymerase kit and Revio SPRQ sequencing plate or Vega system with Vega polymerase kit with Vega sequencing plate Run time: Sequel II/IIe – 30 hr movie + 2hr pre-extension + adaptive loading Revio – 24 hr movie Vega - 24 hr movie Analysis: Read Segmentation and Single-cell Iso-Seq workflow (SL v11.1, SL v13.1 and above) For VisiumHD samples, analysis was done following tutorial in https://github.com/Magdoll/Visium-HD-support ******************** FILE DESCRIPTION ******************** Each sample will contain the following folders: ======================== 0-CCS ======================== This directory contains HiFi reads produced either directly on-instrument or have gone through CCS analysis on SMRT Link. 0-CCS/ |---- .hifi_reads.bam |---- .hifi_reads.bam.pbi ======================== 1-Sreads ======================== This directory contains segmented reads that have been processed by Read segmentation (or skera [6]) to produce S-reads that represent the original cDNA molecules. segmented.bam contains S-reads that have the expected order of MAS/Kinnex primers and is the file used in carrying the subsequent analyses. 1-Sreads/ |---- segmented.bam |---- segmented.non_passing.bam ======================== 2-DeduplicatedReads ======================== This directory contains deduplicated reads that have been through barcode correction (using barcode whitelist) and UMI deduplication. The dedup reads are then used for subsequent mapping and transcript analyses. 2-DeduplicatedReads/ ├── scisoseq.5p--3p.tagged.refined.corrected.sorted.dedup.bam ├── scisoseq.5p--3p.tagged.refined.corrected.sorted.dedup.bam.bai ├── scisoseq.5p--3p.tagged.refined.corrected.sorted.dedup.bam.pbi └── scisoseq.5p--3p.tagged.refined.corrected.sorted.dedup.fasta ======================== 3-CollapsedTranscripts ======================== This directory lists the total set of unique transcripts as a result of mapping the dedup reads to the genome, collapsed into transcripts, classified and filtered against Gencode using pigeon. Read about pigeon at [3]. The classification.txt and junctions.txt are the output from pigeon showing the per-isoform and per-junction-per-isoform classification results against Gencode annotation. The GFF3 file shows the exonic structures of the transcript isoforms. The group.txt file is an intermediate file required for generating Seurat-compatible matrix in the next step, and is kept here for those who wish to re-generate matrices. 3-CollapsedTranscripts/ ├── scisoseq_classification.filtered_lite_classification.txt ├── scisoseq_classification.filtered_lite_junctions.txt ├── scisoseq.mapped_transcripts.collapse.group.txt └── scisoseq_transcripts.sorted.filtered_lite.gff ======================== 4-SeuratMatrix ======================== This directory contains the gene- and isoform-level count matrix compatible with common tertiary analyses tools such as Seurat. The NoNovelGenesIsoforms/ subdirectory contains only known genes (for genes_seurat/) and known+novel isoforms from known genes (for isoforms_seurat). Ribo/mito genes are excluded. The WithNovelGenesIsoforms/ subdirectory contains both known and novel genes. Ribo/mito genes are excluded. 4-SeuratMatrix/ ├── NoNovelGenesIsoforms │ ├── cmd.sh │ ├── genes_seurat │ │ ├── barcodes.tsv │ │ ├── genes.tsv │ │ └── matrix.mtx │ └── isoforms_seurat │ ├── barcodes.tsv │ ├── genes.tsv │ └── matrix.mtx └── WithNovelGenesIsoforms ├── cmd.sh ├── genes_seurat │ ├── barcodes.tsv │ ├── genes.tsv │ └── matrix.mtx └── isoforms_seurat ├── barcodes.tsv ├── genes.tsv └── matrix.mtx 4. REFERENCES [1] Procedure & Checklist - Preparing MAS-Seq libraries using MAS-Seq for 10x Single Cell 3’ kit https://www.pacb.com/wp-content/uploads/Procedure-checklist-preparing-MAS-Seq-libraries-using-MAS-Seq-for-10x-single-cell-3-kit.pdf [2] Procedure & checklist - Preparing Kinnex libraries using Kinnex single-cell RNA kit https://www.pacb.com/wp-content/uploads/Procedure-checklist-Preparing-Kinnex-libraries-using-Kinnex-single-cell-RNA-kit.pdf [3] SMRT Link v11.1 User Guide https://www.pacb.com/wp-content/uploads/SMRT_Link_User_Guide_v11.1.pdf [4] SMRT Link Single-cell Iso-Seq troubleshooting guide https://www.pacb.com/wp-content/uploads/SMRT-Link-Kinnex-single-cell-troubleshooting-guide-v13.1.pdf [5] isoseq.how https://isoseq.how/ [6] skera.how https://skera.how/ [7] Visium HD processing pipeline: https://github.com/Magdoll/Visium-HD-support Research use only. Not for use in diagnostic procedures. © 2025 Pacific Biosciences of California, Inc. (“PacBio”). All rights reserved. The data provided in these files and the information in this document are subject to change without notice. PacBio assumes no responsibility for any errors or omissions in the files or this document. Certain notices, terms, conditions and/or use restrictions may pertain to your use of PacBio products and/or third-party products. Refer to the applicable PacBio terms and conditions of sale and to the applicable license terms at pacb.com/license. Pacific Biosciences, the PacBio logo, PacBio, Circulomics, Omniome, SMRT, SMRTbell, Iso-Seq, Sequel, Nanobind, SBB, Revio, Onso, Apton, Kinnex, PureTarget, SPRQ, and Vega are trademarks of PacBio.