Index of /public/dataset/MAS-Seq

Icon  Name                     Last modified      Size  Description
[PARENTDIR] Parent Directory - [DIR] DATA-Revio-PBMC-1/ 2023-01-04 07:36 - [DIR] DATA-Revio-PBMC-2/ 2023-01-04 07:37 - [DIR] DATA-SQ2-PBMC_10kcells/ 2022-12-23 12:46 - [DIR] DATA-SQ2-PBMC_5kcells/ 2022-12-09 14:27 - [DIR] DATA-SQ2_HG002_10kcells/ 2023-05-01 19:56 - [DIR] PLOT-scripts/ 2022-11-02 16:02 - [DIR] REF-10x_barcodes/ 2023-02-07 14:01 - [DIR] REF-10x_primers/ 2022-08-29 10:43 - [DIR] REF-MAS_adapters/ 2022-08-29 08:49 - [DIR] REF-pigeon_ref_sets/ 2022-08-23 08:58 - [TXT] README.txt 2023-05-02 08:30 6.4K
README  (Last Updated 05/02/2023)


This README file describes the contents in this directory.

The MAS-Seq libraries [1] were sequenced on the Sequel® IIe System and processed using SMRT® Link v11.1 [2] or command line/BioConda version [3].

To learn more about the  MAS-Seq method from PacBio for single-cell isoform sequencing, visit:


SQ2-PBMC 5k cells
Vendor – BioIVT
Lot No – HUMANPBMC-0107696
Storage Matrix – 50% HBSS / 50% FBS from LEUKOMAX Unit ACD; Fresh 
Sample Prep Matrix – PBS + 1.5% bovine serum albumin
Targeted Cell Input – 5000 cells

SQ2-PBMC 10k cells
Vendor – BioIVT
Lot No – HUMANPBMC-0106775
Storage Matrix – CryoStor CS10 from LEUKOMAX Unit ACD; Cryopreserved
Sample Prep Matrix – PBS + 1.5% bovine serum albumin
Targeted Cell Input – 10000 cells

SQ2-HG002/GM24385 10k cells
Vendor – Coriell
Passage 12
Sample Prep Matrix – PBS + 1.5% bovine serum albumin
Targeted Cell Input – 10000 cells

All cDNA libraries were generated using the 10x Chromium Next GEM Single Cell 3’ kit (v3.1) with a 10x Chromium Next GEM Chip G on a 10x Chromium X system.


Library Preparation: 

Procedure & Checklist - Preparing MAS-Seq libraries using MAS-Seq for 10x Single Cell 3’ kit


Sequel IIe system with Sequel II binding kit 3.2 and Sequel II sequencing kit 2.0 (4 rxn)
Revio system with Revio polymerase kit and Revio sequencing plate

Run time: 

Sequel II/IIe – 30 hr movie + 2hr pre-extension + adaptive loading
Revio – 24 hr movie


SMRT Link v11.1 Read segmentation and Single-cell Iso-Seq workflow


Each sample will contain the following folders:


This directory contains HiFi reads produced either directly on-instrument or have gone through CCS analysis on SMRT Link. 

|---- <movie>.hifi_reads.bam
|---- <movie>.hifi_reads.bam.pbi


This directory contains segmented reads that have been processed by Read segmentation (or skera [4]) to produce S-reads that represent the original cDNA molecules. segmented.bam contains S-reads that have the expected order of MAS adapters and is the file used in carrying the subsequent analyses.

|---- segmented.bam
|---- segmented.non_passing.bam


This directory contains deduplicated reads that have been through barcode correction (using barcode whitelist) and UMI deduplication. The dedup reads are then used for subsequent mapping and transcript analyses.

├── scisoseq.5p--3p.tagged.refined.corrected.sorted.dedup.bam 
├── scisoseq.5p--3p.tagged.refined.corrected.sorted.dedup.bam.bai 
├── scisoseq.5p--3p.tagged.refined.corrected.sorted.dedup.bam.pbi 
└── scisoseq.5p--3p.tagged.refined.corrected.sorted.dedup.fasta 


This directory lists the the total set of unique transcripts as a result of mapping the dedup reads to the genome, collapsed into transcripts, classified and filtered against Gencode using pigeon. Read about pigeon at [3]. 

The classification.txt and junctions.txt are the output from pigeon showing the per-isoform and per-junction-per-isoform classification results against Gencode annotation. The GFF3 file shows the exonic structures of the transcript isoforms. The group.txt file is an intermediate file required for generating Seurat-compatible matrix in the next step,and is kept here for those who wish to re-generate matrices.

├── scisoseq_classification.filtered_lite_classification.txt 
├── scisoseq_classification.filtered_lite_junctions.txt 
└── scisoseq_transcripts.sorted.filtered_lite.gff


This directory contains the gene- and isoform-level count matrix compatible with common tertiary analyses tools such as Seurat. 

The NoNovelGenesIsoforms/ subdirectory contains only known genes (for genes_seurat/) and known+novel isoforms from known genes (for isoforms_seurat). Ribo/mito genes are excluded. 

The WithNovelGenesIsoforms/ subdirectory contains both known and novel genes. Ribo/mito genes are excluded.

├── NoNovelGenesIsoforms
│   ├──
│   ├── genes_seurat
│   │   ├── barcodes.tsv
│   │   ├── genes.tsv
│   │   └── matrix.mtx
│   └── isoforms_seurat
│       ├── barcodes.tsv
│       ├── genes.tsv
│       └── matrix.mtx
└── WithNovelGenesIsoforms
    ├── genes_seurat
    │   ├── barcodes.tsv
    │   ├── genes.tsv
    │   └── matrix.mtx
    └── isoforms_seurat
        ├── barcodes.tsv
        ├── genes.tsv
        └── matrix.mtx


[1] Procedure & Checklist - Preparing MAS-Seq libraries using MAS-Seq for 10x Single Cell 3’ kit

[2] SMRT Link v11.1 User Guide



Research use only. Not for use in diagnostic procedures. © 2022 Pacific Biosciences of California, Inc. (“PacBio”). All rights reserved. The data provided in these files and the information in this document are subject to change without notice. PacBio assumes no responsibility for any errors or omissions in the files or this document. Certain notices, terms, conditions and/or use restrictions may pertain to your use of PacBio products and/or third-party products. Refer to the applicable PacBio terms and conditions of sale and to the applicable license terms at Pacific Biosciences, the PacBio logo, PacBio, Circulomics, Omniome, SMRT, SMRTbell, Iso-Seq, Sequel, Nanobind, and SBB are trademarks of PacBio. All other trademarks are the sole property of their respective owners.