Index of /public/dataset/MAS-Seq

Icon  Name                    Last modified      Size  Description
[PARENTDIR] Parent Directory - [DIR] PLOT-scripts/ 2022-11-02 16:02 - [DIR] REF-10x_barcodes/ 2022-10-18 14:36 - [DIR] REF-10x_primers/ 2022-08-29 10:43 - [DIR] REF-MAS_adapters/ 2022-08-29 08:49 - [DIR] REF-pigeon_ref_sets/ 2022-08-23 08:58 - [TXT] README.txt 2022-10-18 13:17 6.1K
README  (Last Updated 10/18/2022)

********************
INTRODUCTION
********************

This README file describes the contents in this directory.

The MAS-Seq libraries [1] were sequenced on the Sequel® IIe System and processed using SMRT® Link v11.1 [2] or command line/BioConda version [3].

To learn more about the  MAS-Seq method from PacBio for single-cell isoform sequencing, visit: https://www.pacb.com/products-and-services/applications/rna-sequencing/single-cell-rna-sequencing/



********************
SAMPLE
********************

PBMC 5k cells
Vendor – BioIVT
Lot No – HUMANPBMC-0107696
Storage Matrix – 50% HBSS / 50% FBS from LEUKOMAX Unit ACD; Fresh 
Sample Prep Matrix – PBS + 1.5% bovine serum albumin
Targeted Cell Input – 5000 cells


PBMC 10k cells
Vendor – BioIVT
Lot No – HUMANPBMC-0106775
Storage Matrix – CryoStor CS10 from LEUKOMAX Unit ACD; Cryopreserved
Sample Prep Matrix – PBS + 1.5% bovine serum albumin
Targeted Cell Input – 10000 cells

Both cDNA libraries were generated using the 10x Chromium Next GEM Single Cell 3’ kit (v3.1) with a 10x Chromium Next GEM Chip G on a 10x Chromium X system.


********************
METHODS
********************

Library Preparation: 

Procedure & Checklist - Preparing MAS-Seq libraries using MAS-Seq for 10x Single Cell 3’ kit

Sequencing: 

Sequel IIe system with Sequel II binding kit 3.2 and Sequel II sequencing kit 2.0 (4 rxn)


Run time: 
30 hr movie + 2hr pre-extension + adaptive loading

Analysis: 

SMRT Link v11.1 Read segmentation and Single-cell Iso-Seq workflow

   
********************
FILE DESCRIPTION
********************

Each sample will contain the following folders:

========================
0-CCS
========================

This directory contains HiFi reads produced either directly on-instrument or have gone through CCS analysis on SMRT Link. 

0-CCS/
|---- <movie>.hifi_reads.bam
|---- <movie>.hifi_reads.bam.pbi



========================
1-Sreads
========================

This directory contains segmented reads that have been processed by Read segmentation (or skera [4]) to produce S-reads that represent the original cDNA molecules. segmented.bam contains S-reads that have the expected order of MAS adapters and is the file used in carrying the subsequent analyses.

1-Sreads/
|---- segmented.bam
|---- segmented.non_passing.bam


========================
2-DeduplicatedReads
========================

This directory contains deduplicated reads that have been through barcode correction (using barcode whitelist) and UMI deduplication. The dedup reads are then used for subsequent mapping and transcript analyses.


2-DeduplicatedReads/
├── scisoseq.5p--3p.tagged.refined.corrected.sorted.dedup.bam 
├── scisoseq.5p--3p.tagged.refined.corrected.sorted.dedup.bam.bai 
├── scisoseq.5p--3p.tagged.refined.corrected.sorted.dedup.bam.pbi 
└── scisoseq.5p--3p.tagged.refined.corrected.sorted.dedup.fasta 





========================
3-CollapsedTranscripts
========================

This directory lists the the total set of unique transcripts as a result of mapping the dedup reads to the genome, collapsed into transcripts, classified and filtered against Gencode using pigeon. Read about pigeon at [3]. 

The classification.txt and junctions.txt are the output from pigeon showing the per-isoform and per-junction-per-isoform classification results against Gencode annotation. The GFF3 file shows the exonic structures of the transcript isoforms. The group.txt file is an intermediate file required for generating Seurat-compatible matrix in the next step,and is kept here for those who wish to re-generate matrices.


3-CollapsedTranscripts/
├── scisoseq_classification.filtered_lite_classification.txt 
├── scisoseq_classification.filtered_lite_junctions.txt 
├── scisoseq.mapped_transcripts.collapse.group.txt 
└── scisoseq_transcripts.sorted.filtered_lite.gff


========================
4-SeuratMatrix
========================

This directory contains the gene- and isoform-level count matrix compatible with common tertiary analyses tools such as Seurat. 

The NoNovelGenesIsoforms/ subdirectory contains only known genes (for genes_seurat/) and known+novel isoforms from known genes (for isoforms_seurat). Ribo/mito genes are excluded. 

The WithNovelGenesIsoforms/ subdirectory contains both known and novel genes. Ribo/mito genes are excluded.


4-SeuratMatrix/
├── NoNovelGenesIsoforms
│   ├── cmd.sh
│   ├── genes_seurat
│   │   ├── barcodes.tsv
│   │   ├── genes.tsv
│   │   └── matrix.mtx
│   └── isoforms_seurat
│       ├── barcodes.tsv
│       ├── genes.tsv
│       └── matrix.mtx
└── WithNovelGenesIsoforms
    ├── cmd.sh
    ├── genes_seurat
    │   ├── barcodes.tsv
    │   ├── genes.tsv
    │   └── matrix.mtx
    └── isoforms_seurat
        ├── barcodes.tsv
        ├── genes.tsv
        └── matrix.mtx


4. REFERENCES

[1] Procedure & Checklist - Preparing MAS-Seq libraries using MAS-Seq for 10x Single Cell 3’ kit
https://www.pacb.com/wp-content/uploads/Procedure-checklist-preparing-MAS-Seq-libraries-using-MAS-Seq-for-10x-single-cell-3-kit.pdf

[2] SMRT Link v11.1 User Guide 
https://www.pacb.com/wp-content/uploads/SMRT_Link_User_Guide_v11.1.pdf

[3] isoseq.how https://isoseq.how/

[4] skera.how https://skera.how/






Research use only. Not for use in diagnostic procedures. © 2022 Pacific Biosciences of California, Inc. (“PacBio”). All rights reserved. The data provided in these files and the information in this document are subject to change without notice. PacBio assumes no responsibility for any errors or omissions in the files or this document. Certain notices, terms, conditions and/or use restrictions may pertain to your use of PacBio products and/or third-party products. Refer to the applicable PacBio terms and conditions of sale and to the applicable license terms at pacb.com/license. Pacific Biosciences, the PacBio logo, PacBio, Circulomics, Omniome, SMRT, SMRTbell, Iso-Seq, Sequel, Nanobind, and SBB are trademarks of PacBio. All other trademarks are the sole property of their respective owners.