Index of /public/dataset/Melanoma2019_IsoSeq

Icon  Name                                                                                                                                       Last modified      Size  Description
[PARENTDIR] Parent Directory - [DIR] subreads/ 2019-06-12 12:46 - [DIR] PolishedMappedTranscripts/ 2019-06-10 08:03 - [DIR] FullLengthReads/ 2019-06-10 07:28 - [TXT] README.txt 2019-06-12 16:12 5.5K [   ] Galvin-AACR-2019-Full-length-transcriptome-sequencing-of-melanoma-cell-line-complements-long-read-assessment-of-genomic-rearrangements.pdf 2019-04-09 22:20 1.0M
README  (Last Updated 06/15/2019)

********************
INTRODUCTION
********************

   This README file describes the contents in this directory.

   This dataset contains raw, intermediate, and processed files for a Melanoma 
Cancer Cell Line Iso-Seq dataset. The library was sequenced on the Sequel system
and processed using SMRTLink 7.0 followed by community tool analysis. For more 
information on Iso-Seq® methods , bioinformatics analysis, see the PacBio Iso-Seq 
GitHub (https://github.com/PacificBiosciences/IsoSeq_SA3nUP) and additional 
references below.


********************
SAMPLE
********************
   
The COLO829T melanoma cell line (ATCC CRL-1974, http://www.lgcstandards-atcc.org/products/all/CRL-1974) 
and COLO829BL peripherial blood cell line (ATCC CRL-1980, http://www.lgcstandards-atcc.org/products/all/CRL-1980) 
were obtained from ATCC  and cultured as recommended. 

********************
METHODS
********************

Library Preparation: 
Iso-Seq® Express Template Preparation for Sequel® and Sequel®  II Systems (no part number yet)

Sequencing: 
Sequel System with Sequel Binding Kit 3.0 (101-500-400) and Sequel Sequencing Kit 3.0 (101-597-900)

Run time: 
20 hrs pre-extension, 4 movie time per SMRT Cell

Analysis: 
SMRTlink 7.0 "IsoSeq" protocol, followed by mapping to hg38 reference genome and collapsed into 
non-redundant transcript set. 
Post-mapping filtering using SQANTI2 software (v2.7, https://github.com/Magdoll/SQANTI2) 
   
********************
FILE DESCRIPTION
********************

========================
WHAT FILES SHOULD I USE? 
========================
Users wishing to immediately make use the processed, mapped, filtered results should use the 
following GFF files:

PolishedMappedTranscripts/after-SQANTI2filter/final_filtered.COLO829BL.gff
PolishedMappedTranscripts/after-SQANTI2filter/final_filtered.COLO829T.gff

We do not recommend most users re-analyzing from raw (subreads.bam) or intermediate (FLNC) data.

========================
Raw Subreads
========================
The subreads/ folder contains two sub-folders, one each for the two cell line (COLO829BL and COLO829T). 
In each subfolder contains three movie BAM files.

subreads/
|__ COLO8299T
|   |__ m54026_190120_000756.subreads.bam 
|   |__ m54026_190120_000756.subreads.bam.pbi 
|   |__ m54119_190202_095143.subreads.bam 
|   |__ m54119_190202_095143.subreads.bam.pbi 
|   |__ m54119_190203_061153.subreads.bam 
|   |__ m54119_190203_061153.subreads.bam.pbi 
|__ COLO829BL
|   |__ m54019_190120_021709.subreads.bam 
|   |__ m54019_190120_021709.subreads.bam.pbi 
|   |__ m54119_190131_171128.subreads.bam 
|   |__ m54119_190131_171128.subreads.bam.pbi
|   |__ m54119_190201_133141.subreads.bam 
|   |__ m54119_190201_133141.subreads.bam.pbi 
|__ md5sum.txt



========================
Intermediate FLNC Reads
========================

The FullLengthReads/ directory contains the full-length, non-concatemer (FLNC) reads in both BAM and 
FASTQ format. Note that the two cell lines are combined. To distinguish which reads come from which 
sample, use the movie name. The FLNC reads have the sequence ID format of <movie>/<zmw>/ccs.


FullLengthReads/
|__ flnc.bam 
|__ flnc.fastq 


========================
Mapped and Filtered Transcripts
========================

The PolishedMappedTranscripts/ directory contains two subfolders. The "before-SQANTI2filter" directory 
contains the results of mapping the Iso-Seq output (full-length, high-quality isoform sequences) to the 
hg38 reference genome, then collapsing the result using Cupcake [3] scripts with 99% coverage and 95% 
identity cutoff. The collapsed results are delineated into the two samples based on the associated 
FLNC read count. The SQANTI2 results (.classification.txt, .junctions.txt, and _sqanti_report.pdf) are 
also included. The "after-SQANTI2filter" subdirectory contains the same files, but after running the 
SQANTI2 filtering script to remove library artifacts. For more information on the SQANTI2 filtering, see [4].


PolishedMappedTranscripts/
|__ after-SQANTI2filter
|   |__ final_filtered_classification.txt 
|   |__ final_filtered.collapsed.gff 
|   |__ final_filtered.COLO829BL.gff 
|   |__ final_filtered.COLO829T.gff 
|   |__ final_filtered.fasta 
|   |__ final_filtered_junctions.txt 
|   |__ final_filtered.mapped_fl_count.txt 
|   |__ final_filtered_sqanti_report.pdf 
|__ before-SQANTI2filter
    |__ final.COLO829BL.gff 
    |__ final.COLO829T.gff 
    |__ final.gff 
    |__ final.mapped_fl_count.txt
    |__ final.rep_classification.txt 
    |__ final.rep.fq 
    |__ final.rep_junctions.txt 
    |__ final.rep_sqanti_report.pdf 

  
4. REFERENCES

[1] PacBio Iso-Seq Landing Page: https://www.pacb.com/applications/rna-sequencing/
[2] PacBio Iso-Seq GitHub Wiki: https://github.com/PacificBiosciences/IsoSeq_SA3nUP
[3] Community Tool Cupcake: https://github.com/Magdoll/cDNA_Cupcake
[4] Community Tool SQANTI2: https://github.com/Magdoll/SQANTI2/


For Research Use Only. Not for use in diagnostic procedures.  Copyright 2019, Pacific Biosciences of California, Inc. 
All rights reserved. The data provided in these files is subject to change without notice and Pacific Biosciences 
assumes no responsibility for any errors or omissions. Certain notices, terms, conditions and/or use restrictions 
may pertain to your use of Pacific Biosciences data, products and/or third party products. Please refer to the 
applicable Pacific Biosciences Terms and Conditions of Sale and to the applicable license terms at 
http://www.pacificbiosciences.com/licenses.html.