Index of /public/dataset/SarsCov2-Eden-ATCC
Name Last modified Size Description
Parent Directory -
subsample_20/ 2020-05-04 11:12 -
subsample_100/ 2020-05-04 11:12 -
subsample_1000/ 2020-05-04 11:11 -
README.txt 2020-05-09 07:54 3.9K
eden.primers.plus_M13constant.fasta 2020-05-04 11:25 1.6K
eden.primers.fasta 2020-05-04 11:25 1.2K
run_juliet_per_sample.sh 2020-05-04 11:17 1.5K
sarscov2.json 2020-04-09 06:27 31K
NC_045512.2.fasta 2020-03-24 09:27 30K
README (Last Updated 05/09/2020)
********************
INTRODUCTION
********************
This README file describes the contents in this directory.
This dataset contains processed data of SARS-CoV-2 sequencing on the PacBio
Systems [1] using the Eden primer set [2] on ATCC full-length controls [3].
Bioinformatics processing is described in the CoSA tutorial [4] using the
2020-05-01 version of workflow.
For issues or questions regarding this dataset,
file a "bug" at https://github.com/Magdoll/CoSA/issues.
********************
SAMPLE
********************
ATCC VR-1986D Lot# 70034826
(https://www.atcc.org/en/Global/Products/VR-1986D.aspx)
********************
METHODS
********************
Library Preparation & Sequencing:
The library was constructed using SMRTbell Express Template Prep Kit 2.0. Sequencing was done on one
SMRT Cell 8M on the Sequel II system for 15hr with 0.6hr pre-extension time
using Sequel II Binding Kit 2.0.
Analysis:
Detailed bioinformatics processing is described in the CoSA tutorial [4]
using the 2020-05-01 version of workflow. Briefly, CCS reads were generated
using SMRT Link, then demultiplexed of M13 barcodes. A second round of demux
(using lima) was performed to identify the Eden primers, allowing only for
adjacent pairs (ex: A3F--A3R) and filtering out invalid pairs (ex: A1F--A3R).
The demuxed, trimmed, and filtered CCS reads were then pooled together and
downsampled at 1000, 100, and 20 reads per amplicon using the CoSA script
`subsample_amplicons.py`.
Mapping and variant calling was done using pbmm2 (minimap2 wrapper) to the
reference genome, followed by juliet (minorseq) with --min-perc 10 frequency
cutoff.
Analysis tool versions:
ccs v5.0.0 (using SMRT Link v9.1.0.94448)
lima v1.11.0
pbmm2 v1.2.1
juliet v1.12.0
********************
FILE DESCRIPTION
********************
NC_045512.2.fasta - the reference genome fasta file, note the ID is "NC_045512v2"
to be consistent with the UCSC genome browser convention.
eden.primers.fasta - the Eden primers
eden.primers.plus_M13constant.fasta - the Eden primers, with the M13 constant sequence added
sarscov2.json - the SARS-CoV-2 config file used by Juliet (MinorSeq) for variant calling
run_juliet_per_sample.sh - template command file for mapping and variant calling
subsampled.ccs.Q20.fastq - CCS (HiFi) amplicon reads. Barcodes and Eden primers have been trimmed.
subsampled.mapped.bam - mapping of "subsampled.ccs.Q20.fastq" to the reference genome.
subsampled.minperc10.juliet.* - variant calling output using Juliet (minorseq).
********************
FILE LIST
********************
├── NC_045512.2.fasta
├── run_juliet_per_sample.sh
├── sarscov2.json
├── eden.primers.fasta
├── eden.primers.plus_M13constant.fasta
├── subsample_1000
│ ├── subsampled.ccs.Q20.fastq
│ ├── subsampled.mapped.bam
│ ├── subsampled.mapped.bam.bai
│ ├── subsampled.minperc10.juliet.html
│ ├── subsampled.minperc10.juliet.json
│ └── subsampled.minperc10.juliet.vcf
├── subsample_100
│ ├── subsampled.ccs.Q20.fastq
│ ├── subsampled.mapped.bam
│ ├── subsampled.mapped.bam.bai
│ ├── subsampled.minperc10.juliet.html
│ ├── subsampled.minperc10.juliet.json
│ └── subsampled.minperc10.juliet.vcf
└── subsample_20
├── subsampled.ccs.Q20.fastq
├── subsampled.mapped.bam
├── subsampled.mapped.bam.bai
├── subsampled.minperc10.juliet.html
├── subsampled.minperc10.juliet.json
└── subsampled.minperc10.juliet.vcf
4. REFERENCES
[1] https://www.pacb.com/covid-19
[2] https://www.pacb.com/wp-content/uploads/Customer-Collaboration-PacBio-Compatible-Eden-Protocol-for-SARS-CoV-2-Sequencing.pdf
[3] https://www.atcc.org/en/Global/Products/VR-1986HK.aspx
[4] https://github.com/Magdoll/CoSA