Index of /public/dataset/redwood2020

Icon  Name                       Last modified      Size  Description
[PARENTDIR] Parent Directory - [TXT] README.txt 2020-04-08 20:21 2.2K [   ] redwood_a_ctg.fasta.gz 2020-03-02 10:33 767M [   ] redwood_a_ctg_33fold.fa.gz 2020-04-08 20:07 745M [   ] redwood_p_ctg.fasta.gz 2020-03-02 10:40 13G [   ] redwood_p_ctg_33fold.fa.gz 2020-04-08 20:09 14G [   ] redwood_p_utg.fasta.gz 2020-03-02 10:48 14G [   ] redwood_p_utg_33fold.fa.gz 2020-04-08 20:11 14G [   ] redwood_r_utg.fasta.gz 2020-03-02 10:55 15G
The Sequoia sempervirens genome was sequenced and assembled at PacBio in
February 2020 and provided as a gift to the community. A 24Kb PacBio HiFi 
Library was prepared and sequenced and the raw HiFi data was accessioned 
and deposited in NCBI under BioProject PRJNA606797.
https://www.ncbi.nlm.nih.gov/bioproject/PRJNA606797

HiFi reads used to generate the assembly can be found in the SRA:
https://trace.ncbi.nlm.nih.gov/Traces/sra/?study=SRP251156

The raw data was assembled in just under 6 days using Hifiasm on a 
computer with 64 cores and 512Gb of ram.

This directory contains the fasta files generated from the assembly 
graphs output by Hifiasm:

# 22-fold Assembly:

redwood_p_ctg.fasta.gz - Primary contigs
redwood_a_ctg.fasta.gz - Alternate contigs
redwood_p_utg.fasta.gz - Haplotype-resolved unitigs
redwood_r_utg.fasta.gz - Haplotype-resolved raw unitigs

Assembly stats - 22-fold dataset:
                   p_ctg         a_ctg          p_utg          r_utg
 contigs          46,991        15,632         63,719        148,757
 max          16,853,612     5,155,247     16,190,392      9,078,036
 e-size        2,398,063       938,485      2,142,272      1,268,402
 n50           1,924,117       626,829      1,711,235      1,011,530
 n90             523,527        46,566        420,917        208,553
 n95             327,215        36,998        246,219         78,330
 total_bp 47,745,475,711 2,704,558,856 50,330,097,415 52,229,135,952

# 33-fold Assembly:

redwood_a_ctg_33fold.fa.gz - Primary contigs
redwood_p_ctg_33fold.fa.gz - Alternate contigs
redwood_p_utg_33fold.fa.gz - Haplotype-resolved unitigs

Assembly stats - 33-fold dataset:
                   p_ctg         a_ctg          p_utg          r_utg
 contigs          28,209        22,242         43,495         97,178
 max          32,786,047     6,132,762     25,744,027     14,449,578
 e-size        4,670,661       910,652      3,764,267      2,231,694
 n50           3,757,823       609,054      3,031,506      1,765,225
 n90           1,037,637        31,534        714,935        380,301
 n95             629,349        26,348        393,729        157,091
 total_bp 48,470,328,197 2,612,008,845 50,732,322,698 52,040,435,767


More information about Hifiasm: https://github.com/chhylp123/hifiasm