Index of /public/onso/2024Q1/WGS/NIST_HG_Trio

Icon  Name                     Last modified      Size  Description
[PARENTDIR] Parent Directory - [DIR] hg003/ 2024-02-28 14:43 - [DIR] hg002/ 2024-02-21 18:37 - [DIR] hg004/ 2024-02-21 18:24 - [TXT] README.txt 2024-02-29 08:43 2.7K
# PacBio Onso Whole Genome Sequencing of HG002, HG003, HG004 trio

## Legal disclaimer
All trademarks, trade names, or logos mentioned or used are the property of their respective owners.


## Sample provenance
The HG002, HG003 and HG004 samples were obtained from NIST (https://shop.nist.gov/ccrz__ProductDetails?sku=8392)


## Library prep
For PCR-free library preparation, genomic DNA from HG002, HG003, and HG004 samples were enzymatically fragmented, end-repaired, and A-tailed using Onso Fragmentation DNA Library Prep kit per manufacturer's instructions. 


## Data
The HG002 trio data set contains both full length and adapter trimmed reads sequenced on a PacBio Onso instrument in San Diego, CA.
The libraries were sequenced with paired-end 2x150bp sequencing chemistry to the following approximate depths:
HG002 - 50X
HG003 - 30X
HG003.2 (replicate) - 45X
HG004 - 45X


Below is a brief description of the file contents:
HG002_Onso_R1.fastq.gz             -  read1 fastq containing untrimmed reads
HG002_Onso_R1_trimmed.fastq.gz     -  read1 fastq containing adapter trimmed reads
HG002_Onso_R2.fastq.gz             -  read2 fastq containing untrimmed reads
HG002_Onso_R2_trimmed.fastq.gz     -  read2 fastq containing adapter trimmed reads
HG003.2_Onso_R1_trimmed.fastq.gz   -  read1 fastq containing adapter trimmed reads
HG003.2_Onso_R2_trimmed.fastq.gz   -  read2 fastq containing adapter trimmed reads
HG003_Onso_R1.fastq.gz             -  read1 fastq containing untrimmed reads
HG003_Onso_R1_trimmed.fastq.gz     -  read1 fastq containing adapter trimmed reads
HG003_Onso_R2.fastq.gz             -  read2 fastq containing untrimmed reads
HG003_Onso_R2_trimmed.fastq.gz     -  read2 fastq containing adapter trimmed reads
HG004_Onso_R1.fastq.gz             -  read1 fastq containing untrimmed reads
HG004_Onso_R1_trimmed.fastq.gz     -  read1 fastq containing adapter trimmed reads
HG004_Onso_R2.fastq.gz             -  read2 fastq containing untrimmed reads
HG004_Onso_R2_trimmed.fastq.gz     -  read2 fastq containing adapter trimmed reads


### Adapter Trimming
Adapter trimming of the trimmed fastqs was perfomed using the cutadapt application (https://cutadapt.readthedocs.io/en/stable/) with the following command:
```
cutadapt \
    -a ATCGATTCGTGCTTGTCCGTGGTACTCGGCA \
    -A ATCGATTCGTGCTCGATGAACCGGGCGCTTA \
    --overlap 8 \
    -j 10 \
    -o {output.fastq1} \
    -p {output.fastq2} \
    {input.fastq1} \
    {input.fastq2}
```


### Alignment
Reads can be aligned to the a reference fasta (e.g. hg38 without alt contigs) using bwa-mem and indexed with samtools.
```
bwa mem -t24 -R {RG_TAG} {REFERENCE_FASTA} {input.fastq1} {input.fastq2} | \
    samtools sort -@4 -o {output.bam}
samtools index {output.bam}
```

 
*Rev 2024-02-27*