Index of /public/onso/2023Q3/WGS/hg002_30x_WGS
Name Last modified Size Description
Parent Directory -
bam_trimmed/ 2023-09-22 15:11 -
deepvariant_v1.4_SBS/ 2023-09-22 14:24 -
md5sums.txt 2023-09-18 16:16 312
README.txt 2023-09-21 11:04 1.6K
Onso_hg002_PCR_free_WGS_OSQ_R1.fastq.gz 2023-09-18 16:50 45G
Onso_hg002_PCR_free_WGS_OSQ_trimmed_R1.fastq.gz 2023-09-18 17:44 45G
Onso_hg002_PCR_free_WGS_OSQ_R2.fastq.gz 2023-09-18 17:18 46G
Onso_hg002_PCR_free_WGS_OSQ_trimmed_R2.fastq.gz 2023-09-18 18:12 46G
# PacBio Onso Whole Genome Sequencing of HG002
## Legal disclaimer
All trademarks, trade names, or logos mentioned or used are the property of their respective owners.
## Data
The Onso_hg002_PCR_free_WGS_OSQ folder contains adapter trimmed reads sequenced on a PacBio Onso instrument in San Diego, CA. The libraries were sequenced with paired-end 2x150bp sequencing chemistry.
Below is a brief description of the files:
Onso_hg002_PCR_free_WGS_OSQ_R1.fastq.gz - read1 fastq containing untrimmed reads
Onso_hg002_PCR_free_WGS_OSQ_R2.fastq.gz - read2 fastq containing untrimmed reads
Onso_hg002_PCR_free_WGS_OSQ_trimmed_R1.fastq.gz - read1 fastq containing adapter trimmed reads
Onso_hg002_PCR_free_WGS_OSQ_trimmed_R2.fastq.gz - read2 fastq containing adapter trimmed reads
md5sums.txt - md5 checksums of the fastq files
### Adapter Trimming
Adapter trimming of the trimmed fastqs was perfomed using the cutadapt application (https://cutadapt.readthedocs.io/en/stable/) with the following command:
```
cutadapt \
-a ATCGATTCGTGCTTGTCCGTGGTACTCGGCA \
-A ATCGATTCGTGCTCGATGAACCGGGCGCTTA \
--overlap 8 \
-j 10 \
-o {output.fastq1} \
-p {output.fastq2} \
{input.fastq1} \
{input.fastq2}
```
### Alignment
Reads can be aligned to the a reference fasta (e.g. hg38 without alt contigs) using bwa-mem and indexed with samtools.
```
bwa mem -t24 -R {RG_TAG} {REFERENCE_FASTA} {input.fastq1} {input.fastq2} | \
samtools sort -@4 -o {output.bam}
samtools index {output.bam}
```
*Rev 2023-09-15*