# PacBio Onso Whole Genome Sequencing of HG002 ## Legal disclaimer All trademarks, trade names, or logos mentioned or used are the property of their respective owners. ## Data The Onso_hg002_PCR_free_WGS_OSQ folder contains adapter trimmed reads sequenced on a PacBio Onso instrument in San Diego, CA. The libraries were sequenced with paired-end 2x150bp sequencing chemistry. Below is a brief description of the files: Onso_hg002_PCR_free_WGS_OSQ_R1.fastq.gz - read1 fastq containing untrimmed reads Onso_hg002_PCR_free_WGS_OSQ_R2.fastq.gz - read2 fastq containing untrimmed reads Onso_hg002_PCR_free_WGS_OSQ_trimmed_R1.fastq.gz - read1 fastq containing adapter trimmed reads Onso_hg002_PCR_free_WGS_OSQ_trimmed_R2.fastq.gz - read2 fastq containing adapter trimmed reads md5sums.txt - md5 checksums of the fastq files ### Adapter Trimming Adapter trimming of the trimmed fastqs was perfomed using the cutadapt application (https://cutadapt.readthedocs.io/en/stable/) with the following command: ``` cutadapt \ -a ATCGATTCGTGCTTGTCCGTGGTACTCGGCA \ -A ATCGATTCGTGCTCGATGAACCGGGCGCTTA \ --overlap 8 \ -j 10 \ -o {output.fastq1} \ -p {output.fastq2} \ {input.fastq1} \ {input.fastq2} ``` ### Alignment Reads can be aligned to the a reference fasta (e.g. hg38 without alt contigs) using bwa-mem and indexed with samtools. ``` bwa mem -t24 -R {RG_TAG} {REFERENCE_FASTA} {input.fastq1} {input.fastq2} | \ samtools sort -@4 -o {output.bam} samtools index {output.bam} ``` *Rev 2023-09-15*