Index of /public/dataset/Onso/Zymo_wastewater

 Name                    Last modified      Size  Description
 Parent Directory                             -   
 checksums/              2024-06-25 17:15    -   
 fastqs/                 2024-06-25 17:15    -   
 README.txt              2024-06-25 17:16  1.7K

# PacBio Onso Sequencing of Wastewater Samples for Zymo Research

## Legal disclaimer

All trademarks, trade names, or logos mentioned or used are the property of their respective owners.

### Sample / library prep

HMW gDNA were extracted from three individual wastewater samples using Zymo Research Quick DNA/RNA Water Kit per manufacturer's instructions and processed using PacBio's Onso Fragmentation DNA Library Prep Kit to generate libraries suitable for short-read shotgun metagenomics sequencing.  

### Paired-end sequence data

Three individual wastewater samples were combined into three sample pools and sequenced on a PacBio Onso system in paired-end mode (PE 2x150) using one lane of one flow cell.

Because this was a paired-end run there are two FASTQ files per sample.

Each file name contains the name of the sample, followed by Lane number (L01), and subsequently R1 or R2 to specify Read 1 or Read 2, respectively.
For example: "Raw_influent_L01_R2_Sample_Library.fastq.gz" refers to sample Raw influent, Lane 1, insert Read 2.

## Data processing

These are raw reads coming right off the instrument with no QC performed. Please refer to the next section for notes on adapter trimming.

### Trimming adapters

While we did not perform adapter trimming prior to posting this dataset, we would recommend doing so. Below is an example command using the adapters specific to this library:

```bash
cutadapt \
    -a ATCGATTCGTGCTTGTCCGTGGTACTCGGCA \
    -A ATCGATTCGTGCTCGATGAACCGGGCGCTTA \
    -j 8 \
    -m 2 \
    -o {output.fastq1} \
    -p {output.fastq2} \
    {input_filt.fastq1} \
    {input_filt.fastq2}
```

Sequences to trim for your reference:

- Trim R1: ATCGATTCGTGCTTGTCCGTGGTACTCGGCA
- Trim R2: ATCGATTCGTGCTCGATGAACCGGGCGCTTA

*Rev 2024-06-25*