Read type classification

Definitions

Assigned Type Definition
scAAV Self-complementary AAV where one half of the payload region is a reverse complement of the other, resulting in an intra-molecular double-stranded DNA template. A sequencing read is inferred as scAAV if it has both a primary and supplementary alignment to the vector genome.
ssAAV AAV where the resulting DNA template is expected to be single-stranded, as opposed to self-complementary. A sequencing read is inferred as ssAAV if it has a single alignment to the AAV genome and no complementary secondary alignment.
other Read consists of a fragment mapping to the vector but with unexpected polarities (e.g. +,-,- or +,+,+) and cannot be well-defined at the moment. This means that the algorithm was not able to distinguish the read either as ssAAV or scAAV by the definitions above. Usually this means there are multiple supplementary alignments on the vector region and/or it’s a weird molecule.
host Read originates from the host genome that is given (e.g. hg38, CHM13).
repcap Read originates from the repcap plasmid. The Rep gene encodes four proteins (Rep78, Rep68, Rep52, and Rep40), which are required for viral genome replication and packaging, while Cap expression gives rise to the viral capsid proteins (VP; VP1/VP2/VP3), which form the outer capsid shell that protects the viral genome, as well as being actively involved in cell binding and internalization.
helper Read originates from the helper plasmid. In addition to Rep and Cap, AAV requires a helper plasmid containing genes from adenovirus. These genes (E4, E2a and VA) mediate AAV replication.
chimeric Read consists of fragments that map to one or more “genomes” (e.g. vector and host; helper and repcap).

Note: Even though ssAAV distinguishes one ITR as the wildtype (wtITR) and the other as the mutated ITR (mITR), we will still refer to them as “left ITR” and “right ITR”. For example, “left-partial” would be equivalent to “mITR-partial” in the case where the mITR is the left ITR based on the given genomic coordinates.


Single-stranded AAV types (ssAAV). ssAAV reads must fully map within the vector genome (which can contain the backbone, beyond the ITR region) consisting of a single (primary) alignment. “ssAAV-full” reads must cover from left ITR to right ITR. “ssAAV-left-partial” contain the left ITR but are missing the right ITR. “scAAV-vector+backbone” map partially within the ITR and partially to the backbone.
Single-stranded AAV types (ssAAV). ssAAV reads must fully map within the vector genome (which can contain the backbone, beyond the ITR region) consisting of a single (primary) alignment. “ssAAV-full” reads must cover from left ITR to right ITR. “ssAAV-left-partial” contain the left ITR but are missing the right ITR. “scAAV-vector+backbone” map partially within the ITR and partially to the backbone.


Self-complementary AAV types (scAAV). scAAV reads must fully map within the vector genome (which can contain the backbone, beyond the ITR region) consisting of both primary and supplementary alignments. The definitions of full, partial, and backbone reads are the same as in ssAAV.
Self-complementary AAV types (scAAV). scAAV reads must fully map within the vector genome (which can contain the backbone, beyond the ITR region) consisting of both primary and supplementary alignments. The definitions of full, partial, and backbone reads are the same as in ssAAV.

Assigned types by read alignment characteristics

Assigned Type

Count

Frequency (%)

scAAV

1,387,444

51.07

ssAAV

1,127,606

41.50

other

149,756

5.51

chimeric

16,879

0.62

repcap

15,697

0.58

host

10,477

0.39

helper

7,483

0.28

unmapped

1,512

0.06

Single-stranded vs self-complementary frequency

Assigned Type

Count

Frequency in AAV (%)

Total Frequency (%)

scAAV

1,387,444

55.17

51.07

ssAAV

1,127,606

44.83

41.50

Distribution of read lengths by assigned AAV types

Assigned AAV read types detailed analysis

Assigned AAV types (top 20)

Assigned Type

Assigned Subtype

Count

Freq. in AAV (%)

Total Freq. (%)

ssAAV

full

553,634

22.01

20.38

ssAAV

left-partial

188,741

7.50

6.95

ssAAV

right-partial

316,265

12.57

11.64

ssAAV

partial

64,645

2.57

2.38

ssAAV

backbone

1,274

0.05

0.05

ssAAV

vector+backbone

3,047

0.12

0.11

scAAV

full

370,134

14.72

13.62

scAAV

left-partial

158,135

6.29

5.82

scAAV

right-partial

431,999

17.18

15.90

scAAV

partial

101,804

4.05

3.75

scAAV

backbone

20,950

0.83

0.77

scAAV

vector+backbone

14,070

0.56

0.52

scAAV

full|right-partial

70,469

2.80

2.59

scAAV

right-partial|partial

57,682

2.29

2.12

scAAV

full|left-partial

44,441

1.77

1.64

scAAV

left-partial|partial

24,829

0.99

0.91

scAAV

partial|right-partial

20,383

0.81

0.75

scAAV

left-partial|full

16,566

0.66

0.61

scAAV

right-partial|full

15,334

0.61

0.56

scAAV

left-partial|right-partial

12,259

0.49

0.45

Definitions

Assigned Subtype Definition
full Read alignment includes the entire ITR-to-ITR target vector sequence.
left-partial Read aligns to a fragment of the vector originating from the left (upstream) ITR of the vector while not covering the right ITR.
right-partial Read aligns to a fragment of the vector originating from the right (downstream) ITR of the vector while not covering the left ITR.
partial Read aligns to a fragment of the vector originating from within the ITR sequences.
vector+backbone Read aligns to a fragment including the vector as well as plasmid backbone sequence.
backbone Read aligns to a fragment originating solely from the plasmid backbone sequence.
snapback Read consists of a double-stranded, sub-genomic fragment including only one ITR and read alignments in both (+) and (-) polarities. (ssAAV only)

Flip/flop considerations

Term Definition
Flip/Flop One ITR is formed by two palindromic arms, called B–B’ and C–C’, embedded in a larger one, A–A’. The order of these palindromic sequences defines the flip or flop orientation of the ITR. (Read more)

No flip/flop analysis results available to display.

Flip/flop configurations, scAAV only

No scAAV flip/flop analysis results available to display.

Flip/flop configurations, ssAAV only

No ssAAV flip/flop analysis results available to display.

Distribution of read length by subtype

AAV mapping to reference sequence

Gene therapy construct

Distribution of non-matches by reference position

Methods

This report was generated by an automated analysis of long-read sequencing data from adeno-associated virus (AAV) products. The sequencing data should be from the PacBio sequencer run in AAV mode, or equivalent circular consensus sequencing (CCS) reads (Travers et al., 2010). Reads are aligned to the AAV, packaging, and masked host reference sequences using Minimap2 (Li, 2018).

In this analysis, aligned sequencing reads were filtered for quality to include primary alignments and reads with mapping quality scores greater than 10. The alignment coordinates and orientation of reads passing these filters were then compared to the annotated vector region in the reference sequence, which comprises the left and right ITRs and the genomic region between them, to assign each read to a type and (for AAV reads) subtype classification according to the definitions above.

Citations

  1. Travers, K. J., Chin, C.-S., Rank, D. R., Eid, J. S. & Turner, S. W. A flexible and efficient template format for circular consensus sequencing and SNP detection. Nucleic Acids Research 38, e159–e159 (2010).
  2. Li, H. Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).