================= Sample ================= MDA-MB-453 cell line from ATCC. For more information: https://www.atcc.org/products/htb-131 ================================== Library preparation & sequencing ================================== The Kinnex full-length RNA kit (8-fold concatenation) and the Kinnex 16S rRNA kit (12-fold concatenation) was used to make one Kinnex library each from the same ArgenTag cDNA. No TSO artifact removal steps were required. Each library was sequenced on one Revio SMRT Cell with SPRQ chemistry. A modified bfx processing is described below, there the S-reads are first processed using ArgenTag's taggy_demux script to tag the UMIs and BCs before proceeding with deduplication, mapping, collapse, classification, and count matrix generation as described in https://isoseq.how. ================= 0-Sreads ================= Contains the segmented.bam after running ReadSegmentation (or skera). ================= 1-TaggedReads ================= Contains the BAM files that have been processed by ArgenTag's taggy_demux script and have UMI and cell barcodes annotated. See: https://github.com/argentagsw/taggy_demux ================= 2-DeduplicatedReads ================= This directory contains deduplicated reads that have been through barcode correction (using barcode whitelist) and UMI deduplication. The dedup reads are then used for subsequent mapping and transcript analyses. ======================== 3-CollapsedTranscripts ======================== This directory lists the total set of unique transcripts as a result of mapping the dedup reads to the genome, collapsed into transcripts, classified and filtered against Gencode using pigeon. ======================== 4-SeuratMatrix ======================== This directory contains the gene- and isoform-level count matrix compatible with common tertiary analyses tools such as Seurat.