Extension of human lncRNA transcripts by RACE coupled with long-read high-throughput sequencing (RACE-Seq)

Birds are a wonderfully diverse and accessible clade with an exceptional range of ecologies and behaviors, making the study of the avian major histocompatibility complex (MHC) of great interest. In the last 20 years, particularly with the advent of high-throughput sequencing, the avian MHC has been explored in great depth in several dimensions: its ability to explain ecological patterns in nature, such as mating preferences; its correlation with parasite resistance; and its structural evolution across the avian tree of life. Here, we review the latest pulse of avian MHC studies spurred by high-throughput sequencing. Despite high-throughput approaches to MHC studies, substantial areas remain in need of improvement with regard to our understanding of MHC structure, diversity, and evolution. Recent studies of the avian MHC have nonetheless revealed intriguing connections between MHC structure and life history traits, and highlight the advantages of long-term ecological studies for understanding the patterns of MHC variation in the wild. Given the exceptional diversity of birds, their accessibility, and the ease of sequencing their genomes, studies of avian MHC promise to improve our understanding of the many dimensions and consequences of MHC variation in nature. However, significant improvements in assembling complete MHC regions with long-read sequencing will be required for truly transformative studies.

Download Full-text

SNP identification and validation on genomic DNA for studying genetic diversity in Thunnus albacares and Scomberomorus brasiliensis by combining RADseq and long read high throughput sequencing

Fisheries Research ◽

10.1016/j.fishres.2017.09.002 ◽

2018 ◽

Vol 198 ◽

pp. 189-194 ◽

Cited By ~ 4

Author(s):

Zoila Raquel Siccha-Ramirez ◽

Francesco Maroso ◽

Belén G. Pardo ◽

Carlos Fernández ◽

Paulino Martínez ◽

...

Keyword(s):

Genetic Diversity ◽

High Throughput ◽

Genomic Dna ◽

High Throughput Sequencing ◽

Thunnus Albacares ◽

Long Read

Download Full-text

neoantigenR: An annotation based pipeline for tumor neoantigen identification from sequencing data

10.1101/171843 ◽

2017 ◽

Cited By ~ 4

Author(s):

Shaojun Tang ◽

Subha Madhavan

Keyword(s):

Alternative Splicing ◽

Cancer Immunotherapy ◽

High Throughput ◽

High Throughput Sequencing ◽

Sequencing Data ◽

Peptide Epitopes ◽

Cancer Antigens ◽

Long Read ◽

Personalized Cancer ◽

Specific Peptide

AbstractStudies indicate that more than 90% of human genes are alternatively spliced, suggesting the complexity of the transcriptome assembly and analysis. The splicing process is often disrupted, resulting in both functional and non-functional end-products (Sveen et al. 2016) in many cancers. Harnessing the immune system to fight against malignant cancers carrying aberrantly mutated or spliced products is becoming a promising approach to cancer therapy. Advances in immune checkpoint blockade have elicited adaptive immune responses with promising clinical responses to treatments against human malignancies (Tumor Neoantigens in Personalized Cancer Immunotherapy 2017). Emerging data suggest that recognition of patient-specific mutation-associated cancer antigens (i.e. from alternative splicing isoforms) may allow scientists to dissect the immune response in the activity of clinical immunotherapies (Schumacher and Schreiber 2015). The advent of high-throughput sequencing technology has provided a comprehensive view of both splicing aberrations and somatic mutations across a range of human malignancies, allowing for a deeper understanding of the interplay of various disease mechanisms.Meanwhile, studies show that the number of transcript isoforms reported to date may be limited by the short-read sequencing due to the inherit limitation of transcriptome reconstruction algorithms, whereas long-read sequencing is able to significantly improve the detection of alternative splicing variants since there is no need to assemble full-length transcripts from short reads. The analysis of these high-throughput long-read sequencing data may permit a systematic view of tumor specific peptide epitopes (also known as neoantigens) that could serve as targets for immunotherapy (Tumor Neoantigens in Personalized Cancer Immunotherapy 2017).Currently, there is no software pipeline available that can efficiently produce mutation-associated cancer antigens from raw high-throughput sequencing data on patient tumor DNA (The Problem with Neoantigen Prediction 2017). In addressing this issue, we introduce a R package that allows the discoveries of peptide epitope candidates, which are the tumor-specific peptide fragments containing potential functional neoantigens. These peptide epitopes consist of structure variants including insertion, deletions, alternative sequences, and peptides from nonsynonymous mutations. Analysis of these precursor candidates with widely used tools such as netMHC allows for the accurate in-silico prediction of neoantigens. The pipeline named neoantigeR is currently hosted in https://github.com/ICBI/neoantigeR.

Download Full-text

GtTR: Bayesian estimation of absolute tandem repeat copy number using sequence capture and high throughput sequencing

10.1101/246108 ◽

2018 ◽

Cited By ~ 1

Author(s):

Devika Ganesamoorthy ◽

Minh Duc Cao ◽

Tania Duarte ◽

Wenhan Chen ◽

Lachlan Coin

Keyword(s):

High Throughput ◽

Tandem Repeat ◽

Copy Number ◽

Tandem Repeats ◽

High Throughput Sequencing ◽

Sequence Data ◽

Complex Diseases ◽

Sequencing Analysis ◽

Reference Dataset ◽

Long Read

ABSTRACTBackgroundTandem repeats comprise significant proportion of the human genome including coding and regulatory regions. They are highly prone to repeat number variation and nucleotide mutation due to their repetitive and unstable nature, making them a major source of genomic variation between individuals. Despite recent advances in high throughput sequencing, analysis of tandem repeats in the context of complex diseases is still hindered by technical limitations.MethodsWe report a novel targeted sequencing approach, which allows simultaneous analysis of hundreds of repeats. We developed a Bayesian algorithm, namely – GtTR - which combines information from a reference long-read dataset with a short read counting approach to genotype tandem repeats at population scale. PCR sizing analysis was used for validation.ResultsWe used a PacBio long-read sequenced sample to generate a reference tandem repeat genotype dataset with on average 13% absolute deviation from PCR sizing results. Using this reference dataset GtTR generated estimates of VNTR copy number with accuracy within 95% high posterior density (HPD) intervals of 68% and 83% for capture sequence data and 200X WGS data respectively, improving to 87% and 94% with use of a PCR reference. We show that the genotype resolution increases as a function of depth, such that the median 95% HPD interval lies within 25%, 14%, 12% and 8% of the its midpoint copy number value for 30X, 200X WGS, 395X and 800X capture sequence data respectively. We validated nine targets by PCR sizing analysis and genotype estimates from sequencing results correlated well with PCR results.ConclusionsThe novel genotyping approach described here presents a new cost-effective method to explore previously unrecognized class of repeat variation in GWAS studies of complex diseases at the population level. Further improvements in accuracy can be obtained by improving accuracy of the reference dataset.

Download Full-text

Perspectives and benefits of high-throughput long-read sequencing in microbial ecology

Applied and Environmental Microbiology ◽

10.1128/aem.00626-21 ◽

2021 ◽

Author(s):

Leho Tedersoo ◽

Mads Albertsen ◽

Sten Anslan ◽

Benjamin Callahan

Keyword(s):

Microbial Ecology ◽

High Throughput ◽

Single Molecule ◽

High Throughput Sequencing ◽

Environmental Dna ◽

Nanopore Sequencing ◽

High Quality ◽

Short Read ◽

Sequencing Technologies ◽

Long Read

Short-read, high-throughput sequencing (HTS) methods have yielded numerous important insights into microbial ecology and function. Yet, in many instances short-read HTS techniques are suboptimal, for example by providing insufficient phylogenetic resolution or low integrity of assembled genomes. Single-molecule and synthetic long-read (SLR) HTS methods have successfully ameliorated these limitations. In addition, nanopore sequencing has generated a number of unique analysis opportunities such as rapid molecular diagnostics and direct RNA sequencing, and both PacBio and nanopore sequencing support detection of epigenetic modifications. Although initially suffering from relatively low sequence quality, recent advances have greatly improved the accuracy of long read sequencing technologies. In spite of great technological progress in recent years, the long-read HTS methods (PacBio and nanopore sequencing) are still relatively costly, require large amounts of high-quality starting material, and commonly need specific solutions in various analysis steps. Despite these challenges, long-read sequencing technologies offer high-quality, cutting-edge alternatives for testing hypotheses about microbiome structure and functioning as well as assembly of eukaryote genomes from complex environmental DNA samples.

Download Full-text

Accurate Microbiome Sequencing with Synthetic Long Read Sequencing

10.1101/2020.10.02.324038 ◽

2020 ◽

Author(s):

Nico Chung ◽

Marc W. Van Goethem ◽

Melanie A. Preston ◽

Filip Lhota ◽

Leona Cerna ◽

...

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

High Throughput ◽

High Throughput Sequencing ◽

Sequence Data ◽

Rrna Gene ◽

Microbial Composition ◽

Short Read ◽

Long Read ◽

Phylogenetic Resolution

AbstractThe microbiome plays a central role in biochemical cycling and nutrient turnover of most ecosystems. Because it can comprise myriad microbial prokaryotes, eukaryotes and viruses, microbiome characterization requires high-throughput sequencing to attain an accurate identification and quantification of such co-existing microbial populations. Short-read next-generation-sequencing (srNGS) revolutionized the study of microbiomes and remains the most widely used approach, yet read lengths spanning only a few of the nine hypervariable regions of the 16S rRNA gene limit phylogenetic resolution leading to misclassification or failure to classify in a high percentage of cases. Here we evaluate a synthetic long-read (SLR) NGS approach for full-length 16S rRNA gene sequencing that is high-throughput, highly accurate and low-cost. The sequencing approach is amenable to highly multiplexed sequencing and provides microbiome sequence data that surpasses existing short and long-read modalities in terms of accuracy and phylogenetic resolution. We validated this commercially-available technology, termed LoopSeq, by characterizing the microbial composition of well-established mock microbiome communities and diverse real-world samples. SLR sequencing revealed differences in aquatic community complexity associated with environmental gradients, resolved species-level community composition of uterine lavage from subjects with histories of misconception and accurately detected strain differences, multiple copies of the 16S rRNA in a single strain’s genome, as well as low-level contamination in soil cyanobacterial cultures. This approach has implications for widespread adoption of high-resolution, accurate long-read microbiome sequencing as it is generated on popular short read sequencing platforms without the need for additional infrastructure.

Download Full-text