scholarly journals Basics of high throughput sequencing Summary

2025 ◽  
Vol 77 (11) ◽  
pp. 6589-2025
Author(s):  
ALEKSANDRA GIZA ◽  
EWELINA IWAN ◽  
ARKADIUSZ BOMBA ◽  
DARIUSZ WASYL

Sequencing can provide genomic characterisation of a specific organism, as well as of a whole environmental or clinical sample. High Throughput Sequencing (HTS) makes it possible to generate an enormous amount of genomic data at gradually decreasing costs and almost in real-time. HTS is used, among others, in medicine, veterinary medicine, microbiology, virology and epidemiology. The paper presents practical aspects of the HTS technology. It describes generations of sequencing, which vary in throughput, read length, accuracy and costs ̶ and thus are used for different applications. The stages of HTS, as well as their purposes and pitfalls, are presented: extraction of the genetic material, library preparation, sequencing and data processing. For success of the whole process, all stages need to follow strict quality control measurements. Choosing the right sequencing platform, proper sample and library preparation procedures, as well as adequate bioinformatic tools are crucial for high quality results.

Blood ◽  
2016 ◽  
Vol 128 (22) ◽  
pp. 3699-3699
Author(s):  
Stefano Vergani ◽  
Ilya Korsunsky ◽  
Nicholas Chiorazzi ◽  
Davide Bagnara

Abstract High-throughput DNA sequencing of the adaptive immune receptor repertoire is a relatively new and fast growing technology used to study the immune response in health and disease. In B and T cell lymphoproliferative disorders, antigen receptor sequencing can be used to study clonal diversity and evolution of the disease in treatment free condition and in response to treatment. Furthermore, it can be used for the detection of minimal residual disease (MRD), providing information on the relationship between the presence and number of pre-treatment clone(s) and their relationship and responsibility for a subsequent relapse. The characteristics and quality of the data generated by high-throughput DNA sequencing of immune receptor signatures are the results of three major components: library preparation, sequencing platform, and software tools. For both the library and software, there are no standard protocols and tools. Indeed, new approaches are continually being developed to accommodate new sequencing platform features and shortcomings, such as errors and read length restrictions. Two major technical challenges are: procuring an unbiased repertoire library that for B lymphocytes obtains and retains the full length IGHV-D-J along with (sub)isotype information, and resolving data to a single cell level, crucial for detection of MRD and rare clonal variants existing in the early phase of the disease, which might emerge and be involved in future relapse or progression. We describe here a library preparation method for use with the Illumina MiSeq platform that results in an exhaustive full-length repertoire where virtually every B cell is sequenced, thereby maximizing the likelihood of identifying and quantifying the “real” IGHV-D-J repertoire of the sample analyzed. The method also allows the detection of very infrequent rearrangements and maintains IG sub-isotype information without compromising data quality. From 0.5 - 1 million human B cells can be sequenced in a single MiSeq 2x300 run with this approach. Key aspects of the technique are: 1) start from a well defined number of B lymphocytes 2) avoid V-gene specific PCR amplification and genetic material dilution in the pre-amplification phases 3) the specific depth of sequencing should depend on the starting B (or T) cell subset (i.e. na•ve, memory or plasma cell), and should be proportional to the number of starting cells. High quality sub-isotype information can be obtained with a second round of sequencing of shorter read length, e.g., with the Illumina 2x150 platform. We used 58 different CLL clones with known IGH sequence mixed all together with polyclonal B cell from a donor PBMC (Figure 1). The mixed lysate is used to test the ability to detect the different clones. The following describes how the absence of genetic material dilution in the pre-amplification phases impact on the ability to obtain a comprehensive repertoire. These are crucial in MRD detection, since diluting the genetic material (RNA and/or cDNA) prior PCR amplification compromises the ability to accurately and consistently detect the clonal variants, reducing the de facto sensitivity and reproducibility of the analysis. As a final example of the method's utility, we also demonstrate how different chronic lymphocytic leukemia clones present considerable variability in IG mRNA expression level that correlate with the number of unique mRNA molecule sequenced (Figure 3), which, if using a method with sub-optimal efficiency, could lead to a reduced clone-specific ability of detection by PCR based techniques. Figure 1. Figure 1. Figure 2. Each dilution is performed in replicates. The cDNA is obtained from all the RNA extracted from the starting cells. Each slice represents a different CLL, and each slice size is the frequency for which it is detected. A comprehensive detection of each CLL is dependent to the absence of genetic material dilution. Figure 2. Each dilution is performed in replicates. The cDNA is obtained from all the RNA extracted from the starting cells. Each slice represents a different CLL, and each slice size is the frequency for which it is detected. A comprehensive detection of each CLL is dependent to the absence of genetic material dilution. Figure 3. qPCR IgH expression correlate with the number of unique mRNA molecule sequenced. Figure 3. qPCR IgH expression correlate with the number of unique mRNA molecule sequenced. Disclosures No relevant conflicts of interest to declare.


Viruses ◽  
2018 ◽  
Vol 10 (10) ◽  
pp. 566 ◽  
Author(s):  
Siemon Ng ◽  
Cassandra Braxton ◽  
Marc Eloit ◽  
Szi Feng ◽  
Romain Fragnoud ◽  
...  

A key step for broad viral detection using high-throughput sequencing (HTS) is optimizing the sample preparation strategy for extracting viral-specific nucleic acids since viral genomes are diverse: They can be single-stranded or double-stranded RNA or DNA, and can vary from a few thousand bases to over millions of bases, which might introduce biases during nucleic acid extraction. In addition, viral particles can be enveloped or non-enveloped with variable resistance to pre-treatment, which may influence their susceptibility to extraction procedures. Since the identity of the potential adventitious agents is unknown prior to their detection, efficient sample preparation should be unbiased toward all different viral types in order to maximize the probability of detecting any potential adventitious viruses using HTS. Furthermore, the quality assessment of each step for sample processing is also a critical but challenging aspect. This paper presents our current perspectives for optimizing upstream sample processing and library preparation as part of the discussion in the Advanced Virus Detection Technologies Interest group (AVDTIG). The topics include: Use of nuclease treatment to enrich for encapsidated nucleic acids, techniques for amplifying low amounts of virus nucleic acids, selection of different extraction methods, relevant controls, the use of spike recovery experiments, and quality control measures during library preparation.


2018 ◽  
Vol 24 (9_suppl) ◽  
pp. 94S-103S ◽  
Author(s):  
Qi Wang ◽  
Lijuan Cao ◽  
Guangying Sheng ◽  
Hongjie Shen ◽  
Jing Ling ◽  
...  

Inherited thrombocytopenia is a group of hereditary diseases with a reduction in platelet count as the main clinical manifestation. Clinically, there is an urgent need for a convenient and rapid diagnosis method. We introduced a high-throughput, next-generation sequencing (NGS) platform into the routine diagnosis of patients with unexplained thrombocytopenia and analyzed the gene sequencing results to evaluate the value of NGS technology in the screening and diagnosis of inherited thrombocytopenia. From a cohort of 112 patients with thrombocytopenia, we screened 43 patients with hereditary features. For the blood samples of these 43 patients, a gene sequencing platform for hemorrhagic and thrombotic diseases comprising 89 genes was used to perform gene detection using NGS technology. When we combined the screening results with clinical features and other findings, 15 (34.9%) of 43patients were diagnosed with inherited thrombocytopenia. In addition, 19 pathogenic variants, including 8 previously unreported variants, were identified in these patients. Through the use of this detection platform, we expect to establish a more effective diagnostic approach to such disorders.


PLoS ONE ◽  
2021 ◽  
Vol 16 (11) ◽  
pp. e0254971
Author(s):  
Federico Rossi ◽  
Alessandro Crnjar ◽  
Federico Comitani ◽  
Rodrigo Feliciano ◽  
Leonie Jahn ◽  
...  

Tree ring features are affected by environmental factors and therefore are the basis for dendrochronological studies to reconstruct past environmental conditions. Oak wood often provides the data for these studies because of the durability of oak heartwood and hence the availability of samples spanning long time periods of the distant past. Wood formation is regulated in part by epigenetic mechanisms such as DNA methylation. Studies of the methylation state of DNA preserved in oak heartwood thus could identify epigenetic tree ring features informing on past environmental conditions. In this study, we aimed to establish protocols for the extraction of DNA, the high-throughput sequencing of whole-genome DNA libraries (WGS) and the profiling of DNA methylation by whole-genome bisulfite sequencing (WGBS) for oak (Quercus robur) heartwood drill cores taken from the trunks of living standing trees spanning the AD 1776-2014 time period. Heartwood contains little DNA, and large amounts of phenolic compounds known to hinder the preparation of high-throughput sequencing libraries. Whole-genome and DNA methylome library preparation and sequencing consistently failed for oak heartwood samples more than 100 and 50 years of age, respectively. DNA fragmentation increased with sample age and was exacerbated by the additional bisulfite treatment step during methylome library preparation. Relative coverage of the non-repetitive portion of the oak genome was sparse. These results suggest that quantitative methylome studies of oak hardwood will likely be limited to relatively recent samples and will require a high sequencing depth to achieve sufficient genome coverage.


Blood ◽  
2016 ◽  
Vol 127 (23) ◽  
pp. 2791-2803 ◽  
Author(s):  
Ilenia Simeoni ◽  
Jonathan C. Stephens ◽  
Fengyuan Hu ◽  
Sri V. V. Deevi ◽  
Karyn Megy ◽  
...  

Key Points Developed a targeted sequencing platform covering 63 genes linked to heritable bleeding, thrombotic, and platelet disorders. The ThromboGenomics platform provides a sensitive genetic test to obtain molecular diagnoses in patients with a suspected etiology.


2019 ◽  
Author(s):  
Lucas A. Nell

AbstractHigh-throughput sequencing (HTS) is central to the study of population genomics and has an increasingly important role in constructing phylogenies. Choices in research design for sequencing projects can include a wide range of factors, such as sequencing platform, depth of coverage, and bioinformatic tools. Simulating HTS data better informs these decisions. However, current standalone HTS simulators cannot generate genomic variants under even somewhat complex evolutionary scenarios, which greatly reduces their usefulness for fields such as population genomics and phylogenomics. Here I present the R package jackalope that simply and efficiently simulates (i) variants from reference genomes and (ii) reads from both Illumina and Pacific Biosciences (PacBio) platforms. Genomic variants can be simulated using phylogenies, gene trees, coalescent-simulation output, population-genomic summary statistics, and Variant Call Format (VCF) files. jackalope can simulate single, paired-end, or mate-pair Illumina reads, as well as reads from Pacific Biosciences. These simulations include sequencing errors, mapping qualities, multiplexing, and optical/PCR duplicates. It can read reference genomes from FASTA files and can simulate new ones, and all outputs can be written to standard file formats. jackalope is available for Mac, Windows, and Linux systems.


2017 ◽  
Author(s):  
Jean-Philippe Bürckert ◽  
William J. Faison ◽  
Axel R. S. X. Dubois ◽  
Regina Sinner ◽  
Oliver Hunewald ◽  
...  

AbstractWith the advent of high-throughput sequencing (HTS), profiling immunoglobulin (IG) repertoires has become an essential part of immunological research. Advances in sequencing technology enable the IonTorrent Personal Genome Machine (PGM) to cover the full-length of IG mRNA transcripts. Nucleotide insertions and deletions (indels) are the dominant errors of the PGM sequencing platform and can critically influence IG repertoire assessments. Here, we present a PGM-tailored IG repertoire sequencing approach combining error correction through unique molecular identifier (UID) barcoding and indel detection through ImMunoGeneTics (IMGT), the most commonly used sequence alignment database for IG sequences. Using artificially falsified sequences for benchmarking, we found that IMGT efficiently detects 98% of the introduced indels through gene-segment frameshifts. Undetected indels are either located at the ends of the sequences or produce masked frameshifts with an insertion and deletion in close proximity. IMGT’s indel correction algorithm resolves up to 87% of the tested insertions, but no deletions. The complementary determining regions 3 (CDR3s) are returned 100% correct for up to 3 insertions or 3 deletions through conservative culling. We further show, that our PGM-tailored unique molecular identifiers results in highly accurate HTS datasets if combined with the presented data processing. In this regard, considering sequences with at least two copies from datasets with UID families of minimum 3 reads result in correct sequences with over 99% confidence. The protocol and sample processing strategies described in this study will help to establish benchtop-scale sequencing of IG heavy chain transcripts in the field of IG repertoire research.


Sign in / Sign up

Export Citation Format

Share Document