High-Throughput Sequencing of the Human Platelet Transcriptome

Blood ◽  
2010 ◽  
Vol 116 (21) ◽  
pp. 481-481
Author(s):  
Paul F Bray ◽  
Paolo M. Fortina ◽  
Srikanth Nagalla ◽  
Kathleen Delgrosso ◽  
Adam Ertel ◽  
...  

Abstract Abstract 481 Most successful DNA-based genome wide association studies identify genomic regions, not genes themselves, and the findings are often devoid of context or mechanism. To identify the genetic basis of disease and disease traits, it is imperative to characterize the quantity and forms of the genes that are expressed in the tissue of interest. It is not feasible to use primary megakaryocytes to profile mRNA from large numbers of subjects, but platelet RNA is easy to obtain. Others and we have previously surveyed genome-wide platelet RNA expression using microarrays, an approach that has had a major impact on systems biology. However, microarrays have a number of limitations, including the use of probes only to known transcripts, a limited dynamic range for quantifying very low and high levels of transcripts, high background levels from cross-hybridization, and complicated normalization schemes to compare expression levels across experiments. Novel high-throughput sequencing approaches that overcome the limitations of microarrays have recently become available. RNA sequencing (RNAseq) has a remarkable ability to quantify mRNAs and provide information about transcript sequence variations, including single nucleotide changes and alternately spliced exons. The goal of these studies was to apply RNAseq to capture platelet transcriptome complexity. Total RNA was prepared using leukocyte-depleted platelets (LDP; less than 1 WBC per 5 million platelets) from 4 donors; 2 were studied twice each. Analysis of this material showed that compared to nucleated cells (HeLa, Meg-01), platelets had 50%-90% less ribosomal RNA, and high levels of messenger and small RNAs (Agilent 2100). The major reduction in platelet rRNA was confirmed by RNA gel analysis. The platelet whole transcriptomes were analyzed via the Applied Biosystems (AB) SOLiD 3Plus next generation sequencing protocols and platform. A typical sequence run generated ∼250 million reads of 50 bp each. We observed more than 30,000 independent platelet mRNA-coding transcripts from about 10,000 genes, demonstrating substantial numbers of variant isoforms. The increased sensitivity of RNAseq for low copy number is clear from these results, because prior platelet transcriptome studies using microarrays have identified only 1500–6000 expressed genes. As an example, the platelet-specific transcript, ITGA2B, showed very high copy number in platelets, but no expression in HeLa cells and modest expression in the megakaryocyte cell line, Meg-01. As is expected for RNA-Seq data, the density of mapped reads varies by exon and local sequence. We also provide examples of newly discovered SNPs that encode non-conservative amino acid changes (AKT2 1209A/T; PIK3CB 837C/G) and alter consensus exon/intron splice junction sites (P2YR12 nt 65 G/A). We have also identified a major difference in the ratio of two splice variants of the FcRg chain, 4:1 in one human platelet donor and 49:1 in another. In summary, we have demonstrated that RNAseq can accurately and sensitively determine the quantity and quality of variations in individual platelet transcriptomes. It appears that the the platelet transcriptome is approximately 10 times more complex than previously thought. The major relative reduction in platelet rRNA may be an advantage for characterizing functional platelet transcripts. RNAseq should permit better understanding of the molecular mechanisms regulating platelet physiology and identify novel genetic variants that contribute to disorders of thrombosis and hemostasis. Disclosures: No relevant conflicts of interest to declare.

Genes ◽  
2019 ◽  
Vol 10 (4) ◽  
pp. 275 ◽  
Author(s):  
Tatiana Maroilley ◽  
Maja Tarailo-Graovac

The problem of ‘missing heritability’ affects both common and rare diseases hindering: discovery, diagnosis, and patient care. The ‘missing heritability’ concept has been mainly associated with common and complex diseases where promising modern technological advances, like genome-wide association studies (GWAS), were unable to uncover the complete genetic mechanism of the disease/trait. Although rare diseases (RDs) have low prevalence individually, collectively they are common. Furthermore, multi-level genetic and phenotypic complexity when combined with the individual rarity of these conditions poses an important challenge in the quest to identify causative genetic changes in RD patients. In recent years, high throughput sequencing has accelerated discovery and diagnosis in RDs. However, despite the several-fold increase (from ~10% using traditional to ~40% using genome-wide genetic testing) in finding genetic causes of these diseases in RD patients, as is the case in common diseases—the majority of RDs are also facing the ‘missing heritability’ problem. This review outlines the key role of high throughput sequencing in uncovering genetics behind RDs, with a particular focus on genome sequencing. We review current advances and challenges of sequencing technologies, bioinformatics approaches, and resources.


2013 ◽  
Vol 20 (4) ◽  
pp. R171-R181 ◽  
Author(s):  
Hidewaki Nakagawa

Prostate cancer (PC) is the most common malignancy in males. It is evident that genetic factors at both germline and somatic levels play critical roles in prostate carcinogenesis. Recently, genome-wide association studies (GWAS) by high-throughput genotyping technology have identified more than 70 germline variants of various genes or chromosome loci that are significantly associated with PC susceptibility. They include multiple 8q24 loci, prostate-specific genes, and metabolism-related genes. Somatic alterations in PC genomes have been explored by high-throughput sequencing technologies such as whole-genome sequencing and RNA sequencing, which have identified a variety of androgen-responsive events and fusion transcripts represented by E26 transformation-specific (ETS) gene fusions. Recent innovations in high-throughput genomic technologies have enabled us to analyze PC genomics more comprehensively, more precisely, and on a larger scale in multiple ethnic groups to increase our understanding of PC genomics and biology in germline and somatic studies, which can ultimately lead to personalized medicine for PC diagnosis, prevention, and therapy. However, these data indicate that the PC genome is more complex and heterogeneous than we expected from GWAS and sequencing analyses.


Author(s):  
Gerald Mboowa ◽  
Ivan Sserwadda ◽  
Marion Amujal ◽  
Norah Namatovu

HIV/AIDS, tuberculosis (TB), and malaria are 3 major global public health threats that undermine development in many resource-poor settings. Recently, the notion that positive selection during epidemics or longer periods of exposure to common infectious diseases may have had a major effect in modifying the constitution of the human genome is being interrogated at a large scale in many populations around the world. This positive selection from infectious diseases increases power to detect associations in genome-wide association studies (GWASs). High-throughput sequencing (HTS) has transformed both the management of infectious diseases and continues to enable large-scale functional characterization of host resistance/susceptibility alleles and loci; a paradigm shift from single candidate gene studies. Application of genome sequencing technologies and genomics has enabled us to interrogate the host-pathogen interface for improving human health. Human populations are constantly locked in evolutionary arms races with pathogens; therefore, identification of common infectious disease-associated genomic variants/markers is important in therapeutic, vaccine development, and screening susceptible individuals in a population. This review describes a range of host-pathogen genomic loci that have been associated with disease susceptibility and resistant patterns in the era of HTS. We further highlight potential opportunities for these genetic markers.


2020 ◽  
Vol 160 (11-12) ◽  
pp. 634-642
Author(s):  
Shiqiang Luo ◽  
Xingyuan Chen ◽  
Tizhen Yan ◽  
Jiaolian Ya ◽  
Zehui Xu ◽  
...  

High-throughput sequencing based on copy number variation (CNV-seq) is commonly used to detect chromosomal abnormalities. This study identifies chromosomal abnormalities in aborted embryos/fetuses in early and middle pregnancy and explores the application value of CNV-seq in determining the causes of pregnancy termination. High-throughput sequencing was used to detect chromosome copy number variations (CNVs) in 116 aborted embryos in early and middle pregnancy. The detection data were compared with the Database of Genomic Variants (DGV), the Database of Chromosomal Imbalance and Phenotype in Humans using Ensemble Resources (DECIPHER), and the Online Mendelian Inheritance in Man (OMIM) database to determine the CNV type and the clinical significance. High-throughput sequencing results were successfully obtained in 109 out of 116 specimens, with a detection success rate of 93.97%. In brief, there were 64 cases with abnormal chromosome numbers and 23 cases with CNVs, in which 10 were pathogenic mutations and 13 were variants of uncertain significance. An abnormal chromosome number is the most important reason for embryo termination in early and middle pregnancy, followed by pathogenic chromosome CNVs. CNV-seq can quickly and accurately detect chromosome abnormalities and identify microdeletion and microduplication CNVs that cannot be detected by conventional chromosome analysis, which is convenient and efficient for genetic etiology diagnosis in miscarriage.


2021 ◽  
Vol 99 (Supplement_3) ◽  
pp. 243-244
Author(s):  
Brittany N Diehl ◽  
Andres A Pech-Cervantes ◽  
Thomas H Terrill ◽  
Ibukun M Ogunade ◽  
Owen Rae ◽  
...  

Abstract Florida Native sheep is an indigenous breed from Florida and expresses superior parasite resistance. Previous candidate and genome wide association studies with Florida Native sheep have identified single nucleotide polymorphisms with additive and non-additive effects associated with parasite resistance. However, the role of other potential DNA variants, such as copy number variants (CNVs), controlling this complex trait have not been evaluated. The objective of the present study was to investigate the importance of CNVs on resistance to natural Haemonchus contortus infections in Florida Native sheep. A total of 200 sheep were evaluated in the present study. Phenotypic records included fecal egg count (FEC, eggs/gram), FAMACHA score, and packed cell volume (PCV, %). Sheep were genotyped using the GGP Ovine 50K SNP chip. The copy number analysis was used to identify CNVs using the univariate method. A total of 170 animals with CNVs and phenotypic data were used for the association testing. Association tests were carried out using single linear regression and Principal Component Analysis (PCA) correction to identify CNVs associated with FEC, FAMACHA, and PCV. To confirm our results, a second association testing using the correlation-trend test with PCA correction was performed. Significant CNVs were detected when their adjusted p-value was < 0.05 after FDR correction. A deletion CNV in chromosome 21 was associated with FEC. This DNA variant was located in intron 2 of RAB3IL gene and overlapped a QTL associated with changes in eosinophil number. Our study demonstrated for the first time that CNVs could be potentially involved with parasite resistance in this heritage sheep breed.


2020 ◽  
Vol 3 (1) ◽  
Author(s):  
Juan Xie ◽  
Jinfang Zheng ◽  
Xu Hong ◽  
Xiaoxue Tong ◽  
Shiyong Liu

AbstractProtein-RNA interaction participates in many biological processes. So, studying protein–RNA interaction can help us to understand the function of protein and RNA. Although the protein–RNA 3D3D model, like PRIME, was useful in building 3D structural complexes, it can’t be used genome-wide, due to lacking RNA 3D structures. To take full advantage of RNA secondary structures revealed from high-throughput sequencing, we present PRIME-3D2D to predict binding sites of protein–RNA interaction. PRIME-3D2D is almost as good as PRIME at modeling protein–RNA complexes. PRIME-3D2D can be used to predict binding sites on PDB data (MCC = 0.75/0.70 for binding sites in protein/RNA) and transcription-wide (MCC = 0.285 for binding sites in RNA). Testing on PDB and yeast transcription-wide data show that PRIME-3D2D performs better than other binding sites predictor. So, PRIME-3D2D can be used to predict the binding sites both on PDB and genome-wide, and it’s freely available.


Author(s):  
Hai Yang ◽  
Daming Zhu

Copy number variation (CNV) is a prevalent kind of genetic structural variation which leads to an abnormal number of copies of large genomic regions, such as gain or loss of DNA segments larger than 1[Formula: see text]kb. CNV exists not only in human genome but also in plant genome. Current researches have testified that CNV is associated with many complex diseases. In this paper, guanine-cytosine (GC) bias, mappability and their effect on read depth signals in sequencing data are discussed first. Subsequently, a new correction method for GC bias and an improved combinatorial detection algorithm for CNV using high-throughput sequencing reads based on hidden Markov model (CNV-HMM) are proposed. The corrected read depth signals have lower correlation with GC content, mappability of reads and the width of analysis window. Then we create a hidden Markov model which maps the reads onto the reference genome and records the unmapped reads. The unmapped reads are counted and normalized. The CNV-HMM detects the abnormal signal of read count and gains the candidate CNVs using the expectation maximization (EM) algorithm. Finally, we filter the candidate CNVs using split reads to promote the performance of our algorithm. The experiment result indicates that the CNV-HMM algorithm has higher accuracy and sensitivity for CNVs detection than most current detection algorithms.


2020 ◽  
Vol 6 (6) ◽  
pp. FSO476
Author(s):  
Ofir Israeli ◽  
Efi Makdasi ◽  
Inbar Cohen-Gihon ◽  
Anat Zvi ◽  
Shirley Lazar ◽  
...  

High-throughput DNA sequencing (HTS) of pathogens in whole blood samples is hampered by the high host/pathogen nucleic acids ratio. We describe a novel and rapid bacterial enrichment procedure whose implementation is exemplified in simulated bacteremic human blood samples. The procedure involves depletion of the host DNA, rapid HTS and bioinformatic analyses. Following this procedure, Y. pestis, F. tularensis and B. anthracis spiked-in samples displayed an improved host/pathogen DNA ratio of 2.5–5.9 orders of magnitude, in samples with bacteria spiked-in at 103–105 CFU/ml. The procedure described in this study enables rapid and detailed metagenomic profiling of pathogens within 8–9 h, circumventing the challenges imposed by the high background present in the bacteremic blood and by the unknown nature of the sample.


Sign in / Sign up

Export Citation Format

Share Document