scholarly journals Systematic identification of intron retention associated variants from massive publicly available transcriptome sequencing data

2021 ◽  
Author(s):  
Yuichi Shiraishi ◽  
Ai Okada ◽  
Kenichi Chiba ◽  
Ikuko Omori ◽  
Raul Nicolas Mateos ◽  
...  

Many disease-associated genomic variants disrupt gene function through abnormal splicing. With the advancement of genomic medicine, identifying disease-associated splicing associated variants has become more important than ever. Most bioinformatics approaches to detect splicing associated variants require both genome and transcriptomic data. However, there are not many datasets where both of them are available. In this study, we developed a methodology to detect genomic variants that cause splicing changes (more specifically, intron retention), using transcriptome sequencing data alone. After demonstrating its high sensitivity and precision, we have applied it to 230,988 transcriptome sequencing data from the publicly available repository and identified 27,937 intron retention associated variants (IRAVs). In addition, by exploring positional relationships with variants registered in existing disease databases, we extracted 3,077 putative disease-associated IRAVs, which range from cancer drivers to variants linked with autosomal recessive disorders. The new in-silico screening framework proposed here provides a foundation for a platform that can automatically acquire medical knowledge making the most of massively accumulated publicly available sequencing data. Collections of IRAVs identified in this study are available through IRAVDB (https://iravdb.io/).

PLoS ONE ◽  
2019 ◽  
Vol 14 (9) ◽  
pp. e0216838 ◽  
Author(s):  
Modupeore O. Adetunji ◽  
Susan J. Lamont ◽  
Behnam Abasht ◽  
Carl J. Schmidt

2017 ◽  
Vol 18 (6) ◽  
pp. 1110 ◽  
Author(s):  
Wolfgang Kaisers ◽  
Johannes Ptok ◽  
Holger Schwender ◽  
Heiner Schaal

2022 ◽  
Vol 9 ◽  
Author(s):  
Han Wang ◽  
Bowen Cui ◽  
Huiying Sun ◽  
Fang Zhang ◽  
Jianan Rao ◽  
...  

GATA2 is a transcription factor that is critical for the generation and survival of hematopoietic stem cells (HSCs). It also plays an important role in the regulation of myeloid differentiation. Accordingly, GATA2 expression is restricted to HSCs and hematopoietic progenitors as well as early erythroid cells and megakaryocytic cells. Here we identified aberrant GATA2 expression in B-cell acute lymphoblastic leukemia (B-ALL) by analyzing transcriptome sequencing data obtained from St. Jude Cloud. Differentially expressed genes upon GATA2 activation showed significantly myeloid-like transcription signature. Further analysis identified several tumor-associated genes as targets of GATA2 activation including BAG3 and EPOR. In addition, the correlation between KMT2A-USP2 fusion and GATA2 activation not only indicates a potential trans-activating mechanism of GATA2 but also suggests that GATA2 is a target of KMT2A-USP2. Furthermore, by integrating whole-genome and transcriptome sequencing data, we showed that GATA2 is also cis activated. A somatic focal deletion located in the GATA2 neighborhood that disrupts the boundaries of topologically associating domains was identified in one B-ALL patient with GATA2 activation. These evidences support the hypothesis that GATA2 could be involved in leukemogenesis of B-ALL and can be transcriptionally activated through multiple mechanisms. The findings of aberrant activation of GATA2 and its molecular function extend our understanding of transcriptional factor dysregulation in B-ALL.


2020 ◽  
Vol 21 (22) ◽  
pp. 8640
Author(s):  
Kijeong Lee ◽  
Mi-Ryung Han ◽  
Ji Woo Yeon ◽  
Byoungjae Kim ◽  
Tae Hoon Kim

Dendritic cells (DCs) play critical roles in atopic diseases, orchestrating both innate and adaptive immune systems. Nevertheless, limited information is available regarding the mechanism through which DCs induce hyperresponsiveness in patients with allergies. This study aims to reveal novel genetic alterations and future therapeutic target molecules in the DCs from patients with allergies using whole transcriptome sequencing. Transcriptome sequencing of human BDCA-3+/CD11c+ DCs sorted from peripheral blood monocytes obtained from six patients with allergies and four healthy controls was conducted. Gene expression profile data were analyzed, and an ingenuity pathway analysis was performed. A total of 1638 differentially expressed genes were identified at p-values < 0.05, with 11 genes showing a log2-fold change ≥1.5. The top gene network was associated with cell death/survival and organismal injury/abnormality. In validation experiments, amphiregulin (AREG) showed consistent results with transcriptome sequencing data, with increased mRNA expression in THP-1-derived DCs after Der p 1 stimulation and higher protein expression in myeloid DCs obtained from patients with allergies. This study suggests an alteration in the expression of DCs in patients with allergies, proposing related altered functions and intracellular mechanisms. Notably, AREG might play a crucial role in DCs by inducing the Th2 immune response.


2020 ◽  
Vol 21 (18) ◽  
pp. 6950
Author(s):  
Anastasiya V. Snezhkina ◽  
Dmitry V. Kalinin ◽  
Vladislav S. Pavlov ◽  
Elena N. Lukyanova ◽  
Alexander L. Golovyuk ◽  
...  

Carotid paragangliomas (CPGLs) are rare neuroendocrine tumors often associated with mutations in SDHx genes. The immunohistochemistry of succinate dehydrogenase (SDH) subunits has been considered a useful instrument for the prediction of SDHx mutations in paragangliomas/pheochromocytomas. We compared the mutation status of SDHx genes with the immunohistochemical (IHC) staining of SDH subunits in CPGLs. To identify pathogenic/likely pathogenic variants in SDHx genes, exome sequencing data analysis among 42 CPGL patients was performed. IHC staining of SDH subunits was carried out for all CPGLs studied. We encountered SDHx variants in 38% (16/42) of the cases in SDHx genes. IHC showed negative (5/15) or weak diffuse (10/15) SDHB staining in most tumors with variants in any of SDHx (94%, 15/16). In SDHA-mutated CPGL, SDHA expression was completely absent and weak diffuse SDHB staining was detected. Positive immunoreactivity for all SDH subunits was found in one case with a variant in SDHD. Notably, CPGL samples without variants in SDHx also demonstrated negative (2/11) or weak diffuse (9/11) SDHB staining (42%, 11/26). Obtained results indicate that SDH immunohistochemistry does not fully reflect the presence of mutations in the genes; diagnostic effectiveness of this method was 71%. However, given the high sensitivity of SDHB immunohistochemistry, it could be used for initial identifications of patients potentially carrying SDHx mutations for recommendation of genetic testing.


BMC Genomics ◽  
2018 ◽  
Vol 19 (S1) ◽  
Author(s):  
Yu-Chen Liu ◽  
Yu-Jung Chiu ◽  
Jian-Rong Li ◽  
Chuan-Hu Sun ◽  
Chun-Chi Liu ◽  
...  

Author(s):  
Janet Piñero ◽  
Juan Manuel Ramírez-Anguita ◽  
Josep Saüch-Pitarch ◽  
Francesco Ronzano ◽  
Emilio Centeno ◽  
...  

Abstract One of the most pressing challenges in genomic medicine is to understand the role played by genetic variation in health and disease. Thanks to the exploration of genomic variants at large scale, hundreds of thousands of disease-associated loci have been uncovered. However, the identification of variants of clinical relevance is a significant challenge that requires comprehensive interrogation of previous knowledge and linkage to new experimental results. To assist in this complex task, we created DisGeNET (http://www.disgenet.org/), a knowledge management platform integrating and standardizing data about disease associated genes and variants from multiple sources, including the scientific literature. DisGeNET covers the full spectrum of human diseases as well as normal and abnormal traits. The current release covers more than 24 000 diseases and traits, 17 000 genes and 117 000 genomic variants. The latest developments of DisGeNET include new sources of data, novel data attributes and prioritization metrics, a redesigned web interface and recently launched APIs. Thanks to the data standardization, the combination of expert curated information with data automatically mined from the scientific literature, and a suite of tools for accessing its publicly available data, DisGeNET is an interoperable resource supporting a variety of applications in genomic medicine and drug R&D.


2020 ◽  
Vol 15 (1) ◽  
pp. 2-16
Author(s):  
Yuwen Luo ◽  
Xingyu Liao ◽  
Fang-Xiang Wu ◽  
Jianxin Wang

Transcriptome assembly plays a critical role in studying biological properties and examining the expression levels of genomes in specific cells. It is also the basis of many downstream analyses. With the increase of speed and the decrease in cost, massive sequencing data continues to accumulate. A large number of assembly strategies based on different computational methods and experiments have been developed. How to efficiently perform transcriptome assembly with high sensitivity and accuracy becomes a key issue. In this work, the issues with transcriptome assembly are explored based on different sequencing technologies. Specifically, transcriptome assemblies with next-generation sequencing reads are divided into reference-based assemblies and de novo assemblies. The examples of different species are used to illustrate that long reads produced by the third-generation sequencing technologies can cover fulllength transcripts without assemblies. In addition, different transcriptome assemblies using the Hybrid-seq methods and other tools are also summarized. Finally, we discuss the future directions of transcriptome assemblies.


Sign in / Sign up

Export Citation Format

Share Document