scholarly journals A Review of Recent Advancement in Integrating Omics Data with Literature Mining towards Biomedical Discoveries

2017 ◽  
Vol 2017 ◽  
pp. 1-10 ◽  
Author(s):  
Kalpana Raja ◽  
Matthew Patrick ◽  
Yilin Gao ◽  
Desmond Madu ◽  
Yuyang Yang ◽  
...  

In the past decade, the volume of “omics” data generated by the different high-throughput technologies has expanded exponentially. The managing, storing, and analyzing of this big data have been a great challenge for the researchers, especially when moving towards the goal of generating testable data-driven hypotheses, which has been the promise of the high-throughput experimental techniques. Different bioinformatics approaches have been developed to streamline the downstream analyzes by providing independent information to interpret and provide biological inference. Text mining (also known as literature mining) is one of the commonly used approaches for automated generation of biological knowledge from the huge number of published articles. In this review paper, we discuss the recent advancement in approaches that integrate results from omics data and information generated from text mining approaches to uncover novel biomedical information.

Genes ◽  
2020 ◽  
Vol 11 (3) ◽  
pp. 245 ◽  
Author(s):  
Gary Hardiman

A major technological shift in the research community in the past decade has been the adoption of high throughput (HT) technologies to interrogate the genome, epigenome, transcriptome, and proteome in a massively parallel fashion [...]


2017 ◽  
Author(s):  
Genevieve L. Stein-O’Brien ◽  
Raman Arora ◽  
Aedin C. Culhane ◽  
Alexander V. Favorov ◽  
Lana X. Garmire ◽  
...  

AbstractOmics data contains signal from the molecular, physical, and kinetic inter- and intra-cellular interactions that control biological systems. Matrix factorization techniques can reveal low-dimensional structure from high-dimensional data that reflect these interactions. These techniques can uncover new biological knowledge from diverse high-throughput omics data in topics ranging from pathway discovery to time course analysis. We review exemplary applications of matrix factorization for systems-level analyses. We discuss appropriate application of these methods, their limitations, and focus on analysis of results to facilitate optimal biological interpretation. The inference of biologically relevant features with matrix factorization enables discovery from high-throughput data beyond the limits of current biological knowledge—answering questions from high-dimensional data that we have not yet thought to ask.


2010 ◽  
Vol 2010 ◽  
pp. 1-19 ◽  
Author(s):  
Chuming Chen ◽  
Peter B. McGarvey ◽  
Hongzhan Huang ◽  
Cathy H. Wu

High-throughput “omics” technologies bring new opportunities for biological and biomedical researchers to ask complex questions and gain new scientific insights. However, the voluminous, complex, and context-dependent data being maintained in heterogeneous and distributed environments plus the lack of well-defined data standard and standardized nomenclature imposes a major challenge which requires advanced computational methods and bioinformatics infrastructures for integration, mining, visualization, and comparative analysis to facilitate data-driven hypothesis generation and biological knowledge discovery. In this paper, we present the challenges in high-throughput “omics” data integration and analysis, introduce a protein-centric approach for systems integration of large and heterogeneous high-throughput “omics” data including microarray, mass spectrometry, protein sequence, protein structure, and protein interaction data, and use scientific case study to illustrate how one can use varied “omics” data from different laboratories to make useful connections that could lead to new biological knowledge.


2019 ◽  
Vol 26 (13) ◽  
pp. 2330-2355 ◽  
Author(s):  
Anutthaman Parthasarathy ◽  
Sasikala K. Anandamma ◽  
Karunakaran A. Kalesh

Peptide therapeutics has made tremendous progress in the past decade. Many of the inherent weaknesses of peptides which hampered their development as therapeutics are now more or less effectively tackled with recent scientific and technological advancements in integrated drug discovery settings. These include recent developments in synthetic organic chemistry, high-throughput recombinant production strategies, highresolution analytical methods, high-throughput screening options, ingenious drug delivery strategies and novel formulation preparations. Here, we will briefly describe the key methodologies and strategies used in the therapeutic peptide development processes with selected examples of the most recent developments in the field. The aim of this review is to highlight the viable options a medicinal chemist may consider in order to improve a specific pharmacological property of interest in a peptide lead entity and thereby rationally assess the therapeutic potential this class of molecules possesses while they are traditionally (and incorrectly) considered ‘undruggable’.


2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Chris Bauer ◽  
Ralf Herwig ◽  
Matthias Lienhard ◽  
Paul Prasse ◽  
Tobias Scheffer ◽  
...  

Abstract Background There is a huge body of scientific literature describing the relation between tumor types and anti-cancer drugs. The vast amount of scientific literature makes it impossible for researchers and physicians to extract all relevant information manually. Methods In order to cope with the large amount of literature we applied an automated text mining approach to assess the relations between 30 most frequent cancer types and 270 anti-cancer drugs. We applied two different approaches, a classical text mining based on named entity recognition and an AI-based approach employing word embeddings. The consistency of literature mining results was validated with 3 independent methods: first, using data from FDA approvals, second, using experimentally measured IC-50 cell line data and third, using clinical patient survival data. Results We demonstrated that the automated text mining was able to successfully assess the relation between cancer types and anti-cancer drugs. All validation methods showed a good correspondence between the results from literature mining and independent confirmatory approaches. The relation between most frequent cancer types and drugs employed for their treatment were visualized in a large heatmap. All results are accessible in an interactive web-based knowledge base using the following link: https://knowledgebase.microdiscovery.de/heatmap. Conclusions Our approach is able to assess the relations between compounds and cancer types in an automated manner. Both, cancer types and compounds could be grouped into different clusters. Researchers can use the interactive knowledge base to inspect the presented results and follow their own research questions, for example the identification of novel indication areas for known drugs.


2021 ◽  
Vol 22 (6) ◽  
pp. 2822
Author(s):  
Efstathios Iason Vlachavas ◽  
Jonas Bohn ◽  
Frank Ückert ◽  
Sylvia Nürnberg

Recent advances in sequencing and biotechnological methodologies have led to the generation of large volumes of molecular data of different omics layers, such as genomics, transcriptomics, proteomics and metabolomics. Integration of these data with clinical information provides new opportunities to discover how perturbations in biological processes lead to disease. Using data-driven approaches for the integration and interpretation of multi-omics data could stably identify links between structural and functional information and propose causal molecular networks with potential impact on cancer pathophysiology. This knowledge can then be used to improve disease diagnosis, prognosis, prevention, and therapy. This review will summarize and categorize the most current computational methodologies and tools for integration of distinct molecular layers in the context of translational cancer research and personalized therapy. Additionally, the bioinformatics tools Multi-Omics Factor Analysis (MOFA) and netDX will be tested using omics data from public cancer resources, to assess their overall robustness, provide reproducible workflows for gaining biological knowledge from multi-omics data, and to comprehensively understand the significantly perturbed biological entities in distinct cancer types. We show that the performed supervised and unsupervised analyses result in meaningful and novel findings.


2018 ◽  
Vol 5 (3) ◽  
pp. 17
Author(s):  
Md. Julhas Miah ◽  
Md. Shahin Alam Khan ◽  
Omar Faruk Misto ◽  
Md. Rezaul Karim

The main purpose of this research is to find out the challenges and opportunities that most of the women specifically those who are entrepreneurs are facing these challenges in Sylhet area, Bangladesh. This report mainly depends on some documents and some practical observations. Women Entrepreneurship is a very essential turning point for the betterment of the women. Unlike the past, women today are no longer confined in the kitchen. They have raised their voice against conservative social outlook. Now women are entering into work force which is providing them a self-identity and right to participate in family decisional affairs. In Sylhet a huge number of women are also having various types of business organizations. The women those who are entrepreneurs of Sylhet, almost 35% are engaged in boutique businesses. There are some other businesses performed by them such as fashion house and cloth store, tailor, parlor, training center etc. Most of them have to maintain their family works despite having a business. But here they are not free from problems. The traditionalism of society, high interest rate of loan, lack of proper training facilities are the main barriers in the smoothness of business. Here every women entrepreneurs recommends that the Government should take necessary effective steps in (providing training, low rate of interest in taking loan etc.) this regard, as it is a very potential way to develop the country.


2014 ◽  
Vol 136 (11) ◽  
Author(s):  
Michael W. Glier ◽  
Daniel A. McAdams ◽  
Julie S. Linsey

Bioinspired design is the adaptation of methods, strategies, or principles found in nature to solve engineering problems. One formalized approach to bioinspired solution seeking is the abstraction of the engineering problem into a functional need and then seeking solutions to this function using a keyword type search method on text based biological knowledge. These function keyword search approaches have shown potential for success, but as with many text based search methods, they produce a large number of results, many of little relevance to the problem in question. In this paper, we develop a method to train a computer to identify text passages more likely to suggest a solution to a human designer. The work presented examines the possibility of filtering biological keyword search results by using text mining algorithms to automatically identify which results are likely to be useful to a designer. The text mining algorithms are trained on a pair of surveys administered to human subjects to empirically identify a large number of sentences that are, or are not, helpful for idea generation. We develop and evaluate three text classification algorithms, namely, a Naïve Bayes (NB) classifier, a k nearest neighbors (kNN) classifier, and a support vector machine (SVM) classifier. Of these methods, the NB classifier generally had the best performance. Based on the analysis of 60 word stems, a NB classifier's precision is 0.87, recall is 0.52, and F score is 0.65. We find that word stem features that describe a physical action or process are correlated with helpful sentences. Similarly, we find biological jargon feature words are correlated with unhelpful sentences.


2016 ◽  
Author(s):  
Marta R. Hidalgo ◽  
Cankut Cubuk ◽  
Alicia Amadoz ◽  
Francisco Salavert ◽  
José Carbonell-Caballero ◽  
...  

AbstractUnderstanding the aspects of the cell functionality that account for disease or drug action mechanisms is a main challenge for precision medicine. Here we propose a new method that models cell signaling using biological knowledge on signal transduction. The method recodes individual gene expression values (and/or gene mutations) into accurate measurements of changes in the activity of signaling circuits, which ultimately constitute high-throughput estimations of cell functionalities caused by gene activity within the pathway. Moreover, such estimations can be obtained either at cohort-level, in case/control comparisons, or personalized for individual patients. The accuracy of the method is demonstrated in an extensive analysis involving 5640 patients from 12 different cancer types. Circuit activity measurements not only have a high diagnostic value but also can be related to relevant disease outcomes such as survival, and can be used to assess therapeutic interventions.


2021 ◽  
Author(s):  
Félix Raimundo ◽  
Laetitia Papaxanthos ◽  
Céline Vallot ◽  
Jean-Philippe Vert

AbstractSingle-cell omics technologies produce large quantities of data describing the genomic, transcriptomic or epigenomic profiles of many individual cells in parallel. In order to infer biological knowledge and develop predictive models from these data, machine learning (ML)-based model are increasingly used due to their flexibility, scalability, and impressive success in other fields. In recent years, we have seen a surge of new ML-based method development for low-dimensional representations of single-cell omics data, batch normalization, cell type classification, trajectory inference, gene regulatory network inference or multimodal data integration. To help readers navigate this fast-moving literature, we survey in this review recent advances in ML approaches developed to analyze single-cell omics data, focusing mainly on peer-reviewed publications published in the last two years (2019-2020).


Sign in / Sign up

Export Citation Format

Share Document