An integrated text mining framework for metabolic interaction network reconstruction

PeerJ ◽

10.7717/peerj.1811 ◽

2016 ◽

Vol 4 ◽

pp. e1811 ◽

Cited By ~ 8

Author(s):

Preecha Patumcharoenpol ◽

Narumol Doungpan ◽

Asawin Meechai ◽

Bairong Shen ◽

Jonathan H. Chan ◽

...

Keyword(s):

Text Mining ◽

Interaction Network ◽

Network Reconstruction ◽

Event Extraction ◽

Metabolic Interaction ◽

Test Corpus ◽

Metabolic Event ◽

Metabolic Interactions ◽

Complex Relationships ◽

Biological Entities

Text mining (TM) in the field of biology is fast becoming a routine analysis for the extraction and curation of biological entities (e.g., genes, proteins, simple chemicals) as well as their relationships. Due to the wide applicability of TM in situations involving complex relationships, it is valuable to apply TM to the extraction of metabolic interactions (i.e., enzyme and metabolite interactions) through metabolic events. Here we present an integrated TM framework containing two modules for the extraction of metabolic events (Metabolic Event Extraction module—MEE) and for the construction of a metabolic interaction network (Metabolic Interaction Network Reconstruction module—MINR). The proposed integrated TM framework performed well based on standard measures of recall, precision and F-score. Evaluation of the MEE module using the constructed Metabolic Entities (ME) corpus yielded F-scores of 59.15% and 48.59% for the detection of metabolic events for production and consumption, respectively. As for the testing of the entity tagger for Gene and Protein (GP) and metabolite with the test corpus, the obtained F-score was greater than 80% for the Superpathway of leucine, valine, and isoleucine biosynthesis. Mapping of enzyme and metabolite interactions through network reconstruction showed a fair performance for the MINR module on the test corpus with F-score >70%. Finally, an application of our integrated TM framework on a big-scale data (i.e., EcoCyc extraction data) for reconstructing a metabolic interaction network showed reasonable precisions at 69.93%, 70.63% and 46.71% for enzyme, metabolite and enzyme–metabolite interaction, respectively. This study presents the first open-source integrated TM framework for reconstructing a metabolic interaction network. This framework can be a powerful tool that helps biologists to extract metabolic events for further reconstruction of a metabolic interaction network. The ME corpus, test corpus, source code, and virtual machine image with pre-configured software are available atwww.sbi.kmutt.ac.th/ preecha/metrecon.

Peer Review #1 of "An integrated text mining framework for metabolic interaction network reconstruction (v0.2)"

10.7287/peerj.1811v0.2/reviews/1 ◽

2016 ◽

Keyword(s):

Text Mining ◽

Peer Review ◽

Interaction Network ◽

Network Reconstruction ◽

Metabolic Interaction

Peer Review #1 of "An integrated text mining framework for metabolic interaction network reconstruction (v0.1)"

10.7287/peerj.1811v0.1/reviews/1 ◽

2016 ◽

Keyword(s):

Text Mining ◽

Peer Review ◽

Interaction Network ◽

Network Reconstruction ◽

Metabolic Interaction

Peer Review #2 of "An integrated text mining framework for metabolic interaction network reconstruction (v0.1)"

10.7287/peerj.1811v0.1/reviews/2 ◽

2016 ◽

Keyword(s):

Text Mining ◽

Peer Review ◽

Interaction Network ◽

Network Reconstruction ◽

Metabolic Interaction

Peer Review #2 of "An integrated text mining framework for metabolic interaction network reconstruction (v0.2)"

10.7287/peerj.1811v0.2/reviews/2 ◽

2016 ◽

Keyword(s):

Text Mining ◽

Peer Review ◽

Interaction Network ◽

Network Reconstruction ◽

Metabolic Interaction

Events Automatic Extraction from Arabic Texts

Natural Language Processing ◽

10.4018/978-1-7998-0951-7.ch078 ◽

2020 ◽

pp. 1686-1704

Author(s):

Emna Hkiri ◽

Souheyl Mallat ◽

Mounir Zrigui

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Text Mining ◽

Machine Translation ◽

Language Processing ◽

Question Answering ◽

Arabic Language ◽

Event Extraction ◽

Mining Machine ◽

Open Domain

The event extraction task consists in determining and classifying events within an open-domain text. It is very new for the Arabic language, whereas it attained its maturity for some languages such as English and French. Events extraction was also proved to help Natural Language Processing tasks such as Information Retrieval and Question Answering, text mining, machine translation etc… to obtain a higher performance. In this article, we present an ongoing effort to build a system for event extraction from Arabic texts using Gate platform and other tools.

emiRIT: A text-mining based resource for microRNA information

10.1101/2020.11.05.370593 ◽

2020 ◽

Author(s):

Debarati Roychowdhury ◽

Samir Gupta ◽

Xihan Qin ◽

Cecilia N. Arighi ◽

K. Vijay-Shanker

Keyword(s):

Text Mining ◽

Information Needs ◽

Large Scale ◽

Biological Process ◽

Essential Gene ◽

Mirna Gene ◽

Easy Access ◽

Context Specific ◽

Biological Entities ◽

User Friendly

AbstractMotivationmicroRNAs (miRNAs) are essential gene regulators and their dysregulation often leads to diseases. Easy access to miRNA information is crucial for interpreting generated experimental data, connecting facts across publications, and developing new hypotheses built on previous knowledge. Here, we present emiRIT, a text mining-based resource, which presents miRNA information mined from the literature through a user-friendly interface.ResultsWe collected 149,233 miRNA-PubMed ID pairs from Medline between January 1997 to May 2020. emiRIT currently contains miRNA-gene regulation (60,491 relations); miRNA-disease (cancer) (12,300 relations); miRNA-biological process and pathways (23,390 relations); and circulatory miRNAs in extracellular locations (3,782 relations). Biological entities and their relation to miRNAs were extracted from Medline abstracts using publicly available and in-house developed text mining tools, and the entities were normalized to facilitate querying and integration. We built a database and an interface to store and access the integrated data, respectively.ConclusionWe provide an up-to-date and user-friendly resource to facilitate access to comprehensive miRNA information from the literature on a large-scale, enabling users to navigate through different roles of miRNA and examine them in a context specific to their information needs. To assess our resource’s information coverage, in the absence of gold standards, we have conducted two case studies focusing on the target and differential expression information of miRNAs in the context of diseases. Database URL: https://research.bioinformatics.udel.edu/emirit/

Species-wide Metabolic Interaction Network for Understanding Natural Lignocellulose Digestion in Termite Gut Microbiota

Scientific Reports ◽

10.1038/s41598-019-52843-w ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 5

Author(s):

Pritam Kundu ◽

Bharat Manna ◽

Subham Majumder ◽

Amit Ghosh

Keyword(s):

Gut Microbiota ◽

Structural Complexity ◽

Interaction Network ◽

Biofuel Production ◽

System Level ◽

Metagenomic Data ◽

Metabolic Interaction ◽

Nasutitermes Corniger ◽

Termite Gut ◽

Microbial Symbionts

Abstract The structural complexity of lignocellulosic biomass hinders the extraction of cellulose, and it has remained a challenge for decades in the biofuel production process. However, wood-feeding organisms like termite have developed an efficient natural lignocellulolytic system with the help of specialized gut microbial symbionts. Despite having an enormous amount of high-throughput metagenomic data, specific contributions of each individual microbe to achieve this lignocellulolytic functionality remains unclear. The metabolic cross-communication and interdependence that drives the community structure inside the gut microbiota are yet to be explored. We have contrived a species-wide metabolic interaction network of the termite gut-microbiome to have a system-level understanding of metabolic communication. Metagenomic data of Nasutitermes corniger have been analyzed to identify microbial communities in different gut segments. A comprehensive metabolic cross-feeding network of 205 microbes and 265 metabolites was developed using published experimental data. Reconstruction of inter-species influence network elucidated the role of 37 influential microbes to maintain a stable and functional microbiota. Furthermore, in order to understand the natural lignocellulose digestion inside N. corniger gut, the metabolic functionality of each influencer was assessed, which further elucidated 15 crucial hemicellulolytic microbes and their corresponding enzyme machinery.

The Potential role of Procyanidin as a Therapeutic Agent against SARS-CoV-2: A Text Mining, Molecular Docking and Molecular Dynamics Simulation Approach

10.26434/chemrxiv.12579599 ◽

2020 ◽

Author(s):

Nikhil Maroli ◽

Balu Bhasuran ◽

Jeyakumar Natarajan ◽

Ponmalai Kolandaivel

Keyword(s):

Molecular Dynamics ◽

Molecular Docking ◽

Molecular Dynamics Simulation ◽

Text Mining ◽

Van Der Waals ◽

Interaction Network ◽

Van Der Waals Interactions ◽

Entity Recognition ◽

Dynamics Simulation ◽

Inhibition Mechanism

A novel coronavirus (SARS-CoV-2) has caused a major outbreak in human all over the world. There are several proteins interplay during the entry and replication of this virus in human. Here, we have used text mining and named entity recognition method to identify co-occurrence of the important COVID 19 genes/proteins in the interaction network based on the frequency of the interaction. Network analysis revealed a set of genes/proteins, highly dense genes/protein clusters and sub-networks of Angiotensin-converting enzyme 2 (ACE2), Helicase, spike (S) protein (trimeric), membrane (M) protein, envelop (E) protein, and the nucleocapsid (N) protein. The isolated proteins are screened against procyanidin-a flavonoid from plants using molecular docking. Further, molecular dynamics simulation of critical proteins such as ACE2, Mpro and spike proteins are performed to elucidate the inhibition mechanism. The strong network of hydrogen bonds and hydrophobic interactions along with van der Waals interactions inhibit receptors, which are essential to the entry and replication of the SARS-CoV-2. The binding energy which largely arises from van der Waals interactions is calculated (ACE2=-50.21 ± 6.3, Mpro=-89.50 ± 6.32 and spike=-23.06 ± 4.39) through molecular mechanics Poisson-Boltzmann surface area also confirm the affinity of procyanidin towards the critical receptors.

Protein Interaction Network Reconstruction Through Ensemble Deep Learning With Attention Mechanism

Frontiers in Bioengineering and Biotechnology ◽

10.3389/fbioe.2020.00390 ◽

2020 ◽

Vol 8 ◽

Author(s):

Feifei Li ◽

Fei Zhu ◽

Xinghong Ling ◽

Quan Liu

Keyword(s):

Deep Learning ◽

Protein Interaction ◽

Protein Interaction Network ◽

Interaction Network ◽

Network Reconstruction ◽

Attention Mechanism

DDA: A Novel Network-Based Scoring Method to Identify Disease-Disease Associations

Bioinformatics and Biology Insights ◽

10.4137/bbi.s35237 ◽

2015 ◽

Vol 9 ◽

pp. BBI.S35237 ◽

Cited By ~ 8

Author(s):

Apichat Suratanee ◽

Kitiporn Plaimas

Keyword(s):

Large Scale ◽

Association Studies ◽

Area Under The Curve ◽

Interaction Network ◽

Disease Diagnosis ◽

Scoring Method ◽

Protein Protein Interaction ◽

Disease Associations ◽

Statistical Relationships ◽

Complex Relationships

Categorizing human diseases provides higher efficiency and accuracy for disease diagnosis, prognosis, and treatment. Disease-disease association (DDA) is a precious information that indicates the large-scale structure of complex relationships of diseases. However, the number of known and reliable associations is very small. Therefore, identification of DDAs is a challenging task in systems biology and medicine. Here, we developed a novel network-based scoring algorithm called DDA to identify the relationships between diseases in a large-scale study. Our method is developed based on a random walk prioritization in a protein-protein interaction network. This approach considers not only whether two diseases directly share associated genes but also the statistical relationships between two different diseases using known disease-related genes. Predicted associations were validated by known DDAs from a database and literature supports. The method yielded a good performance with an area under the curve of 71% and outperformed other standard association indices. Furthermore, novel DDAs and relationships among diseases from the clusters analysis were reported. This method is efficient to identify disease-disease relationships on an interaction network and can also be generalized to other association studies to further enhance knowledge in medical studies.