scholarly journals BioLitMine: Advanced Mining of Biomedical and Biological Literature About Human Genes and Genes from Major Model Organisms

2020 ◽  
Vol 10 (12) ◽  
pp. 4531-4539
Author(s):  
Yanhui Hu ◽  
Verena Chung ◽  
Aram Comjean ◽  
Jonathan Rodiger ◽  
Fnu Nipun ◽  
...  

The accumulation of biological and biomedical literature outpaces the ability of most researchers and clinicians to stay abreast of their own immediate fields, let alone a broader range of topics. Although available search tools support identification of relevant literature, finding relevant and key publications is not always straightforward. For example, important publications might be missed in searches with an official gene name due to gene synonyms. Moreover, ambiguity of gene names can result in retrieval of a large number of irrelevant publications. To address these issues and help researchers and physicians quickly identify relevant publications, we developed BioLitMine, an advanced literature mining tool that takes advantage of the medical subject heading (MeSH) index and gene-to-publication annotations already available for PubMed literature. Using BioLitMine, a user can identify what MeSH terms are represented in the set of publications associated with a given gene of the interest, or start with a term and identify relevant publications. Users can also use the tool to find co-cited genes and a build a literature co-citation network. In addition, BioLitMine can help users build a gene list relevant to a MeSH term, such as a list of genes relevant to “stem cells” or “breast neoplasms.” Users can also start with a gene or pathway of interest and identify authors associated with that gene or pathway, a feature that makes it easier to identify experts who might serve as collaborators or reviewers. Altogether, BioLitMine extends the value of PubMed-indexed literature and its existing expert curation by providing a robust and gene-centric approach to retrieval of relevant information.

Author(s):  
Yanhui Hu ◽  
Verena Chung ◽  
Aram Comjean ◽  
Jonathan Rodiger ◽  
Fnu Nipun ◽  
...  

AbstractThe accumulation of biological and biomedical literature outpaces the ability of most researchers and clinicians to stay abreast of their own immediate fields, let alone a broader range of topics. Although available search tools support identification of relevant literature, finding relevant and key publications is not always straightforward. For example, important publications might be missed in searches with an official gene name due to gene synonyms. Moreover, ambiguity of gene names can result in retrieval of a large number of irrelevant publications. To address these issues and help researchers and physicians quickly identify relevant publications, we developed BioLitMine, an advanced literature mining tool that takes advantage of the medical subject heading (MeSH) index and gene-to-publication annotations already available for PubMed literature. Using BioLitMine, a user can identify what MeSH terms are represented in the set of publications associated with a given gene of the interest, or start with a term and identify relevant publications. Users can also use the tool to find co-cited genes and a build a literature co-citation network. In addition, BioLitMine can help users build a gene list relevant to a MeSH terms, such as a list of genes relevant to “stem cells” or “breast neoplasms.” Users can also start with a gene or pathway of interest and identify authors associated with that gene or pathway, a feature that makes it easier to identify experts who might serve as collaborators or reviewers. Altogether, BioLitMine extends the value of PubMed-indexed literature and its existing expert curation by providing a robust and gene-centric approach to retrieval of relevant information.


Database ◽  
2021 ◽  
Vol 2021 ◽  
Author(s):  
Valerio Arnaboldi ◽  
Jaehyoung Cho ◽  
Paul W Sternberg

Abstract Finding relevant information from newly published scientific papers is becoming increasingly difficult due to the pace at which articles are published every year as well as the increasing amount of information per paper. Biocuration and model organism databases provide a map for researchers to navigate through the complex structure of the biomedical literature by distilling knowledge into curated and standardized information. In addition, scientific search engines such as PubMed and text-mining tools such as Textpresso allow researchers to easily search for specific biological aspects from newly published papers, facilitating knowledge transfer. However, digesting the information returned by these systems—often a large number of documents—still requires considerable effort. In this paper, we present Wormicloud, a new tool that summarizes scientific articles in a graphical way through word clouds. This tool is aimed at facilitating the discovery of new experimental results not yet curated by model organism databases and is designed for both researchers and biocurators. Wormicloud is customized for the Caenorhabditis  elegans literature and provides several advantages over existing solutions, including being able to perform full-text searches through Textpresso, which provides more accurate results than other existing literature search engines. Wormicloud is integrated through direct links from gene interaction pages in WormBase. Additionally, it allows analysis on the gene sets obtained from literature searches with other WormBase tools such as SimpleMine and Gene Set Enrichment. Database URL: https://wormicloud.textpressolab.com


Database ◽  
2019 ◽  
Vol 2019 ◽  
Author(s):  
Peter Brown ◽  
Aik-Choon Tan ◽  
Mohamed A El-Esawi ◽  
Thomas Liehr ◽  
Oliver Blanck ◽  
...  

Abstract Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency–Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.


2012 ◽  
Vol 6 ◽  
pp. BBI.S9902 ◽  
Author(s):  
Divya P. Syamaladevi ◽  
Margaret S Sunitha ◽  
S. Kalaimathy ◽  
Chandrashekar C. Reddy ◽  
Mohammed Iftekhar ◽  
...  

Myosins are one of the largest protein superfamilies with 24 classes. They have conserved structural features and catalytic domains yet show huge variation at different domains resulting in a variety of functions. Myosins are molecules driving various kinds of cellular processes and motility until the level of organisms. These are ATPases that utilize the chemical energy released by ATP hydrolysis to bring about conformational changes leading to a motor function. Myosins are important as they are involved in almost all cellular activities ranging from cell division to transcriptional regulation. They are crucial due to their involvement in many congenital diseases symptomatized by muscular malfunctions, cardiac diseases, deafness, neural and immunological dysfunction, and so on, many of which lead to death at an early age. We present Myosinome, a database of selected myosin classes (myosin II, V, and VI) from five model organisms. This knowledge base provides the sequences, phylogenetic clustering, domain architectures of myosins and molecular models, structural analyses, and relevant literature of their coiled-coil domains. In the current version of Myosinome, information about 71 myosin sequences belonging to three myosin classes (myosin II, V, and VI) in five model organisms ( Homo Sapiens, Mus musculus, D. melanogaster, C. elegans and S. cereviseae) identified using bioinformatics surveys are presented, and several of them are yet to be functionally characterized. As these proteins are involved in congenital diseases, such a database would be useful in short-listing candidates for gene therapy and drug development. The database can be accessed from http://caps.ncbs.res.in/myosinome .


2012 ◽  
Vol 2012 ◽  
pp. 1-12 ◽  
Author(s):  
Claudia P. Spampinato ◽  
Diego F. Gomez-Casati

Different model organisms, such asEscherichia coli,Saccharomyces cerevisiae,Caenorhabditis elegans,Drosophila melanogaster, mouse, cultured human cell lines, among others, were used to study the mechanisms of several human diseases. Since human genes and proteins have been structurally and functionally conserved in plant organisms, the use of plants, especiallyArabidopsis thaliana, as a model system to relate molecular defects to clinical disorders has recently increased. Here, we briefly review our current knowledge of human diseases of nuclear and mitochondrial origin and summarize the experimental findings of plant homologs implicated in each process.


2021 ◽  
Author(s):  
Oliver Thorn-Seshold ◽  
Joyce Meiring

Microtubule dynamics can be inhibited with sub-second temporal resolution and cellular-scale spatial resolution, by using precise illuminations to optically pattern where and when photoswitchable microtubule-inhibiting chemical reagents exert their latent bioactivity. The recently-available reagents (SBTub, PST, STEpo, AzTax, PHTub) now enable researchers to use light to reversibly modulate microtubule-dependent processes in eukaryotes, in 2D and 3D cell culture as well as in vivo, across a variety of model organisms: with applications in fields from cargo transport to cell migration, cell division, and embryonic development.<br><br>However, a wide knowledge gap has remained in the literature, which has blocked further translation of these and many other classes of photopharmaceuticals. No generally-applicable procedures or workflows to establish biological assays using photopharmaceuticals have been published. Accordingly, the rate of adoption of photopharmaceutical tools in the broader chemical biology community (beyond the original chemical developers of the tools) has remained very low. Vital information about assay benchmarking for photoconversion, testing for isomer solubility, proving the retention of mechanism of action, estimating the limits of phototoxicity etc has either simply not been formalised in the literature, or has remained buried in diverse reports without being unified and codified for an audience beyond that of synthetic organic chemists.<br><br>Here we have developed a robust four-step assay establishment procedure to optimise assay parameters for achieving reliable photocontrol over microtubule dynamics, that is applicable to diverse families of photoswitchable inhibitors. This procedure also controls for these common sources of irreproducibility and includes numerous troubleshooting steps. We also collect together the relevant information for non-chemist "users" such as microscopists and biologists, to introduce the theory of small molecule photoswitching; the unique features, usage requirements, and limitations that photoswitchable chemical reagents have; and the specific performance features of the major classes of photoswitchable microtubule inhibitors that are currently available; to highlight their properties that suit them to different applications. The generally-applicable workflows that we present allow establishing cellular assays optically controlling microtubule dynamics in a temporally reversible fashion with spatial specificity down to a single selected cell within a field of view. These workflows and methods also equip the reader to tackle advanced uses of photoswitchable chemical reagents for general protein targets, in 3D culture and in vivo, and can represent an important bridge to reach the high-value biological applications that photopharmacology can promise.<br>


2018 ◽  
Vol 19 (11) ◽  
pp. 3390 ◽  
Author(s):  
Sudip Paudel ◽  
Regan Sindelar ◽  
Margaret Saha

Accumulating evidence over the past three decades suggests that altered calcium signaling during development may be a major driving force for adult pathophysiological events. Well over a hundred human genes encode proteins that are specifically dedicated to calcium homeostasis and calcium signaling, and the majority of these are expressed during embryonic development. Recent advances in molecular techniques have identified impaired calcium signaling during development due to either mutations or dysregulation of these proteins. This impaired signaling has been implicated in various human diseases ranging from cardiac malformations to epilepsy. Although the molecular basis of these and other diseases have been well studied in adult systems, the potential developmental origins of such diseases are less well characterized. In this review, we will discuss the recent evidence that examines different patterns of calcium activity during early development, as well as potential medical conditions associated with its dysregulation. Studies performed using various model organisms, including zebrafish, Xenopus, and mouse, have underscored the critical role of calcium activity in infertility, abortive pregnancy, developmental defects, and a range of diseases which manifest later in life. Understanding the underlying mechanisms by which calcium regulates these diverse developmental processes remains a challenge; however, this knowledge will potentially enable calcium signaling to be used as a therapeutic target in regenerative and personalized medicine.


2020 ◽  
Vol 8 (2) ◽  
pp. 948-956
Author(s):  
Nur Syuhada Jasni ◽  
Haslinda Yusoff

Purpose: The purpose of this study is to investigate the business practices in both; accelerating digitalisation and addressing social issues among Malaysian companies. Methodology: This study uses a sample consisting of four top telecommunication companies listed in the Bursa Malaysia. This study provides relevant literature on the social value creation concept from the corporate perspective. Besides, content analysis is used to extract relevant information from the particular sustainability report of the companies. Results: Results indicate that three out of four companies in the sample are very proactive in embracing the social value creation concept that aligned with national objectives and Sustainability Development Goals (SDG). Although, similarly, all companies addressed providing rural and urban poor communities’ digitalisation assistance as their social contributions. Implications: These results reveal input on the integration of accelerating digitalisation and addressing social issues, that focusing on social value creation. Management should understand that the financial implications has become an important component of social projects in line. Hence should establish effective strategic business strategy towards Integrated Reporting (IR) 4.0 that in reality has significant impact on the society and country.


2017 ◽  
Vol 4 (3) ◽  
pp. e337 ◽  
Author(s):  
Sundararajan Srinivasan ◽  
Marco Di Dario ◽  
Alessandra Russo ◽  
Ramesh Menon ◽  
Elena Brini ◽  
...  

Objective:To perform systematic transcriptomic analysis of multiple sclerosis (MS) risk genes in peripheral blood mononuclear cells (PBMCs) of subjects with distinct MS stages and describe the pathways characterized by dysregulated gene expressions.Methods:We monitored gene expression levels in PBMCs from 3 independent cohorts for a total of 297 cases (including clinically isolated syndromes (CIS), relapsing-remitting MS, primary and secondary progressive MS) and 96 healthy controls by distinct microarray platforms and quantitative PCR. Differential expression and pathway analyses for distinct MS stages were defined and validated by literature mining.Results:Genes located in the vicinity of MS risk variants displayed altered expression in peripheral blood at distinct stages of MS compared with the healthy population. The frequency of dysregulation was significantly higher than expected in CIS and progressive forms of MS. Pathway analysis for each MS stage–specific gene list showed that dysregulated genes contributed to pathogenic processes with scientific evidence in MS.Conclusions:Systematic gene expression analysis in PBMCs highlighted selective dysregulation of MS susceptibility genes playing a role in novel and well-known pathogenic pathways.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Maryam Yaghtin ◽  
Hajar Sotudeh ◽  
Alireza Nikseresht ◽  
Mahdieh Mirzabeigi

PurposeCo-citation frequency, defined as the number of documents co-citing two articles, is considered as a quantitative, and thus, an efficient proxy of subject relatedness or prestige of the co-cited articles. Despite its quantitative nature, it is found effective in retrieving and evaluating documents, signifying its linkage with the related documents' contents. To better understand the dynamism of the citation network, the present study aims to investigate various content features giving rise to the measure.Design/methodology/approachThe present study examined the interaction of different co-citation features in explaining the co-citation frequency. The features include the co-cited works' similarities in their full-texts, Medical Subject Headings (MeSH) terms, co-citation proximity, opinions and co-citances. A test collection is built using the CITREC dataset. The data were analyzed using natural language processing (NLP) and opinion mining techniques. A linear model was developed to regress the objective and subjective content-based co-citation measures against the natural log of the co-citation frequency.FindingsThe dimensions of co-citation similarity, either subjective or objective, play significant roles in predicting co-citation frequency. The model can predict about half of the co-citation variance. The interaction of co-opinionatedness and non-co-opinionatedness is the strongest factor in the model.Originality/valueIt is the first study in revealing that both the objective and subjective similarities could significantly predict the co-citation frequency. The findings re-confirm the citation analysis assumption claiming the connection between the cognitive layers of cited documents and citation measures in general and the co-citation frequency in particular.Peer reviewThe peer review history for this article is available at https://publons.com/publon/10.1108/OIR-04-2020-0126.


Sign in / Sign up

Export Citation Format

Share Document