scholarly journals Deep context of citations using machine-learning models in scholarly full-text articles

2018 ◽  
Vol 117 (3) ◽  
pp. 1645-1662 ◽  
Author(s):  
Saeed-Ul Hassan ◽  
Mubashir Imran ◽  
Sehrish Iqbal ◽  
Naif Radi Aljohani ◽  
Raheel Nawaz
2020 ◽  
Author(s):  
Anna Price ◽  
Matthew Mort ◽  
David N. Cooper ◽  
Kevin E. Ashelford

Abstract Background: We have applied machine learning techniques to automate the screening of biomedical literature prior to the manual curation of clinical databases such as performed by the Human Gene Mutation Database (HGMD). Methods: We have developed two machine learning models, one based on title and abstract data only, the other on the full text of the article. The models were built using a Natural Language Processing (NLP) pipeline and a logistic regression classifier. Our pipelines are implemented in Python and can be run using Docker. They are made available to the wider community via GitHub (https://github.com/annacprice/nlp-bio-tools) and Docker Hub. Results: During testing, both models performed well, correctly predicting HGMD relevant articles more than 93% of the time and correctly discarding irrelevant articles more than 96% of the time, with Matthews Correlation Coefficients (MCC's) of over 0.89. Evaluation of the finalised model using an unseen validation dataset demonstrated that the full text model correctly predicted HGMD-relevant articles more than 97% of the time, an accuracy 9.5% higher than that obtained with the title/abstract model. Conclusions: Through this work we have demonstrated that machine learning models can act as an effective pre-screen of biomedical literature, with the results indicating that a full text approach to screening biomedical literature is preferable to using just the title/abstract data.


2020 ◽  
Vol 2 (1) ◽  
pp. 3-6
Author(s):  
Eric Holloway

Imagination Sampling is the usage of a person as an oracle for generating or improving machine learning models. Previous work demonstrated a general system for using Imagination Sampling for obtaining multibox models. Here, the possibility of importing such models as the starting point for further automatic enhancement is explored.


2021 ◽  
Author(s):  
Norberto Sánchez-Cruz ◽  
Jose L. Medina-Franco

<p>Epigenetic targets are a significant focus for drug discovery research, as demonstrated by the eight approved epigenetic drugs for treatment of cancer and the increasing availability of chemogenomic data related to epigenetics. This data represents a large amount of structure-activity relationships that has not been exploited thus far for the development of predictive models to support medicinal chemistry efforts. Herein, we report the first large-scale study of 26318 compounds with a quantitative measure of biological activity for 55 protein targets with epigenetic activity. Through a systematic comparison of machine learning models trained on molecular fingerprints of different design, we built predictive models with high accuracy for the epigenetic target profiling of small molecules. The models were thoroughly validated showing mean precisions up to 0.952 for the epigenetic target prediction task. Our results indicate that the herein reported models have considerable potential to identify small molecules with epigenetic activity. Therefore, our results were implemented as freely accessible and easy-to-use web application.</p>


2020 ◽  
Author(s):  
Shreya Reddy ◽  
Lisa Ewen ◽  
Pankti Patel ◽  
Prerak Patel ◽  
Ankit Kundal ◽  
...  

<p>As bots become more prevalent and smarter in the modern age of the internet, it becomes ever more important that they be identified and removed. Recent research has dictated that machine learning methods are accurate and the gold standard of bot identification on social media. Unfortunately, machine learning models do not come without their negative aspects such as lengthy training times, difficult feature selection, and overwhelming pre-processing tasks. To overcome these difficulties, we are proposing a blockchain framework for bot identification. At the current time, it is unknown how this method will perform, but it serves to prove the existence of an overwhelming gap of research under this area.<i></i></p>


Sign in / Sign up

Export Citation Format

Share Document