scholarly journals BiDI: Using Machine Learning to collect and facilitate remote access to biomedical databases (Preprint)

2020 ◽  
Author(s):  
Eduardo Rosado ◽  
Miguel Garcia-Remesal Sr ◽  
Sergio Paraiso-Medina Sr ◽  
Alejandro Pazos Sr ◽  
Victor Maojo Sr

BACKGROUND Currently, existing biomedical literature repositories do not commonly provide users with specific means to locate and remotely access biomedical databases. OBJECTIVE To address this issue we developed BiDI (Biomedical Database Inventory), a repository linking to biomedical databases automatically extracted from the scientific literature. BiDI provides an index of data resources and a path to access them in a seamless manner. METHODS We designed an ensemble of Deep Learning methods to extract database mentions. To train the system we annotated a set of 1,242 articles that included mentions to database publications. Such a dataset was used along with transfer learning techniques to train an ensemble of deep learning NLP models based on the task of database publication detection. RESULTS The system obtained an f1-score of 0.929 on database detection, showing high precision and recall values. Applying this model to the PubMed and PubMed Central databases we identified over 10,000 unique databases. The ensemble also extracts the web links to the reported databases, discarding the irrelevant links. For the extraction of web links the model achieved a cross-validated f1-score of 0.908. We show two use cases, related to “omics” and the COVID-19 pandemia. CONCLUSIONS BiDI enables the access of biomedical resources over the Internet and facilitates data-driven research and other scientific initiatives. The repository is available at (http://gib.fi.upm.es/bidi/) and will be regularly updated with an automatic text processing pipeline. The approach can be reused to create repositories of different types (biomedical and others).

10.2196/22976 ◽  
2021 ◽  
Vol 9 (2) ◽  
pp. e22976
Author(s):  
Eduardo Rosado ◽  
Miguel Garcia-Remesal ◽  
Sergio Paraiso-Medina ◽  
Alejandro Pazos ◽  
Victor Maojo

Background Currently, existing biomedical literature repositories do not commonly provide users with specific means to locate and remotely access biomedical databases. Objective To address this issue, we developed the Biomedical Database Inventory (BiDI), a repository linking to biomedical databases automatically extracted from the scientific literature. BiDI provides an index of data resources and a path to access them seamlessly. Methods We designed an ensemble of deep learning methods to extract database mentions. To train the system, we annotated a set of 1242 articles that included mentions of database publications. Such a data set was used along with transfer learning techniques to train an ensemble of deep learning natural language processing models targeted at database publication detection. Results The system obtained an F1 score of 0.929 on database detection, showing high precision and recall values. When applying this model to the PubMed and PubMed Central databases, we identified over 10,000 unique databases. The ensemble model also extracted the weblinks to the reported databases and discarded irrelevant links. For the extraction of weblinks, the model achieved a cross-validated F1 score of 0.908. We show two use cases: one related to “omics” and the other related to the COVID-19 pandemic. Conclusions BiDI enables access to biomedical resources over the internet and facilitates data-driven research and other scientific initiatives. The repository is openly available online and will be regularly updated with an automatic text processing pipeline. The approach can be reused to create repositories of different types (ie, biomedical and others).


2021 ◽  
Author(s):  
Chun-chao Lo ◽  
Shubo Tian ◽  
Yuchuan Tao ◽  
Jie Hao ◽  
Jinfeng Zhang

Most queries submitted to a literature search engine can be more precisely written as sentences to give the search engine more specific information. Sentence queries should be more effective, in principle, than short queries with small numbers of keywords. Querying with full sentences is also a key step in question-answering and citation recommendation systems. Despite the considerable progress in natural language processing (NLP) in recent years, using sentence queries on current search engines does not yield satisfactory results. In this study, we developed a deep learning-based method for sentence queries, called DeepSenSe, using citation data available in full-text articles obtained from PubMed Central (PMC). A large amount of labeled data was generated from millions of matched citing sentences and cited articles, making it possible to train quality predictive models using modern deep learning techniques. A two-stage approach was designed: in the first stage we used a modified BM25 algorithm to obtain the top 1000 relevant articles; the second stage involved re-ranking the relevant articles using DeepSenSe. We tested our method using a large number of sentences extracted from real scientific articles in PMC. Our method performed substantially better than PubMed and Google Scholar for sentence queries.


Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6494
Author(s):  
Jeremiah Abimbola ◽  
Daniel Kostrzewa ◽  
Pawel Kasprowski

This paper presents a thorough review of methods used in various research articles published in the field of time signature estimation and detection from 2003 to the present. The purpose of this review is to investigate the effectiveness of these methods and how they perform on different types of input signals (audio and MIDI). The results of the research have been divided into two categories: classical and deep learning techniques, and are summarized in order to make suggestions for future study. More than 110 publications from top journals and conferences written in English were reviewed, and each of the research selected was fully examined to demonstrate the feasibility of the approach used, the dataset, and accuracy obtained. Results of the studies analyzed show that, in general, the process of time signature estimation is a difficult one. However, the success of this research area could be an added advantage in a broader area of music genre classification using deep learning techniques. Suggestions for improved estimates and future research projects are also discussed.


2022 ◽  
Vol 2022 ◽  
pp. 1-12
Author(s):  
Syeda Fatima Aijaz ◽  
Saad Jawaid Khan ◽  
Fahad Azim ◽  
Choudhary Sobhan Shakeel ◽  
Umer Hassan

Psoriasis is a chronic inflammatory skin disorder mediated by the immune response that affects a large number of people. According to latest worldwide statistics, 125 million individuals are suffering from psoriasis. Deep learning techniques have demonstrated success in the prediction of skin diseases and can also lead to the classification of different types of psoriasis. Hence, we propose a deep learning-based application for effective classification of five types of psoriasis namely, plaque, guttate, inverse, pustular, and erythrodermic as well as the prediction of normal skin. We used 172 images of normal skin from the BFL NTU dataset and 301 images of psoriasis from the Dermnet dataset. The input sample images underwent image preprocessing including data augmentation, enhancement, and segmentation which was followed by color, texture, and shape feature extraction. Two deep learning algorithms of convolutional neural network (CNN) and long short-term memory (LSTM) were applied with the classification models being trained with 80% of the images. The reported accuracies of CNN and LSTM are 84.2% and 72.3%, respectively. A paired sample T-test exhibited significant differences between the accuracies generated by the two deep learning algorithms with a p < 0.001 . The accuracies reported from this study demonstrate potential of this deep learning application to be applied to other areas of dermatology for better prediction.


Author(s):  
Liviu Pirvan ◽  
Shamith A. Samarajiwa

AbstractMotivationPangaea is a scalable and extensible command line interface (CLI) software that integrates gene-relationship detection features to extract context-dependent structured gene-gene and gene-term relationships from the biomedical literature. It provides computational methods to identify biological relationships between a collection of genes and can be used to search and extract different types of contextual relationships amongst genes.ResultsWe implemented a CLI-based software for downloading PubMed articles and extracting gene relationships from abstracts using natural language processing methods. In terms of scalability, the software was designed to support the retrieval and processing of millions of articles whilst minimising memory requirements and optimising for parallel processing on multiple CPU cores. To allow extensibility, the tool permits the use of contextual custom-made models for the text processing parts, and the output is serialised as JSON objects to allow flexible post-processing workflows.AvailabilityThe software is available online at: https://github.com/ss-lab-cancerunit/pangaea


Face recognition plays a vital role in security purpose. In recent years, the researchers have focused on the pose illumination, face recognition, etc,. The traditional methods of face recognition focus on Open CV’s fisher faces which results in analyzing the face expressions and attributes. Deep learning method used in this proposed system is Convolutional Neural Network (CNN). Proposed work includes the following modules: [1] Face Detection [2] Gender Recognition [3] Age Prediction. Thus the results obtained from this work prove that real time age and gender detection using CNN provides better accuracy results compared to other existing approaches.


Sign in / Sign up

Export Citation Format

Share Document