BiDI: Using Machine Learning to collect and facilitate remote access to biomedical databases (Preprint)

BACKGROUND Currently, existing biomedical literature repositories do not commonly provide users with specific means to locate and remotely access biomedical databases. OBJECTIVE To address this issue we developed BiDI (Biomedical Database Inventory), a repository linking to biomedical databases automatically extracted from the scientific literature. BiDI provides an index of data resources and a path to access them in a seamless manner. METHODS We designed an ensemble of Deep Learning methods to extract database mentions. To train the system we annotated a set of 1,242 articles that included mentions to database publications. Such a dataset was used along with transfer learning techniques to train an ensemble of deep learning NLP models based on the task of database publication detection. RESULTS The system obtained an f1-score of 0.929 on database detection, showing high precision and recall values. Applying this model to the PubMed and PubMed Central databases we identified over 10,000 unique databases. The ensemble also extracts the web links to the reported databases, discarding the irrelevant links. For the extraction of web links the model achieved a cross-validated f1-score of 0.908. We show two use cases, related to “omics” and the COVID-19 pandemia. CONCLUSIONS BiDI enables the access of biomedical resources over the Internet and facilitates data-driven research and other scientific initiatives. The repository is available at (http://gib.fi.upm.es/bidi/) and will be regularly updated with an automatic text processing pipeline. The approach can be reused to create repositories of different types (biomedical and others).

Download Full-text

Using Machine Learning to Collect and Facilitate Remote Access to Biomedical Databases: Development of the Biomedical Database Inventory

JMIR Medical Informatics ◽

10.2196/22976 ◽

2021 ◽

Vol 9 (2) ◽

pp. e22976

Author(s):

Eduardo Rosado ◽

Miguel Garcia-Remesal ◽

Sergio Paraiso-Medina ◽

Alejandro Pazos ◽

Victor Maojo

Keyword(s):

Deep Learning ◽

Language Processing ◽

Text Processing ◽

Biomedical Literature ◽

Remote Access ◽

Data Set ◽

Pubmed Central ◽

Learning Techniques ◽

Automatic Text ◽

Biomedical Databases

Background Currently, existing biomedical literature repositories do not commonly provide users with specific means to locate and remotely access biomedical databases. Objective To address this issue, we developed the Biomedical Database Inventory (BiDI), a repository linking to biomedical databases automatically extracted from the scientific literature. BiDI provides an index of data resources and a path to access them seamlessly. Methods We designed an ensemble of deep learning methods to extract database mentions. To train the system, we annotated a set of 1242 articles that included mentions of database publications. Such a data set was used along with transfer learning techniques to train an ensemble of deep learning natural language processing models targeted at database publication detection. Results The system obtained an F1 score of 0.929 on database detection, showing high precision and recall values. When applying this model to the PubMed and PubMed Central databases, we identified over 10,000 unique databases. The ensemble model also extracted the weblinks to the reported databases and discarded irrelevant links. For the extraction of weblinks, the model achieved a cross-validated F1 score of 0.908. We show two use cases: one related to “omics” and the other related to the COVID-19 pandemic. Conclusions BiDI enables access to biomedical resources over the internet and facilitates data-driven research and other scientific initiatives. The repository is openly available online and will be regularly updated with an automatic text processing pipeline. The approach can be reused to create repositories of different types (ie, biomedical and others).

Download Full-text

Developing a More Accurate Biomedical Literature Retrieval Method using Deep Learning and Citations in PubMed Central Full-text Articles

10.1101/2021.10.21.465340 ◽

2021 ◽

Author(s):

Chun-chao Lo ◽

Shubo Tian ◽

Yuchuan Tao ◽

Jie Hao ◽

Jinfeng Zhang

Keyword(s):

Deep Learning ◽

Search Engine ◽

Language Processing ◽

Full Text ◽

Question Answering ◽

Biomedical Literature ◽

Specific Information ◽

Retrieval Method ◽

Pubmed Central ◽

Learning Techniques

Most queries submitted to a literature search engine can be more precisely written as sentences to give the search engine more specific information. Sentence queries should be more effective, in principle, than short queries with small numbers of keywords. Querying with full sentences is also a key step in question-answering and citation recommendation systems. Despite the considerable progress in natural language processing (NLP) in recent years, using sentence queries on current search engines does not yield satisfactory results. In this study, we developed a deep learning-based method for sentence queries, called DeepSenSe, using citation data available in full-text articles obtained from PubMed Central (PMC). A large amount of labeled data was generated from millions of matched citing sentences and cited articles, making it possible to train quality predictive models using modern deep learning techniques. A two-stage approach was designed: in the first stage we used a modified BM25 algorithm to obtain the top 1000 relevant articles; the second stage involved re-ranking the relevant articles using DeepSenSe. We tested our method using a large number of sentences extracted from real scientific articles in PMC. Our method performed substantially better than PubMed and Google Scholar for sentence queries.

Download Full-text

Time Signature Detection: A Survey

Sensors ◽

10.3390/s21196494 ◽

2021 ◽

Vol 21 (19) ◽

pp. 6494

Author(s):

Jeremiah Abimbola ◽

Daniel Kostrzewa ◽

Pawel Kasprowski

Keyword(s):

Deep Learning ◽

Research Area ◽

Future Research ◽

Genre Classification ◽

Signature Detection ◽

Learning Techniques ◽

Input Signals ◽

Different Types ◽

Time Signature ◽

Music Genre Classification

This paper presents a thorough review of methods used in various research articles published in the field of time signature estimation and detection from 2003 to the present. The purpose of this review is to investigate the effectiveness of these methods and how they perform on different types of input signals (audio and MIDI). The results of the research have been divided into two categories: classical and deep learning techniques, and are summarized in order to make suggestions for future study. More than 110 publications from top journals and conferences written in English were reviewed, and each of the research selected was fully examined to demonstrate the feasibility of the approach used, the dataset, and accuracy obtained. Results of the studies analyzed show that, in general, the process of time signature estimation is a difficult one. However, the success of this research area could be an added advantage in a broader area of music genre classification using deep learning techniques. Suggestions for improved estimates and future research projects are also discussed.

Download Full-text

Deep Learning Application for Effective Classification of Different Types of Psoriasis

Journal of Healthcare Engineering ◽

10.1155/2022/7541583 ◽

2022 ◽

Vol 2022 ◽

pp. 1-12

Author(s):

Syeda Fatima Aijaz ◽

Saad Jawaid Khan ◽

Fahad Azim ◽

Choudhary Sobhan Shakeel ◽

Umer Hassan

Keyword(s):

Deep Learning ◽

Data Augmentation ◽

Short Term Memory ◽

Skin Diseases ◽

Normal Skin ◽

Learning Algorithms ◽

Learning Techniques ◽

Different Types ◽

Input Sample

Psoriasis is a chronic inflammatory skin disorder mediated by the immune response that affects a large number of people. According to latest worldwide statistics, 125 million individuals are suffering from psoriasis. Deep learning techniques have demonstrated success in the prediction of skin diseases and can also lead to the classification of different types of psoriasis. Hence, we propose a deep learning-based application for effective classification of five types of psoriasis namely, plaque, guttate, inverse, pustular, and erythrodermic as well as the prediction of normal skin. We used 172 images of normal skin from the BFL NTU dataset and 301 images of psoriasis from the Dermnet dataset. The input sample images underwent image preprocessing including data augmentation, enhancement, and segmentation which was followed by color, texture, and shape feature extraction. Two deep learning algorithms of convolutional neural network (CNN) and long short-term memory (LSTM) were applied with the classification models being trained with 80% of the images. The reported accuracies of CNN and LSTM are 84.2% and 72.3%, respectively. A paired sample T-test exhibited significant differences between the accuracies generated by the two deep learning algorithms with a p < 0.001 . The accuracies reported from this study demonstrate potential of this deep learning application to be applied to other areas of dermatology for better prediction.

Download Full-text

Pangaea: A modular and extensible collection of tools for mining context dependent gene relationships from the biomedical literature

10.1101/2020.04.02.022517 ◽

2020 ◽

Cited By ~ 1

Author(s):

Liviu Pirvan ◽

Shamith A. Samarajiwa

Keyword(s):

Natural Language Processing ◽

Language Processing ◽

Text Processing ◽

Biomedical Literature ◽

Command Line ◽

Post Processing ◽

Command Line Interface ◽

Custom Made ◽

Different Types ◽

Context Dependent

AbstractMotivationPangaea is a scalable and extensible command line interface (CLI) software that integrates gene-relationship detection features to extract context-dependent structured gene-gene and gene-term relationships from the biomedical literature. It provides computational methods to identify biological relationships between a collection of genes and can be used to search and extract different types of contextual relationships amongst genes.ResultsWe implemented a CLI-based software for downloading PubMed articles and extracting gene relationships from abstracts using natural language processing methods. In terms of scalability, the software was designed to support the retrieval and processing of millions of articles whilst minimising memory requirements and optimising for parallel processing on multiple CPU cores. To allow extensibility, the tool permits the use of contextual custom-made models for the text processing parts, and the output is serialised as JSON objects to allow flexible post-processing workflows.AvailabilityThe software is available online at: https://github.com/ss-lab-cancerunit/pangaea

Download Full-text

Anomaly Detection and Categorization in Cloud Environment using Deep Learning Techniques

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i5.211214 ◽

2019 ◽

Vol 7 (5) ◽

pp. 211-214

Author(s):

Nidhi Thakkar ◽

Miren Karamta ◽

Seema Joshi ◽

M. B. Potdar

Keyword(s):

Deep Learning ◽

Anomaly Detection ◽

Cloud Environment ◽

Learning Techniques

Download Full-text

Deep Learning Techniques for Naskh and Nastalique Writing Style Text Recognition

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i4.7076 ◽

2019 ◽

Vol 7 (4) ◽

pp. 70-76

Author(s):

Shanky Goel ◽

Gurpreet Singh Lehal

Keyword(s):

Deep Learning ◽

Text Recognition ◽

Writing Style ◽

Learning Techniques

Download Full-text

A Study on the Types of Classic Fiction Using Deep Learning Techniques - Focusing on hero novels and romantic novels -

Korean Language and Literature in International Context ◽

10.31147/iall.84.1 ◽

2020 ◽

Vol 84 ◽

pp. 9-35

Author(s):

Woo-kyu Kang ◽

Ba-ro Kim

Keyword(s):

Deep Learning ◽

Learning Techniques

Download Full-text

Grammatic and semantic normativity of linguistic units and features as a factor of automatic text processing

Proceedings of the 9th conference on Computational linguistics - ◽

10.3115/990100.990162 ◽

1982 ◽

Author(s):

Z. M. Shalyapina

Keyword(s):

Text Processing ◽

Semantic Normativity ◽

Linguistic Units ◽

Automatic Text

Download Full-text

Real Time Gender and Age Prediction using Deep Learning Techniques

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.e2906.049620 ◽

2020 ◽

Vol 9 (6) ◽

pp. 797-801

Keyword(s):

Deep Learning ◽

Face Recognition ◽

Real Time ◽

Vital Role ◽

Learning Techniques ◽

Gender And Age ◽

The Face ◽

And Gender ◽

Gender Detection ◽

Age Prediction

Face recognition plays a vital role in security purpose. In recent years, the researchers have focused on the pose illumination, face recognition, etc,. The traditional methods of face recognition focus on Open CV’s fisher faces which results in analyzing the face expressions and attributes. Deep learning method used in this proposed system is Convolutional Neural Network (CNN). Proposed work includes the following modules: [1] Face Detection [2] Gender Recognition [3] Age Prediction. Thus the results obtained from this work prove that real time age and gender detection using CNN provides better accuracy results compared to other existing approaches.

Download Full-text