scholarly journals SIENA: Semi-automatic semantic enhancement of datasets using concept recognition

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Andreea Grigoriu ◽  
Amrapali Zaveri ◽  
Gerhard Weiss ◽  
Michel Dumontier

Abstract Background The amount of available data, which can facilitate answering scientific research questions, is growing. However, the different formats of published data are expanding as well, creating a serious challenge when multiple datasets need to be integrated for answering a question. Results This paper presents a semi-automated framework that provides semantic enhancement of biomedical data, specifically gene datasets. The framework involved a concept recognition task using machine learning, in combination with the BioPortal annotator. Compared to using methods which require only the BioPortal annotator for semantic enhancement, the proposed framework achieves the highest results. Conclusions Using concept recognition combined with machine learning techniques and annotation with a biomedical ontology, the proposed framework can provide datasets to reach their full potential of providing meaningful information, which can answer scientific research questions.

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Talal S. Qaid ◽  
Hussein Mazaar ◽  
Mohammad Yahya H. Al-Shamri ◽  
Mohammed S. Alqahtani ◽  
Abeer A. Raweh ◽  
...  

The COVID-19 pandemic has had a significant impact on public life and health worldwide, putting the world’s healthcare systems at risk. The first step in stopping this outbreak is to detect the infection in its early stages, which will relieve the risk, control the outbreak’s spread, and restore full functionality to the world’s healthcare systems. Currently, PCR is the most prevalent diagnosis tool for COVID-19. However, chest X-ray images may play an essential role in detecting this disease, as they are successful for many other viral pneumonia diseases. Unfortunately, there are common features between COVID-19 and other viral pneumonia, and hence manual differentiation between them seems to be a critical problem and needs the aid of artificial intelligence. This research employs deep- and transfer-learning techniques to develop accurate, general, and robust models for detecting COVID-19. The developed models utilize either convolutional neural networks or transfer-learning models or hybridize them with powerful machine-learning techniques to exploit their full potential. For experimentation, we applied the proposed models to two data sets: the COVID-19 Radiography Database from Kaggle and a local data set from Asir Hospital, Abha, Saudi Arabia. The proposed models achieved promising results in detecting COVID-19 cases and discriminating them from normal and other viral pneumonia with excellent accuracy. The hybrid models extracted features from the flatten layer or the first hidden layer of the neural network and then fed these features into a classification algorithm. This approach enhanced the results further to full accuracy for binary COVID-19 classification and 97.8% for multiclass classification.


2020 ◽  
Author(s):  
Rory Bunker ◽  
Teo Sunsjak

Over the past two decades, Machine Learning (ML) techniques have been increasingly utilized for the purpose of predicting outcomes in sport. In this paper, we provide a review of studies that have used ML for predicting results in team sport, covering studies from 1996 to 2019. We sought to answer five key research questions while extensively surveying papers in this field. This paper offers insights into which ML algorithms have tended to be used in this field, as well as those that are beginning to emerge with successful outcomes. Our research highlights defining characteristics of successful studies and identifies robust strategies for evaluating accuracy results in this application domain. Our study considers accuracies that have been achieved across different sports and explores the notion that outcomes of some team sports could be inherently more difficult to predict than others. Finally, our study uncovers common themes of future research directions across all surveyed papers, looking for gaps and opportunities, while proposing recommendations for future researchers in this domain.


BMJ Open ◽  
2020 ◽  
Vol 10 (11) ◽  
pp. e038832
Author(s):  
Constanza L Andaur Navarro ◽  
Johanna A A G Damen ◽  
Toshihiko Takada ◽  
Steven W J Nijman ◽  
Paula Dhiman ◽  
...  

IntroductionStudies addressing the development and/or validation of diagnostic and prognostic prediction models are abundant in most clinical domains. Systematic reviews have shown that the methodological and reporting quality of prediction model studies is suboptimal. Due to the increasing availability of larger, routinely collected and complex medical data, and the rising application of Artificial Intelligence (AI) or machine learning (ML) techniques, the number of prediction model studies is expected to increase even further. Prediction models developed using AI or ML techniques are often labelled as a ‘black box’ and little is known about their methodological and reporting quality. Therefore, this comprehensive systematic review aims to evaluate the reporting quality, the methodological conduct, and the risk of bias of prediction model studies that applied ML techniques for model development and/or validation.Methods and analysisA search will be performed in PubMed to identify studies developing and/or validating prediction models using any ML methodology and across all medical fields. Studies will be included if they were published between January 2018 and December 2019, predict patient-related outcomes, use any study design or data source, and available in English. Screening of search results and data extraction from included articles will be performed by two independent reviewers. The primary outcomes of this systematic review are: (1) the adherence of ML-based prediction model studies to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD), and (2) the risk of bias in such studies as assessed using the Prediction model Risk Of Bias ASsessment Tool (PROBAST). A narrative synthesis will be conducted for all included studies. Findings will be stratified by study type, medical field and prevalent ML methods, and will inform necessary extensions or updates of TRIPOD and PROBAST to better address prediction model studies that used AI or ML techniques.Ethics and disseminationEthical approval is not required for this study because only available published data will be analysed. Findings will be disseminated through peer-reviewed publications and scientific conferences.Systematic review registrationPROSPERO, CRD42019161764.


2021 ◽  
Vol 22 (6) ◽  
pp. 2903
Author(s):  
Noam Auslander ◽  
Ayal B. Gussow ◽  
Eugene V. Koonin

The exponential growth of biomedical data in recent years has urged the application of numerous machine learning techniques to address emerging problems in biology and clinical research. By enabling the automatic feature extraction, selection, and generation of predictive models, these methods can be used to efficiently study complex biological systems. Machine learning techniques are frequently integrated with bioinformatic methods, as well as curated databases and biological networks, to enhance training and validation, identify the best interpretable features, and enable feature and model investigation. Here, we review recently developed methods that incorporate machine learning within the same framework with techniques from molecular evolution, protein structure analysis, systems biology, and disease genomics. We outline the challenges posed for machine learning, and, in particular, deep learning in biomedicine, and suggest unique opportunities for machine learning techniques integrated with established bioinformatics approaches to overcome some of these challenges.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Khushnood Abbas ◽  
Alireza Abbasi ◽  
Shi Dong ◽  
Ling Niu ◽  
Laihang Yu ◽  
...  

Abstract Background Technological and research advances have produced large volumes of biomedical data. When represented as a network (graph), these data become useful for modeling entities and interactions in biological and similar complex systems. In the field of network biology and network medicine, there is a particular interest in predicting results from drug–drug, drug–disease, and protein–protein interactions to advance the speed of drug discovery. Existing data and modern computational methods allow to identify potentially beneficial and harmful interactions, and therefore, narrow drug trials ahead of actual clinical trials. Such automated data-driven investigation relies on machine learning techniques. However, traditional machine learning approaches require extensive preprocessing of the data that makes them impractical for large datasets. This study presents wide range of machine learning methods for predicting outcomes from biomedical interactions and evaluates the performance of the traditional methods with more recent network-based approaches. Results We applied a wide range of 32 different network-based machine learning models to five commonly available biomedical datasets, and evaluated their performance based on three important evaluations metrics namely AUROC, AUPR, and F1-score. We achieved this by converting link prediction problem as binary classification problem. In order to achieve this we have considered the existing links as positive example and randomly sampled negative examples from non-existant set. After experimental evaluation we found that Prone, ACT and $$LRW_5$$ L R W 5 are the top 3 best performers on all five datasets. Conclusions This work presents a comparative evaluation of network-based machine learning algorithms for predicting network links, with applications in the prediction of drug-target and drug–drug interactions, and applied well known network-based machine learning methods. Our work is helpful in guiding researchers in the appropriate selection of machine learning methods for pharmaceutical tasks.


2015 ◽  
Vol 2015 ◽  
pp. 1-12 ◽  
Author(s):  
Abdul Wahab Muzaffar ◽  
Farooque Azam ◽  
Usman Qamar

The information extraction from unstructured text segments is a complex task. Although manual information extraction often produces the best results, it is harder to manage biomedical data extraction manually because of the exponential increase in data size. Thus, there is a need for automatic tools and techniques for information extraction in biomedical text mining. Relation extraction is a significant area under biomedical information extraction that has gained much importance in the last two decades. A lot of work has been done on biomedical relation extraction focusing on rule-based and machine learning techniques. In the last decade, the focus has changed to hybrid approaches showing better results. This research presents a hybrid feature set for classification of relations between biomedical entities. The main contribution of this research is done in the semantic feature set where verb phrases are ranked using Unified Medical Language System (UMLS) and a ranking algorithm. Support Vector Machine and Naïve Bayes, the two effective machine learning techniques, are used to classify these relations. Our approach has been validated on the standard biomedical text corpus obtained from MEDLINE 2001. Conclusively, it can be articulated that our framework outperforms all state-of-the-art approaches used for relation extraction on the same corpus.


Author(s):  
Cephas Alves da Silveira Barreto ◽  
João C. Xavier-Júnior ◽  
Anne M. P. Canuto ◽  
Ivanovitch M. D. Da Silva

The potential for processing car sensing data has increased in recent years due to the development of new technologies. Having this type of data is important, for instance, to analyze the way drivers behave when sitting behind steering wheel. Many studies have addressed the drive behavior by developing smartphone-based telematics systems. However, very little has been done to analyze car usage patterns based on car engine sensor data, and, therefore, it has not been been explored its full potential by considering all sensors within a car engine. Aiming to bridge this gap, this paper proposes the use of Machine Learning techniques (supervised and unsupervised) on automotive engine sensor data to discover drivers’ usage patterns, and to perform classification through a distributed online sensing platform. We believe that such platform can be useful used in different domains, such as fleet management, insurance market, fuel consumption optimization, CO2 emission reduction, among others.


Sign in / Sign up

Export Citation Format

Share Document