Prediction of aircraft estimated time of arrival using machine learning methods

Forecasting residential gas consumption with machine learning algorithms on weather data

E3S Web of Conferences ◽

10.1051/e3sconf/201911105019 ◽

2019 ◽

Vol 111 ◽

pp. 05019

Author(s):

Brian de Keijzer ◽

Pol de Visser ◽

Víctor García Romillo ◽

Víctor Gómez Muñoz ◽

Daan Boesten ◽

...

Keyword(s):

Machine Learning ◽

Energy Consumption ◽

Energy Use ◽

Machine Learning Algorithms ◽

Weather Data ◽

Computational Time ◽

Percentage Error ◽

Learning Models ◽

Gas Consumption ◽

Machine Learning Models

Machine learning models have proven to be reliable methods in the forecasting of energy use in commercial and office buildings. However, little research has been done on energy forecasting in dwellings, mainly due to the difficulty of obtaining household level data while keeping the privacy of inhabitants in mind. Gaining insight into the energy consumption in the near future can be helpful in balancing the grid and insights in how to reduce the energy consumption can be received. In collaboration with OPSCHALER, a measurement campaign on the influence of housing characteristics on energy costs and comfort, several machine learning models were compared on forecasting performance and the computational time needed. Nine months of data containing the mean gas consumption of 52 dwellings on a one hour resolution was used for this research. The first 6 months were used for training, whereas the last 3 months were used to evaluate the models. The results showed that the Deep Neural Network (DNN) performed best with a 50.1 % Mean Absolute Percentage Error (MAPE) on a one hour resolution. When comparing daily and weekly resolutions, the Multivariate Linear Regression (MVLR) outperformed other models, with a 20.1 % and 17.0 % MAPE, respectively. The models were programmed in Python.

Download Full-text

REDIAL-2020: A Suite of Machine Learning Models to Estimate Anti-SARS-CoV-2 Activities

10.26434/chemrxiv.12915779.v2 ◽

2020 ◽

Author(s):

Govinda KC ◽

Giovanni Bocci ◽

Srijan Verma ◽

Mahmudulla Hassan ◽

Jayme Holmes ◽

...

Keyword(s):

Machine Learning ◽

High Throughput Screening ◽

Web Application ◽

External Validation ◽

Model Development ◽

Machine Learning Algorithms ◽

Virus Infectivity ◽

Learning Models ◽

Live Virus ◽

Machine Learning Models

Strategies for drug discovery and repositioning are an urgent need with respect to COVID-19. We developed "REDIAL-2020", a suite of machine learning models for estimating small molecule activity from molecular structure, for a range of SARS-CoV-2 related assays. Each classifier is based on three distinct types of descriptors (fingerprint, physicochemical, and pharmacophore) for parallel model development. These models were trained using high throughput screening data from the NCATS COVID19 portal (https://opendata.ncats.nih.gov/covid19/index.html), with multiple categorical machine learning algorithms. The “best models” are combined in an ensemble consensus predictor that outperforms single models where external validation is available. This suite of machine learning models is available through the DrugCentral web portal (<a href="https://drugdiscovery.utep.edu/redial">http://drugcentral.org/Redial</a>). Acceptable input formats are: drug name, PubChem CID, or SMILES; the output is an estimate of anti-SARS-CoV-2 activities. The web application reports estimated activity across three areas (viral entry, viral replication, and live virus infectivity) spanning six independent models, followed by a similarity search that displays the most similar molecules to the query among experimentally determined data. The ML models have 60% to 74% external predictivity, based on three separate datasets. Complementing the NCATS COVID19 portal, REDIAL-2020 can serve as a rapid online tool for identifying active molecules for COVID-19 treatment. The source code and specific models are available through Github (<a href="https://github.com/sirimullalab/ncats_covid">https://github.com/sirimullalab/</a>redial-2020), or via Docker Hub (https://hub.docker.com/r/sirimullalab/redial-2020) for users preferring a containerized version.

Download Full-text

REDIAL-2020: A Suite of Machine Learning Models to Estimate Anti-SARS-CoV-2 Activities

10.26434/chemrxiv.12915779 ◽

2020 ◽

Author(s):

Govinda KC ◽

Giovanni Bocci ◽

Srijan Verma ◽

Mahmudulla Hassan ◽

Jayme Holmes ◽

...

Keyword(s):

Machine Learning ◽

High Throughput Screening ◽

Web Application ◽

External Validation ◽

Model Development ◽

Machine Learning Algorithms ◽

Virus Infectivity ◽

Learning Models ◽

Live Virus ◽

Machine Learning Models

Strategies for drug discovery and repositioning are an urgent need with respect to COVID-19. We developed "REDIAL-2020", a suite of machine learning models for estimating small molecule activity from molecular structure, for a range of SARS-CoV-2 related assays. Each classifier is based on three distinct types of descriptors (fingerprint, physicochemical, and pharmacophore) for parallel model development. These models were trained using high throughput screening data from the NCATS COVID19 portal (https://opendata.ncats.nih.gov/covid19/index.html), with multiple categorical machine learning algorithms. The “best models” are combined in an ensemble consensus predictor that outperforms single models where external validation is available. This suite of machine learning models is available through the DrugCentral web portal (<a href="https://drugdiscovery.utep.edu/redial">http://drugcentral.org/Redial</a>). Acceptable input formats are: drug name, PubChem CID, or SMILES; the output is an estimate of anti-SARS-CoV-2 activities. The web application reports estimated activity across three areas (viral entry, viral replication, and live virus infectivity) spanning six independent models, followed by a similarity search that displays the most similar molecules to the query among experimentally determined data. The ML models have 60% to 74% external predictivity, based on three separate datasets. Complementing the NCATS COVID19 portal, REDIAL-2020 can serve as a rapid online tool for identifying active molecules for COVID-19 treatment. The source code and specific models are available through Github (<a href="https://github.com/sirimullalab/ncats_covid">https://github.com/sirimullalab/</a>redial-2020), or via Docker Hub (https://hub.docker.com/r/sirimullalab/redial-2020) for users preferring a containerized version.

Download Full-text

REDIAL-2020: A Suite of Machine Learning Models to Estimate Anti-SARS-CoV-2 Activities

10.26434/chemrxiv.12915779.v1 ◽

2020 ◽

Author(s):

Govinda KC ◽

Giovanni Bocci ◽

Srijan Verma ◽

Mahmudulla Hassan ◽

Jayme Holmes ◽

...

Keyword(s):

Machine Learning ◽

High Throughput Screening ◽

Web Application ◽

External Validation ◽

Model Development ◽

Machine Learning Algorithms ◽

Virus Infectivity ◽

Learning Models ◽

Live Virus ◽

Machine Learning Models

Strategies for drug discovery and repositioning are an urgent need with respect to COVID-19. We developed "REDIAL-2020", a suite of machine learning models for estimating small molecule activity from molecular structure, for a range of SARS-CoV-2 related assays. Each classifier is based on three distinct types of descriptors (fingerprint, physicochemical, and pharmacophore) for parallel model development. These models were trained using high throughput screening data from the NCATS COVID19 portal (https://opendata.ncats.nih.gov/covid19/index.html), with multiple categorical machine learning algorithms. The “best models” are combined in an ensemble consensus predictor that outperforms single models where external validation is available. This suite of machine learning models is available through the DrugCentral web portal (<a href="https://drugdiscovery.utep.edu/redial">http://drugcentral.org/Redial</a>). Acceptable input formats are: drug name, PubChem CID, or SMILES; the output is an estimate of anti-SARS-CoV-2 activities. The web application reports estimated activity across three areas (viral entry, viral replication, and live virus infectivity) spanning six independent models, followed by a similarity search that displays the most similar molecules to the query among experimentally determined data. The ML models have 60% to 74% external predictivity, based on three separate datasets. Complementing the NCATS COVID19 portal, REDIAL-2020 can serve as a rapid online tool for identifying active molecules for COVID-19 treatment. The source code and specific models are available through Github (<a href="https://github.com/sirimullalab/ncats_covid">https://github.com/sirimullalab/</a>redial-2020), or via Docker Hub (https://hub.docker.com/r/sirimullalab/redial-2020) for users preferring a containerized version.

Download Full-text

REDIAL-2020: A suite of machine learning models to estimate Anti-SARS-CoV-2 activities

10.21203/rs.3.rs-76894/v1 ◽

2020 ◽

Author(s):

Govinda KC ◽

Giovanni Bocci ◽

Srijan Verma ◽

Md Hassan ◽

Jayme Holmes ◽

...

Keyword(s):

Machine Learning ◽

High Throughput Screening ◽

Web Application ◽

External Validation ◽

Model Development ◽

Machine Learning Algorithms ◽

Virus Infectivity ◽

Learning Models ◽

Live Virus ◽

Machine Learning Models

Abstract Strategies for drug discovery and repositioning are an urgent need with respect to COVID-19. We developed "REDIAL-2020", a suite of machine learning models for estimating small molecule activity from molecular structure, for a range of SARS-CoV-2 related assays. Each classifier is based on three distinct types of descriptors (fingerprint, physicochemical, and pharmacophore) for parallel model development. These models were trained using high throughput screening data from the NCATS COVID19 portal (https://opendata.ncats.nih.gov/covid19/index.html), with multiple categorical machine learning algorithms. The “best models” are combined in an ensemble consensus predictor that outperforms single models where external validation is available. This suite of machine learning models is available through the DrugCentral web portal (http://drugcentral.org/Redial). Acceptable input formats are: drug name, PubChem CID, or SMILES; the output is an estimate of anti-SARS-CoV-2 activities. The web application reports estimated activity across three areas (viral entry, viral replication, and live virus infectivity) spanning six independent models, followed by a similarity search that displays the most similar molecules to the query among experimentally determined data. The ML models have 60% to 74% external predictivity, based on three separate datasets. Complementing the NCATS COVID19 portal, REDIAL-2020 can serve as a rapid online tool for identifying active molecules for COVID-19 treatment. The source code and specific models are available through Github (https://github.com/sirimullalab/redial-2020), or via Docker Hub (https://hub.docker.com/r/sirimullalab/redial-2020) for users preferring a containerized version.

Download Full-text

Characterizing and Evaluating the Zoonotic Potential of Novel Viruses Discovered in Vampire Bats

Viruses ◽

10.3390/v13020252 ◽

2021 ◽

Vol 13 (2) ◽

pp. 252

Author(s):

Laura M. Bergner ◽

Nardus Mollentze ◽

Richard J. Orton ◽

Carlos Tello ◽

Alice Broos ◽

...

Keyword(s):

Machine Learning ◽

Phylogenetic Analyses ◽

Human Infection ◽

Machine Learning Algorithms ◽

Zoonotic Potential ◽

Metagenomic Sequencing ◽

Learning Models ◽

Sequencing Data ◽

Vampire Bats ◽

Machine Learning Models

The contemporary surge in metagenomic sequencing has transformed knowledge of viral diversity in wildlife. However, evaluating which newly discovered viruses pose sufficient risk of infecting humans to merit detailed laboratory characterization and surveillance remains largely speculative. Machine learning algorithms have been developed to address this imbalance by ranking the relative likelihood of human infection based on viral genome sequences, but are not yet routinely applied to viruses at the time of their discovery. Here, we characterized viral genomes detected through metagenomic sequencing of feces and saliva from common vampire bats (Desmodus rotundus) and used these data as a case study in evaluating zoonotic potential using molecular sequencing data. Of 58 detected viral families, including 17 which infect mammals, the only known zoonosis detected was rabies virus; however, additional genomes were detected from the families Hepeviridae, Coronaviridae, Reoviridae, Astroviridae and Picornaviridae, all of which contain human-infecting species. In phylogenetic analyses, novel vampire bat viruses most frequently grouped with other bat viruses that are not currently known to infect humans. In agreement, machine learning models built from only phylogenetic information ranked all novel viruses similarly, yielding little insight into zoonotic potential. In contrast, genome composition-based machine learning models estimated different levels of zoonotic potential, even for closely related viruses, categorizing one out of four detected hepeviruses and two out of three picornaviruses as having high priority for further research. We highlight the value of evaluating zoonotic potential beyond ad hoc consideration of phylogeny and provide surveillance recommendations for novel viruses in a wildlife host which has frequent contact with humans and domestic animals.

Download Full-text

Short-Term Prediction of COVID-19 Cases Using Machine Learning Models

Applied Sciences ◽

10.3390/app11094266 ◽

2021 ◽

Vol 11 (9) ◽

pp. 4266

Author(s):

Md. Shahriare Satu ◽

Koushik Chandra Howlader ◽

Mufti Mahmud ◽

M. Shamim Kaiser ◽

Sheikh Mohammad Shariful Islam ◽

...

Keyword(s):

Machine Learning ◽

Web Application ◽

Learning Models ◽

Short Term ◽

Infected People ◽

Infected Case ◽

First Case ◽

Novel Coronavirus ◽

Short Term Prediction ◽

Machine Learning Models

The first case in Bangladesh of the novel coronavirus disease (COVID-19) was reported on 8 March 2020, with the number of confirmed cases rapidly rising to over 175,000 by July 2020. In the absence of effective treatment, an essential tool of health policy is the modeling and forecasting of the progress of the pandemic. We, therefore, developed a cloud-based machine learning short-term forecasting model for Bangladesh, in which several regression-based machine learning models were applied to infected case data to estimate the number of COVID-19-infected people over the following seven days. This approach can accurately forecast the number of infected cases daily by training the prior 25 days sample data recorded on our web application. The outcomes of these efforts could aid the development and assessment of prevention strategies and identify factors that most affect the spread of COVID-19 infection in Bangladesh.

Download Full-text

Model Comparison for Esp Run-Life Prediction: Classic Statistics Vs. Machine Learning

10.2118/206028-ms ◽

2021 ◽

Author(s):

Alejandro Celemín ◽

Diego A. Estupiñan ◽

Ricardo Nieto

Keyword(s):

Machine Learning ◽

Model Comparison ◽

Proportional Hazards ◽

Proportional Hazards Model ◽

Machine Learning Algorithms ◽

Slight Reduction ◽

Learning Models ◽

Validation Data ◽

Operational Conditions ◽

Machine Learning Models

Abstract Electrical Submersible Pumps reliability and run-life analysis has been extensively studied since its development. Current machine learning algorithms allow to correlate operational conditions to ESP run-life in order to generate predictions for active and new wells. Four machine learning models are compared to a linear proportional hazards model, used as a baseline for comparison purposes. Proper accuracy metrics for survival analysis problems are calculated on run-life predictions vs. actual values over training and validation data subsets. Results demonstrate that the baseline model is able to produce more consistent predictions with a slight reduction in its accuracy, compared to current machine learning models for small datasets. This study demonstrates that the quality of the date and it pre-processing supports the current shift from model-centric to data-centric approach to machine and deep learning problems.

Download Full-text

Machine Learning Models for Finger Bend Evaluation using Implemented Low cost Flex Sensor

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35742 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 3605-3611

Author(s):

Pratyush Kaware

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Low Cost ◽

Learning Algorithms ◽

Cost Effective ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Models ◽

Machine Learning Models

In this paper a cost-effective sensor has been implemented to read finger bend signals, by attaching the sensor to a finger, so as to classify them based on the degree of bent as well as the joint about which the finger was being bent. This was done by testing with various machine learning algorithms to get the most accurate and consistent classifier. Finally, we found that Support Vector Machine was the best algorithm suited to classify our data, using we were able predict live state of a finger, i.e., the degree of bent and the joints involved. The live voltage values from the sensor were transmitted using a NodeMCU micro-controller which were converted to digital and uploaded on a database for analysis.

Download Full-text

Comparison of the Performance of Machine Learning Algorithms in Predicting Heart Disease

Frontiers in Health Informatics ◽

10.30699/fhi.v10i1.349 ◽

2021 ◽

Vol 10 (1) ◽

pp. 99

Author(s):

Sajad Yousefi

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Heart Disease ◽

Decision Tree ◽

Roc Curve ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Learning Models ◽

Algorithm Performance ◽

Machine Learning Models

Introduction: Heart disease is often associated with conditions such as clogged arteries due to the sediment accumulation which causes chest pain and heart attack. Many people die due to the heart disease annually. Most countries have a shortage of cardiovascular specialists and thus, a significant percentage of misdiagnosis occurs. Hence, predicting this disease is a serious issue. Using machine learning models performed on multidimensional dataset, this article aims to find the most efficient and accurate machine learning models for disease prediction.Material and Methods: Several algorithms were utilized to predict heart disease among which Decision Tree, Random Forest and KNN supervised machine learning are highly mentioned. The algorithms are applied to the dataset taken from the UCI repository including 294 samples. The dataset includes heart disease features. To enhance the algorithm performance, these features are analyzed, the feature importance scores and cross validation are considered.Results: The algorithm performance is compared with each other, so that performance based on ROC curve and some criteria such as accuracy, precision, sensitivity and F1 score were evaluated for each model. As a result of evaluation, Accuracy, AUC ROC are 83% and 99% respectively for Decision Tree algorithm. Logistic Regression algorithm with accuracy and AUC ROC are 88% and 91% respectively has better performance than other algorithms. Therefore, these techniques can be useful for physicians to predict heart disease patients and prescribe them correctly.Conclusion: Machine learning technique can be used in medicine for analyzing the related data collections to a disease and its prediction. The area under the ROC curve and evaluating criteria related to a number of classifying algorithms of machine learning to evaluate heart disease and indeed, the prediction of heart disease is compared to determine the most appropriate classification. As a result of evaluation, better performance was observed in both Decision Tree and Logistic Regression models.

Download Full-text