The Practice of Implementing ML Service into an Internet Business Application

Practical Application ◽

Learning Techniques ◽

Business Application ◽

Internet Business

Currently, there are a large number of articles describing the theoretical aspects of development in the field of machine learning. However, the experience of their practical application in real systems is described much less often. Basically, authors describe the efficiency, accuracy, and other performance metrics of the resulting solution, but everything stops at the prototype stage. At the same time, how the trained model will behave not on test data, but in real conditions, can be very different from the indicators obtained at the development stage. This article describes the experience of the implementation and real use of a classification service based on machine learning techniques.

Prediction of Confusion Attempting Algebra Homework in an Intelligent Tutoring System through Machine Learning Techniques for Educational Sustainable Development

Sustainability ◽

10.3390/su11010105 ◽

2018 ◽

Vol 11 (1) ◽

pp. 105 ◽

Cited By ~ 10

Author(s):

Syed Abidi ◽

Mushtaq Hussain ◽

Yonglin Xu ◽

Wu Zhang

Keyword(s):

Machine Learning ◽

Sustainable Development ◽

Teaching And Learning ◽

Performance Metrics ◽

Intelligent Tutoring ◽

Intelligent Tutoring System ◽

Vital Role ◽

Tutoring System ◽

Incorporating substantial, sustainable development issues into teaching and learning is the ultimate task of Education for Sustainable Development (ESD). The purpose of our study was to identify the confused students who had failed to master the skill(s) given by the tutors as homework using the Intelligent Tutoring System (ITS). We have focused ASSISTments, an ITS in this study, and scrutinized the skill-builder data using machine learning techniques and methods. We used seven candidate models including: Naïve Bayes (NB), Generalized Linear Model (GLM), Logistic Regression (LR), Deep Learning (DL), Decision Tree (DT), Random Forest (RF), and Gradient Boosted Trees (XGBoost). We trained, validated, and tested learning algorithms, performed stratified cross-validation, and measured the performance of the models through various performance metrics, i.e., ROC (Receiver Operating Characteristic), Accuracy, Precision, Recall, F-Measure, Sensitivity, and Specificity. We found RF, GLM, XGBoost, and DL were high accuracy-achieving classifiers. However, other perceptions such as detecting unexplored features that might be related to the forecasting of outputs can also boost the accuracy of the prediction model. Through machine learning methods, we identified the group of students that were confused when attempting the homework exercise, to help foster their knowledge and talent to play a vital role in environmental development.

A Comprehensive Review and meta-analysis on Applications of Machine Learning Techniques in Intrusion Detection

Australasian Journal of Information Systems ◽

10.3127/ajis.v22i0.1667 ◽

2018 ◽

Vol 22 ◽

Cited By ~ 1

Author(s):

Manojit Chattopadhyay ◽

Rinku Sen ◽

Sumeet Gupta

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Performance Metrics ◽

Meta Analysis ◽

Cyber Attacks ◽

Business Organizations ◽

Learning Techniques ◽

Applications Of Machine Learning ◽

Network Technologies

Securing a machine from various cyber-attacks has been of serious concern for researchers, statutory bodies such as governments, business organizations and users in both wired and wireless media. However, during the last decade, the amount of data handling by any device, particularly servers, has increased exponentially and hence the security of these devices has become a matter of utmost concern. This paper attempts to examine the challenges in the application of machine learning techniques to intrusion detection. We review different inherent issues in defining and applying the machine learning techniques to intrusion detection. We also attempt to identify the best technological solution for changing usage pattern by comparing different machine learning techniques on different datasets and summarizing their performance using various performance metrics. This paper highlights the research challenges and future trends of intrusion detection in dynamic scenarios of intrusion detection problems in diverse network technologies.

Advancing Skill Development for Business Managers in Industry 4.0 - Advances in Logistics, Operations, and Management Science ◽

Artificial Intelligence in Practice

10.4018/978-1-7998-2036-9.ch005 ◽

2020 ◽

pp. 98-123

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Data Driven ◽

Practical Application ◽

Learning Techniques ◽

The Cost

In this chapter, the authors discuss machine learning techniques and artificial intelligence applications, their role in business, and present a practical application of it. They try to highlight how important machine learning can be in data-driven organisations, where the cost and/or the advantages to implement such tools are far greater than having a human—or a team of humans—doing it.

Deep forest model for diagnosing COVID-19 from routine blood tests

Scientific Reports ◽

10.1038/s41598-021-95957-w ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Maryam AlJame ◽

Ayyub Imtiaz ◽

Imtiaz Ahmad ◽

Ameer Mohammed

Keyword(s):

Machine Learning ◽

Prediction Model ◽

Performance Metrics ◽

Laboratory Data ◽

Layer By Layer ◽

Infected People ◽

Global Pandemic ◽

Learning Techniques ◽

Deep Forest

AbstractThe Coronavirus Disease 2019 (COVID-19) global pandemic has threatened the lives of people worldwide and posed considerable challenges. Early and accurate screening of infected people is vital for combating the disease. To help with the limited quantity of swab tests, we propose a machine learning prediction model to accurately diagnose COVID-19 from clinical and/or routine laboratory data. The model exploits a new ensemble-based method called the deep forest (DF), where multiple classifiers in multiple layers are used to encourage diversity and improve performance. The cascade level employs the layer-by-layer processing and is constructed from three different classifiers: extra trees, XGBoost, and LightGBM. The prediction model was trained and evaluated on two publicly available datasets. Experimental results show that the proposed DF model has an accuracy of 99.5%, sensitivity of 95.28%, and specificity of 99.96%. These performance metrics are comparable to other well-established machine learning techniques, and hence DF model can serve as a fast screening tool for COVID-19 patients at places where testing is scarce.

Cardiac Disease Prediction using Supervised Machine Learning Techniques.

Journal of Physics Conference Series ◽

10.1088/1742-6596/2161/1/012013 ◽

2022 ◽

Vol 2161 (1) ◽

pp. 012013

Author(s):

Chiradeep Gupta ◽

Athina Saha ◽

N V Subba Reddy ◽

U Dinesh Acharya

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Cardiac Disease ◽

Performance Metrics ◽

Confusion Matrix ◽

Supervised Machine Learning ◽

Support Vector ◽

Ensemble Techniques ◽

Abstract Diagnosis of cardiac disease requires being more accurate, precise, and reliable. The number of death cases due to cardiac attacks is increasing exponentially day by day. Thus, practical approaches for earlier diagnosis of cardiac or heart disease are done to achieve prompt management of the disease. Various supervised machine learning techniques like K-Nearest Neighbour, Decision Tree, Logistic Regression, Naïve Bayes, and Support Vector Machine (SVM) model are used for predicting cardiac disease using a dataset that was collected from the repository of the University of California, Irvine (UCI). The results depict that Logistic Regression was better than all other supervised classifiers in terms of the performance metrics. The model is also less risky since the number of false negatives is low as compared to other models as per the confusion matrix of all the models. In addition, ensemble techniques can be approached for the accuracy improvement of the classifier. Jupyter notebook is the best tool, for the implementation of Python Programming having many types of libraries, header files, for accurate and precise work.

Heart Attack Prediction by using Machine Learning Techniques

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d9439.018520 ◽

2020 ◽

Vol 8 (5) ◽

pp. 1577-1580

Keyword(s):

Machine Learning ◽

Heart Disease ◽

Comparative Analysis ◽

Missing Data ◽

Heart Attack ◽

Performance Metrics ◽

Disease Prediction ◽

Light Weight ◽

Heart disease is most common now a days and it is a very serious problem. Machine learning provides a best way for predicting heart disease. The aim of this paper is to develop simple, light weight approach for detecting heart disease by machine learning techniques. Machine learning can be implemented in heart disease prediction. In this paper different machine learning techniques have been used and it compares the result using various performance metrics. This study aims to perform comparative analysis of heart disease detection using publicly available dataset collected from UCI machine learning repository. There are various datasets available such as Switzerland dataset, Hungarian dataset and Cleveland dataset. Here Cleveland dataset is used which is having 303 records of patients along with 14 attributes are used for this study and testing. These datasets are preprocessed by removing all the noisy and missing data from the dataset. And then the preprocessed dataset are used for analysis. In this study six different machine learning techniques were used for comparison based on various performance metrics. The analysis shows that out of six techniques SVM gives the best result with 89.34%. A GUI is developed for the prediction of heart disease.

Prediction of Confusion Attempting Algebra Homework in an Intelligent Tutoring System through Machine Learning Techniques for Educational Sustainable Development

10.20944/preprints201811.0460.v1 ◽

2018 ◽

Cited By ~ 1

Author(s):

Syed Muhammad Raza Abidi ◽

Mushtaq Hussain ◽

Yonglin Xu ◽

Wu Zhang

Keyword(s):

Machine Learning ◽

Sustainable Development ◽

Teaching And Learning ◽

Performance Metrics ◽

Intelligent Tutoring ◽

Intelligent Tutoring System ◽

Vital Role ◽

Tutoring System ◽

Incorporating substantial sustainable development issues into teaching and learning is the ultimate task of Education for Sustainable Development (ESD). The purpose of our study is to identify the confused students who have failed to master the skill(s) given by the tutors as a homework using Intelligent Tutoring System (ITS). We have focused ASSISTments, an ITS in this study and scrutinized the skill-builder data using machine learning techniques and methods. We used seven candidate models that include: Naïve Bayes (NB), Generalized Linear Model (GLM), Logistic Regression (LR), Deep Learning (DL), Decision Tree (DT), Random Forest (RF), and Gradient Boosted Trees (XGBoost). We trained, validated and tested learning algorithms, performed stratified cross-validation and measured the performance of the models through various performance metrics i.e., ROC (Receiver Operating Characteristic), Accuracy, Precision, Recall, F-Measure, Sensitivity & Specificity. We found GLM, DT & RF are high accuracies achieving classifiers. However, other perceptions such as detection of unexplored features that might be related to the forecasting of outputs can also boost the accuracy of the prediction model. Through machine learning methods, we identified the group of students which were confused attempting the homework exercise and can help students foster their knowledge, and talent to play a vital role in environmental development.

Exploration of the Evolution of Airport Ground Delay Programs

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198118782272 ◽

2018 ◽

Vol 2672 (23) ◽

pp. 71-81 ◽

Cited By ~ 2

Author(s):

Kexin (May) Ren ◽

Amy M. Kim ◽

Kenneth Kuhn

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Capacity Utilization ◽

Weather Conditions ◽

Lower Efficiency ◽

Weather Forecasts ◽

Learning Techniques ◽

Predictability Score ◽

Ground Delay

This study introduces a novel method of merging disparate but complementary datasets and applying machine learning techniques to ground delay program (GDP) data. More specifically, it aims to characterize GDPs with respect to changing weather forecasts, GDP plan parameters, and operational performance. The analysis aims to gain insights into GDP usage patterns (implementation and revisions), with respect to these key dimensions. It also aims to gain insights into how GDP cancelations and revisions correlate with operational efficiency and predictability. The results could be used to help traffic managers and air carriers understand complex patterns in the evolution of GDPs, so that they might, for example, better anticipate or even plan a response to a change in weather conditions. The focus is on GDPs at Newark Liberty International Airport (EWR), from 2010 through 2014. A master dataset was generated by merging several datasets on GDPs, weather forecasts, and individual flight information. Several scenarios of GDP evolution were then identified by reducing the dimensionality of the master GDP dataset, then applying cluster analysis on the lower dimensional data. It was found that GDPs at EWR can be categorized into 10 types based on weather forecasts, realized weather, GDP scope, arrival rates, and duration. The characteristics of these 10 GDP clusters were further explored by examining the relationships between GDP scenarios and their performance. It was found that GDPs under stable, low-severity weather and with large scope may score higher on the efficiency metric than expected. When GDPs called in the same weather conditions have high program rates, medium durations, and narrow scopes, capacity utilization was higher than expected—less affected flights lead to fewer cancelations and more arrivals (albeit delayed), and therefore, higher capacity utilization. Results also suggest that program rates are set more conservatively than needed for some poor weather conditions that end earlier than expected. GDPs with fewer revisions were associated with a higher predictability score but lower efficiency score. These findings can provide greater insights and knowledge about GDPs for future planning purposes. More specifically, the findings could, for example, be used to support discussion around, or even future guidance regarding, how to set and adjust GDP program rates. In future work additional data could be utilized to provide a more comprehensive operational picture of GDPs, and a wider range of performance metrics could be considered. It is also recommended that the patterns of how GDPs evolve over their lifetimes be further explored using other machine learning techniques that may provide new and useful insights.

Deep Forest Model for Diagnosing COVID-19 From Routine Blood Tests

10.21203/rs.3.rs-567774/v1 ◽

2021 ◽

Author(s):

Maryam AlJame ◽

Ayyub Imtiaz ◽

Imtiaz Ahmad ◽

Ameer Mohammed

Keyword(s):

Machine Learning ◽

Prediction Model ◽

Performance Metrics ◽

Laboratory Data ◽

Layer By Layer ◽

Infected People ◽

Global Pandemic ◽

Learning Techniques ◽

Deep Forest

Abstract The Coronavirus Disease 2019 (COVID-19) global pandemic has threatened the lives of people worldwide and poses considerable challenges. Early and accurate screening of infected people is vital for combating the disease. To help with the limited quantity of swab tests, we propose a machine learning prediction model to accurately diagnose COVID-19 from clinical and/or routine laboratory data. The model exploits a new ensemble-based method called the deep forest (DF), where multiple classifiers in multiple layers are used to encourage diversity and improve performance. The cascade level employs the layer-by-layer processing and is constructed from three different classifiers: extra trees, XGBoost, and LightGBM. The prediction model was trained and evaluated on two publicly available datasets. Experimental results show that the proposed DF model has an accuracy of 99.5%, sensitivity of 95.28%, and specificity of 99.96%. These performance metrics are comparable to other well-established machine learning techniques, and hence DF model can serve as a fast screening tool for COVID-19 patients at places where testing is scarce.

Machine learning predictivity applied to consumer creditworthiness

Future Business Journal ◽

10.1186/s43093-020-00041-w ◽

2020 ◽

Vol 6 (1) ◽

Author(s):

Maisa Cardoso Aniceto ◽

Flavio Barboza ◽

Herbert Kimura

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Credit Risk ◽

Performance Metrics ◽

Prediction Models ◽

Support Vector ◽

Learning Techniques ◽

Default Prediction

AbstractCredit risk evaluation has a relevant role to financial institutions, since lending may result in real and immediate losses. In particular, default prediction is one of the most challenging activities for managing credit risk. This study analyzes the adequacy of borrower’s classification models using a Brazilian bank’s loan database, and exploring machine learning techniques. We develop Support Vector Machine, Decision Trees, Bagging, AdaBoost and Random Forest models, and compare their predictive accuracy with a benchmark based on a Logistic Regression model. Comparisons are analyzed based on usual classification performance metrics. Our results show that Random Forest and Adaboost perform better when compared to other models. Moreover, Support Vector Machine models show poor performance using both linear and nonlinear kernels. Our findings suggest that there are value creating opportunities for banks to improve default prediction models by exploring machine learning techniques.