scholarly journals The Practice of Implementing ML Service into an Internet Business Application

2021 ◽  
Vol 24 ◽  
pp. 8-14
Author(s):  
Pavels Osipovs

Currently, there are a large number of articles describing the theoretical aspects of development in the field of machine learning. However, the experience of their practical application in real systems is described much less often. Basically, authors describe the efficiency, accuracy, and other performance metrics of the resulting solution, but everything stops at the prototype stage. At the same time, how the trained model will behave not on test data, but in real conditions, can be very different from the indicators obtained at the development stage. This article describes the experience of the implementation and real use of a classification service based on machine learning techniques.

2018 ◽  
Vol 11 (1) ◽  
pp. 105 ◽  
Author(s):  
Syed Abidi ◽  
Mushtaq Hussain ◽  
Yonglin Xu ◽  
Wu Zhang

Incorporating substantial, sustainable development issues into teaching and learning is the ultimate task of Education for Sustainable Development (ESD). The purpose of our study was to identify the confused students who had failed to master the skill(s) given by the tutors as homework using the Intelligent Tutoring System (ITS). We have focused ASSISTments, an ITS in this study, and scrutinized the skill-builder data using machine learning techniques and methods. We used seven candidate models including: Naïve Bayes (NB), Generalized Linear Model (GLM), Logistic Regression (LR), Deep Learning (DL), Decision Tree (DT), Random Forest (RF), and Gradient Boosted Trees (XGBoost). We trained, validated, and tested learning algorithms, performed stratified cross-validation, and measured the performance of the models through various performance metrics, i.e., ROC (Receiver Operating Characteristic), Accuracy, Precision, Recall, F-Measure, Sensitivity, and Specificity. We found RF, GLM, XGBoost, and DL were high accuracy-achieving classifiers. However, other perceptions such as detecting unexplored features that might be related to the forecasting of outputs can also boost the accuracy of the prediction model. Through machine learning methods, we identified the group of students that were confused when attempting the homework exercise, to help foster their knowledge and talent to play a vital role in environmental development.


Author(s):  
Manojit Chattopadhyay ◽  
Rinku Sen ◽  
Sumeet Gupta

Securing a machine from various cyber-attacks has been of serious concern for researchers, statutory bodies such as governments, business organizations and users in both wired and wireless media. However, during the last decade, the amount of data handling by any device, particularly servers, has increased exponentially and hence the security of these devices has become a matter of utmost concern. This paper attempts to examine the challenges in the application of machine learning techniques to intrusion detection. We review different inherent issues in defining and applying the machine learning techniques to intrusion detection. We also attempt to identify the best technological solution for changing usage pattern by comparing different machine learning techniques on different datasets and summarizing their performance using various performance metrics. This paper highlights the research challenges and future trends of intrusion detection in dynamic scenarios of intrusion detection problems in diverse network technologies.


In this chapter, the authors discuss machine learning techniques and artificial intelligence applications, their role in business, and present a practical application of it. They try to highlight how important machine learning can be in data-driven organisations, where the cost and/or the advantages to implement such tools are far greater than having a human—or a team of humans—doing it.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Maryam AlJame ◽  
Ayyub Imtiaz ◽  
Imtiaz Ahmad ◽  
Ameer Mohammed

AbstractThe Coronavirus Disease 2019 (COVID-19) global pandemic has threatened the lives of people worldwide and posed considerable challenges. Early and accurate screening of infected people is vital for combating the disease. To help with the limited quantity of swab tests, we propose a machine learning prediction model to accurately diagnose COVID-19 from clinical and/or routine laboratory data. The model exploits a new ensemble-based method called the deep forest (DF), where multiple classifiers in multiple layers are used to encourage diversity and improve performance. The cascade level employs the layer-by-layer processing and is constructed from three different classifiers: extra trees, XGBoost, and LightGBM. The prediction model was trained and evaluated on two publicly available datasets. Experimental results show that the proposed DF model has an accuracy of 99.5%, sensitivity of 95.28%, and specificity of 99.96%. These performance metrics are comparable to other well-established machine learning techniques, and hence DF model can serve as a fast screening tool for COVID-19 patients at places where testing is scarce.


2022 ◽  
Vol 2161 (1) ◽  
pp. 012013
Author(s):  
Chiradeep Gupta ◽  
Athina Saha ◽  
N V Subba Reddy ◽  
U Dinesh Acharya

Abstract Diagnosis of cardiac disease requires being more accurate, precise, and reliable. The number of death cases due to cardiac attacks is increasing exponentially day by day. Thus, practical approaches for earlier diagnosis of cardiac or heart disease are done to achieve prompt management of the disease. Various supervised machine learning techniques like K-Nearest Neighbour, Decision Tree, Logistic Regression, Naïve Bayes, and Support Vector Machine (SVM) model are used for predicting cardiac disease using a dataset that was collected from the repository of the University of California, Irvine (UCI). The results depict that Logistic Regression was better than all other supervised classifiers in terms of the performance metrics. The model is also less risky since the number of false negatives is low as compared to other models as per the confusion matrix of all the models. In addition, ensemble techniques can be approached for the accuracy improvement of the classifier. Jupyter notebook is the best tool, for the implementation of Python Programming having many types of libraries, header files, for accurate and precise work.


2020 ◽  
Vol 8 (5) ◽  
pp. 1577-1580

Heart disease is most common now a days and it is a very serious problem. Machine learning provides a best way for predicting heart disease. The aim of this paper is to develop simple, light weight approach for detecting heart disease by machine learning techniques. Machine learning can be implemented in heart disease prediction. In this paper different machine learning techniques have been used and it compares the result using various performance metrics. This study aims to perform comparative analysis of heart disease detection using publicly available dataset collected from UCI machine learning repository. There are various datasets available such as Switzerland dataset, Hungarian dataset and Cleveland dataset. Here Cleveland dataset is used which is having 303 records of patients along with 14 attributes are used for this study and testing. These datasets are preprocessed by removing all the noisy and missing data from the dataset. And then the preprocessed dataset are used for analysis. In this study six different machine learning techniques were used for comparison based on various performance metrics. The analysis shows that out of six techniques SVM gives the best result with 89.34%. A GUI is developed for the prediction of heart disease.


Author(s):  
Syed Muhammad Raza Abidi ◽  
Mushtaq Hussain ◽  
Yonglin Xu ◽  
Wu Zhang

Incorporating substantial sustainable development issues into teaching and learning is the ultimate task of Education for Sustainable Development (ESD). The purpose of our study is to identify the confused students who have failed to master the skill(s) given by the tutors as a homework using Intelligent Tutoring System (ITS). We have focused ASSISTments, an ITS in this study and scrutinized the skill-builder data using machine learning techniques and methods. We used seven candidate models that include: Naïve Bayes (NB), Generalized Linear Model (GLM), Logistic Regression (LR), Deep Learning (DL), Decision Tree (DT), Random Forest (RF), and Gradient Boosted Trees (XGBoost). We trained, validated and tested learning algorithms, performed stratified cross-validation and measured the performance of the models through various performance metrics i.e., ROC (Receiver Operating Characteristic), Accuracy, Precision, Recall, F-Measure, Sensitivity & Specificity. We found GLM, DT & RF are high accuracies achieving classifiers. However, other perceptions such as detection of unexplored features that might be related to the forecasting of outputs can also boost the accuracy of the prediction model. Through machine learning methods, we identified the group of students which were confused attempting the homework exercise and can help students foster their knowledge, and talent to play a vital role in environmental development.


Author(s):  
Kexin (May) Ren ◽  
Amy M. Kim ◽  
Kenneth Kuhn

This study introduces a novel method of merging disparate but complementary datasets and applying machine learning techniques to ground delay program (GDP) data. More specifically, it aims to characterize GDPs with respect to changing weather forecasts, GDP plan parameters, and operational performance. The analysis aims to gain insights into GDP usage patterns (implementation and revisions), with respect to these key dimensions. It also aims to gain insights into how GDP cancelations and revisions correlate with operational efficiency and predictability. The results could be used to help traffic managers and air carriers understand complex patterns in the evolution of GDPs, so that they might, for example, better anticipate or even plan a response to a change in weather conditions. The focus is on GDPs at Newark Liberty International Airport (EWR), from 2010 through 2014. A master dataset was generated by merging several datasets on GDPs, weather forecasts, and individual flight information. Several scenarios of GDP evolution were then identified by reducing the dimensionality of the master GDP dataset, then applying cluster analysis on the lower dimensional data. It was found that GDPs at EWR can be categorized into 10 types based on weather forecasts, realized weather, GDP scope, arrival rates, and duration. The characteristics of these 10 GDP clusters were further explored by examining the relationships between GDP scenarios and their performance. It was found that GDPs under stable, low-severity weather and with large scope may score higher on the efficiency metric than expected. When GDPs called in the same weather conditions have high program rates, medium durations, and narrow scopes, capacity utilization was higher than expected—less affected flights lead to fewer cancelations and more arrivals (albeit delayed), and therefore, higher capacity utilization. Results also suggest that program rates are set more conservatively than needed for some poor weather conditions that end earlier than expected. GDPs with fewer revisions were associated with a higher predictability score but lower efficiency score. These findings can provide greater insights and knowledge about GDPs for future planning purposes. More specifically, the findings could, for example, be used to support discussion around, or even future guidance regarding, how to set and adjust GDP program rates. In future work additional data could be utilized to provide a more comprehensive operational picture of GDPs, and a wider range of performance metrics could be considered. It is also recommended that the patterns of how GDPs evolve over their lifetimes be further explored using other machine learning techniques that may provide new and useful insights.


2021 ◽  
Author(s):  
Maryam AlJame ◽  
Ayyub Imtiaz ◽  
Imtiaz Ahmad ◽  
Ameer Mohammed

Abstract The Coronavirus Disease 2019 (COVID-19) global pandemic has threatened the lives of people worldwide and poses considerable challenges. Early and accurate screening of infected people is vital for combating the disease. To help with the limited quantity of swab tests, we propose a machine learning prediction model to accurately diagnose COVID-19 from clinical and/or routine laboratory data. The model exploits a new ensemble-based method called the deep forest (DF), where multiple classifiers in multiple layers are used to encourage diversity and improve performance. The cascade level employs the layer-by-layer processing and is constructed from three different classifiers: extra trees, XGBoost, and LightGBM. The prediction model was trained and evaluated on two publicly available datasets. Experimental results show that the proposed DF model has an accuracy of 99.5%, sensitivity of 95.28%, and specificity of 99.96%. These performance metrics are comparable to other well-established machine learning techniques, and hence DF model can serve as a fast screening tool for COVID-19 patients at places where testing is scarce.


2020 ◽  
Vol 6 (1) ◽  
Author(s):  
Maisa Cardoso Aniceto ◽  
Flavio Barboza ◽  
Herbert Kimura

AbstractCredit risk evaluation has a relevant role to financial institutions, since lending may result in real and immediate losses. In particular, default prediction is one of the most challenging activities for managing credit risk. This study analyzes the adequacy of borrower’s classification models using a Brazilian bank’s loan database, and exploring machine learning techniques. We develop Support Vector Machine, Decision Trees, Bagging, AdaBoost and Random Forest models, and compare their predictive accuracy with a benchmark based on a Logistic Regression model. Comparisons are analyzed based on usual classification performance metrics. Our results show that Random Forest and Adaboost perform better when compared to other models. Moreover, Support Vector Machine models show poor performance using both linear and nonlinear kernels. Our findings suggest that there are value creating opportunities for banks to improve default prediction models by exploring machine learning techniques.


Sign in / Sign up

Export Citation Format

Share Document