scholarly journals Dynamic Analytics and Forecasting Model for Covid-19 Using Machine Learning Algorithms

Webology ◽  
2021 ◽  
Vol 18 (05) ◽  
pp. 1212-1225
Author(s):  
Siva C ◽  
Maheshwari K.G ◽  
Nalinipriya G ◽  
Priscilla Mary J

In our day to day life, the availability of correctly labelled data as well as handling of categorical data are mostly acknowledged as two main challenges in dynamic analysis. Therefore, clustering techniques are applied on unlabelled data to group them in accordance with the homogeneity. There are many prediction methods that are being popularly used in handling forecasting problems in real time environment. The outbreak of coronavirus disease (COVID19)-2019 creates the need for a medical emergency of worldwide concern with a rapidly high danger of open out and strike the entire world. Recently, the ML prediction models were used in many real time applications which necessitate the identification and categorization for real time environment. In medical field Prediction models are vital role to obtain observations of spread and significances of infectious diseases. Machine learning related forecasting mechanisms have showed their importance to develop the decision making on the upcoming course of actions. The K-means algorithm and hierarchy were applied directly on the renewed dataset using R programming language to create the covid patient cluster. Confirmed Covid patients count are passed to Prophet package, then the prophet model has been created. This forecasts model predicts the future covid count, which is essential for the clinical and healthcare leaders to make the appropriate measures in advance. The results of the experiments indicate that the quality of Hierarchical clustering outperforms than the K-Means clustering algorithm in the structured structured dataset. Thus, the prediction model also used to support model predictions help for the officials to take timely actions and make decisions to contain the COVID-19 dilemma. This work concludes Hierarchical clustering algorithm is the best model for clustering the covid data set obtained from world health organization (WHO).

Road crashes are the most common forms of accidents and deaths worldwide, and the significant reasons for these accidents are usually drunken, drowsiness and reckless behaviour of the driver. According to the World Health Organization, road traffic injuries have risen to 1.25 billion worldwide, which makes driver drowsiness detection a major potential area to avert numerous sleep-induced road accidents. This project proposes an idea to detect drowsiness using machine learning algorithms, hence alarming the driver in real-time to prevent a collision. The model uses the Haar Cascade algorithm, along with the OpenCV library to monitor the real-time video of the driver and to detect the eyes of the driver. The system uses the Eye Aspect Ratio (EAR) concept to determine if the eyes are open or closed. We also feed a data-set file consisting of the facial features data-points to train the machine learning algorithm. The model inspects each frame of the video, which helps to recognize the state of the driver. Furthermore, a Raspberry Pi single-board computer, combined with a camera module and an alarm system, facilitates the project to emulate a compact drowsiness detection system suitable for different automobiles.


2020 ◽  
Author(s):  
Xiao Lai ◽  
Pu Tian

AbstractSupervised machine learning, especially deep learning based on a wide variety of neural network architectures, have contributed tremendously to fields such as marketing, computer vision and natural language processing. However, development of un-supervised machine learning algorithms has been a bottleneck of artificial intelligence. Clustering is a fundamental unsupervised task in many different subjects. Unfortunately, no present algorithm is satisfactory for clustering of high dimensional data with strong nonlinear correlations. In this work, we propose a simple and highly efficient hierarchical clustering algorithm based on encoding by composition rank vectors and tree structure, and demonstrate its utility with clustering of protein structural domains. No record comparison, which is an expensive and essential common step to all present clustering algorithms, is involved. Consequently, it achieves linear time and space computational complexity hierarchical clustering, thus applicable to arbitrarily large datasets. The key factor in this algorithm is definition of composition, which is dependent upon physical nature of target data and therefore need to be constructed case by case. Nonetheless, the algorithm is general and applicable to any high dimensional data with strong nonlinear correlations. We hope this algorithm to inspire a rich research field of encoding based clustering well beyond composition rank vector trees.


Symmetry ◽  
2020 ◽  
Vol 12 (4) ◽  
pp. 581
Author(s):  
Guadalupe Obdulia Gutiérrez-Esparza ◽  
Oscar Infante Vázquez ◽  
Maite Vallejo ◽  
José Hernández-Torruco

Metabolic syndrome is a health condition that increases the risk of heart diseases, diabetes, and stroke. The prognostic variables that identify this syndrome have already been defined by the World Health Organization (WHO), the National Cholesterol Education Program Third Adult Treatment Panel (ATP III) as well as by the International Diabetes Federation. According to these guides, there is some symmetry among anthropometric prognostic variables to classify abdominal obesity in people with metabolic syndrome. However, some appear to be more sensitive than others, nevertheless, these proposed definitions have failed to appropriately classify a specific population or ethnic group. In this work, we used the ATP III criteria as the framework with the purpose to rank the health parameters (clinical and anthropometric measurements, lifestyle data, and blood tests) from a data set of 2942 participants of Mexico City Tlalpan 2020 cohort, applying machine learning algorithms. We aimed to find the most appropriate prognostic variables to classify Mexicans with metabolic syndrome. The criteria of sensitivity, specificity, and balanced accuracy were used for validation. The ATP III using Waist-to-Height-Ratio (WHtR) as an anthropometric index for the diagnosis of abdominal obesity achieved better performance in classification than waist or body mass index. Further work is needed to assess its precision as a classification tool for Metabolic Syndrome in a Mexican population.


2011 ◽  
Vol 20 (04) ◽  
pp. 753-781
Author(s):  
KAI CHEN ◽  
KIA MAKKI ◽  
NIKI PISSINOU

In the metropolitan region, most congestion or traffic jams are caused by the uneven distribution of traffic flow that creates bottleneck points where the traffic volume exceeds the road capacity. Additionally, unexpected incidents are the next most probable cause of these bottleneck regions. Moreover, most drivers are driving based on their empirical experience without awareness of real-time traffic situations. This unintelligent traffic behavior can make the congestion problem worse. Prediction based route guidance systems show great improvements in solving the inefficient diversion strategy problem by estimating future travel time when calculating accurate travel time is difficult. However, performances of machine learning based prediction models that are based on the historical data set degrade sharply during a congestion situation. This paper develops a new navigation system for reducing travel time of an individual driver and distributing the flow of urban traffic efficiently in order to reduce the occurrence of congestion. Compared with previous route guidance systems, the results reveal that our system, applying the advanced multi-lane prediction based real-time fastest path (AMPRFP) algorithm, can significantly reduce the travel time especially when drivers travel in a complex route environment and face frequent congestion problems. Unlike the previous system,1 it can be applied either for single lane or multi-lane urban traffic networks where the reason for congestion is significantly complex. We also demonstrate the advantages of this system and verify the results using real highway traffic data and a synthetic experiment.


The Bank Marketing data set at Kaggle is mostly used in predicting if bank clients will subscribe a long-term deposit. We believe that this data set could provide more useful information such as predicting whether a bank client could be approved for a loan. This is a critical choice that has to be made by decision makers at the bank. Building a prediction model for such high-stakes decision does not only require high model prediction accuracy, but also needs a reasonable prediction interpretation. In this research, different ensemble machine learning techniques have been deployed such as Bagging and Boosting. Our research results showed that the loan approval prediction model has an accuracy of 83.97%, which is approximately 25% better than most state-of-the-art other loan prediction models found in the literature. As well, the model interpretation efforts done in this research was able to explain a few critical cases that the bank decision makers may encounter; therefore, the high accuracy of the designed models was accompanied with a trust in prediction. We believe that the achieved model accuracy accompanied with the provided interpretation information are vitally needed for decision makers to understand how to maintain balance between security and reliability of their financial lending system, while providing fair credit opportunities to their clients.


2019 ◽  
Author(s):  
Sungjun Hong ◽  
Sungjoo Lee ◽  
Jeonghoon Lee ◽  
Won Chul Cha ◽  
Kyunga Kim

BACKGROUND The development and application of clinical prediction models using machine learning in clinical decision support systems is attracting increasing attention. OBJECTIVE The aims of this study were to develop a prediction model for cardiac arrest in the emergency department (ED) using machine learning and sequential characteristics and to validate its clinical usefulness. METHODS This retrospective study was conducted with ED patients at a tertiary academic hospital who suffered cardiac arrest. To resolve the class imbalance problem, sampling was performed using propensity score matching. The data set was chronologically allocated to a development cohort (years 2013 to 2016) and a validation cohort (year 2017). We trained three machine learning algorithms with repeated 10-fold cross-validation. RESULTS The main performance parameters were the area under the receiver operating characteristic curve (AUROC) and the area under the precision-recall curve (AUPRC). The random forest algorithm (AUROC 0.97; AUPRC 0.86) outperformed the recurrent neural network (AUROC 0.95; AUPRC 0.82) and the logistic regression algorithm (AUROC 0.92; AUPRC=0.72). The performance of the model was maintained over time, with the AUROC remaining at least 80% across the monitored time points during the 24 hours before event occurrence. CONCLUSIONS We developed a prediction model of cardiac arrest in the ED using machine learning and sequential characteristics. The model was validated for clinical usefulness by chronological visualization focused on clinical usability.


Machine learning is a branch of Artificial intelligence which provides algorithms that can learn from data and improve from experience, without human intervention. Now a day's many of the machine learning algorithms playing a vital role in data analytics. Such algorithms are possible to apply with the recent pandemic COVID situation across the globe. Machine learning algorithms are classified into 3 different groups based on the type of learning process, such as supervised learning, unsupervised learning, and reinforcement learning. By considering the medical observations on the COVID across the globe it has been discussed and concluded to analyze under the supervised learning process. The data set is acquired from the reliable source, it is processed and fed into the classification algorithms. Since learning behaviors are carried out by knowing the input data and expected output data. The data is labeled and has been classified based on labels. In the proposed work, three different algorithms are used to experiment with the COVID'19 dataset and compared for their efficiency and algorithm selection decision is made.


2020 ◽  
Vol 9 (3) ◽  
pp. 164-172
Author(s):  
Changsheng Jiang ◽  
Piaopiao Zhao ◽  
Weihua Li ◽  
Yun Tang ◽  
Guixia Liu

Abstract Neurotoxicity is one of the main causes of drug withdrawal, and the biological experimental methods of detecting neurotoxic toxicity are time-consuming and laborious. In addition, the existing computational prediction models of neurotoxicity still have some shortcomings. In response to these shortcomings, we collected a large number of data set of neurotoxicity and used PyBioMed molecular descriptors and eight machine learning algorithms to construct regression prediction models of chemical neurotoxicity. Through the cross-validation and test set validation of the models, it was found that the extra-trees regressor model had the best predictive effect on neurotoxicity (${q}_{\mathrm{test}}^2$ = 0.784). In addition, we get the applicability domain of the models by calculating the standard deviation distance and the lever distance of the training set. We also found that some molecular descriptors are closely related to neurotoxicity by calculating the contribution of the molecular descriptors to the models. Considering the accuracy of the regression models, we recommend using the extra-trees regressor model to predict the chemical autonomic neurotoxicity.


2020 ◽  
Author(s):  
Renato Cordeiro de Amorim

In a real-world data set there is always the possibility, rather high in our opinion, that different features may have different degrees of relevance. Most machine learning algorithms deal with this fact by either selecting or deselecting features in the data preprocessing phase. However, we maintain that even among relevant features there may be different degrees of relevance, and this should be taken into account during the clustering process. With over 50 years of history, K-Means is arguably the most popular partitional clustering algorithm there is. The first K-Means based clustering algorithm to compute feature weights was designed just over 30 years ago. Various such algorithms have been designed since but there has not been, to our knowledge, a survey integrating empirical evidence of cluster recovery ability, common flaws, and possible directions for future research. This paper elaborates on the concept of feature weighting and addresses these issues by critically analysing some of the most popular, or innovative, feature weighting mechanisms based in K-Means


10.2196/15932 ◽  
2020 ◽  
Vol 8 (8) ◽  
pp. e15932
Author(s):  
Sungjun Hong ◽  
Sungjoo Lee ◽  
Jeonghoon Lee ◽  
Won Chul Cha ◽  
Kyunga Kim

Background The development and application of clinical prediction models using machine learning in clinical decision support systems is attracting increasing attention. Objective The aims of this study were to develop a prediction model for cardiac arrest in the emergency department (ED) using machine learning and sequential characteristics and to validate its clinical usefulness. Methods This retrospective study was conducted with ED patients at a tertiary academic hospital who suffered cardiac arrest. To resolve the class imbalance problem, sampling was performed using propensity score matching. The data set was chronologically allocated to a development cohort (years 2013 to 2016) and a validation cohort (year 2017). We trained three machine learning algorithms with repeated 10-fold cross-validation. Results The main performance parameters were the area under the receiver operating characteristic curve (AUROC) and the area under the precision-recall curve (AUPRC). The random forest algorithm (AUROC 0.97; AUPRC 0.86) outperformed the recurrent neural network (AUROC 0.95; AUPRC 0.82) and the logistic regression algorithm (AUROC 0.92; AUPRC=0.72). The performance of the model was maintained over time, with the AUROC remaining at least 80% across the monitored time points during the 24 hours before event occurrence. Conclusions We developed a prediction model of cardiac arrest in the ED using machine learning and sequential characteristics. The model was validated for clinical usefulness by chronological visualization focused on clinical usability.


Sign in / Sign up

Export Citation Format

Share Document