scholarly journals A Survey on Cardiovascular Prediction using Variant Machine learning

2021 ◽  
Vol 309 ◽  
pp. 01042
Author(s):  
L. Chandrika ◽  
K. Madhavi ◽  
B. Sindhuja ◽  
M. Arshi

Prediction of a cardiovascular diseases has always a tedious challenge for doctors and medical practitioners. Most of the practitioners and hospital staff offers expensive medication, care and surgeries to treat the cardiovascular patients. At early-stage of prediction of heart-oriented problems will be giving a chance of survival by taking necessary precautions. Over the years there are different types of methodologies were proposed to predict the cardiovascular diseases one of the best methodologies is a Machine learning approach. These years many scientific advancements take place in the Artificial Intelligence, Machine learning, and Deep learning which gives an extra push up to help and implement the path in the field of medical image processing and medical data analysis. By using the enormous dataset from various medical experts used to help the researchers to predict the coronary problems prior to happening. Many researchers have tried and implemented different machine learning algorithms to automate the prediction analysis using the enormous number of datasets. There are numerous algorithms and procedures to predict the cardiovascular diseases and accessible to be specific Classification methods including Artificial Neural Networks (AI), Decision tree (DT), Support vector machine (SVM), Genetic algorithm (GA), Neural network (NN), Naive Bayes (NB) and Clustering algorithms like K-NN. A few examinations have been done for creating expectation models utilizing singular procedures and additionally concatenating at least two strategies. This paper gives a speedy and simple survey and knowledge of approachable prediction models using different researchers work from 2004 to 2019. The examination indicates the precision of individual experiments done by various researchers.

Author(s):  
Cheng-Chien Lai ◽  
Wei-Hsin Huang ◽  
Betty Chia-Chen Chang ◽  
Lee-Ching Hwang

Predictors for success in smoking cessation have been studied, but a prediction model capable of providing a success rate for each patient attempting to quit smoking is still lacking. The aim of this study is to develop prediction models using machine learning algorithms to predict the outcome of smoking cessation. Data was acquired from patients underwent smoking cessation program at one medical center in Northern Taiwan. A total of 4875 enrollments fulfilled our inclusion criteria. Models with artificial neural network (ANN), support vector machine (SVM), random forest (RF), logistic regression (LoR), k-nearest neighbor (KNN), classification and regression tree (CART), and naïve Bayes (NB) were trained to predict the final smoking status of the patients in a six-month period. Sensitivity, specificity, accuracy, and area under receiver operating characteristic (ROC) curve (AUC or ROC value) were used to determine the performance of the models. We adopted the ANN model which reached a slightly better performance, with a sensitivity of 0.704, a specificity of 0.567, an accuracy of 0.640, and an ROC value of 0.660 (95% confidence interval (CI): 0.617–0.702) for prediction in smoking cessation outcome. A predictive model for smoking cessation was constructed. The model could aid in providing the predicted success rate for all smokers. It also had the potential to achieve personalized and precision medicine for treatment of smoking cessation.


Author(s):  
Adwait Patil

Abstract: Alzheimer’s disease is one of the neurodegenerative disorders. It initially starts with innocuous symptoms but gradually becomes severe. This disease is so dangerous because there is no treatment, the disease is detected but typically at a later stage. So it is important to detect Alzheimer at an early stage to counter the disease and for a probable recovery for the patient. There are various approaches currently used to detect symptoms of Alzheimer’s disease (AD) at an early stage. The fuzzy system approach is not widely used as it heavily depends on expert knowledge but is quite efficient in detecting AD as it provides a mathematical foundation for interpreting the human cognitive processes. Another more accurate and widely accepted approach is the machine learning detection of AD stages which uses machine learning algorithms like Support Vector Machines (SVMs) , Decision Tree , Random Forests to detect the stage depending on the data provided. The final approach is the Deep Learning approach using multi-modal data that combines image , genetic data and patient data using deep models and then uses the concatenated data to detect the AD stage more efficiently; this method is obscure as it requires huge volumes of data. This paper elaborates on all the three approaches and provides a comparative study about them and which method is more efficient for AD detection. Keywords: Alzheimer’s Disease (AD), Fuzzy System , Machine Learning , Deep Learning , Multimodal data


2020 ◽  
Vol 38 (15_suppl) ◽  
pp. e17554-e17554
Author(s):  
Ioana Danciu ◽  
Samantha Erwin ◽  
Greeshma Agasthya ◽  
Tate Janet ◽  
Benjamin McMahon ◽  
...  

e17554 Background: The ability to understand and predict at the time of diagnosis the trajectories of prostate cancer patients is critical for deciding the appropriate treatment plan. Evidence-based approaches for outcome prediction include predictive machine learning algorithms that harness health record data. Methods: All our analyses used the Veterans Affairs Clinical Data Warehouse (CDW). We included all individuals with a non-metastatic (early stage) prostate cancer diagnosis between 2002 and 2017 as documented in the CDW cancer registry (N = 111351). Our predictors were demographics (age at diagnosis, race), disease staging parameters abstracted at diagnosis ( Stage grouping AJCC, Gleason score, SEER summary stage) and prostate specific antigen (PSA) laboratory values in the last 5 years prior to diagnosis (last value, the value before last, average, minimum, maximum, rate of the change of the last 2 PSAs and density). The predicted outcome was disease progression at 2 years (N = 3469) and 5 years (N = 6325) defined as metastasis - taking either Abiraterone, Sipuleucel-T, Enzalutamide or Radium 223, registry cancer related death or PSA > 50. We used 4 different machine learning classifiers to train prediction models: random forest, k-nearest neighbor, decision trees, and xgboost all with hyper parameter optimization. For testing, we used two approaches: (1) 20% sample held out at the beginning of the study, and (2) stratified test/train split on the remaining data. Results: The table below shows the performance of the best classifier, xgboost. The top five predictors of disease progression were the last PSA, Gleason Score, maximum PSA, age at diagnosis, and SEER summary stage. The last PSA had a significantly higher contribution than the other predictors. More than one PSA value is important for prediction, emphasizing the need for investigating the PSA trajectory in the period before diagnosis. The models are overall very robust going from outcome at 2 years compared to 5 years. Conclusions: A machine learning based xgboost classifier can be integrated in clinical decision support at diagnosis, to robustly predict disease progression at 2 and 5 years. [Table: see text]


Author(s):  
Ahmed T. Shawky ◽  
Ismail M. Hagag

In today’s world using data mining and classification is considered to be one of the most important techniques, as today’s world is full of data that is generated by various sources. However, extracting useful knowledge out of this data is the real challenge, and this paper conquers this challenge by using machine learning algorithms to use data for classifiers to draw meaningful results. The aim of this research paper is to design a model to detect diabetes in patients with high accuracy. Therefore, this research paper using five different algorithms for different machine learning classification includes, Decision Tree, Support Vector Machine (SVM), Random Forest, Naive Bayes, and K- Nearest Neighbor (K-NN), the purpose of this approach is to predict diabetes at an early stage. Finally, we have compared the performance of these algorithms, concluding that K-NN algorithm is a better accuracy (81.16%), followed by the Naive Bayes algorithm (76.06%).


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Susan Idicula-Thomas ◽  
Ulka Gawde ◽  
Prabhat Jha

Abstract Background Machine learning (ML) algorithms have been successfully employed for prediction of outcomes in clinical research. In this study, we have explored the application of ML-based algorithms to predict cause of death (CoD) from verbal autopsy records available through the Million Death Study (MDS). Methods From MDS, 18826 unique childhood deaths at ages 1–59 months during the time period 2004–13 were selected for generating the prediction models of which over 70% of deaths were caused by six infectious diseases (pneumonia, diarrhoeal diseases, malaria, fever of unknown origin, meningitis/encephalitis, and measles). Six popular ML-based algorithms such as support vector machine, gradient boosting modeling, C5.0, artificial neural network, k-nearest neighbor, classification and regression tree were used for building the CoD prediction models. Results SVM algorithm was the best performer with a prediction accuracy of over 0.8. The highest accuracy was found for diarrhoeal diseases (accuracy = 0.97) and the lowest was for meningitis/encephalitis (accuracy = 0.80). The top signs/symptoms for classification of these CoDs were also extracted for each of the diseases. A combination of signs/symptoms presented by the deceased individual can effectively lead to the CoD diagnosis. Conclusions Overall, this study affirms that verbal autopsy tools are efficient in CoD diagnosis and that automated classification parameters captured through ML could be added to verbal autopsies to improve classification of causes of death.


2021 ◽  
Author(s):  
Floe Foxon

Ammonoid identification is crucial to biostratigraphy, systematic palaeontology, and evolutionary biology, but may prove difficult when shell features and sutures are poorly preserved. This necessitates novel approaches to ammonoid taxonomy. This study aimed to taxonomize ammonoids by their conch geometry using supervised and unsupervised machine learning algorithms. Ammonoid measurement data (conch diameter, whorl height, whorl width, and umbilical width) were taken from the Paleobiology Database (PBDB). 11 species with ≥50 specimens each were identified providing N=781 total unique specimens. Naive Bayes, Decision Tree, Random Forest, Gradient Boosting, K-Nearest Neighbours, and Support Vector Machine classifiers were applied to the PBDB data with a 5x5 nested cross-validation approach to obtain unbiased generalization performance estimates across a grid search of algorithm parameters. All supervised classifiers achieved ≥70% accuracy in identifying ammonoid species, with Naive Bayes demonstrating the least over-fitting. The unsupervised clustering algorithms K-Means, DBSCAN, OPTICS, Mean Shift, and Affinity Propagation achieved Normalized Mutual Information scores of ≥0.6, with the centroid-based methods having most success. This presents a reasonably-accurate proof-of-concept approach to ammonoid classification which may assist identification in cases where more traditional methods are not feasible.


PLoS ONE ◽  
2020 ◽  
Vol 15 (11) ◽  
pp. e0241239
Author(s):  
Kai On Wong ◽  
Osmar R. Zaïane ◽  
Faith G. Davis ◽  
Yutaka Yasui

Background Canada is an ethnically-diverse country, yet its lack of ethnicity information in many large databases impedes effective population research and interventions. Automated ethnicity classification using machine learning has shown potential to address this data gap but its performance in Canada is largely unknown. This study conducted a large-scale machine learning framework to predict ethnicity using a novel set of name and census location features. Methods Using census 1901, the multiclass and binary class classification machine learning pipelines were developed. The 13 ethnic categories examined were Aboriginal (First Nations, Métis, Inuit, and all-combined)), Chinese, English, French, Irish, Italian, Japanese, Russian, Scottish, and others. Machine learning algorithms included regularized logistic regression, C-support vector, and naïve Bayes classifiers. Name features consisted of the entire name string, substrings, double-metaphones, and various name-entity patterns, while location features consisted of the entire location string and substrings of province, district, and subdistrict. Predictive performance metrics included sensitivity, specificity, positive predictive value, negative predictive value, F1, Area Under the Curve for Receiver Operating Characteristic curve, and accuracy. Results The census had 4,812,958 unique individuals. For multiclass classification, the highest performance achieved was 76% F1 and 91% accuracy. For binary classifications for Chinese, French, Italian, Japanese, Russian, and others, the F1 ranged 68–95% (median 87%). The lower performance for English, Irish, and Scottish (F1 ranged 63–67%) was likely due to their shared cultural and linguistic heritage. Adding census location features to the name-based models strongly improved the prediction in Aboriginal classification (F1 increased from 50% to 84%). Conclusions The automated machine learning approach using only name and census location features can predict the ethnicity of Canadians with varying performance by specific ethnic categories.


Symmetry ◽  
2020 ◽  
Vol 12 (5) ◽  
pp. 728 ◽  
Author(s):  
Lijuan Yan ◽  
Yanshen Liu

Student performance prediction has become a hot research topic. Most of the existing prediction models are built by a machine learning method. They are interested in prediction accuracy but pay less attention to interpretability. We propose a stacking ensemble model to predict and analyze student performance in academic competition. In this model, student performance is classified into two symmetrical categorical classes. To improve accuracy, three machine learning algorithms, including support vector machine (SVM), random forest, and AdaBoost are established in the first level and then integrated by logistic regression via stacking. A feature importance analysis was applied to identify important variables. The experimental data were collected from four academic years in Hankou University. According to comparative studies on five evaluation metrics (precision, recall, F1, error, and area   under   the   receiver   operating   characteristic   curve ( AUC ) in this analysis, the proposed model generally performs better than compared models. The important variables identified from the analysis are interpretable, they can be used as guidance to select potential students.


Author(s):  
Marco A. Alvarez ◽  
SeungJin Lim

Current search engines impose an overhead to motivated students and Internet users who employ the Web as a valuable resource for education. The user, searching for good educational materials for a technical subject, often spends extra time to filter irrelevant pages or ends up with commercial advertisements. It would be ideal if, given a technical subject by user who is educationally motivated, suitable materials with respect to the given subject are automatically identified by an affordable machine processing of the recommendation set returned by a search engine for the subject. In this scenario, the user can save a significant amount of time in filtering out less useful Web pages, and subsequently the user’s learning goal on the subject can be achieved more efficiently without clicking through numerous pages. This type of convenient learning is called One-Stop Learning (OSL). In this paper, the contributions made by Lim and Ko in (Lim and Ko, 2006) for OSL are redefined and modeled using machine learning algorithms. Four selected supervised learning algorithms: Support Vector Machine (SVM), AdaBoost, Naive Bayes and Neural Networks are evaluated using the same data used in (Lim and Ko, 2006). The results presented in this paper are promising, where the highest precision (98.9%) and overall accuracy (96.7%) obtained by using SVM is superior to the results presented by Lim and Ko. Furthermore, the machine learning approach presented here, demonstrates that the small set of features used to represent each Web page yields a good solution for the OSL problem.


Author(s):  
Nabil Mohamed Eldakhly ◽  
Magdy Aboul-Ela ◽  
Areeg Abdalla

The particulate matter air pollutant of diameter less than 10 micrometers (PM[Formula: see text]), a category of pollutants including solid and liquid particles, can be a health hazard for several reasons: it can harm lung tissues and throat, aggravate asthma and increase respiratory illness. Accurate prediction models of PM[Formula: see text] concentrations are essential for proper management, control, and making public warning strategies. Therefore, machine learning techniques have the capability to develop methods or tools that can be used to discover unseen patterns from given data to solve a particular task or problem. The chance theory has advanced concepts pertinent to treat cases where both randomness and fuzziness play simultaneous roles at one time. The main objective is to study the modification of a single machine learning algorithm — support vector machine (SVM) — applying the chance weight of the target variable, based on the chance theory, to the corresponding dataset point to be superior to the ensemble machine learning algorithms. The results of this study are outperforming of the SVM algorithms when modifying and combining with the right theory/technique, especially the chance theory over other modern ensemble learning algorithms.


Sign in / Sign up

Export Citation Format

Share Document