Machine Learning Based Method for Prediction of Heart Disease in Big Data Environment

Prediction of diseases is one of the challenging tasks in healthcare domain. Conventionally the heart diseases were diagnosed by experienced medical professional and cardiologist with the help of medical and clinical tests. With conventional method even experienced medical professional struggled to predict the disease with sufficient accuracy. In addition, manually analysing and extracting useful knowledge from the archived disease data becomes time consuming as well as infeasible. The advent of machine learning techniques enables the prediction of various diseases in healthcare domain. Machine learning algorithms are trained to learn from the existing historical data and prediction models are being created to predict the unknown raw data. For the past two decades, machine learning techniques are extensively employed for disease prediction. Despite the capability of machine algorithm on learning from huge historical data which is stored in data mart and data warehouses using traditional database technologies such as Oracle OnLine Analytical Processing (OLAP). The conventional database technologies suffer from the limitation that they cannot handle huge data or unstructured data or data that comes with speed. In this context, big data tools and technologies plays a major role in storing and facilitating the processing of huge data. In this paper, an approach is proposed for prediction of heart diseases using Support Vector Algorithm in Spark environment. Support Vector Machine algorithm is basically a binary classifier which classifies both linear and non-linear input data. It transforms the non-linear data into hyper plan with the help of different kernel functions. Spark is a distributed big data processing platform which has a unique feature of keeping and processing a huge data in memory. The proposed approach is tested with a benchmark dataset from UCI repository and results are discussed.

Download Full-text

Heart Disease Prediction Using Machine Learning

International Journal of Advanced Research in Science, Communication and Technology ◽

10.48175/ijarsct-1131 ◽

2021 ◽

pp. 267-276

Author(s):

Baban. U. Rindhe ◽

Nikita Ahire ◽

Rupali Patil ◽

Shweta Gagare ◽

Manisha Darade

Keyword(s):

Machine Learning ◽

Data Mining ◽

Heart Disease ◽

Heart Diseases ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Whole Body ◽

Support Vector ◽

Learning Techniques

Heart-related diseases or Cardiovascular Diseases (CVDs) are the main reason for a huge number of death in the world over the last few decades and has emerged as the most life-threatening disease, not only in India but in the whole world. So, there is a need fora reliable, accurate, and feasible system to diagnose such diseases in time for proper treatment. Machine Learning algorithms and techniques have been applied to various medical datasets to automate the analysis of large and complex data. Many researchers, in recent times, have been using several machine learning techniques to help the health care industry and the professionals in the diagnosis of heart-related diseases. Heart is the next major organ comparing to the brain which has more priority in the Human body. It pumps the blood and supplies it to all organs of the whole body. Prediction of occurrences of heart diseases in the medical field is significant work. Data analytics is useful for prediction from more information and it helps the medical center to predict various diseases. A huge amount of patient-related data is maintained on monthly basis. The stored data can be useful for the source of predicting the occurrence of future diseases. Some of the data mining and machine learning techniques are used to predict heart diseases, such as Artificial Neural Network (ANN), Random Forest,and Support Vector Machine (SVM).Prediction and diagnosingof heart disease become a challenging factor faced by doctors and hospitals both in India and abroad. To reduce the large scale of deaths from heart diseases, a quick and efficient detection technique is to be discovered. Data mining techniques and machine learning algorithms play a very important role in this area. The researchers accelerating their research works to develop software with thehelp of machine learning algorithms which can help doctors to decide both prediction and diagnosing of heart disease. The main objective of this research project is to predict the heart disease of a patient using machine learning algorithms.

Download Full-text

Geographic Origin Discrimination of Millet Using Vis-NIR Spectroscopy Combined with Machine Learning Techniques

Foods ◽

10.3390/foods10112767 ◽

2021 ◽

Vol 10 (11) ◽

pp. 2767

Author(s):

Muhammad Hilal Kabir ◽

Mahamed Lamine Guindo ◽

Rongqin Chen ◽

Fei Liu

Keyword(s):

Machine Learning ◽

Geographic Origin ◽

Discrimination Performance ◽

Nir Spectroscopy ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Protected Designation Of Origin ◽

Learning Techniques ◽

Non Linear

Millet is a primary food for people living in the dry and semi-dry regions and is dispersed within most parts of Europe, Africa, and Asian countries. As part of the European Union (EU) efforts to establish food originality, there is a global need to create Protected Geographical Indication (PGI) and Protected Designation of Origin (PDO) of crops and agricultural products to ensure the integrity of the food supply. In the present work, Visible and Near-Infrared Spectroscopy (Vis-NIR) combined with machine learning techniques was used to discriminate 16 millet varieties (n = 480) originating from various regions of China. Five different machine learning algorithms, namely, K-nearest neighbor (K-NN), Linear discriminant analysis (LDA), Logistic regression (LR), Random Forest (RF), and Support vector machine (SVM), were used to train the NIR spectra of these millet samples and to assess their discrimination performance. Visible cluster trends were obtained from the Principal Component Analysis (PCA) of the spectral data. Cross-validation was used to optimize the performance of the models. Overall, the F-Score values were as follows: SVM with 99.5%, accompanied by RF with 99.5%, LDA with 99.5%, K-NN with 99.1%, and LR with 98.8%. Both the linear and non-linear algorithms yielded positive results, but the non-linear models appear slightly better. The study revealed that applying Vis-NIR spectroscopy assisted by machine learning technique can be an essential tool for tracing the origins of millet, contributing to a safe authentication method in a quick, relatively cheap, and non-destructive way.

Download Full-text

Acoustic Diversity Classification Using Machine Learning Techniques: Towards Automated Marine Big Data Analysis

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213020600118 ◽

2020 ◽

Vol 29 (03n04) ◽

pp. 2060011

Author(s):

Emna Hachicha Belghith ◽

François Rioult ◽

Medjber Bouzidi

Keyword(s):

Machine Learning ◽

Big Data ◽

Data Analysis ◽

Big Data Analysis ◽

Machine Learning Techniques ◽

Support Vector ◽

K Nearest Neighbor ◽

Learning Techniques ◽

Acoustic Diversity ◽

Marine Data

During the last years, big data has become the new emerging trend that increasingly attracting the attention of the R&D community in several fields (e.g., image processing, database engineering, data mining, artificial intelligence). Marine data is part of these fields which accommodates this growth, hence the appearance of marine big data paradigm that monitoring advocates the assessment of human impact on marine data. Nonetheless, supporting acoustic sounds classification is missing in such environment, with taking into account the diversity of such data (i.e., sounds of living undersea species, sounds of human activities, and sounds of environmental effects). To overcome this issue, we propose in this paper an approach that efficiently allowing acoustic diversity classification using machine learning techniques. The aim is to reach an automated support of marine big data analysis. We have conducted a set of experiments, using a real marine dataset, in order to validate our approach and show its effectiveness and efficiency. To do so, three machine learning techniques are employed: (i) classic machine learning models (i.e., k-nearest neighbor and support vector machine), (ii) deep learning based on convolutional neural networks, and (iii) transfer learning based on the reuse of pretrained models.

Download Full-text

Predictors of Outpatients’ No-Show: Big Data Analytics using Apache Spark

10.21203/rs.3.rs-33216/v1 ◽

2020 ◽

Author(s):

Tahani Daghistani ◽

Huda AlGhamdi ◽

Riyad Alshammari ◽

Raed H. AlHazme

Keyword(s):

Machine Learning ◽

Big Data ◽

Negative Impact ◽

Big Data Analytics ◽

Quality Of Healthcare ◽

Machine Learning Techniques ◽

Healthcare Organizations ◽

Data Framework ◽

Huge Data ◽

Learning Techniques

Abstract Outpatients who fail to attend their appointments have a negative impact on the healthcare outcome. Thus, healthcare organizations facing new opportunities, one of them is to improve the quality of healthcare. The main challenges is predictive analysis using techniques capable of handle the huge data generated. We propose a big data framework for identifying subject outpatients’ no-show via feature engineering and machine learning (MLlib) in Spark platform. The aim of this paper is exploring factors that affect no-sow rate then can be used to formulate predictions using big data machine learning techniques.

Download Full-text

A Study on Machine Learning in Big Data

Oriental journal of computer science and technology ◽

10.13005/ojcst/10.03.15 ◽

2017 ◽

Vol 10 (3) ◽

pp. 660-663

Author(s):

L. Dhanapriya ◽

Dr. S. MANJU

Keyword(s):

Machine Learning ◽

Big Data ◽

Large Data ◽

Machine Learning Algorithms ◽

Large Data Sets ◽

Machine Learning Techniques ◽

Data Sets ◽

Huge Data ◽

Learning Techniques ◽

Market Needs

In the recent development of IT technology, the capacity of data has surpassed the zettabyte, and improving the efficiency of business is done by increasing the ability of predictive through an efficient analysis on these data which has emerged as an issue in the current society. Now the market needs for methods that are capable of extracting valuable information from large data sets. Recently big data is becoming the focus of attention, and using any of the machine learning techniques to extract the valuable information from the huge data of complex structures has become a concern yet an urgent problem to resolve. The aim of this work is to provide a better understanding of this Machine Learning technique for discovering interesting patterns and introduces some machine learning algorithms to explore the developing trend.

Download Full-text

HEART DISEASE PREDICTION USING MACHINE LEARNING TECHNIQUES

INFORMATION TECHNOLOGY IN INDUSTRY ◽

10.17762/itii.v9i1.120 ◽

2021 ◽

Vol 9 (1) ◽

pp. 207-214

Author(s):

Krishna Kumar Yadav, Dr. Anurag Sharma, Dr. Abhishek Badholia

Keyword(s):

Machine Learning ◽

Disease Diagnosis ◽

Machine Learning Techniques ◽

Support Vector ◽

Correct Treatment ◽

Related Disease ◽

Huge Data ◽

Learning Techniques ◽

Supervised Learning Algorithms ◽

Life Threatening

In few previous decades around the globe the reason for extensive number of deaths is cardiovascular disease or Heart related disease and not only in India but all around the world has emerged as a life-threatening disease. So for the correct treatment and in time diagnosis for this disease the need of feasible, accurate and reliable system is encountered. For automation of analysis of the sophisticated and huge data, to the various medical dataset of Machine Learning techniques and methods are applied. In recent times many researchers for the health care industry assistance with the help of various techniques of Machine Learning, this in turn helps the professionals in the procedure of the heart related disease diagnosis. A survey of various models that accepts such techniques and algorithms and their performance analysis is presented in this paper. Within the researchers few very fashionable Model supported supervised learning algorithms are Random forest (RF), Decision Tree (DT), Naïve Bayes, ensemble models, K-Nearest Neighbour (KNN) and Support Vector Machine (SVM).

Download Full-text

Analyzing driving factors of land values in urban scale based on big data and non-linear machine learning techniques

Land Use Policy ◽

10.1016/j.landusepol.2020.104537 ◽

2020 ◽

Vol 94 ◽

pp. 104537 ◽

Cited By ~ 5

Author(s):

Jun Ma ◽

Jack C.P. Cheng ◽

Feifeng Jiang ◽

Weiwei Chen ◽

Jingcheng Zhang

Keyword(s):

Machine Learning ◽

Big Data ◽

Driving Factors ◽

Machine Learning Techniques ◽

Land Values ◽

Learning Techniques ◽

Non Linear ◽

Linear Machine

Download Full-text

Short Term Stock Movements with Big Data and Market Sentiments Analytics

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i8474.078919 ◽

2019 ◽

Vol 8 (9) ◽

pp. 2305-2313

Keyword(s):

Machine Learning ◽

Big Data ◽

Quantitative Data ◽

Data Analytics ◽

Historical Data ◽

Big Data Analytics ◽

Machine Learning Techniques ◽

Short Term ◽

Trend Prediction ◽

Learning Techniques

Machine Learning Techniques and Big Data analytics are two central points of data science. Big Data is important for organizations to gain insights into it and machine learning techniques are one of the substantial assets for analyzing a massive amount of data. In this paper, a framework has been proposed to improve the short term stock trend prediction accuracy using Logistic Regression model by means of qualitative and quantitative data. This paper makes a comprehensive survey of stock market trend prediction with the accumulation of various data sources by applying machine learning techniques and by using big data analytics approach. The model has been implemented in Big data Framework with Hadoop and Apache Spark. For qualitative data Tweets sentiments and news sentiments has been taken in to account and for quantitative data Google trends and historical data are considered. The proposed system has enhanced the prediction accuracy about 3-4 % in comparison to existing models by supplying Google trend as input data in addition to market sentiments and historical data. The implemented model can help the investors to take short term decisions to make money in the security market and the survey would help in finding the most effective resources which overly influence the stock prices.

Download Full-text

Using Machine Learning Algorithms on Prediction of Stock Price

Journal of Modeling and Optimization ◽

10.32732/jmo.2020.12.2.84 ◽

2020 ◽

Vol 12 (2) ◽

pp. 84-99

Author(s):

Li-Pang Chen

Keyword(s):

Machine Learning ◽

Stock Price ◽

Short Term Memory ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Short Term ◽

Learning Techniques ◽

Historical Database ◽

Long Short Term Memory

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.

Download Full-text

A Comparative Study of Different Machine Learning Algorithms for Disease Prediction

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i7/0177 ◽

2017 ◽

Vol 7 (7) ◽

pp. 172

Author(s):

Anantvir Singh Romana

Keyword(s):

Machine Learning ◽

Subsequent Treatment ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Disease Prediction ◽

Classification Problems ◽

Learning Techniques ◽

Neural Network Classifiers ◽

Diagnostic Detection

Accurate diagnostic detection of the disease in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Naïve bayes, J48 Decision Tree and neural network classifiers breast cancer and diabetes datsets.

Download Full-text