scholarly journals Big Data Based Diabetes and Heart Disease Prediction System by Employing Supervised Learning Algorithm

Healthcare systems generate bytes and bytes of data and the data growth is exponential. The voluminous data can be analysed effectively, only when the data organization is efficient. Additionally, data retrieval must also be made simpler, such that the healthcare professional can compare and contrast the test sample with the database of health records. This makes it possible to achieve better disease prediction and this work presents a big data based disease prediction system with the help of supervised learning. The proposed approach clusters the related health records, based on every medical attribute followed by which the disease is predicted by SVM classifier. The performance of the proposed disease prediction system is observed to be satisfactory in terms of accuracy, precision, recall, F-measure, while consuming reasonable period of time.

Now days, Machine learning is considered as the key technique in the field of technologies, such as, Internet of things (IOT), Cloud computing, Big data and Artificial Intelligence etc. As technology enhances, lots of incorrect and redundant data are collected from these fields. To make use of these data for a meaningful purpose, we have to apply mining or classification technique in the real world. In this paper, we have proposed two nobel approaches towards data classification by using supervised learning algorithm


2021 ◽  
Vol 1 (1) ◽  
pp. 146-176
Author(s):  
Israa Nadher ◽  
Mohammad Ayache ◽  
Hussein Kanaan

Abstract—Information decision support systems are becomingmore in use as we are living in the era of digital data andrise of artificial intelligence. Heart disease as one of the mostknown and dangerous is getting very important attention, thisattention is translated into digital and prediction system thatdetects the presence of disease according to the available dataand information. Such systems faced a lot of problems since thefirst rise, but now with the deveolopment of machine learnigfield we are using them in developing new models to detect thepresence of this disease, in addition to algorithms data is veryimportant which also form the heart of the predicton systems,as we know prediction algorithms take decisions and thesedecisions must be based on facts, and these facts are extractedfrom data, as a result data is the starting point of every system.In this paper we propose a Heart Disease Prediction Systemusing Machine Learning Algorithms, in terms of data we usedCleveland dataset, this dataset is normalized then divided intothree scnearios in terms of traning and testing respectively,80%-20%, 50%-50%, 30%-70%. In each case of dataset ifit is normalized or not we will have these three scenarios.We used three machine learning algorithms for every scenarioof the mentioned before which are SVM, SMO and MLP, inthese algorithms we’ve used two different kernels to test theresults upon that. These two types of simulation are added tothe collection of scenarios mentioned above to become as thefollowing we have at the main level two types normalized andunnormalized dataset, then for each one we have three typesaccording to the amount of training and testing dataset, thenfor each of these scenarios we have two scenarios according tothe type of kernel to become 30 scenarios in total, our proposedsystem have shown a dominance in terms of accuracy over theother previous works.


2020 ◽  
Author(s):  
Lei Wang ◽  
Qing Qian ◽  
Qiang Zhang ◽  
Jishuai Wang ◽  
Wenbo Cheng ◽  
...  

Abstract Big data in medical diagnosis can provide abundant value for clinical diagnosis, decision support and many other applications, but obtaining a large number of labeled medical data will take a lot of time and manpower. In this paper, a classification model based on semi-supervised learning algorithm using both labeled and unlabeled data is proposed to process big data in medical diagnosis, which includes structured, semi-structured and unstructured data. For the medical laboratory data, this paper proposes a self-training algorithm based on repeated labeling strategy to solve the problem that mislabeled samples weaken the performance of classifiers. Aiming at medical record data, this paper extracts features with high correlation of classification results based on domain expert knowledge base first, and then chooses the unlabeled medical record data with the highest confidence to expand the training set and optimizes the performance of the classifiers of tri-training algorithm, which uses supervised learning algorithm to train three basic classifiers. The experimental results show that the proposed medical diagnosis data classification model based on semi-supervised learning algorithm has good performance.


Author(s):  
Gaochao Xu ◽  
Yan Ding ◽  
Yuqiang Jiang ◽  
Ming Hu ◽  
Jia Zhao

Recently big data have become a research hotspot and been successfully exploited in a few applications such as data mining and business modeling. Although big data contain a plenty of treasures for all the fields of computer science, it is very difficult for the current computing paradigms and computer hardware to efficiently process and utilize big data to attain what are looked forward to. In this work, we explore the possibility of employing big data in recommendation systems. We have proposed a simple recommendation system framework BDRSF (Big Data Recommendation System Framework), which is based on big data with social context theories and has abilities in obtaining the Recommender based on the idea of supervised learning through big data training. Its main idea can be divided into three parts: (1) reduce the scale of the current recommendation problems according to the essence of recommending; (2) design a rational Recommender and propose a novel supervised learning algorithm to get it; (3) utilize the Recommender to deal with the later recommendation problems. Experimental results show that BDRSF outperforms conventional recommendation systems, which clearly indicates the effectiveness and efficiency of big data with social context in personalized recommendation.


Author(s):  
Dan Luo

Background: As known that the semi-supervised algorithm is a classical algorithm in semi-supervised learning algorithm. Methods: In the paper, it proposed improved cooperative semi-supervised learning algorithm, and the algorithm process is presented in detailed, and it is adopted to predict unlabeled electronic components image. Results: In the experiments of classification and recognition of electronic components, it show that through the method the accuracy the proposed algorithm in electron device image recognition can be significantly improved, the improved algorithm can be used in the actual recognition process . Conclusion: With the continuous development of science and technology, machine vision and deep learning will play a more important role in people's life in the future. The subject research based on the identification of the number of components is bound to develop towards the direction of high precision and multi-dimension, which will greatly improve the production efficiency of electronic components industry.


2020 ◽  
Author(s):  
Anusha Ampavathi ◽  
Vijaya Saradhi T

UNSTRUCTURED Big data and its approaches are generally helpful for healthcare and biomedical sectors for predicting the disease. For trivial symptoms, the difficulty is to meet the doctors at any time in the hospital. Thus, big data provides essential data regarding the diseases on the basis of the patient’s symptoms. For several medical organizations, disease prediction is important for making the best feasible health care decisions. Conversely, the conventional medical care model offers input as structured that requires more accurate and consistent prediction. This paper is planned to develop the multi-disease prediction using the improvised deep learning concept. Here, the different datasets pertain to “Diabetes, Hepatitis, lung cancer, liver tumor, heart disease, Parkinson’s disease, and Alzheimer’s disease”, from the benchmark UCI repository is gathered for conducting the experiment. The proposed model involves three phases (a) Data normalization (b) Weighted normalized feature extraction, and (c) prediction. Initially, the dataset is normalized in order to make the attribute's range at a certain level. Further, weighted feature extraction is performed, in which a weight function is multiplied with each attribute value for making large scale deviation. Here, the weight function is optimized using the combination of two meta-heuristic algorithms termed as Jaya Algorithm-based Multi-Verse Optimization algorithm (JA-MVO). The optimally extracted features are subjected to the hybrid deep learning algorithms like “Deep Belief Network (DBN) and Recurrent Neural Network (RNN)”. As a modification to hybrid deep learning architecture, the weight of both DBN and RNN is optimized using the same hybrid optimization algorithm. Further, the comparative evaluation of the proposed prediction over the existing models certifies its effectiveness through various performance measures.


Sign in / Sign up

Export Citation Format

Share Document