scholarly journals Effective Prediction of Diabetes Mellitus using Nine different Machine Learning Techniques and their Performances

Diabetes is a disease where the predominant finding is high blood sugar. The high blood sugar may either be because of deficient insulin production (Type 1) or insulin resistance in peripheral tissue cells (Type 2). Many problems occur if diabetes remains untreated and unidentified. It is additional inventor of various varieties of disorders for example: coronary failure, blindness, urinary organ diseases etc. Nine different machine learning techniques are used in this research work for prediction of diabetes. A dataset of diabetic patient’s is taken and nine different machine learning techniques are applied on the dataset. Positive likelihood ratio, Negative likelihood ratio, Positive predictive value, Negative predictive value, Disease prevalence, Specificity, Precision, Recall, F1-Score ,True positive rate, False positive rate of the applied algorithms is discussed and compared. Diabetes is growing at an increasing in the world and it requires continuous monitoring. To check this we use Logical regression, Random forest, Logical regression CV, Support Vector Machine, Artificial Neural Network (ANN), Decision Tree, k-nearest neighbors (KNN), XGB classifier.

2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Sana Shokat ◽  
Rabia Riaz ◽  
Sanam Shahla Rizvi ◽  
Inayat Khan ◽  
Anand Paul

Revolution in technology is changing the way visually impaired people read and write Braille easily. Learning Braille in its native language can be more convenient for its users. This study proposes an improved backend processing algorithm for an earlier developed touchscreen-based Braille text entry application. This application is used to collect Urdu Braille data, which is then converted to Urdu text. Braille to text conversion has been done on Hindi, Arabic, Bangla, Chinese, English, and other languages. For this study, Urdu Braille Grade 1 data were collected with multiclass (39 characters of Urdu represented by class 1, Alif (ﺍ), to class 39, Bri Yay (ے). Total (N = 144) cases for each class were collected. The dataset was collected from visually impaired students from The National Special Education School. Visually impaired users entered the Urdu Braille alphabets using touchscreen devices. The final dataset contained (N = 5638) cases. Reconstruction Independent Component Analysis (RICA)-based feature extraction model is created for Braille to Urdu text classification. The multiclass was categorized into three groups (13 each), i.e., category-1 (1–13), Alif-Zaal (ﺫ - ﺍ), category-2 (14–26), Ray-Fay (ﻒ - ﺮ), and category-3 (27–39), Kaaf-Bri Yay (ے - ﻕ), to give better vision and understanding. The performance was evaluated in terms of true positive rate, true negative rate, positive predictive value, negative predictive value, false positive rate, total accuracy, and area under the receiver operating curve. Among all the classifiers, support vector machine has achieved the highest performance with a 99.73% accuracy. For comparisons, robust machine learning techniques, such as support vector machine, decision tree, and K-nearest neighbors were used. Currently, this work has been done on only Grade 1 Urdu Braille. In the future, we plan to enhance this work using Grade 2 Urdu Braille with text and speech feedback on touchscreen-based android phones.


Electronics ◽  
2021 ◽  
Vol 10 (22) ◽  
pp. 2857
Author(s):  
Laura Vigoya ◽  
Diego Fernandez ◽  
Victor Carneiro ◽  
Francisco Nóvoa

With advancements in engineering and science, the application of smart systems is increasing, generating a faster growth of the IoT network traffic. The limitations due to IoT restricted power and computing devices also raise concerns about security vulnerabilities. Machine learning-based techniques have recently gained credibility in a successful application for the detection of network anomalies, including IoT networks. However, machine learning techniques cannot work without representative data. Given the scarcity of IoT datasets, the DAD emerged as an instrument for knowing the behavior of dedicated IoT-MQTT networks. This paper aims to validate the DAD dataset by applying Logistic Regression, Naive Bayes, Random Forest, AdaBoost, and Support Vector Machine to detect traffic anomalies in IoT. To obtain the best results, techniques for handling unbalanced data, feature selection, and grid search for hyperparameter optimization have been used. The experimental results show that the proposed dataset can achieve a high detection rate in all the experiments, providing the best mean accuracy of 0.99 for the tree-based models, with a low false-positive rate, ensuring effective anomaly detection.


2020 ◽  
Vol 17 (8) ◽  
pp. 3449-3452
Author(s):  
M. S. Roobini ◽  
Y. Sai Satwick ◽  
A. Anil Kumar Reddy ◽  
M. Lakshmi ◽  
D. Deepa ◽  
...  

In today’s world diabetes is the major health challenges in India. It is a group of a syndrome that results in too much sugar in the blood. It is a protracted condition that affects the way the body mechanizes the blood sugar. Prevention and prediction of diabetes mellitus is increasingly gaining interest in medical sciences. The aim is how to predict at an early stage of diabetes using different machine learning techniques. In this paper basically, we use well-known classification that are Decision tree, K-Nearest Neighbors, Support Vector Machine, and Random forest. These classification techniques used with Pima Indians diabetes dataset. Therefore, we predict diabetes at different stage and analyze the performance of different classification techniques. We Also proposed a conceptual model for the prediction of diabetes mellitus using different machine learning techniques. In this paper we also compare the accuracy of the different machine learning techniques to finding the diabetes mellitus at early stage.


Computers ◽  
2019 ◽  
Vol 8 (2) ◽  
pp. 35 ◽  
Author(s):  
Xuan Dau Hoang ◽  
Ngoc Tuong Nguyen

Defacement attacks have long been considered one of prime threats to websites and web applications of companies, enterprises, and government organizations. Defacement attacks can bring serious consequences to owners of websites, including immediate interruption of website operations and damage of the owner reputation, which may result in huge financial losses. Many solutions have been researched and deployed for monitoring and detection of website defacement attacks, such as those based on checksum comparison, diff comparison, DOM tree analysis, and complicated algorithms. However, some solutions only work on static websites and others demand extensive computing resources. This paper proposes a hybrid defacement detection model based on the combination of the machine learning-based detection and the signature-based detection. The machine learning-based detection first constructs a detection profile using training data of both normal and defaced web pages. Then, it uses the profile to classify monitored web pages into either normal or attacked. The machine learning-based component can effectively detect defacements for both static pages and dynamic pages. On the other hand, the signature-based detection is used to boost the model’s processing performance for common types of defacements. Extensive experiments show that our model produces an overall accuracy of more than 99.26% and a false positive rate of about 0.27%. Moreover, our model is suitable for implementation of a real-time website defacement monitoring system because it does not demand extensive computing resources.


2019 ◽  
Vol 8 (9) ◽  
pp. 1298 ◽  
Author(s):  
Giulia Lorenzoni ◽  
Stefano Santo Sabato ◽  
Corrado Lanera ◽  
Daniele Bottigliengo ◽  
Clara Minto ◽  
...  

The present study aims to compare the performance of eight Machine Learning Techniques (MLTs) in the prediction of hospitalization among patients with heart failure, using data from the Gestione Integrata dello Scompenso Cardiaco (GISC) study. The GISC project is an ongoing study that takes place in the region of Puglia, Southern Italy. Patients with a diagnosis of heart failure are enrolled in a long-term assistance program that includes the adoption of an online platform for data sharing between general practitioners and cardiologists working in hospitals and community health districts. Logistic regression, generalized linear model net (GLMN), classification and regression tree, random forest, adaboost, logitboost, support vector machine, and neural networks were applied to evaluate the feasibility of such techniques in predicting hospitalization of 380 patients enrolled in the GISC study, using data about demographic characteristics, medical history, and clinical characteristics of each patient. The MLTs were compared both without and with missing data imputation. Overall, models trained without missing data imputation showed higher predictive performances. The GLMN showed better performance in predicting hospitalization than the other MLTs, with an average accuracy, positive predictive value and negative predictive value of 81.2%, 87.5%, and 75%, respectively. Present findings suggest that MLTs may represent a promising opportunity to predict hospital admission of heart failure patients by exploiting health care information generated by the contact of such patients with the health care system.


Agriculture ◽  
2021 ◽  
Vol 11 (5) ◽  
pp. 387
Author(s):  
Nahina Islam ◽  
Md Mamunur Rashid ◽  
Santoso Wibowo ◽  
Cheng-Yuan Xu ◽  
Ahsan Morshed ◽  
...  

This paper explores the potential of machine learning algorithms for weed and crop classification from UAV images. The identification of weeds in crops is a challenging task that has been addressed through orthomosaicing of images, feature extraction and labelling of images to train machine learning algorithms. In this paper, the performances of several machine learning algorithms, random forest (RF), support vector machine (SVM) and k-nearest neighbours (KNN), are analysed to detect weeds using UAV images collected from a chilli crop field located in Australia. The evaluation metrics used in the comparison of performance were accuracy, precision, recall, false positive rate and kappa coefficient. MATLAB is used for simulating the machine learning algorithms; and the achieved weed detection accuracies are 96% using RF, 94% using SVM and 63% using KNN. Based on this study, RF and SVM algorithms are efficient and practical to use, and can be implemented easily for detecting weed from UAV images.


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1467 ◽  
Author(s):  
Zeinab Shahbazi ◽  
Yung-Cheol Byun

Smart manufacturing systems are growing based on the various requests for predicting the reliability and quality of equipment. Many machine learning techniques are being examined to that end. Another issue which considers an important part of industry is data security and management. To overcome the problems mentioned above, we applied the integrated methods of blockchain and machine learning to secure system transactions and handle a dataset to overcome the fake dataset. To manage and analyze the collected dataset, big data techniques were used. The blockchain system was implemented in the private Hyperledger Fabric platform. Similarly, the fault diagnosis prediction aspect was evaluated based on the hybrid prediction technique. The system’s quality control was evaluated based on non-linear machine learning techniques, which modeled that complex environment and found the true positive rate of the system’s quality control approach.


MENDEL ◽  
2019 ◽  
Vol 25 (2) ◽  
pp. 1-10 ◽  
Author(s):  
Ivan Zelinka ◽  
Eslam Amer

Current commercial antivirus detection engines still rely on signature-based methods. However, with the huge increase in the number of new malware, current detection methods become not suitable. In this paper, we introduce a malware detection model based on ensemble learning. The model is trained using the minimum number of signification features that are extracted from the file header. Evaluations show that the ensemble models slightly outperform individual classification models. Experimental evaluations show that our model can predict unseen malware with an accuracy rate of 0.998 and with a false positive rate of 0.002. The paper also includes a comparison between the performance of the proposed model and with different machine learning techniques. We are emphasizing the use of machine learning based approaches to replace conventional signature-based methods.


Author(s):  
Alka Singh

Machine learning is considered to be one of the most promising tools when it comes to working with heterogeneous data. It provides a new dimension which enables one to extract relevant data and take decision for the effective functioning of the network, making use of network generated data. Every sphere of our life is now dependent on machine learning. It has flourished in every dimension. Making it versatile and ever demanding. Department of healthcare contains very abundant and sensitive information which is needed to be carefully handled. Diabetes mellitus is increasing exponentially and is spreading like anything in the world. A reliable prediction system should be present for diagnosing diabetes. Variety of machine learning techniques find their use in the examination of data from variant perspectives and summarizing it into effective information. Usage of new patterns is done to elucidate these patterns in order to deliver relevant information for their users. By making use of techniques such as SVM, random forest, logistic regression, naïve bayes etc the prediction of diabetes can be done easily and accurately. In this study we will make use of different machine learning techniques and try to find accurate prediction regarding the same.


Sign in / Sign up

Export Citation Format

Share Document