Cardiotocography Data Analysis to Predict Fetal Health Risks with Tree-Based Ensemble Learning

Pankaj Bhowmik;  ; Pulak Chandra Bhowmik; U. A. Md. Ehsan Ali; Md. Sohrawordi

doi:10.5815/ijitcs.2021.05.03

Cardiotocography Data Analysis to Predict Fetal Health Risks with Tree-Based Ensemble Learning

International Journal of Information Technology and Computer Science ◽

10.5815/ijitcs.2021.05.03 ◽

2021 ◽

Vol 13 (5) ◽

pp. 30-40

Author(s):

Pankaj Bhowmik ◽

◽

Pulak Chandra Bhowmik ◽

U. A. Md. Ehsan Ali ◽

Md. Sohrawordi

Keyword(s):

Machine Learning ◽

Ensemble Learning ◽

Health Risks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Model Assessment ◽

Decision Tree Classifier ◽

Chi Square ◽

Fetal Health ◽

Tree Classifier

A sizeable number of women face difficulties during pregnancy, which eventually can lead the fetus towards serious health problems. However, early detection of these risks can save both the invaluable life of infants and mothers. Cardiotocography (CTG) data provides sophisticated information by monitoring the heart rate signal of the fetus, is used to predict the potential risks of fetal wellbeing and for making clinical conclusions. This paper proposed to analyze the antepartum CTG data (available on UCI Machine Learning Repository) and develop an efficient tree-based ensemble learning (EL) classifier model to predict fetal health status. In this study, EL considers the Stacking approach, and a concise overview of this approach is discussed and developed accordingly. The study also endeavors to apply distinct machine learning algorithmic techniques on the CTG dataset and determine their performances. The Stacking EL technique, in this paper, involves four tree-based machine learning algorithms, namely, Random Forest classifier, Decision Tree classifier, Extra Trees classifier, and Deep Forest classifier as base learners. The CTG dataset contains 21 features, but only 10 most important features are selected from the dataset with the Chi-square method for this experiment, and then the features are normalized with Min-Max scaling. Following that, Grid Search is applied for tuning the hyperparameters of the base algorithms. Subsequently, 10-folds cross validation is performed to select the meta learner of the EL classifier model. However, a comparative model assessment is made between the individual base learning algorithms and the EL classifier model; and the finding depicts EL classifiers’ superiority in fetal health risks prediction with securing the accuracy of about 96.05%. Eventually, this study concludes that the Stacking EL approach can be a substantial paradigm in machine learning studies to improve models’ accuracy and reduce the error rate.

Download Full-text

Ensemble-Based Machine Learning for Predicting Sudden Human Fall Using Health Data

Mathematical Problems in Engineering ◽

10.1155/2021/8608630 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Utkarsh Saxena ◽

Soumen Moulik ◽

Soumya Ranjan Nayak ◽

Thomas Hanne ◽

Diptendu Sinha Roy

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Majority Voting ◽

Support Vector ◽

Human Beings ◽

Medical Terminology ◽

Decision Tree Classifier ◽

Tree Classifier ◽

Health Parameters

We attempt to predict the accidental fall of human beings due to sudden abnormal changes in their health parameters such as blood pressure, heart rate, and sugar level. In medical terminology, this problem is known as Syncope. The primary motivation is to prevent such falls by predicting abnormal changes in these health parameters that might trigger a sudden fall. We apply various machine learning algorithms such as logistic regression, a decision tree classifier, a random forest classifier, K-Nearest Neighbours (KNN), a support vector machine, and a naive Bayes classifier on a relevant dataset and verify our results with the cross-validation method. We observe that the KNN algorithm provides the best accuracy in predicting such a fall. However, the accuracy results of some other algorithms are also very close. Thus, we move one step further and propose an ensemble model, Majority Voting, which aggregates the prediction results of multiple machine learning algorithms and finally indicates the probability of a fall that corresponds to a particular human being. The proposed ensemble algorithm yields 87.42% accuracy, which is greater than the accuracy provided by the KNN algorithm.

Download Full-text

Detailed Analysis of Intrusion Detection using Machine Learning Algorithms

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.a2127.059120 ◽

2020 ◽

Vol 9 (1) ◽

pp. 1894-1899 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Svm Classifier ◽

Learning Approaches ◽

Decision Tree Classifier ◽

Internet Users ◽

Tree Classifier ◽

Challenging Tasks

The number of internet users has increased exponentially over the years and so have increased intrusive activities significantly. To detect an intrusion attack in a system connected over a network is one of the most challenging tasks in today’s world. A significant number of techniques have been developed which are based on machine learning approaches to detect these intrusion attacks. Even though these techniques are good, they are not good enough to detect all kinds of attacks. In this paper, the analysis of different machine learning algorithm will be performed on the NSL-KDD dataset with pre-processing steps like One-hot encoding, feature selection and random sampling to use in different machine learning models to find the best performing model to detect these attacks. The attacks are from the datasets are classified into four types of attacks: Probe, DoS, U2R, R2L while the non- attack is the Normal. The dataset is in two parts: KDD-Train and KDD-Test. The dataset is trained and tested to find accuracy and understand the performance of different machine learning algorithms and compare them. The Machine Learning algorithms used are Naive Bayes Classifier, Decision Tree Classifier, Random Forest Classifier, KNeighbours Classifier, Logistic Regression, SVM Classifier, Voting Classifier. These techniques are compared according to their capability to detect the attacks. This comparison will help to find the algorithm which would work the best to detect different kinds of intrusion attacks.

Download Full-text

A Study of Machine Learning Algorithms for DDoS Detection

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.34922 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 174-178

Author(s):

Sheikh Shehzad Ahmed

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Algorithms ◽

Random Forest Classifier ◽

Attack Detection ◽

Machine Learning Algorithms ◽

The Internet ◽

Ddos Attacks ◽

Decision Tree Classifier ◽

Tree Classifier

The Internet is used practically everywhere in today's digital environment. With the increased use of the Internet comes an increase in the number of threats. DDoS attacks are one of the most popular types of cyber-attacks nowadays. With the fast advancement of technology, the harm caused by DDoS attacks has grown increasingly severe. Because DDoS attacks may readily modify the ports/protocols utilized or how they function, the basic features of these attacks must be examined. Machine learning approaches have also been used extensively in intrusion detection research. Still, it is unclear what features are applicable and which approach would be better suited for detection. With this in mind, the research presents a machine learning-based DDoS attack detection approach. To train the attack detection model, we employ four Machine Learning algorithms: Decision Tree classifier (ID3), k-Nearest Neighbors (k-NN), Logistic Regression, and Random Forest classifier. The results of our experiments show that the Random Forest classifier is more accurate in recognizing attacks.

Download Full-text

Making Use of Functional Dependencies Based on Data to Find Better Classification Trees

International Journal of Circuits, Systems and Signal Processing ◽

10.46300/9106.2021.15.160 ◽

2021 ◽

Vol 15 ◽

pp. 1475-1485

Author(s):

Hyontai Sug

Keyword(s):

Machine Learning ◽

Data Mining ◽

Decision Trees ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Functional Dependencies ◽

Chi Square ◽

Chi Square Test ◽

Novel Method ◽

Categorical Attributes

For the classification task of machine learning algorithms independency between conditional attributes is a precondition for success of data mining. On the other hand, decision trees are one of the mostly used machine learning algorithms because of their good understandability. So, because dependency between conditional attributes can cause more complex trees, supplying conditional attributes independent each other is very important, the requirement of conditional attributes for decision trees as well as other machine learning algorithms is that they are independent each other and dependent on decisional attributes only. Statistical method to check independence between attributes is Chi-square test, but the test can be effective for categorical attributes only. So, the applicability of Chi-square test is limited, because most datasets for data mining have mixed attributes of categorical and numerical. In order to overcome the problem, and as a way to test dependency between conditional attributes, a novel method based on functional dependency based on data that can be applied to any datasets irrespective of data type of attributes is suggested. After removing highly dependent attributes between conditional attributes, we can generate better decision trees. Experiments were performed to show that the method is effective, and the experiments showed very good results.

Download Full-text

Super ensemble learning for daily streamflow forecasting: large-scale demonstration and comparison with multiple machine learning algorithms

Neural Computing and Applications ◽

10.1007/s00521-020-05172-3 ◽

2020 ◽

Cited By ~ 1

Author(s):

Hristos Tyralis ◽

Georgia Papacharalampous ◽

Andreas Langousis

Keyword(s):

Machine Learning ◽

Ensemble Learning ◽

Large Scale ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Streamflow Forecasting ◽

Daily Streamflow

Download Full-text

Identification and Detection of Cyberbullying on Facebook Using Machine Learning Algorithms

Journal of Cases on Information Technology ◽

10.4018/jcit.296254 ◽

2021 ◽

Vol 23 (4) ◽

pp. 1-21

Author(s):

Nureni Ayofe AZEEZ ◽

Sanjay Misra ◽

Omotola Ifeoluwa LAWAL ◽

Jonathan Oluranti

Keyword(s):

Machine Learning ◽

Social Media ◽

Nearest Neighbor ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

K Nearest Neighbor ◽

Chi Square ◽

Social Media Platforms ◽

Bayes Algorithm ◽

Use Of Social Media

The use of social media platforms such as Facebook, Twitter, Instagram, WhatsApp, etc. have enabled a lot of people to communicate effectively and frequently with each other and this has enabled cyberbullying to occur more frequently while using these networks. Cyberbullying is known to be the cause of some serious health issues among social media users and creating a way to identify and detect this holds significant importance. This paper takes a look at unique features gotten from the Facebook dataset and develops a model that identifies and detect cyberbullying posts by applying machine learning algorithms (Naïve Bayes Algorithm and K-Nearest Neighbor). The project also uses a feature selection algorithm namely x2 test (Chi-Square test) to select important features which can improve the performance of the classifiers and decrease classification time. The result of this paper tends to detect cyberbullying in Facebook with a high degree of accuracy and also improve the performance of the machine learning classifiers.

Download Full-text

Synthetic Minority Oversampling and Smote Regularised Deep Autoencoders Neural Network Techniques for Fraud Prediction in Financial Payment Services

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.l3419.1081219 ◽

2019 ◽

Vol 8 (12) ◽

pp. 3908-3915

Keyword(s):

Neural Network ◽

Machine Learning ◽

Financial Institutions ◽

Fraud Detection ◽

Machine Learning Algorithms ◽

Decision Tree Classifier ◽

Class Imbalance Problem ◽

Good Recall ◽

Tree Classifier ◽

Payment Services

Frauds in Financial Payment Services are the most prevalent form of cybercrime. The increased growth in e-commerce and mobile payments in recent years is behind the rising incidence of fraud in financial payment services. According to "McKinsey, fraud losses throughout the world could be close to $44 billion by 2025." Every year, fraudulent card transactions causes billions of US Dollar of loss. To reduce these losses, designing effective fraud detection algorithms is essential, which depend on sophisticated machine learning methods to help investigators in fraud. For banks and financial institutions, therefore, fraud detection systems have gained excellent significance. Though the fake transactions are very low when compared to genuine transaction, care must be taken to predict it so that the financial institutions can maintain the customer integrity. As fraud is unlikely to occur compared to normal operations, we have the class imbalance problem. We applied Synthetic Minority Oversampling TEchnique (SMOTE) and the Ensemble of sampling methods(Balanced Random Forest Classifier, Balanced Bagging Classifier, Easy Ensemble Classifier, RUS Boost) to Ensemble machine learning algorithms Performance assessment using sensitivity, specificity, precision, ROC area. The purpose of this article is to analyze different predictive models to see how precise they are to detect whether a transaction is a standard payment or a fraud. Instead of misclassifying a real transaction as fraud, this model seeks to improve detection of fraud. We noted that the technique of Ensemble learning using Maximum voting detects the fraud better than other classifiers. Decision Tree Classifier, Logistic Regression, Balanced Bagging classifier is combined and the proposed algorithm is OptimizedEnsembleFD Algorithm. The sample size is increased and deep learning is applied .It is found that the proposed system Smote Regularised Deep Autoencoders (SRD Autoencoders) neural network performs better with good recall and accuracy for this large dataset.

Download Full-text

Prophecy on Programming Language using Machine Learning Algorithms

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35746 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 3699-3706

Author(s):

Komal Bhaskar Thube

Keyword(s):

Machine Learning ◽

Programming Language ◽

Machine Learning Algorithms ◽

Support Vector ◽

Computer Language ◽

Decision Tree Classifier ◽

Development Environment ◽

Tree Classifier ◽

Develop Software ◽

Neighbor Classifier

A programming language is a computer language developers use to develop software programs, scripts, or other sets of instruction for computers to execute. It is difficult to determine which programming language is widely used. In our work, I have analyzed and compared the classification results of various machine learning models and find out which programming language is widely used by developers. I have used Support Vector Machine (SVM), K neighbor classifier (KNN),Decision Tree Classifier(CART) for our comparative study. My task is to analyze different data and to classify them for the efficiency of each algorithm in terms of accuracy, precision, recall, and F1 Score. My best accuracy was 94.29% percent which was found using SVM. These techniques are coded in python and executed in Jupyter NoteBook, the Scientific Python Development Environment. Our experiments have shown that SVM is the best for predictive analysis and from our study that SVM is the well-suited algorithm for the prediction of the most widely used programming language.

Download Full-text

Estimation of Individual Tree Biomass in Natural Secondary Forests Based on ALS Data and WorldView-3 Imagery

Remote Sensing ◽

10.3390/rs14020271 ◽

2022 ◽

Vol 14 (2) ◽

pp. 271

Author(s):

Yinghui Zhao ◽

Ye Ma ◽

Lindi Quackenbush ◽

Zhen Zhen

Keyword(s):

Machine Learning ◽

Ensemble Learning ◽

Tree Species ◽

Laser Scanning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Secondary Forests ◽

Species Classification ◽

Individual Tree ◽

Tree Species Classification

Individual-tree aboveground biomass (AGB) estimation can highlight the spatial distribution of AGB and is vital for precision forestry. Accurately estimating individual tree AGB is a requisite for accurate forest carbon stock assessment of natural secondary forests (NSFs). In this study, we investigated the performance of three machine learning and three ensemble learning algorithms in tree species classification based on airborne laser scanning (ALS) and WorldView-3 imagery, inversed the diameter at breast height (DBH) using an optimal tree height curve model, and mapped individual tree AGB for a site in northeast China using additive biomass equations, tree species, and inversed DBH. The results showed that the combination of ALS and WorldView-3 performed better than either single data source in tree species classification, and ensemble learning algorithms outperformed machine learning algorithms (except CNN). Seven tree species had satisfactory accuracy of individual tree AGB estimation, with R2 values ranging from 0.68 to 0.85 and RMSE ranging from 7.47 kg to 36.83kg. The average individual tree AGB was 125.32 kg and the forest AGB was 113.58 Mg/ha in the Maoershan study site in Heilongjiang Province, China. This study provides a way to classify tree species and estimate individual tree AGB of NSFs based on ALS data and WorldView-3 imagery.

Download Full-text

Breast Cancer Prediction Using Classification Techniques of Machine Learning

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2022.39743 ◽

2022 ◽

Vol 10 (1) ◽

pp. 51-57

Author(s):

Angela More

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Naive Bayes ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Decision Tree Classifier ◽

Learning Techniques ◽

Tree Classifier ◽

Abstract Data

Abstract: Data analytics play vital roles in diagnosis and treatment in the health care sector. To enable practitioner decisionmaking, huge volumes of data should be processed with machine learning techniques to produce tools for prediction and classification Breast Cancer reports 1 million cases per year. We have proposed a prediction model, which is specifically designed for prediction of Breast Cancer using Machine learning algorithms Decision tree classifier, Naïve Bayes, SVM and KNearest Neighbour algorithms. The model predicts the type of tumour, the tumour can be benign (noncancerous) or malignant (cancerous) . The model uses supervised learning which is a machine learning concept where we provide dependent and independent columns to machine. It uses classification technique which predicts the type of tumour. Keywords: Cancer, Machine learning, Prediction, Data Visualization, SVM, Naïve Bayes, Classification.

Download Full-text