Personality Classification Experiment by Applying k-Means Clustering

This paper describes personality classification experiment by applying k-means clustering machine learning algorithms. Several previous studies have been attempted to predict personality types of human beings automatically by using various machine learning algorithms. However, only few of them have obtained good accuracy results. To classify a person into personality types, we used Jungian Type Inventory. Our method consists of three parts: data collection, data preparation, and hyper-parameter tuning. Our testing results showed that the k-means model has 107 inertia value, which is a good number for an unsupervised learning model as an interim result. With the result, we divided the data into 16 clusters, which can be considered as personality types. We continue this research with analysis of large data to be collected in the future.

Download Full-text

Significant Impact of Improved Machine Learning Algorithm in The Processes of Large Data Sets

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit206133 ◽

2020 ◽

pp. 458-467

Author(s):

Virendra Tiwari ◽

Balendra Garg ◽

Uday Prakash Sharma

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Learning Algorithms ◽

Dynamic Environment ◽

Large Data ◽

Machine Learning Algorithms ◽

Streaming Data ◽

Machine Learning Techniques ◽

Machine Learning Algorithm ◽

Learning Mechanisms

The machine learning algorithms are capable of managing multi-dimensional data under the dynamic environment. Despite its so many vital features, there are some challenges to overcome. The machine learning algorithms still requires some additional mechanisms or procedures for predicting a large number of new classes with managing privacy. The deficiencies show the reliable use of a machine learning algorithm relies on human experts because raw data may complicate the learning process which may generate inaccurate results. So the interpretation of outcomes with expertise in machine learning mechanisms is a significant challenge in the machine learning algorithm. The machine learning technique suffers from the issue of high dimensionality, adaptability, distributed computing, scalability, the streaming data, and the duplicity. The main issue of the machine learning algorithm is found its vulnerability to manage errors. Furthermore, machine learning techniques are also found to lack variability. This paper studies how can be reduced the computational complexity of machine learning algorithms by finding how to make predictions using an improved algorithm.

Download Full-text

A system for analyzing large data sets using machine learning algorithms

Bulletin of Kharkov National Automobile and Highway University ◽

10.30977/bul.2219-5548.2021.94.0.142 ◽

2021 ◽

pp. 142

Author(s):

Sergey Pronin ◽

Mykhailo Miroshnichenko

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Large Data ◽

Machine Learning Algorithms ◽

Large Data Sets ◽

Data Sets

A system for analyzing large data sets using machine learning algorithms

Download Full-text

Diagnosis of breast cancer using machine learning algorithms based on features selected by Genetic Algorithm: Assessed on five datasets

Journal of University of Shanghai for Science and Technology ◽

10.51201/jusst/21/11963 ◽

2021 ◽

Vol 23 (11) ◽

pp. 749-758

Author(s):

Saranya N ◽

◽

Kavi Priya S ◽

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Genetic Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Cancer Prognosis ◽

Support Vector ◽

Breast Cancer Dataset ◽

Human Beings ◽

Original Dataset

Breast Cancer is one of the chronic diseases occurred to human beings throughout the world. Early detection of this disease is the most promising way to improve patients’ chances of survival. The strategy employed in this paper is to select the best features from various breast cancer datasets using a genetic algorithm and machine learning algorithm is applied to predict the outcomes. Two machine learning algorithms such as Support Vector Machines and Decision Tree are used along with Genetic Algorithm. The proposed work is experimented on five datasets such as Wisconsin Breast Cancer-Diagnosis Dataset, Wisconsin Breast Cancer-Original Dataset, Wisconsin Breast Cancer-Prognosis Dataset, ISPY1 Clinical trial Dataset, and Breast Cancer Dataset. The results exploit that SVM-GA achieves higher accuracy of 98.16% than DT-GA of 97.44%.

Download Full-text

Ensemble-Based Machine Learning for Predicting Sudden Human Fall Using Health Data

Mathematical Problems in Engineering ◽

10.1155/2021/8608630 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Utkarsh Saxena ◽

Soumen Moulik ◽

Soumya Ranjan Nayak ◽

Thomas Hanne ◽

Diptendu Sinha Roy

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Majority Voting ◽

Support Vector ◽

Human Beings ◽

Medical Terminology ◽

Decision Tree Classifier ◽

Tree Classifier ◽

Health Parameters

We attempt to predict the accidental fall of human beings due to sudden abnormal changes in their health parameters such as blood pressure, heart rate, and sugar level. In medical terminology, this problem is known as Syncope. The primary motivation is to prevent such falls by predicting abnormal changes in these health parameters that might trigger a sudden fall. We apply various machine learning algorithms such as logistic regression, a decision tree classifier, a random forest classifier, K-Nearest Neighbours (KNN), a support vector machine, and a naive Bayes classifier on a relevant dataset and verify our results with the cross-validation method. We observe that the KNN algorithm provides the best accuracy in predicting such a fall. However, the accuracy results of some other algorithms are also very close. Thus, we move one step further and propose an ensemble model, Majority Voting, which aggregates the prediction results of multiple machine learning algorithms and finally indicates the probability of a fall that corresponds to a particular human being. The proposed ensemble algorithm yields 87.42% accuracy, which is greater than the accuracy provided by the KNN algorithm.

Download Full-text

Big Data Analytics in Healthcare using Machine Learning Algorithms: A Comparative Study

International Journal of Online and Biomedical Engineering (iJOE) ◽

10.3991/ijoe.v16i13.18609 ◽

2020 ◽

Vol 16 (13) ◽

pp. 19

Author(s):

Sai Hanuman Akundi ◽

Soujanya R ◽

Madhuri PM

Keyword(s):

Machine Learning ◽

Big Data ◽

Comparative Study ◽

Data Analytics ◽

Learning Algorithms ◽

Big Data Analytics ◽

Large Data ◽

Heterogeneous Data ◽

Machine Learning Algorithms ◽

Healthcare Sector

In recent years vast quantities of data have been managed in various ways of medical applications and multiple organizations worldwide have developed this type of data and, together, these heterogeneous data are called big data. Data with other characteristics, quantity, speed and variety are the word big data. The healthcare sector has faced the need to handle the large data from different sources, renowned for generating large amounts of heterogeneous data. We can use the Big Data analysis to make proper decision in the health system by tweaking some of the current machine learning algorithms. If we have a large amount of knowledge that we want to predict or identify patterns, master learning would be the way forward. In this article, a brief overview of the Big Data, functionality and ways of Big data analytics are presented, which play an important role and affect healthcare information technology significantly. Within this paper we have presented a comparative study of algorithms for machine learning. We need to make effective use of all the current machine learning algorithms to anticipate accurate outcomes in the world of nursing.

Download Full-text

Developing QSAR Models with Defined Applicability Domains on PPARγ Binding Affinity Using Large Data Sets and Machine Learning Algorithms

Environmental Science & Technology ◽

10.1021/acs.est.0c07040 ◽

2021 ◽

Author(s):

Zhongyu Wang ◽

Jingwen Chen ◽

Huixiao Hong

Keyword(s):

Machine Learning ◽

Binding Affinity ◽

Learning Algorithms ◽

Large Data ◽

Machine Learning Algorithms ◽

Large Data Sets ◽

Data Sets ◽

Qsar Models

Download Full-text

Music Genre Classification using Optimized Sequential Neural Network

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2021/641032021 ◽

2021 ◽

Vol 10 (3) ◽

pp. 1949-1958

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Learning Approaches ◽

Human Beings ◽

Advantages And Disadvantages ◽

Genre Classification ◽

Music Genre ◽

Music Genre Classification

Music makes up a huge portion of the contents stored and used over the internet, with several sites and applications developed solely to provide music-related services to their users/ customers.Some of the most challenging tasks in this scenario would include music classification based on languages and genres, playlist suggestions based on music history, song suggestions based on playlist contents, top genres / songs based on listeners' rating, likes, number of streams, song loops, popularity of artists based on number of songs released per year, hit songs per year, etc. One of the most important stages to solve the above-mentioned challenges would be music genre classification. It would be impractical to analyze each and every song in a given database to identify and classify music genres, even though human beings are better at performing such tasks. Hence, useful Machine Learning algorithms and Deep Learning approaches may be used for accomplishing such tasks with ease. A thorough analysis to understand the different uses of Machine Learning and Deep Learning algorithms and relevance of such algorithms with respect to situations would be made to highlight and contrast the advantages and disadvantages of each approach. The outcomes of the optimized models would be visualized and comparedto the expected outcomes for better perception.

Download Full-text

Exploring Complementary Models Consisting of Machine Learning Algorithms for Landslide Susceptibility Mapping

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10100639 ◽

2021 ◽

Vol 10 (10) ◽

pp. 639

Author(s):

Han Hu ◽

Changming Wang ◽

Zhu Liang ◽

Ruiyuan Gao ◽

Bailong Li

Keyword(s):

Machine Learning ◽

Land Use Planning ◽

Landslide Susceptibility ◽

Prediction Accuracy ◽

Learning Algorithms ◽

Specific Area ◽

Machine Learning Algorithms ◽

Landslide Susceptibility Mapping ◽

Human Beings ◽

Statistical Measures

Landslides frequently occur because of natural or human factors. Landslides cause huge losses to the economy as well as human beings every year around the globe. Landslide susceptibility prediction (LSP) plays a key role in the prevention of landslides and has been under investigation for years. Although new machine learning algorithms have achieved excellent performance in terms of prediction accuracy, a sufficient quantity of training samples is essential. In contrast, it is hard to obtain enough landslide samples in most the areas, especially for the county-level area. The present study aims to explore an optimization model in conjunction with conventional unsupervised and supervised learning methods, which performs well with respect to prediction accuracy and comprehensibility. Logistic regression (LR), fuzzy c-means clustering (FCM) and factor analysis (FA) were combined to establish four models: LR model, FCM coupled with LR model, FA coupled with LR model, and FCM, FA coupled with LR model and applied in a specific area. Firstly, an inventory with 114 landslides and 10 conditioning factors was prepared for modeling. Subsequently, four models were applied to LSP. Finally, the performance was evaluated and compared by k-fold cross-validation based on statistical measures. The results showed that the coupled model by FCM, FA and LR achieved the greatest performance among these models with the AUC (Area under the curve) value of 0.827, accuracy of 85.25%, sensitivity of 74.96% and specificity of 86.21%. While the LR model performed the worst with an AUC value of 0.736, accuracy of 77%, sensitivity of 62.52% and specificity of 72.55%. It was concluded that both the dimension reduction and sample size should be considered in modeling, and the performance can be enhanced by combining complementary methods. The combination of models should be more flexible and purposeful. This work provides reference for related research and better guidance to engineering activities, decision-making by local administrations and land use planning.

Download Full-text

Decision Tree-based Machine Learning Algorithms to Classify Rice Plant Diseases

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a4753.119119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 5365-5368

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Rice Plant ◽

Plant Disease ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Plant Diseases ◽

Brown Spot ◽

Human Beings ◽

Data Set

Rice is one of the most important foods on earth for human beings. India and China are two countries in the world mostly depend on rice. The output of this crop depends on the many parameters such as soil, water supply, pesticides used, time duration, and infected diseases. Rice Plant Disease (RPD) is one of the important factors that decrease the quantity and quality of rice. Identifying the type of rice plant disease and taking corrective action against the disease in time is always challenging for the farmers. Although the rice plant is affected by many diseases, Bacterial Leaf Blight (BLB), Brown Spot (BS), and Leaf Smut (LS) are major diseases. Identification of this disease is really challenging because the infected leaf has to be processed by the human eye. So in this paper, we focused on machine learning techniques to identify and classify the RPD. We have collected infected rice plant data from the UCI Machine Learning repository. The data set consists of 120 images of infected rice plants in which 40 images are BLB, 40 are BS, and 40 are LS. Experiments are conducted using Decision tree-based machine learning algorithms such as RandomForest, REPTree, and J48. In order to extract the numerical features from the infected images, we have used ColourLayoutFilter supported by WEKA. Experimental analysis is done using 65% data for training and 35% data for testing. The experiments unfold that the Random Forest algorithm is exceptional in predicting RPD.

Download Full-text

Effective prediction on music therapy using hybrid SVM-ANN approach

ITM Web of Conferences ◽

10.1051/itmconf/20213701014 ◽

2021 ◽

Vol 37 ◽

pp. 01014

Author(s):

K Devendran ◽

S K Thangarasu ◽

P Keerthika ◽

R Manjula Devi ◽

B K Ponnarasee

Keyword(s):

Machine Learning ◽

Music Therapy ◽

Learning Algorithms ◽

Blood Pressure Level ◽

Visual Analog Scale Score ◽

Machine Learning Algorithms ◽

Support Vector ◽

Human Beings ◽

Psychological Issues ◽

Medical Disorders

In this world, people are moving with lightning speed. Stress has become a usual thing we experience in our day to day routine. Some factors like work tension, emotional obstacles, brutality, etc lead to stress. Many health issues like headaches, heart problems, depression, etc and psychological issues arise in human beings due to stress. Music therapy gives qualitative results in balancing the physical and psychological issues. Music therapy is an expressive type of art therapy. There are many beneficial effects achieved through music therapy like relaxation, maintain blood pressure level, cure on medical disorders, stability in mood, and improve memory and sleep. Here we aimed to establish the main predictive factors of music listening’s relaxation and the prediction of music for music therapy using various machine learning algorithms such as Decision tree, Random Forest, Artificial Neural Network (ANN), Support Vector Machine (SVM) and hybrid of SVM ANN algorithm. The accuracy of these different methods is critically examined with the help of the accuracy performance metric. Various factors like age, gender, education level, music choice, visual analog scale score before and after listening to music for both individual and therapist suggestions on music are considered for prediction. Our study revealed that SVM-ANN hybrid classifier performance is much better than other machine learning algorithms.

Download Full-text