scholarly journals Personality Classification Experiment by Applying k-Means Clustering

Author(s):  
Assem Talasbek ◽  
Azamat Serek ◽  
Meirambek Zhaparov ◽  
Seong-Moo Yoo ◽  
Young-Kab Kim ◽  
...  

This paper describes personality classification experiment by applying k-means clustering machine learning algorithms. Several previous studies have been attempted to predict personality types of human beings automatically by using various machine learning algorithms. However, only few of them have obtained good accuracy results. To classify a person into personality types, we used Jungian Type Inventory. Our method consists of three parts: data collection, data preparation, and hyper-parameter tuning. Our testing results showed that the k-means model has 107 inertia value, which is a good number for an unsupervised learning model as an interim result. With the result, we divided the data into 16 clusters, which can be considered as personality types. We continue this research with analysis of large data to be collected in the future.

Author(s):  
Virendra Tiwari ◽  
Balendra Garg ◽  
Uday Prakash Sharma

The machine learning algorithms are capable of managing multi-dimensional data under the dynamic environment. Despite its so many vital features, there are some challenges to overcome. The machine learning algorithms still requires some additional mechanisms or procedures for predicting a large number of new classes with managing privacy. The deficiencies show the reliable use of a machine learning algorithm relies on human experts because raw data may complicate the learning process which may generate inaccurate results. So the interpretation of outcomes with expertise in machine learning mechanisms is a significant challenge in the machine learning algorithm. The machine learning technique suffers from the issue of high dimensionality, adaptability, distributed computing, scalability, the streaming data, and the duplicity. The main issue of the machine learning algorithm is found its vulnerability to manage errors. Furthermore, machine learning techniques are also found to lack variability. This paper studies how can be reduced the computational complexity of machine learning algorithms by finding how to make predictions using an improved algorithm.


Author(s):  
Sergey Pronin ◽  
Mykhailo Miroshnichenko

A system for analyzing large data sets using machine learning algorithms


2021 ◽  
Vol 23 (11) ◽  
pp. 749-758
Author(s):  
Saranya N ◽  
◽  
Kavi Priya S ◽  

Breast Cancer is one of the chronic diseases occurred to human beings throughout the world. Early detection of this disease is the most promising way to improve patients’ chances of survival. The strategy employed in this paper is to select the best features from various breast cancer datasets using a genetic algorithm and machine learning algorithm is applied to predict the outcomes. Two machine learning algorithms such as Support Vector Machines and Decision Tree are used along with Genetic Algorithm. The proposed work is experimented on five datasets such as Wisconsin Breast Cancer-Diagnosis Dataset, Wisconsin Breast Cancer-Original Dataset, Wisconsin Breast Cancer-Prognosis Dataset, ISPY1 Clinical trial Dataset, and Breast Cancer Dataset. The results exploit that SVM-GA achieves higher accuracy of 98.16% than DT-GA of 97.44%.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Utkarsh Saxena ◽  
Soumen Moulik ◽  
Soumya Ranjan Nayak ◽  
Thomas Hanne ◽  
Diptendu Sinha Roy

We attempt to predict the accidental fall of human beings due to sudden abnormal changes in their health parameters such as blood pressure, heart rate, and sugar level. In medical terminology, this problem is known as Syncope. The primary motivation is to prevent such falls by predicting abnormal changes in these health parameters that might trigger a sudden fall. We apply various machine learning algorithms such as logistic regression, a decision tree classifier, a random forest classifier, K-Nearest Neighbours (KNN), a support vector machine, and a naive Bayes classifier on a relevant dataset and verify our results with the cross-validation method. We observe that the KNN algorithm provides the best accuracy in predicting such a fall. However, the accuracy results of some other algorithms are also very close. Thus, we move one step further and propose an ensemble model, Majority Voting, which aggregates the prediction results of multiple machine learning algorithms and finally indicates the probability of a fall that corresponds to a particular human being. The proposed ensemble algorithm yields 87.42% accuracy, which is greater than the accuracy provided by the KNN algorithm.


Author(s):  
Sai Hanuman Akundi ◽  
Soujanya R ◽  
Madhuri PM

In recent years vast quantities of data have been managed in various ways of medical applications and multiple organizations worldwide have developed this type of data and, together, these heterogeneous data are called big data. Data with other characteristics, quantity, speed and variety are the word big data. The healthcare sector has faced the need to handle the large data from different sources, renowned for generating large amounts of heterogeneous data. We can use the Big Data analysis to make proper decision in the health system by tweaking some of the current machine learning algorithms. If we have a large amount of knowledge that we want to predict or identify patterns, master learning would be the way forward. In this article, a brief overview of the Big Data, functionality and ways of Big data analytics are presented, which play an important role and affect healthcare information technology significantly. Within this paper we have presented a comparative study of algorithms for machine learning. We need to make effective use of all the current machine learning algorithms to anticipate accurate outcomes in the world of nursing.


Music makes up a huge portion of the contents stored and used over the internet, with several sites and applications developed solely to provide music-related services to their users/ customers.Some of the most challenging tasks in this scenario would include music classification based on languages and genres, playlist suggestions based on music history, song suggestions based on playlist contents, top genres / songs based on listeners' rating, likes, number of streams, song loops, popularity of artists based on number of songs released per year, hit songs per year, etc. One of the most important stages to solve the above-mentioned challenges would be music genre classification. It would be impractical to analyze each and every song in a given database to identify and classify music genres, even though human beings are better at performing such tasks. Hence, useful Machine Learning algorithms and Deep Learning approaches may be used for accomplishing such tasks with ease. A thorough analysis to understand the different uses of Machine Learning and Deep Learning algorithms and relevance of such algorithms with respect to situations would be made to highlight and contrast the advantages and disadvantages of each approach. The outcomes of the optimized models would be visualized and comparedto the expected outcomes for better perception.


2021 ◽  
Vol 10 (10) ◽  
pp. 639
Author(s):  
Han Hu ◽  
Changming Wang ◽  
Zhu Liang ◽  
Ruiyuan Gao ◽  
Bailong Li

Landslides frequently occur because of natural or human factors. Landslides cause huge losses to the economy as well as human beings every year around the globe. Landslide susceptibility prediction (LSP) plays a key role in the prevention of landslides and has been under investigation for years. Although new machine learning algorithms have achieved excellent performance in terms of prediction accuracy, a sufficient quantity of training samples is essential. In contrast, it is hard to obtain enough landslide samples in most the areas, especially for the county-level area. The present study aims to explore an optimization model in conjunction with conventional unsupervised and supervised learning methods, which performs well with respect to prediction accuracy and comprehensibility. Logistic regression (LR), fuzzy c-means clustering (FCM) and factor analysis (FA) were combined to establish four models: LR model, FCM coupled with LR model, FA coupled with LR model, and FCM, FA coupled with LR model and applied in a specific area. Firstly, an inventory with 114 landslides and 10 conditioning factors was prepared for modeling. Subsequently, four models were applied to LSP. Finally, the performance was evaluated and compared by k-fold cross-validation based on statistical measures. The results showed that the coupled model by FCM, FA and LR achieved the greatest performance among these models with the AUC (Area under the curve) value of 0.827, accuracy of 85.25%, sensitivity of 74.96% and specificity of 86.21%. While the LR model performed the worst with an AUC value of 0.736, accuracy of 77%, sensitivity of 62.52% and specificity of 72.55%. It was concluded that both the dimension reduction and sample size should be considered in modeling, and the performance can be enhanced by combining complementary methods. The combination of models should be more flexible and purposeful. This work provides reference for related research and better guidance to engineering activities, decision-making by local administrations and land use planning.


Rice is one of the most important foods on earth for human beings. India and China are two countries in the world mostly depend on rice. The output of this crop depends on the many parameters such as soil, water supply, pesticides used, time duration, and infected diseases. Rice Plant Disease (RPD) is one of the important factors that decrease the quantity and quality of rice. Identifying the type of rice plant disease and taking corrective action against the disease in time is always challenging for the farmers. Although the rice plant is affected by many diseases, Bacterial Leaf Blight (BLB), Brown Spot (BS), and Leaf Smut (LS) are major diseases. Identification of this disease is really challenging because the infected leaf has to be processed by the human eye. So in this paper, we focused on machine learning techniques to identify and classify the RPD. We have collected infected rice plant data from the UCI Machine Learning repository. The data set consists of 120 images of infected rice plants in which 40 images are BLB, 40 are BS, and 40 are LS. Experiments are conducted using Decision tree-based machine learning algorithms such as RandomForest, REPTree, and J48. In order to extract the numerical features from the infected images, we have used ColourLayoutFilter supported by WEKA. Experimental analysis is done using 65% data for training and 35% data for testing. The experiments unfold that the Random Forest algorithm is exceptional in predicting RPD.


2021 ◽  
Vol 37 ◽  
pp. 01014
Author(s):  
K Devendran ◽  
S K Thangarasu ◽  
P Keerthika ◽  
R Manjula Devi ◽  
B K Ponnarasee

In this world, people are moving with lightning speed. Stress has become a usual thing we experience in our day to day routine. Some factors like work tension, emotional obstacles, brutality, etc lead to stress. Many health issues like headaches, heart problems, depression, etc and psychological issues arise in human beings due to stress. Music therapy gives qualitative results in balancing the physical and psychological issues. Music therapy is an expressive type of art therapy. There are many beneficial effects achieved through music therapy like relaxation, maintain blood pressure level, cure on medical disorders, stability in mood, and improve memory and sleep. Here we aimed to establish the main predictive factors of music listening’s relaxation and the prediction of music for music therapy using various machine learning algorithms such as Decision tree, Random Forest, Artificial Neural Network (ANN), Support Vector Machine (SVM) and hybrid of SVM ANN algorithm. The accuracy of these different methods is critically examined with the help of the accuracy performance metric. Various factors like age, gender, education level, music choice, visual analog scale score before and after listening to music for both individual and therapist suggestions on music are considered for prediction. Our study revealed that SVM-ANN hybrid classifier performance is much better than other machine learning algorithms.


Sign in / Sign up

Export Citation Format

Share Document