Prophecy on Programming Language using Machine Learning Algorithms

Komal Bhaskar Thube

doi:10.22214/ijraset.2021.35746

Prophecy on Programming Language using Machine Learning Algorithms

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35746 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 3699-3706

Author(s):

Komal Bhaskar Thube

Keyword(s):

Machine Learning ◽

Programming Language ◽

Machine Learning Algorithms ◽

Support Vector ◽

Computer Language ◽

Decision Tree Classifier ◽

Development Environment ◽

Tree Classifier ◽

Develop Software ◽

Neighbor Classifier

A programming language is a computer language developers use to develop software programs, scripts, or other sets of instruction for computers to execute. It is difficult to determine which programming language is widely used. In our work, I have analyzed and compared the classification results of various machine learning models and find out which programming language is widely used by developers. I have used Support Vector Machine (SVM), K neighbor classifier (KNN),Decision Tree Classifier(CART) for our comparative study. My task is to analyze different data and to classify them for the efficiency of each algorithm in terms of accuracy, precision, recall, and F1 Score. My best accuracy was 94.29% percent which was found using SVM. These techniques are coded in python and executed in Jupyter NoteBook, the Scientific Python Development Environment. Our experiments have shown that SVM is the best for predictive analysis and from our study that SVM is the well-suited algorithm for the prediction of the most widely used programming language.

Download Full-text

Ensemble-Based Machine Learning for Predicting Sudden Human Fall Using Health Data

Mathematical Problems in Engineering ◽

10.1155/2021/8608630 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Utkarsh Saxena ◽

Soumen Moulik ◽

Soumya Ranjan Nayak ◽

Thomas Hanne ◽

Diptendu Sinha Roy

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Majority Voting ◽

Support Vector ◽

Human Beings ◽

Medical Terminology ◽

Decision Tree Classifier ◽

Tree Classifier ◽

Health Parameters

We attempt to predict the accidental fall of human beings due to sudden abnormal changes in their health parameters such as blood pressure, heart rate, and sugar level. In medical terminology, this problem is known as Syncope. The primary motivation is to prevent such falls by predicting abnormal changes in these health parameters that might trigger a sudden fall. We apply various machine learning algorithms such as logistic regression, a decision tree classifier, a random forest classifier, K-Nearest Neighbours (KNN), a support vector machine, and a naive Bayes classifier on a relevant dataset and verify our results with the cross-validation method. We observe that the KNN algorithm provides the best accuracy in predicting such a fall. However, the accuracy results of some other algorithms are also very close. Thus, we move one step further and propose an ensemble model, Majority Voting, which aggregates the prediction results of multiple machine learning algorithms and finally indicates the probability of a fall that corresponds to a particular human being. The proposed ensemble algorithm yields 87.42% accuracy, which is greater than the accuracy provided by the KNN algorithm.

Download Full-text

Human Activity Recognition using Deep and Machine Learning Algorithms

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8835.029420 ◽

2020 ◽

Vol 9 (4) ◽

pp. 2460-2466 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Activity Recognition ◽

Human Activity ◽

Multilayer Perceptron ◽

Human Activity Recognition ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Decision Tree Classifier ◽

Tree Classifier

Activity recognition in humans is one of the active challenges that finds its application in numerous fields such as, medical health care, military, manufacturing, assistive techniques and gaming. Due to the advancements in technologies the usage of smartphones in human lives become inevitable. The sensors in the smartphones help us to measure the essential vital parameters. These measured parameters enable us to monitor the activities of humans, which we call as human activity recognition. In this paper, we have proposed an automatic human activity recognition system that independently recognizes the actions of the humans. Four deep learning approaches and thirteen different machine learning classifiers such as Multilayer Perceptron, Random Forest, Support Vector Machine, Decision Tree Classifier, AdaBoost Classifier, Gradient Boosting Classifier and others are applied to identify the efficient classifier for human activity recognition. Our proposed system is able to recognize the activities such as Laying, Sitting, Standing, Walking, Walking downstairs and Walking upstairs. Benchmark dataset has been used to evaluate all the classifiers implemented. We have investigated all these classifiers to identify a best suitable classifier for this dataset. The results obtained show that, the Multilayer Perceptron has obtained 98.46% of overall accuracy in detecting the activities. The second-best performance was observed when the classifiers are combined together.

Download Full-text

Development of Computer Vision System for Fruits

Current Journal of Applied Science and Technology ◽

10.9734/cjast/2021/v40i3631576 ◽

2021 ◽

pp. 1-11

Author(s):

Y. Dileep Sean ◽

D.D. Smith ◽

V.S.P. Bitra ◽

Vimala Bera ◽

Sk. Nafeez Umar

Keyword(s):

Machine Learning ◽

Computer Vision ◽

Fruit Quality ◽

Vision System ◽

Machine Learning Algorithms ◽

Classification Model ◽

Support Vector ◽

Small Scale ◽

Decision Tree Classifier ◽

Tree Classifier

Automated defect detection of fruits using computer vision and machine learning concepts has ‎become a significant area of research. In ‎this work, working prototype hardware model of conveyor with PC is designed, constructed and implemented to analyze the fruit quality. The prototype consists of low-cost microcontrollers, USB camera and MATLAB user interface. The automated classification model rejects or accepts the fruit based on the quality i.e., good (ripe, unripe) and bad. For the classification of fruit quality, machine learning algorithms such as Support Vector Machine, KNN, Random Forest classifier, Decision Tree classifier and ANN are used. The dataset used in this work consists of the following fruit varieties i.e., apple, orange, tomato, guava, lemon, and pomegranate. We trained, tested and ‎compared the performance of these five machine learning approaches and found out that the ANN based fruit detection performs better. The overall accuracy obtained by the ANN model for the dataset is 95.6%. In addition, the response time of the system is 50 seconds per fruit which is very low. Therefore, it will be very suitable and useful for small-scale industries and farmers to grow up their business.

Download Full-text

Heart Disease Prediction using Machine Learning

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.f9780.059120 ◽

2020 ◽

Vol 9 (1) ◽

pp. 700-704

Keyword(s):

Machine Learning ◽

Heart Disease ◽

Machine Learning Techniques ◽

Support Vector ◽

Disease Prediction ◽

Nearest Neighbour ◽

Decision Tree Classifier ◽

Support Vector Classifier ◽

Learning Techniques ◽

Tree Classifier

Deriving the methodologies to detect heart issues at an earlier stage and intimating the patient to improve their health. To resolve this problem, we will use Machine Learning techniques to predict the incidence at an earlier stage. We have a tendency to use sure parameters like age, sex, height, weight, case history, smoking and alcohol consumption and test like pressure ,cholesterol, diabetes, ECG, ECHO for prediction. In machine learning there are many algorithms which will be used to solve this issue. The algorithms include K-Nearest Neighbour, Support vector classifier, decision tree classifier, logistic regression and Random Forest classifier. Using these parameters and algorithms we need to predict whether or not the patient has heart disease or not and recommend the patient to improve his/her health.

Download Full-text

Analyzing Behavior of Cancer Patients using Machine Learning Techniques

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i8414.078919 ◽

2019 ◽

Vol 8 (9) ◽

pp. 1547-1556

Keyword(s):

Machine Learning ◽

Natural Language ◽

Cancer Patients ◽

Language Processing ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Operating Characteristics ◽

Decision Tree Classifier ◽

Tree Classifier

The online discussion forums and blogs are very vibrant platforms for cancer patients to express their views in the form of stories. These stories sometimes become a source of inspiration for some patients who are anxious in searching the similar cases. This paper proposes a method using natural language processing and machine learning to analyze unstructured texts accumulated from patient’s reviews and stories. The proposed methodology aims to identify behavior, emotions, side-effects, decisions and demographics associated with the cancer victims. The pre-processing phase of our work involves extraction of web text followed by text-cleaning where some special characters and symbols are omitted, and finally tagging the texts using NLTK’s (Natural Language Toolkit) POS (Parts of Speech) Tagger. The post-processing phase performs training of seven machine learning classifiers (refer Table 6). The Decision Tree classifier shows the higher precision (0.83) among the other classifiers while, the Area under the operating Characteristics (AUC) for Support Vector Machine (SVM) classifier is highest (0.98).

Download Full-text

QSAR Models for Active Substances Against Pseudomonas aeruginosa Using Disk-diffusion Test Data

10.20944/preprints202102.0147.v1 ◽

2021 ◽

Author(s):

Cosmin Alexandru Bugeac ◽

Robert Ancuceanu ◽

Mihaela Dinu

Keyword(s):

Pseudomonas Aeruginosa ◽

Model Development ◽

Qsar Model ◽

Machine Learning Algorithms ◽

Disk Diffusion ◽

Support Vector ◽

Decision Tree Classifier ◽

K Nearest Neighbors ◽

Disk Diffusion Test ◽

Tree Classifier

Pseudomonas aeruginosa is a Gram-negative bacillus included among the six "ESKAPE" microbial species with an outstanding ability to "escape" currently used antibiotics and developing new antibiotics against it is of the highest priority. Whereas minimum inhibitory concentration (MIC) values against Pseudomonas aeruginosa have been used previously for QSAR model development, disk diffusion results (inhibition zones) have not been apparently used for this purpose in the literature, and we decided to explore their use in this sense. We developed multiple QSAR methods using several machine learning algorithms (Support vector classifier, K Nearest Neighbors, Random Forest Classifier, Decision Tree Classifier, AdaBoost Classifier, Logistic Regression, and Naive Bayes Classifier). The main descriptors used in building the models belonged to the families of adjacency matrix, constitutional descriptors, first highest eigenvalue of Burden matrix, centered Moreau-Broto autocorrelation, and averaged and centered Moreau-Broto autocorrelation descriptors. A total of 32 models were built, of which 28 were selected and stacked to create a meta-model. In terms of balanced accuracy, the best performance was provided by KNN, SVM and AdaBoost algorithms, but the ensemble method had slightly superior results in nested cross-validation.

Download Full-text

Clustering Visualization and Class Prediction using Flask of Benchmark Dataset for Unsupervised Techniques in Machine learning

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.g5943.059720 ◽

2020 ◽

Vol 9 (7) ◽

pp. 1297-1302 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Dimensionality Reduction ◽

Support Vector ◽

Decision Tree Classifier ◽

Class Prediction ◽

Linear Discriminant ◽

Reduction Techniques ◽

Tree Classifier ◽

Dimensionality Reduction Techniques ◽

Clustering Pattern

Cutting edge improved techniques gave greater values to Artificial Intelligence (AI) and Machine Learning (ML) which are becoming a part of interest rapidly for numerous types of researches presently. Clustering and Dimensionality Reduction Techniques are one of the trending methods utilized in Machine Learning these days. Fundamentally clustering techniques such as K-means and Hierarchical is utilized to predict the data and put it into the required group in a cluster format. Clustering can be utilized in recommendation frameworks, examination of clients related to social media platforms, patients related to particular diseases of specific age groups can be categorized, etc. While most aspects of the dimensionality lessening method such as Principal Component Analysis and Linear Discriminant Analysis are a bit like the clustering method but it decreases the data size and plots the cluster. In this paper, a comparative and predictive analysis is done utilizing three different datasets namely IRIS, Wine, and Seed from the UCI benchmark in Machine learning on four distinctive techniques. The class prediction analysis of the dataset is done employing a flask-app. The main aim is to form a good clustering pattern for each dataset for given techniques. The experimental analysis calculates the accuracy of the shaped clusters used different machine learning classifiers namely Logistic Regression, K-nearest neighbors, Support Vector Machine, Gaussian Naïve Bayes, Decision Tree Classifier, and Random Forest Classifier. Cohen Kappa is another accuracy indicator used to compare the obtained classification result. It is observed that Kmeans and Hierarchical clustering analysis provide a good clustering pattern of the input dataset than the dimensionality reduction techniques. Clustering Design is well-formed in all the techniques. The KNN classifier provides an improved accuracy in all the techniques of the dataset.

Download Full-text

Sentiment Analysis on Social Media Big Data With Multiple Tweet Words

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.j9684.0881019 ◽

2019 ◽

Vol 8 (10) ◽

pp. 3429-3434 ◽

Cited By ~ 2

Keyword(s):

Machine Learning ◽

Social Media ◽

Big Data ◽

Sentiment Analysis ◽

Language Processing ◽

Sentiment Classification ◽

Support Vector ◽

Decision Tree Classifier ◽

Machine Learning Classification ◽

Tree Classifier

The main objective of this paper is Analyze the reviews of Social Media Big Data of E-Commerce product’s. And provides helpful result to online shopping customers about the product quality and also provides helpful decision making idea to the business about the customer’s mostly liking and buying products. This covers all features or opinion words, like capitalized words, sequence of repeated letters, emoji, slang words, exclamatory words, intensifiers, modifiers, conjunction words and negation words etc available in tweets. The existing work has considered only two or three features to perform Sentiment Analysis with the machine learning technique Natural Language Processing (NLP). In this proposed work familiar Machine Learning classification models namely Multinomial Naïve Bayes, Support Vector Machine, Decision Tree Classifier, and, Random Forest Classifier are used for sentiment classification. The sentiment classification is used as a decision support system for the customers and also for the business.

Download Full-text

Effect of Data Scaling Methods on Machine Learning Algorithms and Model Performance

Technologies ◽

10.3390/technologies9030052 ◽

2021 ◽

Vol 9 (3) ◽

pp. 52

Author(s):

Md Manjurul Ahsan ◽

M. A. Parvez Mahmud ◽

Pritom Kumar Saha ◽

Kishor Datta Gupta ◽

Zahed Siddique

Keyword(s):

Machine Learning ◽

Heart Disease ◽

Missing Data ◽

Feature Reduction ◽

Machine Learning Algorithms ◽

Mixed Data ◽

Support Vector ◽

Tree Classifier ◽

Data Scaling ◽

Scaling Methods

Heart disease, one of the main reasons behind the high mortality rate around the world, requires a sophisticated and expensive diagnosis process. In the recent past, much literature has demonstrated machine learning approaches as an opportunity to efficiently diagnose heart disease patients. However, challenges associated with datasets such as missing data, inconsistent data, and mixed data (containing inconsistent missing data both as numerical and categorical) are often obstacles in medical diagnosis. This inconsistency led to a higher probability of misprediction and a misled result. Data preprocessing steps like feature reduction, data conversion, and data scaling are employed to form a standard dataset—such measures play a crucial role in reducing inaccuracy in final prediction. This paper aims to evaluate eleven machine learning (ML) algorithms—Logistic Regression (LR), Linear Discriminant Analysis (LDA), K-Nearest Neighbors (KNN), Classification and Regression Trees (CART), Naive Bayes (NB), Support Vector Machine (SVM), XGBoost (XGB), Random Forest Classifier (RF), Gradient Boost (GB), AdaBoost (AB), Extra Tree Classifier (ET)—and six different data scaling methods—Normalization (NR), Standscale (SS), MinMax (MM), MaxAbs (MA), Robust Scaler (RS), and Quantile Transformer (QT) on a dataset comprising of information of patients with heart disease. The result shows that CART, along with RS or QT, outperforms all other ML algorithms with 100% accuracy, 100% precision, 99% recall, and 100% F1 score. The study outcomes demonstrate that the model’s performance varies depending on the data scaling method.

Download Full-text

Prediction of Students’ Performance based on Academic, Behaviour, Extra and Co-Curricular Activities

Webology ◽

10.14704/web/v18si01/web18058 ◽

2021 ◽

Vol 18 (Special Issue 01) ◽

pp. 262-279

Author(s):

T. Jenitha ◽

S. Santhi ◽

J. Monisha Privthy Jeba

Keyword(s):

Extracurricular Activities ◽

Family Background ◽

Machine Learning Algorithms ◽

Support Vector ◽

Academic Institutions ◽

Physical And Mental Health ◽

Decision Tree Classifier ◽

Academic Scholarship ◽

Tree Classifier ◽

Training Programmes

Since Academic institutions contain huge volume of data regarding students such as academic scores, scores in co and extracurricular activities, family annual income, family background and other supporting documents, predicting individual students performance in all aspects manually is a difficult task. The proposed work uses data mining techniques to identify students who are eligible for scholarships and other benefits. Students are classified into different categories by means of academic, behavior, extra and co-curricular activities. Machine Learning algorithms such as Naive Bayes, Decision Tree Classifier and Support Vector Machine are used for predicting the performance of the student. With the help of this proposed model parents and instructors can monitor student’s performance and they can also provide essential technical and moral support. Also this helps in providing academic scholarship and training to the students to support them financially and to enrich their knowledge. It suggests the Academic Institutions to organize induction or training programmes at the beginning of the semester. Technical training, motivational talks, Yoga, etc are organized by the institutions by keeping in mind of students physical and mental health. Considering the e-learning platforms huge volumes of data and plethora of information are generated. In this work, various learning models are constructed and their accuracies are compared to analyse which algorithm out-performs.

Download Full-text