Supervised Learning Applied to Graduation Forecast of Industrial Engineering Students

<p style="text-align: justify;">The article aims to develop a machine-learning algorithm that can predict student’s graduation in the Industrial Engineering course at the Federal University of Amazonas based on their performance data. The methodology makes use of an information package of 364 students with an admission period between 2007 and 2019, considering characteristics that can affect directly or indirectly in the graduation of each one, being: type of high school, number of semesters taken, grade-point average, lockouts, dropouts and course terminations. The data treatment considered the manual removal of several characteristics that did not add value to the output of the algorithm, resulting in a package composed of 2184 instances. Thus, the logistic regression, MLP and XGBoost models developed and compared could predict a binary output of graduation or non-graduation to each student using 30% of the dataset to test and 70% to train, so that was possible to identify a relationship between the six attributes explored and achieve, with the best model, 94.15% of accuracy on its predictions.</p>

Download Full-text

An Overview of the RELIEF Algorithm and Advancements

Statistical Approaches to Gene X Environment Interactions for Complex Phenotypes ◽

10.7551/mitpress/9780262034685.003.0006 ◽

2016 ◽

Author(s):

Alexandre Todorov

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Learning Algorithm ◽

Simulated Data ◽

Environment Interaction ◽

Machine Learning Algorithm ◽

Gene Environment Interaction ◽

Snp Data ◽

Relief Algorithm ◽

Gene Environment

The aim of the RELIEF algorithm is to filter out features (e.g., genes, environmental factors) that are relevant to a trait of interest, starting from a set of that may include thousands of irrelevant features. Though widely used in many fields, its application to the study of gene-environment interaction studies has been limited thus far. We provide here an overview of this machine learning algorithm and some of its variants. Using simulated data, we then compare of the performance of RELIEF to that of logistic regression for screening for gene-environment interactions in SNP data. Even though performance degrades in larger sets of markers, RELIEF remains a competitive alternative to logistic regression, and shows clear promise as a tool for the study of gene-environment interactions. Areas for further improvements of the algorithm are then suggested.

Download Full-text

Plural marking patterns of nouns and their associates in the world’s languages

Studies in Language ◽

10.1075/sl.16001.che ◽

2020 ◽

Vol 44 (1) ◽

pp. 231-269

Author(s):

Rong Chen

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Learning Algorithm ◽

Machine Learning Algorithm ◽

Data Set ◽

Component Structure ◽

Universal Distribution ◽

Two Component ◽

The World ◽

Plural Marking

Abstract Plural marking reaches most corners of languages. When a noun occurs with another linguistic element, which is called associate in this paper, plural marking on the two-component structure has four logically possible patterns: doubly unmarked, noun-marked, associate-marked and doubly marked. These four patterns do not distribute homogeneously in the world’s languages, because they are motivated by two competing motivations iconicity and economy. Some patterns are preferred over others, and this preference is consistently found in languages across the world. In other words, there exists a universal distribution of the four plural marking patterns. Furthermore, holding the view that plural marking on associates expresses plurality of nouns, I propose a hypothetical universal which uses the number of pluralized associates to predict plural marking on nouns. A data set collected from a sample of 100 languages is used to test the hypothetical universal, by employing the machine learning algorithm logistic regression.

Download Full-text

Job Recommendation System Implementation in Python vs. C++

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d8132.118419 ◽

2019 ◽

Vol 8 (4) ◽

pp. 2299-2302

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Programming Language ◽

Mathematical Description ◽

Recommendation System ◽

Learning Algorithm ◽

Machine Learning Algorithm ◽

Learning Problem ◽

Wide Range ◽

Job Recommendation

Implementing a machine learning algorithm gives you a deep and practical appreciation for how the algorithm works. This knowledge can also help you to internalize the mathematical description of the algorithm by thinking of the vectors and matrices as arrays and the computational intuitions for the transformations on those structures. There are numerous micro-decisions required when implementing a machine learning algorithm, like Select programming language, Select Algorithm, Select Problem, Research Algorithm, Unit Test and these decisions are often missing from the formal algorithm descriptions. The notion of implementing a job recommendation (a classic machine learning problem) system using to two algorithms namely, KNN [3] and logistic regression [3] in more than one programming language (C++ and python) is introduced and we bring here the analysis and comparison of performance of each. We specifically focus on building a model for predictions of jobs in the field of computer sciences but they can be applied to a wide range of other areas as well. This paper can be used by implementers to deduce which language will best suite their needs to achieve accuracy along with efficiency We are using more than one algorithm to establish the fact that our finding is not just singularly applicable.

Download Full-text

Missing Data Imputation using Machine Learning Algorithm for Supervised Learning

2021 International Conference on Computer Communication and Informatics (ICCCI) ◽

10.1109/iccci50826.2021.9402558 ◽

2021 ◽

Author(s):

D. Cenitta ◽

R. Vijaya Arjunan ◽

Prema K V

Keyword(s):

Machine Learning ◽

Missing Data ◽

Supervised Learning ◽

Learning Algorithm ◽

Machine Learning Algorithm ◽

Data Imputation ◽

Missing Data Imputation

Download Full-text

Penerapan algoritma C5.0 pada analisis faktor-faktor pengaruh kelulusan tepat waktu mahasiswa Teknik Informatika UMM

Repositor ◽

10.22219/repositor.v1i2.545 ◽

2020 ◽

Vol 1 (2) ◽

pp. 131

Author(s):

Vinna Rahmayanti ◽

Yufis Azhar ◽

Andriani Eka Pramudita

Keyword(s):

High School ◽

Feature Selection ◽

Grade Point Average ◽

Engineering Students ◽

Senior High School ◽

Credit System ◽

Grade Point ◽

Academic Credit ◽

Entry Status ◽

School Status

AbstrakKelulusan tepat waktu mahasiswa merupakan salah satu permasalahan yang sulit untuk diatasi oleh setiap pihak perguruan tinggi, begitu pula pada jurusan Teknik Informatika Universitas Muhammadiyah Malang. Permasalahan ini harus segera diatasi mengingat kualitas mahasiswa akan mempengaruhi sebuah akreditasi perguruan tinggi maupun jurusan. Oleh karena itu, perlu dilakukan analisis faktor-faktor pengaruh kelulusan tepat waktu mahasiswa Teknik Informatika UMM. Penelitian ini menggunakan algoritma C5.0 untuk melakukan seleksi fitur penting dan analisis regresi untuk melakukan estimasi peluang kelulusan tepat waktu mahasiswa. Variabel bebas yang digunakan adalah jenis kelamin, asal daerah, status masuk, SKS semester 4, SKS semester 6, IP semester 2, IP semester 4, IP semester 6, IPK semester 2, IPK semester 4, IPK semester 6, jenis SMA, status SMA, pendidikan orang tua, dan pekerjaan orang tua. Hasil implementasi algoritma C5.0 pada penelitian ini mampu melakukan seleksi fitur dengan menghasilkan 8 dari total keseluruhan 15 fitur dengan nilai akurasi yang lebih baik dibandingkan nilai akurasi yang menggunakan keseluruhan fitur. Serta, penelitian ini mampu memberikan model regresi dengan nilai akurasi sebesar 82%.Abstract Timely graduation of college students is one of the problems that is difficult to overcome by each college, as well as in the Department of Informatics, University of Muhammadiyah Malang. This problem must be resolved immediately, considering the quality of students will affect the accreditation of university and its majors. So, it is necessary to analyze the factors that influence the timely graduation of Informatics Engineering students in UMM. This study uses the C5.0 algorithm to do feature selection and regression analysis to estimate the opportunities of timely graduation. The independent variables used are gender, regional origin, entry status, academic credit system in 4th semester, academic credit system in 6th semester, grade point of 2nd semester, grade point of 4th semester, grade point of 6th semester, grade point average of 2nd semester, grade point average of 4th semester, grade point average of 6th semester, type of senior high school, status of senior high school, parent’s education, and parent’s job. The results of the implementation of the C5.0 algorithm in this study were able to do feature selection by producing 8 out of total 15 features with better accuracy than the value of accuracy using all features. And this study is able to provide a regression model with an accuracy value of 82%.

Download Full-text

Motion Detection and Prediction Using Machine Learning Algorithm

Issue 4 - Journal of Science and Technology ◽

10.46243/jst.2020.v5.i5.pp220-226 ◽

2020 ◽

pp. 220-226

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Logistic Regression ◽

Linear Regression ◽

21St Century ◽

Motion Detection ◽

Data Analytics ◽

Learning Algorithm ◽

Machine Learning Algorithm ◽

Processing Data

Machine learning is a branch of Artificial Intelligence which is gaining importance in the 21st century with increasing processing speeds and miniaturization of sensors, the applications of Artificial Intelligence and cognitive technologies are growing rapidly. An array of ultrasonic sensors i.e., HCSR-04 is placed at different directions, collecting data for a particularinterval of a period during a particular day. The acquired sensor values are subjected to pre-processing, data analytics, and visualization. The prepared data is now split into test and train. A prediction model is designed using logistic regression and linear regression and checked for accuracy, F1 score, and precision compared.

Download Full-text

A Novel Machine Learning Algorithm Predicts Dementia With Lewy Bodies Versus Parkinson’s Disease Dementia Based on Clinical and Neuropsychological Scores

Journal of Geriatric Psychiatry and Neurology ◽

10.1177/0891988721993556 ◽

2021 ◽

pp. 089198872199355

Author(s):

Anastasia Bougea ◽

Efthymia Efthymiopoulou ◽

Ioanna Spanou ◽

Panagiotis Zikos

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Neuropsychological Tests ◽

Naive Bayes ◽

Learning Algorithm ◽

Naïve Bayes ◽

Classification Model ◽

Support Vector ◽

Ensemble Model ◽

Machine Learning Algorithm

Objective: Our aim was to develop a machine learning algorithm based only on non-invasively clinic collectable predictors, for the accurate diagnosis of these disorders. Methods: This is an ongoing prospective cohort study ( ClinicalTrials.gov identifier NCT number NCT04448340) of 78 PDD and 62 DLB subjects whose diagnostic follow-up is available for at least 3 years after the baseline assessment. We used predictors such as clinico-demographic characteristics, 6 neuropsychological tests (mini mental, PD Cognitive Rating Scale, Brief Visuospatial Memory test, Symbol digit written, Wechsler adult intelligence scale, trail making A and B). We investigated logistic regression, K-Nearest Neighbors (K-NNs) Support Vector Machine (SVM), Naïve Bayes classifier, and Ensemble Model for their ability to predict successfully PDD or DLB diagnosis. Results: The K-NN classification model had an accuracy 91.2% of overall cases based on 15 best clinical and cognitive scores achieving 96.42% sensitivity and 81% specificity on discriminating between DLB and PDD. The binomial logistic regression classification model achieved an accuracy of 87.5% based on 15 best features, showing 93.93% sensitivity and 87% specificity. The SVM classification model had an accuracy 84.6% of overall cases based on 15 best features achieving 90.62% sensitivity and 78.58% specificity. A model created on Naïve Bayes classification had 82.05% accuracy, 93.10% sensitivity and 74.41% specificity. Finally, an Ensemble model, synthesized by the individual ones, achieved 89.74% accuracy, 93.75% sensitivity and 85.73% specificity. Conclusion: Machine learning method predicted with high accuracy, sensitivity and specificity PDD or DLB diagnosis based on non-invasively and easily in-the-clinic and neuropsychological tests.

Download Full-text

A Solution to Cartpole using Neural Networks and Tensorflow

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d7519.118419 ◽

2019 ◽

Vol 8 (4) ◽

pp. 920-924

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Logistic Regression ◽

Computational Efficiency ◽

Learning Algorithm ◽

Machine Learning Techniques ◽

Machine Learning Algorithm ◽

Learning Techniques ◽

The Right ◽

Algorithm Benchmarking

Machine learning is not quite a new topic for discussion these days. A lot of enthusiasts excel in this field. The problem just lies with the beginners who lack just the right amount of intuition in to step ahead in this field. This paper is all about finding a simple enough solution to this issue through an example problem Cart-Pole an Open AI Gym’s classic Machine Learning algorithm benchmarking tool. The contents here will provide a perception to Machine Learning and will help beginners get familiar with the field quite a lot. Machine Learning techniques like Regression which further includes Linear and Logistic Regression, forming the basics of Neural Networks using familiar terms from Logistic regression would be mentioned here. Along with using TensorFlow, a Google’s project initiative which is widely used today for computational efficiency would be all of the techniques used here to solve the trivial game Cart-Pole

Download Full-text