Ambo University Student’s Case Classification Models using Support Vector Machine

The main objective of ambo university is to provide quality education and improve the overall performance of an students by looking at individual students’ problems cases. One way to analysis students’ cases personally is to identify the problems causes and guide the students to solve the problems. Following this, the department Academic council and Academic Commission is whole authorized people to make the decision manually so this will consume more time and energy. This research focused to learning classification models for predicting students problems cases using support vector classification techniques. Finally, performance of the model evaluated using precision, recall and F-measure evaluation parameters.

Download Full-text

Classification and Regression Models for Genomic Selection of Skewed Phenotypes: A Case for Disease Resistance in Winter Wheat (Triticum aestivum L.)

10.1101/2021.12.16.472985 ◽

2021 ◽

Author(s):

Lance F Merrick ◽

Dennis N Lozada ◽

Xianming Chen ◽

Arron H Carter

Keyword(s):

Support Vector Machine ◽

Winter Wheat ◽

Genomic Selection ◽

Stripe Rust ◽

Regression Models ◽

Prediction Models ◽

Support Vector ◽

Classification Models ◽

Breeding Lines ◽

Classification And Regression

Most genomic prediction models are linear regression models that assume continuous and normally distributed phenotypes, but responses to diseases such as stripe rust (caused by Puccinia striiformis f. sp. tritici) are commonly recorded in ordinal scales and percentages. Disease severity (SEV) and infection type (IT) data in germplasm screening nurseries generally do not follow these assumptions. On this regard, researchers may ignore the lack of normality, transform the phenotypes, use generalized linear models, or use supervised learning algorithms and classification models with no restriction on the distribution of response variables, which are less sensitive when modeling ordinal scores. The goal of this research was to compare classification and regression genomic selection models for skewed phenotypes using stripe rust SEV and IT in winter wheat. We extensively compared both regression and classification prediction models using two training populations composed of breeding lines phenotyped in four years (2016-2018, and 2020) and a diversity panel phenotyped in four years (2013-2016). The prediction models used 19,861 genotyping-by-sequencing single-nucleotide polymorphism markers. Overall, square root transformed phenotypes using rrBLUP and support vector machine regression models displayed the highest combination of accuracy and relative efficiency across the regression and classification models. Further, a classification system based on support vector machine and ordinal Bayesian models with a 2-Class scale for SEV reached the highest class accuracy of 0.99. This study showed that breeders can use linear and non-parametric regression models within their own breeding lines over combined years to accurately predict skewed phenotypes.

Download Full-text

Integration of synthetic minority oversampling technique for imbalanced class

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v13.i1.pp102-108 ◽

2019 ◽

Vol 13 (1) ◽

pp. 102

Author(s):

Noviyanti Santoso ◽

Wahyu Wibowo ◽

Hilda Hikmawati

Keyword(s):

Machine Learning ◽

Data Mining ◽

Support Vector Machine ◽

Class Imbalance ◽

Original Data ◽

Support Vector ◽

Classification Methods ◽

Problematic Issue ◽

Imbalanced Class ◽

F Measure

In the data mining, a class imbalance is a problematic issue to look for the solutions. It probably because machine learning is constructed by using algorithms with assuming the number of instances in each balanced class, so when using a class imbalance, it is possible that the prediction results are not appropriate. They are solutions offered to solve class imbalance issues, including oversampling, undersampling, and synthetic minority oversampling technique (SMOTE). Both oversampling and undersampling have its disadvantages, so SMOTE is an alternative to overcome it. By integrating SMOTE in the data mining classification method such as Naive Bayes, Support Vector Machine (SVM), and Random Forest (RF) is expected to improve the performance of accuracy. In this research, it was found that the data of SMOTE gave better accuracy than the original data. In addition to the three classification methods used, RF gives the highest average AUC, F-measure, and G-means score.

Download Full-text

Implementation of n-gram Methodology for Rotten Tomatoes Review Dataset Sentiment Analysis

International Journal of Knowledge Discovery in Bioinformatics ◽

10.4018/ijkdb.2017010103 ◽

2017 ◽

Vol 7 (1) ◽

pp. 30-41 ◽

Cited By ~ 12

Author(s):

Prayag Tiwari ◽

Brojo Kishore Mishra ◽

Sachin Kumar ◽

Vivek Kumar

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Sentiment Analysis ◽

Maximum Entropy ◽

Learning Strategies ◽

Supervised Machine Learning ◽

Support Vector ◽

N Gram ◽

F Measure ◽

Blog Posts

Sentiment Analysis intends to get the basic perspective of the content, which may be anything that holds a subjective supposition, for example, an online audit, Comments on Blog posts, film rating and so forth. These surveys and websites might be characterized into various extremity gatherings, for example, negative, positive, and unbiased keeping in mind the end goal to concentrate data from the info dataset. Supervised machine learning strategies group these reviews. In this paper, three distinctive machine learning calculations, for example, Support Vector Machine (SVM), Maximum Entropy (ME) and Naive Bayes (NB), have been considered for the arrangement of human conclusions. The exactness of various strategies is basically inspected keeping in mind the end goal to get to their execution on the premise of parameters, e.g. accuracy, review, f-measure, and precision.

Download Full-text

Assessment of Machine Learning Models to Identify Port Jackson Shark Behaviours Using Tri-Axial Accelerometers

Sensors ◽

10.3390/s20247096 ◽

2020 ◽

Vol 20 (24) ◽

pp. 7096

Author(s):

Julianna P. Kadar ◽

Monique A. Ladds ◽

Joanna Day ◽

Brianne Lyall ◽

Culum Brown

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Classification Tree ◽

Support Vector ◽

Fine Scale ◽

Learning Models ◽

Port Jackson ◽

F Measure ◽

Machine Learning Models ◽

Broad Scale

Movement ecology has traditionally focused on the movements of animals over large time scales, but, with advancements in sensor technology, the focus can become increasingly fine scale. Accelerometers are commonly applied to quantify animal behaviours and can elucidate fine-scale (<2 s) behaviours. Machine learning methods are commonly applied to animal accelerometry data; however, they require the trial of multiple methods to find an ideal solution. We used tri-axial accelerometers (10 Hz) to quantify four behaviours in Port Jackson sharks (Heterodontus portusjacksoni): two fine-scale behaviours (<2 s)—(1) vertical swimming and (2) chewing as proxy for foraging, and two broad-scale behaviours (>2 s–mins)—(3) resting and (4) swimming. We used validated data to calculate 66 summary statistics from tri-axial accelerometry and assessed the most important features that allowed for differentiation between the behaviours. One and two second epoch testing sets were created consisting of 10 and 20 samples from each behaviour event, respectively. We developed eight machine learning models to assess their overall accuracy and behaviour-specific accuracy (one classification tree, five ensemble learners and two neural networks). The support vector machine model classified the four behaviours better when using the longer 2 s time epoch (F-measure 89%; macro-averaged F-measure: 90%). Here, we show that this support vector machine (SVM) model can reliably classify both fine- and broad-scale behaviours in Port Jackson sharks.

Download Full-text

Classification study of solvation free energies of organic molecules using machine learning techniques

RSC Advances ◽

10.1039/c4ra07961b ◽

2014 ◽

Vol 4 (106) ◽

pp. 61624-61630 ◽

Cited By ~ 8

Author(s):

N. S. Hari Narayana Moorthy ◽

Silvia A. Martins ◽

Sergio F. Sousa ◽

Maria J. Ramos ◽

Pedro A. Fernandes

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Organic Molecules ◽

Machine Learning Techniques ◽

Support Vector ◽

Classification Models ◽

Free Energies ◽

Learning Techniques ◽

Solvation Free Energies

Classification models to predict the solvation free energies of organic molecules were developed using decision tree, random forest and support vector machine approaches and with MACCS fingerprints, MOE and PaDEL descriptors.

Download Full-text

Detection Of Spam Comments On Instagram Using Complementary Naïve Bayes

IJCCS (Indonesian Journal of Computing and Cybernetics Systems) ◽

10.22146/ijccs.47046 ◽

2019 ◽

Vol 13 (3) ◽

pp. 263

Author(s):

Nur Azizul Haqimi ◽

Nur Rokhman ◽

Sigit Priyanta

Keyword(s):

Social Media ◽

Support Vector Machine ◽

Test Data ◽

Training Data ◽

Classification Method ◽

Support Vector ◽

Test Results ◽

Imbalanced Dataset ◽

Web Based ◽

F Measure

Instagram (IG) is a web-based and mobile social media application where users can share photos or videos with available features. Upload photos or videos with captions that contain an explanation of the photo or video that can reap spam comments. Comments on spam containing comments that are not relevant to the caption and photos. The problem that arises when identifying spam is non-spam comments are more dominant than spam comments so that it leads to the problem of the imbalanced dataset. A balanced dataset can influence the performance of a classification method. This is the focus of research related to the implementation of the CNB method in dealing with imbalance datasets for the detection of Instagram spam comments. The study used TF-IDF weighting with Support Vector Machine (SVM) as a comparison classification. Based on the test results with 2500 training data and 100 test data on the imbalanced dataset (25% spam and 75% non-spam), the CNB accuracy was 92%, precision 86% and f-measure 93%. Whereas SVM produces 87% accuracy, 79% precision, 88% f-measure. In conclusion, the CNB method is more suitable for detecting spam comments in cases of imbalanced datasets.

Download Full-text

Modified framework for sarcasm detection and classification in sentiment analysis

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v13.i3.pp1175-1183 ◽

2019 ◽

Vol 13 (3) ◽

pp. 1175 ◽

Cited By ~ 1

Author(s):

Mohd Suhairi Md Suhaimin ◽

Mohd Hanafi Ahmad Hijazi ◽

Rayner Alfred ◽

Frans Coenen

Keyword(s):

Support Vector Machine ◽

Sentiment Analysis ◽

Classification Performance ◽

Sentiment Classification ◽

Support Vector ◽

Linear Support Vector Machine ◽

Social Media Data ◽

Classification Framework ◽

Media Data ◽

F Measure

<span>Sentiment analysis is directed at identifying people's opinions, beliefs, views and emotions in the context of the entities and attributes that appear in text. The presence of sarcasm, however, can significantly hamper sentiment analysis. In this paper a sentiment classification framework is presented that incorporates sarcasm detection. The framework was evaluated using a non-linear Support Vector Machine and Malay social media data. The results obtained demonstrated that the proposed sarcasm detection process could successfully detect the presence of sarcasm in that better sentiment classification performance was recorded. A best average F-measure score of 0.905 was recorded using the framework; a significantly better result than when sentiment classification was performed without sarcasm detection.</span>

Download Full-text

Improved Cost-Sensitive Support Vector Machine Classifier for Breast Cancer Diagnosis

Mathematical Problems in Engineering ◽

10.1155/2018/3875082 ◽

2018 ◽

Vol 2018 ◽

pp. 1-13 ◽

Cited By ~ 4

Author(s):

Na Liu ◽

Jiang Shen ◽

Man Xu ◽

Dan Gan ◽

Er-Shi Qi ◽

...

Keyword(s):

Breast Cancer ◽

Support Vector Machine ◽

Cancer Diagnosis ◽

Classification Accuracy ◽

Support Vector Machine Classifier ◽

Treatment Plan ◽

Breast Cancer Diagnosis ◽

Support Vector ◽

Classification Models ◽

Misclassification Costs

As one of the most prevalent cancers among women worldwide, breast cancer has attracted the most attention by researchers. It has been verified that an accurate and early detection of breast cancer can increase the chances for the patients to take the right treatment plan and survive for a long time. Nowadays, numerous classification methods have been utilized for breast cancer diagnosis. However, most of these classification models have concentrated on maximum the classification accuracy, failed to take into account the unequal misclassification costs for the breast cancer diagnosis. To the best of our knowledge, misclassifying the cancerous patient as non-cancerous has much higher cost compared to misclassifying the non-cancerous as cancerous. Consequently, in order to tackle this deficiency and further improve the classification accuracy of the breast cancer diagnosis, we propose an improved cost-sensitive support vector machine classifier (ICS-SVM) for the diagnosis of breast cancer. In the proposed approach, we take full account of unequal misclassification costs of breast cancer intelligent diagnosis and provide more reasonable results over previous works and conventional classification models. To evaluate the performance of the proposed approach, Wisconsin Breast Cancer (WBC) and Wisconsin Diagnostic Breast Cancer (WDBC) breast cancer datasets obtained from the University of California at Irvine (UCI) machine learning repository have been studied. The experimental results demonstrate that the proposed hybrid algorithm outperforms all the existing methods. Promisingly, the proposed method can be regarded as a useful clinical tool for breast cancer diagnosis and could also be applied to other illness diagnosis.

Download Full-text