Support Directional Shifting Vector: A Direction Based Machine Learning Classifier

Machine learning models have been very popular nowadays for providing rigorous solutions to complicated real-life problems. There are three main domains named supervised, unsupervised, and reinforcement. Supervised learning mainly deals with regression and classification. There exist several types of classification algorithms, and these are based on various bases. The classification performance varies based on the dataset velocity and the algorithm selection. In this article, we have focused on developing a model of angular nature that performs supervised classification. Here, we have used two shifting vectors named Support Direction Vector (SDV) and Support Origin Vector (SOV) to form a linear function. These vectors form a linear function to measure cosine-angle with both the target class data and the non-target class data. Considering target data points, the linear function takes such a position that minimizes its angle with target class data and maximizes its angle with non-target class data. The positional error of the linear function has been modelled as a loss function which is iteratively optimized using the gradient descent algorithm. In order to justify the acceptability of this method, we have implemented this model on three different standard datasets. The model showed comparable accuracy with the existing standard supervised classification algorithm. Doi: 10.28991/esj-2021-01306 Full Text: PDF

Download Full-text

Privacy as Protection of the Incomputable Self: From Agnostic to Agonistic Machine Learning

Theoretical Inquiries in Law ◽

10.1515/til-2019-0004 ◽

2019 ◽

Vol 20 (1) ◽

pp. 83-121 ◽

Cited By ~ 16

Author(s):

Mireille Hildebrandt

Keyword(s):

Machine Learning ◽

Research Design ◽

Real Life ◽

Big Data Analytics ◽

Validity And Reliability ◽

Human Identity ◽

The Right To Privacy ◽

Trade Offs ◽

Life Problems ◽

The Right

Abstract This Article takes the perspective of law and philosophy, integrating insights from computer science. First, I will argue that in the era of big data analytics we need an understanding of privacy that is capable of protecting what is uncountable, incalculable or incomputable about individual persons. To instigate this new dimension of the right to privacy, I expand previous work on the relational nature of privacy, and the productive indeterminacy of human identity it implies, into an ecological understanding of privacy, taking into account the technological environment that mediates the constitution of human identity. Second, I will investigate how machine learning actually works, detecting a series of design choices that inform the accuracy of the outcome, each entailing trade-offs that determine the relevance, validity and reliability of the algorithm’s accuracy for real life problems. I argue that incomputability does not call for a rejection of machine learning per se but calls for a research design that enables those who will be affected by the algorithms to become involved and to learn how machines learn — resulting in a better understanding of their potential and limitations. A better understanding of the limitations that are inherent in machine learning will deflate some of the eschatological expectations, and provide for better decision-making about whether and if so how to implement machine learning in specific domains or contexts. I will highlight how a reliable research design aligns with purpose limitation as core to its methodological integrity. This Article, then, advocates a practice of “agonistic machine learning” that will contribute to responsible decisions about the integration of data-driven applications into our environments while simultaneously bringing them under the Rule of Law. This should also provide the best means to achieve effective protection against overdetermination of individuals by machine inferences.

Download Full-text

Tree based Machine Learning in Predicting the Price of Green Building

Alinteri Journal of Agricultural Sciences ◽

10.47059/alinteri/v36i1/ajas21081 ◽

2021 ◽

Vol 36 (1) ◽

pp. 583-589

Author(s):

Suraya Masrom ◽

Thuraiya Mohd ◽

Nur Syafiqah Jamil

Keyword(s):

Machine Learning ◽

Negative Impact ◽

Real Life ◽

Green Building ◽

Learning Approaches ◽

Learning Models ◽

Collection Method ◽

Life Problems ◽

Experimental Implementation ◽

Machine Learning Models

Researchers and industry players acknowledged that machine learning application is useful in assisting human for solving many kinds of real life problems, including in real estate and property industry. In this paper, we present the empirical steps for implementing machine learning approaches in the prediction of green building price. Green building conserve natural resources and reduce the negative impact of the building development. This paper provides a report from the data collection method, preliminary data analysis with statistical method, and the experimental implementation of the machine learning models from training, validating to testing. The results show that the tree based machine learning produced better performances on the green building properties, which further tested with another five hold-out data. The testing results show that the machine learning with tree based scheme was able to predict the green building price higher than the observed price for the eight out of the ten cases within the acceptable valuation ranges.

Download Full-text

Education 4.0: Teaching the Basics of KNN, LDA and Simple Perceptron Algorithms for Binary Classification Problems

Future Internet ◽

10.3390/fi13080193 ◽

2021 ◽

Vol 13 (8) ◽

pp. 193

Author(s):

Diego Lopez-Bernal ◽

David Balderas ◽

Pedro Ponce ◽

Arturo Molina

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Binary Classification ◽

Real Life ◽

Basic Knowledge ◽

Classification Problems ◽

K Nearest Neighbor ◽

Linear Discriminant ◽

Advantages And Disadvantages ◽

Life Problems

One of the main focuses of Education 4.0 is to provide students with knowledge on disruptive technologies, such as Machine Learning (ML), as well as the skills to implement this knowledge to solve real-life problems. Therefore, both students and professors require teaching and learning tools that facilitate the introduction to such topics. Consequently, this study looks forward to contributing to the development of those tools by introducing the basic theory behind three machine learning classifying algorithms: K-Nearest-Neighbor (KNN), Linear Discriminant Analysis (LDA), and Simple Perceptron; as well as discussing the diverse advantages and disadvantages of each method. Moreover, it is proposed to analyze how these methods work on different conditions through their implementation over a test bench. Thus, in addition to the description of each algorithm, we discuss their application to solving three different binary classification problems using three different datasets, as well as comparing their performances in these specific case studies. The findings of this study can be used by teachers to provide students the basic knowledge of KNN, LDA, and perceptron algorithms, and, at the same time, it can be used as a guide to learn how to apply them to solve real-life problems that are not limited to the presented datasets.

Download Full-text

The Psychologist in the Medical School: An Example of Solving Real-Life Problems

PsycEXTRA Dataset ◽

10.1037/e465412008-270 ◽

1970 ◽

Author(s):

Matisyohu Weisenberg ◽

Carl Eisdorfer ◽

C. Richard Fletcher ◽

Murray Wexler

Keyword(s):

Medical School ◽

Real Life ◽

Life Problems

Download Full-text

Binary Spectrum Feature for Improved Classiﬁer Performance

10.36227/techrxiv.12993122 ◽

2020 ◽

Author(s):

Nalika Ulapane ◽

Karthick Thiyagarajan ◽

sarath kodagoda

Keyword(s):

Machine Learning ◽

Classification Performance ◽

Feature Reduction ◽

Sensor Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Monitoring Task ◽

Classifier Performance ◽

Spectrum Feature

<div>Classiﬁcation has become a vital task in modern machine learning and Artiﬁcial Intelligence applications, including smart sensing. Numerous machine learning techniques are available to perform classiﬁcation. Similarly, numerous practices, such as feature selection (i.e., selection of a subset of descriptor variables that optimally describe the output), are available to improve classiﬁer performance. In this paper, we consider the case of a given supervised learning classiﬁcation task that has to be performed making use of continuous-valued features. It is assumed that an optimal subset of features has already been selected. Therefore, no further feature reduction, or feature addition, is to be carried out. Then, we attempt to improve the classiﬁcation performance by passing the given feature set through a transformation that produces a new feature set which we have named the “Binary Spectrum”. Via a case study example done on some Pulsed Eddy Current sensor data captured from an infrastructure monitoring task, we demonstrate how the classiﬁcation accuracy of a Support Vector Machine (SVM) classiﬁer increases through the use of this Binary Spectrum feature, indicating the feature transformation’s potential for broader usage.</div><div><br></div>

Download Full-text

Students Self-Learning of Real-Life Problems Using Significant Learning Taxonomy

SSRN Electronic Journal ◽

10.2139/ssrn.3510908 ◽

2019 ◽

Author(s):

Hameed Sulaiman

Keyword(s):

Real Life ◽

Life Problems ◽

Self Learning

Download Full-text

Document Preprocessing with TF-IDF to Improve the Polarity Classification Performance of Unstructured Sentiment Analysis

Kinetik Game Technology Information System Computer Network Computing Electronics and Control ◽

10.22219/kinetik.v5i3.1066 ◽

2020 ◽

pp. 235-242

Author(s):

Farrikh Alzami ◽

Erika Devi Udayanti ◽

Dwi Puji Prabowo ◽

Rama Aria Megantara

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Random Forest ◽

Sentiment Analysis ◽

Classification Performance ◽

Document Preparation ◽

Learning Models ◽

Polarity Classification ◽

Negative Sentiment ◽

Machine Learning Models

Sentiment analysis in terms of polarity classification is very important in everyday life, with the existence of polarity, many people can find out whether the respected document has positive or negative sentiment so that it can help in choosing and making decisions. Sentiment analysis usually done manually. Therefore, an automatic sentiment analysis classification process is needed. However, it is rare to find studies that discuss extraction features and which learning models are suitable for unstructured sentiment analysis types with the Amazon food review case. This research explores some extraction features such as Word Bags, TF-IDF, Word2Vector, as well as a combination of TF-IDF and Word2Vector with several machine learning models such as Random Forest, SVM, KNN and Naïve Bayes to find out a combination of feature extraction and learning models that can help add variety to the analysis of polarity sentiments. By assisting with document preparation such as html tags and punctuation and special characters, using snowball stemming, TF-IDF results obtained with SVM are suitable for obtaining a polarity classification in unstructured sentiment analysis for the case of Amazon food review with a performance result of 87,3 percent.

Download Full-text

Transformer Oil Quality Assessment Using Random Forest with Feature Engineering

Energies ◽

10.3390/en14071809 ◽

2021 ◽

Vol 14 (7) ◽

pp. 1809

Author(s):

Mohammed El Amine Senoussaoui ◽

Mostefa Brahami ◽

Issouf Fofana

Keyword(s):

Machine Learning ◽

Random Forest ◽

Oil Quality ◽

Principal Component ◽

Condition Assessment ◽

Classification Performance ◽

Transformer Oil ◽

Classification Model ◽

Insulation Degradation ◽

Transformer Oils

Machine learning is widely used as a panacea in many engineering applications including the condition assessment of power transformers. Most statistics attribute the main cause of transformer failure to insulation degradation. Thus, a new, simple, and effective machine-learning approach was proposed to monitor the condition of transformer oils based on some aging indicators. The proposed approach was used to compare the performance of two machine-learning classifiers: J48 decision tree and random forest. The service-aged transformer oils were classified into four groups: the oils that can be maintained in service, the oils that should be reconditioned or filtered, the oils that should be reclaimed, and the oils that must be discarded. From the two algorithms, random forest exhibited a better performance and high accuracy with only a small amount of data. Good performance was achieved through not only the application of the proposed algorithm but also the approach of data preprocessing. Before feeding the classification model, the available data were transformed using the simple k-means method. Subsequently, the obtained data were filtered through correlation-based feature selection (CFsSubset). The resulting features were again retransformed by conducting the principal component analysis and were passed through the CFsSubset filter. The transformation and filtration of the data improved the classification performance of the adopted algorithms, especially random forest. Another advantage of the proposed method is the decrease in the number of the datasets required for the condition assessment of transformer oils, which is valuable for transformer condition monitoring.

Download Full-text

Use of Machine Learning to Investigate the Quantitative Checklist for Autism in Toddlers (Q-CHAT) towards Early Autism Screening

Diagnostics ◽

10.3390/diagnostics11030574 ◽

2021 ◽

Vol 11 (3) ◽

pp. 574

Author(s):

Gennaro Tartarisco ◽

Giovanni Cicceri ◽

Davide Di Pietro ◽

Elisa Leonardi ◽

Stefania Aiello ◽

...

Keyword(s):

Machine Learning ◽

High Performance ◽

Behavioral Science ◽

Autistic Traits ◽

Classification Performance ◽

Recursive Feature Elimination ◽

Diagnostic Tools ◽

Support Vector ◽

K Nearest Neighbors ◽

Autism Screening

In the past two decades, several screening instruments were developed to detect toddlers who may be autistic both in clinical and unselected samples. Among others, the Quantitative CHecklist for Autism in Toddlers (Q-CHAT) is a quantitative and normally distributed measure of autistic traits that demonstrates good psychometric properties in different settings and cultures. Recently, machine learning (ML) has been applied to behavioral science to improve the classification performance of autism screening and diagnostic tools, but mainly in children, adolescents, and adults. In this study, we used ML to investigate the accuracy and reliability of the Q-CHAT in discriminating young autistic children from those without. Five different ML algorithms (random forest (RF), naïve Bayes (NB), support vector machine (SVM), logistic regression (LR), and K-nearest neighbors (KNN)) were applied to investigate the complete set of Q-CHAT items. Our results showed that ML achieved an overall accuracy of 90%, and the SVM was the most effective, being able to classify autism with 95% accuracy. Furthermore, using the SVM–recursive feature elimination (RFE) approach, we selected a subset of 14 items ensuring 91% accuracy, while 83% accuracy was obtained from the 3 best discriminating items in common to ours and the previously reported Q-CHAT-10. This evidence confirms the high performance and cross-cultural validity of the Q-CHAT, and supports the application of ML to create shorter and faster versions of the instrument, maintaining high classification accuracy, to be used as a quick, easy, and high-performance tool in primary-care settings.

Download Full-text

A Machine Learning Approach to Study Glycosidase Activities from Bifidobacterium

Microorganisms ◽

10.3390/microorganisms9051034 ◽

2021 ◽

Vol 9 (5) ◽

pp. 1034

Author(s):

Carlos Sabater ◽

Lorena Ruiz ◽

Abelardo Margolles

Keyword(s):

Machine Learning ◽

Supervised Classification ◽

Machine Learning Algorithms ◽

Learning Approach ◽

Human Milk Oligosaccharides ◽

Future Studies ◽

High Fiber ◽

Machine Learning Approach ◽

Prebiotic Oligosaccharides

This study aimed to recover metagenome-assembled genomes (MAGs) from human fecal samples to characterize the glycosidase profiles of Bifidobacterium species exposed to different prebiotic oligosaccharides (galacto-oligosaccharides, fructo-oligosaccharides and human milk oligosaccharides, HMOs) as well as high-fiber diets. A total of 1806 MAGs were recovered from 487 infant and adult metagenomes. Unsupervised and supervised classification of glycosidases codified in MAGs using machine-learning algorithms allowed establishing characteristic hydrolytic profiles for B. adolescentis, B. bifidum, B. breve, B. longum and B. pseudocatenulatum, yielding classification rates above 90%. Glycosidase families GH5 44, GH32, and GH110 were characteristic of B. bifidum. The presence or absence of GH1, GH2, GH5 and GH20 was characteristic of B. adolescentis, B. breve and B. pseudocatenulatum, while families GH1 and GH30 were relevant in MAGs from B. longum. These characteristic profiles allowed discriminating bifidobacteria regardless of prebiotic exposure. Correlation analysis of glycosidase activities suggests strong associations between glycosidase families comprising HMOs-degrading enzymes, which are often found in MAGs from the same species. Mathematical models here proposed may contribute to a better understanding of the carbohydrate metabolism of some common bifidobacteria species and could be extrapolated to other microorganisms of interest in future studies.

Download Full-text