Recommender System: Towards Classification of Human Intentions in E-Shopping Using Machine Learning

Recommender systems were introduced in mid-1990 for assisting the users to choose a correct product from innumerable choices available. The basic concept of a recommender system is to advise a new item or product to the users instead of the manual search, because when user wants to buy a new item, he is confused about which item will suit him better and meet the intended requirements. From google news to netflix and from Instagram to LinkedIn, recommender systems have spread their roots in almost every application domain possible. Now a days, lots of recommender system are available for every field. In this paper, overview of recommender system, recommender approaches, application areas and the challenges of recommender system, is given. Further, we study conduct an experiment on online shoppers’ intention to predict the behavior of shoppers using Machine learning algorithms. Based on the results, it is observed that Random forest algorithm performs the best with 93% ROC value.

Download Full-text

A User- Based Recommendation with a Scalable Machine Learning Tool

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v5i5.pp1153-1157 ◽

2015 ◽

Vol 5 (5) ◽

pp. 1153 ◽

Cited By ~ 2

Author(s):

Ch. Veena ◽

B. Vijaya Babu

Keyword(s):

Machine Learning ◽

Collaborative Filtering ◽

Open Source ◽

Recommender Systems ◽

Recommender System ◽

Historical Data ◽

Machine Learning Algorithms ◽

Frame Work ◽

Machine Learning Tool ◽

Selection Of

Recommender Systems have proven to be valuable way for online users to recommend information items like books, videos, songs etc.colloborative filtering methods are used to make all predictions from historical data. In this paper we introduce Apache mahout which is an open source and provides a rich set of components to construct a customized recommender system from a selection of machine learning algorithms.[12] This paper also focuses on addressing the challenges in collaborative filtering like scalability and data sparsity. To deal with scalability problems, we go with a distributed frame work like hadoop. We then present a customized user based recommender system.

Download Full-text

Auto-CaseRec: Automatically Selecting and Optimizing Recommendation-Systems Algorithms

10.31219/osf.io/4znmd ◽

2020 ◽

Author(s):

Srijan Gupta ◽

Joeran Beel

Keyword(s):

Machine Learning ◽

Recommender Systems ◽

Recommender System ◽

Parameter Tuning ◽

Machine Learning Algorithms ◽

Algorithm Selection ◽

Automated Algorithm ◽

Recommendation Algorithms ◽

Automated Machine Learning ◽

Human Effort

The advances in the ﬁeld of Automated Machine Learning (AutoML) have greatly reduced human effort in selecting and optimizing machine learning algorithms. These advances, however, have not yet widely made it to Recommender-Systems libraries. We introduce Auto-CaseRec, a Python framework based on the CaseRec recommender-system library. Auto-CaseRec provides automated algorithm selection and parameter tuning for recommendation algorithms. An initial evaluation of Auto-CaseRec against the baselines shows an average 13.88% improvement in RMSE for theMovielens100K dataset and an average 17.95% improvement in RMSE for the Last.fm dataset.

Download Full-text

On the integration of Machine Learning algorithms and operations research techniques in the development of a hybrid Recommender System

Intelligent Decision Technologies ◽

10.3233/idt-200217 ◽

2021 ◽

pp. 1-14

Author(s):

Panagiotis Giannopoulos ◽

Georgios Kournetas ◽

Nikos Karacapilidis

Keyword(s):

Machine Learning ◽

Recommender Systems ◽

Recommender System ◽

Information Overload ◽

Learning Algorithms ◽

Information Filtering ◽

Machine Learning Algorithms ◽

K Nearest Neighbors ◽

Hybrid Recommender System ◽

Hybrid Recommender

Recommender Systems is a highly applicable subclass of information filtering systems, aiming to provide users with personalized item suggestions. These systems build on collaborative filtering and content-based methods to overcome the information overload issue. Hybrid recommender systems combine the abovementioned methods and are generally proved to be more efficient than the classical approaches. In this paper, we propose a novel approach for the development of a hybrid recommender system that is able to make recommendations under the limitation of processing small amounts of data with strong intercorrelation. The proposed hybrid solution integrates Machine Learning and Multi-Criteria Decision Analysis algorithms. The experimental evaluation of the proposed solution indicates that it performs better than widely used Machine Learning algorithms such as the k-Nearest Neighbors and Decision Trees.

Download Full-text

Classification of instagram fake users using supervised machine learning algorithms

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v10i3.pp2763-2772 ◽

2020 ◽

Vol 10 (3) ◽

pp. 2763 ◽

Cited By ~ 1

Author(s):

Kristo Radion Purba ◽

David Asirvatham ◽

Raja Kumar Murugesan

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Descriptive Statistics ◽

Supervised Machine Learning ◽

Business Owner ◽

Random Forest Algorithm ◽

Link Availability ◽

Machine Learning Models

On Instagram, the number of followers is a common success indicator. Hence, followers selling services become a huge part of the market. Influencers become bombarded with fake followers and this causes a business owner to pay more than they should for a brand endorsement. Identifying fake followers becomes important to determine the authenticity of an influencer. This research aims to identify fake users' behavior, and proposes supervised machine learning models to classify authentic and fake users. The dataset contains fake users bought from various sources, and authentic users. There are 17 features used, based on these sources: 6 metadata, 3 media info, 2 engagement, 2 media tags, 4 media similarity. Five machine learning algorithms will be tested. Three different approaches of classification are proposed, i.e. classification to 2-classes and 4-classes, and classification with metadata. Random forest algorithm produces the highest accuracy for the 2-classes (authentic, fake) and 4-classes (authentic, active fake user, inactive fake user, spammer) classification, with accuracy up to 91.76%. The result also shows that the five metadata variables, i.e. number of posts, followers, biography length, following, and link availability are the biggest predictors for the users class. Additionally, descriptive statistics results reveal noticeable differences between fake and authentic users.

Download Full-text

Classification of Phonocardiography Signals Using Imbalanced Machine Learning Techniques

Journal of Intelligent Systems with Applications ◽

10.54856/jiswa.202012128 ◽

2020 ◽

pp. 103-106

Author(s):

Mustafa Berkant Selek ◽

Sude Pehlivan ◽

Yalcin Isler

Keyword(s):

Machine Learning ◽

Frequency Domain ◽

Cardiovascular Health ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Learning Approaches ◽

Random Forest Algorithm ◽

Learning Techniques ◽

Machine Learning Applications

Cardiovascular diseases, which involve heart and blood vessel dysfunctions, cause a higher number of deaths than any other disease in the world. Throughout history, many approaches have been developed to analyze cardiovascular health by diagnosing such conditions. One of the methodologies is recording and analyzing heart sounds to distinguish normal and abnormal functioning of the heart, which is called Phonocardiography. With the emergence of machine learning applications in healthcare, this process can be automated via the extraction of various features from phonocardiography signals and performing classification with several machine learning algorithms. Many studies have been conducted to extract time and frequency domain features from the phonocardiography signals by segmenting them first into individual heart cycles, and then by classifying them using different machine learning and deep learning approaches. In this study, various time and frequency domain features have been extracted using the complete signal rather than just segments of it. Random Forest algorithm was found to be the most successful algorithm in terms of accuracy as well as recall and precision.

Download Full-text

Classification of Brainwaves for Sleep Stages by High-Dimensional FFT Features from EEG Signals

Applied Sciences ◽

10.3390/app10051797 ◽

2020 ◽

Vol 10 (5) ◽

pp. 1797 ◽

Cited By ~ 2

Author(s):

Mera Kartika Delimayanti ◽

Bedy Purnama ◽

Ngoc Giang Nguyen ◽

Mohammad Reza Faisal ◽

Kunti Robiatul Mahmudah ◽

...

Keyword(s):

Machine Learning ◽

Sleep Stage ◽

Machine Learning Algorithms ◽

High Dimensional ◽

Sleep Stages ◽

Eeg Signals ◽

Stage Classification ◽

Sleep Stage Classification ◽

Low Dimensional

Manual classification of sleep stage is a time-consuming but necessary step in the diagnosis and treatment of sleep disorders, and its automation has been an area of active study. The previous works have shown that low dimensional fast Fourier transform (FFT) features and many machine learning algorithms have been applied. In this paper, we demonstrate utilization of features extracted from EEG signals via FFT to improve the performance of automated sleep stage classification through machine learning methods. Unlike previous works using FFT, we incorporated thousands of FFT features in order to classify the sleep stages into 2–6 classes. Using the expanded version of Sleep-EDF dataset with 61 recordings, our method outperformed other state-of-the art methods. This result indicates that high dimensional FFT features in combination with a simple feature selection is effective for the improvement of automated sleep stage classification.

Download Full-text

A Machine Learning Approach to Study Glycosidase Activities from Bifidobacterium

Microorganisms ◽

10.3390/microorganisms9051034 ◽

2021 ◽

Vol 9 (5) ◽

pp. 1034

Author(s):

Carlos Sabater ◽

Lorena Ruiz ◽

Abelardo Margolles

Keyword(s):

Machine Learning ◽

Supervised Classification ◽

Machine Learning Algorithms ◽

Learning Approach ◽

Human Milk Oligosaccharides ◽

Future Studies ◽

High Fiber ◽

Machine Learning Approach ◽

Prebiotic Oligosaccharides

This study aimed to recover metagenome-assembled genomes (MAGs) from human fecal samples to characterize the glycosidase profiles of Bifidobacterium species exposed to different prebiotic oligosaccharides (galacto-oligosaccharides, fructo-oligosaccharides and human milk oligosaccharides, HMOs) as well as high-fiber diets. A total of 1806 MAGs were recovered from 487 infant and adult metagenomes. Unsupervised and supervised classification of glycosidases codified in MAGs using machine-learning algorithms allowed establishing characteristic hydrolytic profiles for B. adolescentis, B. bifidum, B. breve, B. longum and B. pseudocatenulatum, yielding classification rates above 90%. Glycosidase families GH5 44, GH32, and GH110 were characteristic of B. bifidum. The presence or absence of GH1, GH2, GH5 and GH20 was characteristic of B. adolescentis, B. breve and B. pseudocatenulatum, while families GH1 and GH30 were relevant in MAGs from B. longum. These characteristic profiles allowed discriminating bifidobacteria regardless of prebiotic exposure. Correlation analysis of glycosidase activities suggests strong associations between glycosidase families comprising HMOs-degrading enzymes, which are often found in MAGs from the same species. Mathematical models here proposed may contribute to a better understanding of the carbohydrate metabolism of some common bifidobacteria species and could be extrapolated to other microorganisms of interest in future studies.

Download Full-text

Delineating Smallholder Maize Farms from Sentinel-1 Coupled with Sentinel-2 Data Using Machine Learning

Sustainability ◽

10.3390/su13094728 ◽

2021 ◽

Vol 13 (9) ◽

pp. 4728

Author(s):

Zinhle Mashaba-Munghemezulu ◽

George Johannes Chirima ◽

Cilence Munghemezulu

Keyword(s):

Machine Learning ◽

Food Security ◽

Rural Communities ◽

Machine Learning Algorithms ◽

Support Vector ◽

Subsistence Agriculture ◽

Smallholder Farms ◽

Main Driver ◽

Sentinel 2

Rural communities rely on smallholder maize farms for subsistence agriculture, the main driver of local economic activity and food security. However, their planted area estimates are unknown in most developing countries. This study explores the use of Sentinel-1 and Sentinel-2 data to map smallholder maize farms. The random forest (RF), support vector (SVM) machine learning algorithms and model stacking (ST) were applied. Results show that the classification of combined Sentinel-1 and Sentinel-2 data improved the RF, SVM and ST algorithms by 24.2%, 8.7%, and 9.1%, respectively, compared to the classification of Sentinel-1 data individually. Similarities in the estimated areas (7001.35 ± 1.2 ha for RF, 7926.03 ± 0.7 ha for SVM and 7099.59 ± 0.8 ha for ST) show that machine learning can estimate smallholder maize areas with high accuracies. The study concludes that the single-date Sentinel-1 data were insufficient to map smallholder maize farms. However, single-date Sentinel-1 combined with Sentinel-2 data were sufficient in mapping smallholder farms. These results can be used to support the generation and validation of national crop statistics, thus contributing to food security.

Download Full-text

174 A comparison of machine learning algorithms in the classification of beef steers finished in feedlot

Journal of Animal Science ◽

10.1093/jas/skaa278.231 ◽

2020 ◽

Vol 98 (Supplement_4) ◽

pp. 126-127

Author(s):

Lucas S Lopes ◽

Christine F Baes ◽

Dan Tulpan ◽

Luis Artur Loyola Chardulo ◽

Otavio Machado Neto ◽

...

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Final Decision ◽

Relevant Parameter ◽

Good Prediction ◽

Quality Traits ◽

C4.5 Decision Tree

Abstract The aim of this project is to compare some of the state-of-the-art machine learning algorithms on the classification of steers finished in feedlots based on performance, carcass and meat quality traits. The precise classification of animals allows for fast, real-time decision making in animal food industry, such as culling or retention of herd animals. Beef production presents high variability in its numerous carcass and beef quality traits. Machine learning algorithms and software provide an opportunity to evaluate the interactions between traits to better classify animals. Four different treatment levels of wet distiller’s grain were applied to 97 Angus-Nellore animals and used as features for the classification problem. The C4.5 decision tree, Naïve Bayes (NB), Random Forest (RF) and Multilayer Perceptron (MLP) Artificial Neural Network algorithms were used to predict and classify the animals based on recorded traits measurements, which include initial and final weights, sheer force and meat color. The top performing classifier was the C4.5 decision tree algorithm with a classification accuracy of 96.90%, while the RF, the MLP and NB classifiers had accuracies of 55.67%, 39.17% and 29.89% respectively. We observed that the final decision tree model constructed with C4.5 selected only the dry matter intake (DMI) feature as a differentiator. When DMI was removed, no other feature or combination of features was sufficiently strong to provide good prediction accuracies for any of the classifiers. We plan to investigate in a follow-up study on a significantly larger sample size, the reasons behind DMI being a more relevant parameter than the other measurements.

Download Full-text

Multivariate Analysis for the Classification of Chocolate According to its Percentage of Cocoa by Using Terahertz Time-Domain Spectroscopy (THz-TDS)

Proceedings ◽

10.3390/foods_2020-08029 ◽

2020 ◽

Vol 70 (1) ◽

pp. 109

Author(s):

Jimy Oblitas ◽

Jorge Ruiz

Keyword(s):

Machine Learning ◽

Time Domain ◽

Electromagnetic Pulse ◽

Machine Learning Algorithms ◽

Classification Models ◽

Terahertz Time Domain Spectroscopy ◽

Time Domain Spectroscopy ◽

Svm Algorithm ◽

Classification Of Images

Terahertz time-domain spectroscopy is a useful technique for determining some physical characteristics of materials, and is based on selective frequency absorption of a broad-spectrum electromagnetic pulse. In order to investigate the potential of this technology to classify cocoa percentages in chocolates, the terahertz spectra (0.5–10 THz) of five chocolate samples (50%, 60%, 70%, 80% and 90% of cocoa) were examined. The acquired data matrices were analyzed with the MATLAB 2019b application, from which the dielectric function was obtained along with the absorbance curves, and were classified by using 24 mathematical classification models, achieving differentiations of around 93% obtained by the Gaussian SVM algorithm model with a kernel scale of 0.35 and a one-against-one multiclass method. It was concluded that the combined processing and classification of images obtained from the terahertz time-domain spectroscopy and the use of machine learning algorithms can be used to successfully classify chocolates with different percentages of cocoa.

Download Full-text