Prediksi Klasifikasi Pembangunan Merek Kosmetik Dengan Metode Enbag K-Logres Berdasarkan Keterlibatan Pengguna Facebook

Abstract: Building a brand new company that starts a business by conducting market research is intended to introduce new products and maintain existing businesses. But the market survey actually requires quite a lot of costs for transportation costs, brochure printing costs, more employee salaries and so forth. Surveys conducted offline also reach a less extensive market, less maximum results and less detail, and require more time. Based on the description above, the researchers conducted a study using Facebook performance metric data that assessed the construction of cosmetics brands using the K-Nearest Neighbor and Logistics Regression (SVM) algorithm by classifying which posts were the most desirable and less desirable by consumers, as well as measuring by the EnBag method K-LoGres of the two algorithms to improve the performance of the two proposed algorithms. Bagging technique was chosen because it has the advantage of being able to improve the measurement results and improve the accuracy of classification measurements by combining two or more algorithms. Based on the measurement results of Facebook metric data which assesses the development of cosmetic brands with the K-NN algorithm it gets an accuracy of 68.67% and a Logistic Regression (SVM) of 72.67% then the two algorithms are processed using the EnBag K-LoGres method getting an accuracy of 73.91%. Based on the results of measurements with the EnBag K-LoGres method the results increased by 1.24%.Keywords: Brand Development, Cosmetics, K-Nearest-Neighbour, Logistic (SVM), EnBag K-LogresAbstrak: Membangun merek perusahaan yang baru memulai usaha dengan melakukan riset pasar dimaksudkan untuk memperkenalkan produk baru serta mempertahankan usaha yang sudah ada. Namun survei pasar justru membutuhkan biaya yang cukup banyak untuk biaya transportasi, biaya cetak brosur, gaji karyawan lebih banyak dan lain sebagainya. Survei yang dilakukan secara offline juga menjangkau pasar kurang luas, hasil kurang maksimal dan kurang merinci, serta membutuhkan waktu yang lebih lama. Berdasarkan uraian diatas maka peneliti melakukan penelitian dengan memanfaatkan data metrik kinerja facebook yang menilai pembangunan merk kosmetik dengan menggunakan algoritma K-Nearest Neighbourdan Logistic Regreesion (SVM) dengan mengklasifikasikan postingan mana yang paling diminati dan kurang diminati oleh konsumen, serta melakukan pengukuran dengan metode EnBag K-LoGres dari kedua algoritma untuk meningkatkan kinerja kedua algoritma yang diusulkan. Teknik bagging dipilih karena memiliki kelebihan dapat memperbaiki hasil pengukuran serta meningkatkan akurasi dari pengukuran klasifikasi dengan menggabungkan dua atau lebih algoritma. Berdasarkan hasil pengukuran data metrik facebook yang menilai pembangunan merek kosmetik denganalgoritma K-NN memperoleh akurasi sebesar 68.67% dan Logistic Regression (SVM) sebesar 72.67% selanjutnya kedua algoritma diproses dengan metode EnBag K-LoGres mendapat akurasi sebesar 73.91%. Berdasarkan hasil pengukuran dengan metode EnBag K-LoGreshasilnya mengalami kenaikan sebesar 1.24 %.Kata kunci: Pembangunan Merek, Kosmetik, K-Nearest Neighbour, Logistic Regression (SVM), EnBag K-LoGres

Download Full-text

Book Genre Categorization Using Machine Learning Algorithms (K-Nearest Neighbor, Support Vector Machine and Logistic Regression) using Customized Dataset

International Journal of Computer Science and Mobile Computing ◽

10.47760/ijcsmc.2021.v10i03.002 ◽

2021 ◽

Vol 10 (3) ◽

pp. 14-25

Author(s):

Parilkumar Shiroya

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Logistic Regression ◽

Nearest Neighbor ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

K Nearest Neighbor

Download Full-text

Predictive Modeling and Analysis of Logistic Regression and k-Nearest Neighbor for Personal Loan Campaign

Proceedings of the 2nd International Conference on ICT for Digital, Smart, and Sustainable Development, ICIDSSD 2020, 27-28 February 2020, Jamia Hamdard, New Delhi, India ◽

10.4108/eai.27-2-2020.2303232 ◽

2021 ◽

Author(s):

Bhavya Alankar ◽

Iftikhar Alam

Keyword(s):

Logistic Regression ◽

Predictive Modeling ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Modeling And Analysis

Download Full-text

Behavior Analysis of Illegal Fishing in the Gulf of Mexico

Journal of Homeland Security and Emergency Management ◽

10.1515/jhsem-2016-0017 ◽

2018 ◽

Vol 15 (1) ◽

Author(s):

Ali Pala ◽

Jing Zhang ◽

Jun Zhuang ◽

Nathan Allen

Keyword(s):

Logistic Regression ◽

Gulf Of Mexico ◽

Wave Height ◽

Nearest Neighbor ◽

Moon Phase ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

Illegal Fishing ◽

The Us ◽

K Nearest Neighbor Algorithm

Abstract Illegal fishing activities in the Gulf of Mexico pose a threat to the US national security, as well as damage to the economy. The US Coast Guard (USCG) estimates over 1100 incursions by Mexican fisherman into US regulated waters in the Gulf of Mexico annually. Fishermen enter the water borders to catch red snapper, which is one of the Gulf of Mexico’s signature and most valuable fish. There are a number of academic contributions which have sought to improve the understanding of the problem of illegal fishing, and to try to generate better solutions. In this study, we investigate the relationship between illegal fishing activities and environmental factors with one-year of historical sight, weather, and moon phase data. Descriptive analysis provides some interesting insights such as sight patterns depending on wave height, moon phase, and hours of a day. Also, we develop logistic regression models that shows wave height is negatively correlated with sight occurrences for all sight types. In addition, we oversample the data and develop two pre diction models using logistic regression and k-nearest neighbor algorithm and compare prediction accuracies. The results show that k-nearest neighbor algorithm performs better in most of the cases.

Download Full-text

IDENTIFIKASI JENIS IKAN MENGGUNAKAN MODEL HYBRID DEEP LEARNING DAN ALGORITMA KLASIFIKASI

Sebatik ◽

10.46984/sebatik.v24i2.1057 ◽

2020 ◽

Vol 24 (2) ◽

Author(s):

Anifuddin Azis

Keyword(s):

Neural Networks ◽

Support Vector Machine ◽

Logistic Regression ◽

Deep Learning ◽

Random Forest ◽

Decision Tree ◽

Nearest Neighbor ◽

Support Vector ◽

K Nearest Neighbor ◽

Data Output

Indonesia merupakan negara dengan keanekaragaman hayati terbesar kedua di dunia setelah Brazil. Indonesia memiliki sekitar 25.000 spesies tumbuhan dan 400.000 jenis hewan dan ikan. Diperkirakan 8.500 spesies ikan hidup di perairan Indonesia atau merupakan 45% dari jumlah spesies yang ada di dunia, dengan sekitar 7.000an adalah spesies ikan laut. Untuk menentukan berapa jumlah spesies tersebut dibutuhkan suatu keahlian di bidang taksonomi. Dalam pelaksanaannya mengidentifikasi suatu jenis ikan bukanlah hal yang mudah karena memerlukan suatu metode dan peralatan tertentu, juga pustaka mengenai taksonomi. Pemrosesan video atau citra pada data ekosistem perairan yang dilakukan secara otomatis mulai dikembangkan. Dalam pengembangannya, proses deteksi dan identifikasi spesies ikan menjadi suatu tantangan dibandingkan dengan deteksi dan identifikasi pada objek yang lain. Metode deep learning yang berhasil dalam melakukan klasifikasi objek pada citra mampu untuk menganalisa data secara langsung tanpa adanya ekstraksi fitur pada data secara khusus. Sistem tersebut memiliki parameter atau bobot yang berfungsi sebagai ektraksi fitur maupun sebagai pengklasifikasi. Data yang diproses menghasilkan output yang diharapkan semirip mungkin dengan data output yang sesungguhnya. CNN merupakan arsitektur deep learning yang mampu mereduksi dimensi pada data tanpa menghilangkan ciri atau fitur pada data tersebut. Pada penelitian ini akan dikembangkan model hybrid CNN (Convolutional Neural Networks) untuk mengekstraksi fitur dan beberapa algoritma klasifikasi untuk mengidentifikasi spesies ikan. Algoritma klasifikasi yang digunakan pada penelitian ini adalah : Logistic Regression (LR), Support Vector Machine (SVM), Decision Tree, K-Nearest Neighbor (KNN), Random Forest, Backpropagation.

Download Full-text

ANALISA SENTIMEN PADA TINJAUAN BUKU DENGAN ALGORITMA K-NEAREST NEIGHBOUR

KONVERGENSI ◽

10.30996/konv.v13i2.2758 ◽

2019 ◽

Vol 13 (2) ◽

Author(s):

Luvia Friska Narulita

Keyword(s):

Nearest Neighbor ◽

Nearest Neighbour ◽

K Nearest Neighbor ◽

Term Frequency

Analisa sentimen pada tinjauan buku dapat digunakan untuk pengklasifikasian dokumen tinjauan sehingga pembagian sentimen positif dan negatif dapat dilakukan secara sistemis. Penggunaan metode k-nearest neighbor dan digabungkan dengan metode pembobotan istilah dan penghitungan tingkat kemiripan memberikan hasil yang cukup baik pada penelitian yang telah dilakukan.Â Kata Kunci: analisa sentimen, similarity, k nearest neighbor, term frequency

Download Full-text

Booking Prediction Models for Peer-to-peer Accommodation Listings using Logistics Regression, Decision Tree, K-Nearest Neighbor, and Random Forest Classifiers

Journal of Information Systems Engineering and Business Intelligence ◽

10.20473/jisebi.6.2.123-132 ◽

2020 ◽

Vol 6 (2) ◽

pp. 123

Author(s):

Mochammad Agus Afrianto ◽

Meditya Wasesa

Keyword(s):

Random Forest ◽

Decision Tree ◽

Revenue Management ◽

Nearest Neighbor ◽

Prediction Models ◽

Model Development ◽

Peer To Peer ◽

K Nearest Neighbor ◽

Logistics Regression ◽

Roc Score

Background: Literature in the peer-to-peer accommodation has put a substantial focus on accommodation listings' price determinants. Developing prediction models related to the demand for accommodation listings is vital in revenue management because accurate price and demand forecasts will help determine the best revenue management responses.Objective: This study aims to develop prediction models to determine the booking likelihood of accommodation listings.Methods: Using an Airbnb dataset, we developed four machine learning models, namely Logistics Regression, Decision Tree, K-Nearest Neighbor (KNN), and Random Forest Classifiers. We assessed the models using the AUC-ROC score and the model development time by using the ten-fold three-way split and the ten-fold cross-validation procedures.Results: In terms of average AUC-ROC score, the Random Forest Classifiers outperformed other evaluated models. In three-ways split procedure, it had a 15.03% higher AUC-ROC score than Decision Tree, 2.93 % higher than KNN, and 2.38% higher than Logistics Regression. In the cross-validation procedure, it has a 26,99% higher AUC-ROC score than Decision Tree, 4.41 % higher than KNN, and 3.31% higher than Logistics Regression. It should be noted that the Decision Tree model has the lowest AUC-ROC score, but it has the smallest model development time.Conclusion: The performance of random forest models in predicting booking likelihood of accommodation listings is the most superior. The model can be used by peer-to-peer accommodation owners to improve their revenue management responses.

Download Full-text

Machine Learning Approach to Differentiation of Peripheral Schwannomas and Neurofibromas: A Multi-Center Study

Neuro-Oncology ◽

10.1093/neuonc/noab211 ◽

2021 ◽

Author(s):

Michael Zhang ◽

Elizabeth Tong ◽

Sam Wong ◽

Forrest Hamrick ◽

Maryam Mohammadzadeh ◽

...

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Clinical Features ◽

Nearest Neighbor ◽

Motor Deficit ◽

Support Vector ◽

Spontaneous Pain ◽

Learning Approaches ◽

Imaging Data ◽

K Nearest Neighbor

Abstract Background Non-invasive differentiation between schwannomas and neurofibromas is important for appropriate management, preoperative counseling, and surgical planning, but has proven difficult using conventional imaging. The objective of this study was to develop and evaluate machine learning approaches for differentiating peripheral schwannomas from neurofibromas. Methods We assembled a cohort of schwannomas and neurofibromas from 3 independent institutions and extracted high-dimensional radiomic features from gadolinium-enhanced, T1-weighted MRI using the PyRadiomics package on Quantitative Imaging Feature Pipeline. Age, sex, neurogenetic syndrome, spontaneous pain, and motor deficit were recorded. We evaluated the performance of 6 radiomics-based classifier models with and without clinical features and compared model performance against human expert evaluators. Results 107 schwannomas and 59 neurofibroma were included. The primary models included both clinical and imaging data. The accuracy of the human evaluators (0.765) did not significantly exceed the no-information rate (NIR), whereas the Support Vector Machine (0.929), Logistic Regression (0.929), and Random Forest (0.905) classifiers exceeded the NIR. Using the method of DeLong, the AUC for the Logistic Regression (AUC=0.923) and K Nearest Neighbor (AUC=0.923) classifiers was significantly greater than the human evaluators (AUC=0.766; p = 0.041). Conclusions The radiomics-based classifiers developed here proved to be more accurate and had a higher AUC on the ROC curve than expert human evaluators. This demonstrates that radiomics using routine MRI sequences and clinical features can aid in differentiation of peripheral schwannomas and neurofibromas.

Download Full-text

Diagnostic Model of in-Hospital Mortality in Patients with Acute ST-Segment Elevation Myocardial Infarction Used Artificial Intelligence Methods : Algorithm Development and Validation (Preprint)

10.2196/preprints.32349 ◽

2021 ◽

Author(s):

Yong Li

Keyword(s):

Artificial Intelligence ◽

Logistic Regression ◽

Hospital Mortality ◽

Nearest Neighbor ◽

Classification Model ◽

Diagnostic Model ◽

K Nearest Neighbor ◽

Nearest Neighbor Classification ◽

Artificial Intelligence Methods ◽

Neighbor Classification

BACKGROUND Preventing in-hospital mortality in Patients with ST-segment elevation myocardial infarction (STEMI) is a crucial step. OBJECTIVE The objective of our research was to to develop and externally validate the diagnostic model of in-hospital mortality in acute STEMI patients used artificial intelligence methods. METHODS As our datasets were highly imbalanced, we evaluated the effect of down-sampling methods. Therefore, down-sampling techniques was additionally implemented on the original dataset to create 1 balanced datasets. This ultimately yielded 2 datasets; original, and down-sampling. We divide non-randomly the American population into a training set and a test set , and anther American population as the validation set. We used artificial intelligence methods to develop and externally validate the diagnostic model of in-hospital mortality in acute STEMI patients, including logistic regression, decision tree, extreme gradient boosting (XGBoost), K nearest neighbor classification model ,and multi-layer perceptron.We used confusion matrix combined with the area under the receiver operating characteristic curve (AUC) to evaluate the pros and cons of the above models. RESULTS The strongest predictors of in-hospital mortality were age, female, cardiogenic shock, atrial fibrillation(AF), ventricular fibrillation(VF),in-hospital bleeding and medical history such as hypertension, old myocardial infarction.The F2 score of logistic regression in the training set, the test set , and the validation data set were 0.7, 0.7, and 0.54 respectively.The F2 score of XGBoost were 0.74, 0.52, and 0.54 respectively. The F2 score of decision tree were 0.72, 0.51,and 0.52 respectively. The F2 score of K nearest neighbor classification model were 0.64,0.47, and 0.49 respectively. The F2 score of multi-layer perceptron were 0.71, 0.54, and 0.54 respectively. The AUC of logistic regression in the training set, the test set, and the validation data set were 0.72, 0.73, and 0.76 respectively. The AUC of XGoBost were 0.75, 0.73, and 0.75 respectively. The AUC of decision tree were 0.75, 0.71,and 0.74 respectively. The AUC of K nearest neighbor classification model were 0.71,0.69, and 0.72 respectively. The AUC of multi-layer perceptron were 0.73, 0.74, and 0.75 respectively. The diagnostic model built by logistic regression was the best. CONCLUSIONS The strongest predictors of in-hospital mortality were age, female, cardiogenic shock, AF, VF,in-hospital bleeding and medical history such as hypertension, old myocardial infarction. We had used artificial intelligence methods developed and externally validated the diagnostic model of in-hospital mortality in acute STEMI patients.The diagnostic model built by logistic regression was the best. CLINICALTRIAL We registered this study with WHO International Clinical Trials Registry Platform (ICTRP) (registration number: ChiCTR1900027129; registered date: 1 November 2019). http://www.chictr.org.cn/edit.aspx?pid=44888&htm=4.

Download Full-text

Image Classification of Tourist Attractions with K-Nearest Neighbor, Logistic Regression, Random Forest, and Support Vector Machine

International Journal on Advanced Science Engineering and Information Technology ◽

10.18517/ijaseit.10.6.9098 ◽

2020 ◽

Vol 10 (6) ◽

pp. 2207

Author(s):

Herry Sujaini

Keyword(s):

Support Vector Machine ◽

Logistic Regression ◽

Random Forest ◽

Image Classification ◽

Nearest Neighbor ◽

Support Vector ◽

K Nearest Neighbor ◽

Tourist Attractions

Download Full-text

Exploring the determinants of and predicting the helpfulness of online user reviews using decision trees

Management Decision ◽

10.1108/md-06-2016-0398 ◽

2017 ◽

Vol 55 (4) ◽

pp. 681-700 ◽

Cited By ~ 7

Author(s):

Sangjae Lee ◽

Joon Yeon Choeh

Keyword(s):

Decision Tree ◽

Nearest Neighbor ◽

Decision Rules ◽

Online Reviews ◽

K Nearest Neighbor ◽

Product Data ◽

Content Type ◽

Logistics Regression ◽

Review Helpfulness ◽

High Level

Purpose The purpose of this paper is to suggest important determinants for helpfulness from the reviews’ product data, review characteristics, and textual characteristics, and identify the more crucial factors among these determinants by using statistical methods. Furthermore, this study intends to propose a classification-based review recommender using a decision tree (CRDT) that uses a decision tree to identify and recommend reviews that have a high level of helpfulness. Design/methodology/approach This study used publicly available data from Amazon.com to construct measures of determinants and helpfulness. To examine this, the authors collected data about economic transactions on Amazon.com and analyzed the associated review system. The final sample included 10,000 reviews composed of 4,799 helpful and 5,201 not helpful reviews. Findings The study selected more crucial determinants from a comprehensive group of product, reviewer, and textual characteristics through using a t-test and logistics regression. The five important variables found to be significant in both t-test and logistic regression analysis were the total number of reviews for the product, the reviewer’s history macro, the reviewer’s rank, the disclosure of the reviewer’s name, and the length of the review in words. The decision tree method produced decision rules for determining helpfulness from the value of the product data, review characteristics, and textual characteristics. The prediction accuracy of CRDT was better than that of the k-nearest neighbor (kNN) method and linear multivariate discriminant analysis in terms of prediction error. CRDT can suggest better determinants that have a greater effect on the degree of helpfulness. Practical implications The important factors suggested as affecting review helpfulness should be considered in the design of websites, as online retail sites with more helpful reviews can provide a greater potential value to customers. The results of the study suggest managers and marketers better understand customers’ review and increase the value to customers by proving enhanced diagnosticity to consumers. Originality/value This study is different from previous studies in that it investigated the holistic aspect of determinants, that is, product, review, and textual characteristics for classifying helpful reviews, and selected more crucial determinants from a comprehensive group of product, reviewer, and textual characteristics by using a t-test and logistics regression. This study utilized a decision tree, which has rarely been used in predicting review helpfulness, to provide rules for identifying helpful online reviews.

Download Full-text

Prediksi Klasifikasi Pembangunan Merek Kosmetik Dengan Metode Enbag K-Logres Berdasarkan Keterlibatan Pengguna Facebook

Book Genre Categorization Using Machine Learning Algorithms (K-Nearest Neighbor, Support Vector Machine and Logistic Regression) using Customized Dataset﻿

Predictive Modeling and Analysis of Logistic Regression and k-Nearest Neighbor for Personal Loan Campaign

Behavior Analysis of Illegal Fishing in the Gulf of Mexico

IDENTIFIKASI JENIS IKAN MENGGUNAKAN MODEL HYBRID DEEP LEARNING DAN ALGORITMA KLASIFIKASI

ANALISA SENTIMEN PADA TINJAUAN BUKU DENGAN ALGORITMA K-NEAREST NEIGHBOUR

Booking Prediction Models for Peer-to-peer Accommodation Listings using Logistics Regression, Decision Tree, K-Nearest Neighbor, and Random Forest Classifiers

Machine Learning Approach to Differentiation of Peripheral Schwannomas and Neurofibromas: A Multi-Center Study

Diagnostic Model of in-Hospital Mortality in Patients with Acute ST-Segment Elevation Myocardial Infarction Used Artificial Intelligence Methods : Algorithm Development and Validation (Preprint)

Image Classification of Tourist Attractions with K-Nearest Neighbor, Logistic Regression, Random Forest, and Support Vector Machine

Exploring the determinants of and predicting the helpfulness of online user reviews using decision trees

Book Genre Categorization Using Machine Learning Algorithms (K-Nearest Neighbor, Support Vector Machine and Logistic Regression) using Customized Dataset