Pengembangan Sistem Penilaian Kematangan Tandan Buah Segar Kelapa Sawit menggunakan Citra 680 dan 750 NM

Salah satu permasalahan utama dalam industri kelapa sawit adalah proses sortasi Tandan Buah Segar (TBS) di pabrik kelapa sawit. Parameter yang digunakan dalam sortasi TBS adalah jumlah brondolan kelapa sawit. Pada saat ini, sortasi dilakukan oleh grader yang bersifat subyektif dan sering kali tidak konsisten. Hal ini terjadi karena keterbatasan penglihatan dan kemampuan manusia untuk mengolah informasi jumlah brondolan setiap TBS dalam waktu yang terbatas. Oleh karena itu, pada penelitian ini dikembangkan sistem penilaian kematangan TBS kelapa sawit berbasis spektroskopi dan nilai kontras citras. Sumber cahaya yang digunakan pada penelitian ini adalah lampu berjenis Light-emitting Diode (LED) dengan panjang gelombang 680 dan 750 nm. Akuisisi citra TBS dilakukan dengan menggunakan kamera DSLR yang telah dimodifikasi. sehingga diperoleh dua citra TBS pada panjang gelombang 680 dan 750 nm. Kemudian, dilakukan perhitungan nilai kontras kedua citra tersebut. Dalam penelitian ini, terdapat 24 TBS yang digunakan sebagai data latih, dengan komposisi 10 TBS matang dan 14 TBS mentah. Data uji yang digunakan berjumlah 77 TBS yang terdiri dari 38 matang dan 39 mentah. Pada penelitian ini, Support Vector Machine (SVM) digunakan sebagai metode klasifikasi. Akurasi data latih yang diperoleh adalah 66,67%. Sedangkan akurasi data uji dari sistem yang dikembangkan dalam penelitian ini adalah 57,14%. Hasil yang diperoleh ini masih perlu diperbaiki untuk meningkatkan akurasi sistem dengan cara menambah jumlah data, baik data latih maupun uji, serta menggunakan pembelajaran mesin. AbstractOne of the main problems in the palm oil industry is the grading of Fresh Fruit Bunches (FFB) in the palm oil mills. The parameter used for the process is the number of fruitlets detached from the bunch. Nowadays, the FFB grading is conducted by graders which is subjective and often inconsistent due to the limitation of human vision and ability to process information on the number of fruitlets detached per FFB in a very limited time. Therefore, this study developed a grading system to assess and estimate the FFB maturity based on spectroscopy and image contrast value. From the literature review, visible light and NIR spectrum in 680 and 780 nm can be used as light sources to detect the maturity level of FFB. DSLR camera is used to acquire the FFB image. Using this scheme, two FFB images in 680 and 750 nm are obtained. The next process is to calculate the image contrast. In this research, there are 24 FFB that are used as training data that consists of 10 ripe and 14 unripe. A total of 77 FFB are used as test data that consists of 38 ripe and 39 unripe. Support Vector Machine (SVM) is used in this research to classify the maturity level of FFB. The accuracy of the training dataset is 66.67%. Meanwhile, the accuracy of the test data is 57.14%. Future works will focus on enhancing accuracy of the system through increasing the number of training and testing data using machine learning.

Download Full-text

Detection Of Spam Comments On Instagram Using Complementary Naïve Bayes

IJCCS (Indonesian Journal of Computing and Cybernetics Systems) ◽

10.22146/ijccs.47046 ◽

2019 ◽

Vol 13 (3) ◽

pp. 263

Author(s):

Nur Azizul Haqimi ◽

Nur Rokhman ◽

Sigit Priyanta

Keyword(s):

Social Media ◽

Support Vector Machine ◽

Test Data ◽

Training Data ◽

Classification Method ◽

Support Vector ◽

Test Results ◽

Imbalanced Dataset ◽

Web Based ◽

F Measure

Instagram (IG) is a web-based and mobile social media application where users can share photos or videos with available features. Upload photos or videos with captions that contain an explanation of the photo or video that can reap spam comments. Comments on spam containing comments that are not relevant to the caption and photos. The problem that arises when identifying spam is non-spam comments are more dominant than spam comments so that it leads to the problem of the imbalanced dataset. A balanced dataset can influence the performance of a classification method. This is the focus of research related to the implementation of the CNB method in dealing with imbalance datasets for the detection of Instagram spam comments. The study used TF-IDF weighting with Support Vector Machine (SVM) as a comparison classification. Based on the test results with 2500 training data and 100 test data on the imbalanced dataset (25% spam and 75% non-spam), the CNB accuracy was 92%, precision 86% and f-measure 93%. Whereas SVM produces 87% accuracy, 79% precision, 88% f-measure. In conclusion, the CNB method is more suitable for detecting spam comments in cases of imbalanced datasets.

Download Full-text

Classification with imperfect training labels

Biometrika ◽

10.1093/biomet/asaa011 ◽

2020 ◽

Vol 107 (2) ◽

pp. 311-330 ◽

Cited By ~ 2

Author(s):

Timothy I Cannings ◽

Yingying Fan ◽

Richard J Samworth

Keyword(s):

Support Vector Machine ◽

Discriminant Analysis ◽

Linear Discriminant Analysis ◽

General Setting ◽

Excess Risk ◽

Training Data ◽

Training Dataset ◽

Support Vector ◽

Nearest Neighbour ◽

Linear Discriminant

Summary We study the effect of imperfect training data labels on the performance of classification methods. In a general setting, where the probability that an observation in the training dataset is mislabelled may depend on both the feature vector and the true label, we bound the excess risk of an arbitrary classifier trained with imperfect labels in terms of its excess risk for predicting a noisy label. This reveals conditions under which a classifier trained with imperfect labels remains consistent for classifying uncorrupted test data points. Furthermore, under stronger conditions, we derive detailed asymptotic properties for the popular $k$-nearest neighbour, support vector machine and linear discriminant analysis classifiers. One consequence of these results is that the $k$-nearest neighbour and support vector machine classifiers are robust to imperfect training labels, in the sense that the rate of convergence of the excess risk of these classifiers remains unchanged; in fact, our theoretical and empirical results even show that in some cases, imperfect labels may improve the performance of these methods. The linear discriminant analysis classifier is shown to be typically inconsistent in the presence of label noise unless the prior probabilities of the classes are equal. Our theoretical results are supported by a simulation study.

Download Full-text

ABC-Gly: identifying protein lysine glycation sites with artificial bee colony algorithm

Current Proteomics ◽

10.2174/1570164617666191227120136 ◽

2019 ◽

Vol 17 ◽

Author(s):

Yanqiu Yao ◽

Xiaosa Zhao ◽

Qiao Ning ◽

Junping Zhou

Keyword(s):

Support Vector Machine ◽

Amino Acid ◽

Artificial Bee Colony Algorithm ◽

Artificial Bee Colony ◽

Training Dataset ◽

Support Vector ◽

Supplementary File ◽

Feature Subset ◽

Lipid Molecule ◽

Bee Colony

Background: Glycation is a nonenzymatic post-translational modification process by attaching a sugar molecule to a protein or lipid molecule. It may impair the function and change the characteristic of the proteins which may lead to some metabolic diseases. In order to understand the underlying molecular mechanisms of glycation, computational prediction methods have been developed because of their convenience and high speed. However, a more effective computational tool is still a challenging task in computational biology. Methods: In this study, we showed an accurate identification tool named ABC-Gly for predicting lysine glycation sites. At first, we utilized three informative features, including position-specific amino acid propensity, secondary structure and the composition of k-spaced amino acid pairs to encode the peptides. Moreover, to sufficiently exploit discriminative features thus can improve the prediction and generalization ability of the model, we developed a two-step feature selection, which combined the Fisher score and an improved binary artificial bee colony algorithm based on support vector machine. Finally, based on the optimal feature subset, we constructed the effective model by using Support Vector Machine on the training dataset. Results: The performance of the proposed predictor ABC-Gly was measured with the sensitivity of 76.43%, the specificity of 91.10%, the balanced accuracy of 83.76%, the area under the receiver-operating characteristic curve (AUC) of 0.9313, a Matthew’s Correlation Coefficient (MCC) of 0.6861 by 10-fold cross-validation on training dataset, and a balanced accuracy of 59.05% on independent dataset. Compared to the state-of-the-art predictors on the training dataset, the proposed predictor achieved significant improvement in the AUC of 0.156 and MCC of 0.336. Conclusion: The detailed analysis results indicated that our predictor may serve as a powerful complementary tool to other existing methods for predicting protein lysine glycation. The source code and datasets of the ABC-Gly were provided in the Supplementary File 1.

Download Full-text

Mie scattering and microparticle-based characterization of heavy metal ions and classification by statistical inference methods

Royal Society Open Science ◽

10.1098/rsos.190001 ◽

2019 ◽

Vol 6 (5) ◽

pp. 190001 ◽

Cited By ~ 1

Author(s):

Katherine E. Klug ◽

Christian M. Jennings ◽

Nicholas Lytal ◽

Lingling An ◽

Jeong-Yeol Yoon

Keyword(s):

Heavy Metal ◽

Support Vector Machine ◽

Metal Ions ◽

Heavy Metal Ions ◽

Mie Scattering ◽

Training Data ◽

Scattering Data ◽

Support Vector ◽

Linear Discriminant ◽

Machine Analysis

A straightforward method for classifying heavy metal ions in water is proposed using statistical classification and clustering techniques from non-specific microparticle scattering data. A set of carboxylated polystyrene microparticles of sizes 0.91, 0.75 and 0.40 µm was mixed with the solutions of nine heavy metal ions and two control cations, and scattering measurements were collected at two angles optimized for scattering from non-aggregated and aggregated particles. Classification of these observations was conducted and compared among several machine learning techniques, including linear discriminant analysis, support vector machine analysis, K-means clustering and K-medians clustering. This study found the highest classification accuracy using the linear discriminant and support vector machine analysis, each reporting high classification rates for heavy metal ions with respect to the model. This may be attributed to moderate correlation between detection angle and particle size. These classification models provide reasonable discrimination between most ion species, with the highest distinction seen for Pb(II), Cd(II), Ni(II) and Co(II), followed by Fe(II) and Fe(III), potentially due to its known sorption with carboxyl groups. The support vector machine analysis was also applied to three different mixture solutions representing leaching from pipes and mine tailings, and showed good correlation with single-species data, specifically with Pb(II) and Ni(II). With more expansive training data and further processing, this method shows promise for low-cost and portable heavy metal identification and sensing.

Download Full-text

Basketball shooting technology based on acceleration sensor fusion motion capture technology

EURASIP Journal on Advances in Signal Processing ◽

10.1186/s13634-021-00731-9 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Binbin Zhao ◽

Shihong Liu

Keyword(s):

Computer Vision ◽

Support Vector Machine ◽

Motion Capture ◽

Visual Recognition ◽

Gaussian Mixture ◽

Human Vision ◽

Support Vector ◽

Background Removal ◽

Graphics Processing ◽

Basketball Shooting

AbstractComputer vision recognition refers to the use of cameras and computers to replace the human eyes with computer vision, such as target recognition, tracking, measurement, and in-depth graphics processing, to process images to make them more suitable for human vision. Aiming at the problem of combining basketball shooting technology with visual recognition motion capture technology, this article mainly introduces the research of basketball shooting technology based on computer vision recognition fusion motion capture technology. This paper proposes that this technology first performs preprocessing operations such as background removal and filtering denoising on the acquired shooting video images to obtain the action characteristics of the characters in the video sequence and then uses the support vector machine (SVM) and the Gaussian mixture model to obtain the characteristics of the objects. Part of the data samples are extracted from the sample set for the learning and training of the model. After the training is completed, the other parts are classified and recognized. The simulation test results of the action database and the real shot video show that the support vector machine (SVM) can more quickly and effectively identify the actions that appear in the shot video, and the average recognition accuracy rate reaches 95.9%, which verifies the application and feasibility of this technology in the recognition of shooting actions is conducive to follow up and improve shooting techniques.

Download Full-text

Support Vector Machine based Stress Detection System to manage COVID-19 pandemic related stress from ECG signal

AIUB Journal of Science and Engineering (AJSE) ◽

10.53799/ajse.v20i1.112 ◽

2021 ◽

Vol 20 (1) ◽

pp. 8-16

Author(s):

Md Fahim Rizwan ◽

Rayed Farhad ◽

Md. Hasan Imam

Keyword(s):

Support Vector Machine ◽

Detection System ◽

Health Condition ◽

Cardiac Patients ◽

Gaussian Kernel ◽

Training Dataset ◽

Support Vector ◽

Portable Devices ◽

Stress Detection ◽

Specialist Doctor

This study represents a detailed investigation of induced stress detection in humans using Support Vector Machine algorithms. Proper detection of stress can prevent many psychological and physiological problems like the occurrence of major depression disorder (MDD), stress-induced cardiac rhythm abnormalities, or arrhythmia. Stress induced due to COVID -19 pandemic can make the situation worse for the cardiac patients and cause different abnormalities in the normal people due to lockdown condition. Therefore, an ECG based technique is proposed in this paper where the ECG can be recorded for the available handheld/portable devices which are now common to many countries where people can take ECG by their own in their houses and get preliminary information about their cardiac health. From ECG, we can derive RR interval, QT interval, and EDR (ECG derived Respiration) for developing the model for stress detection also. To validate the proposed model, an open-access database named "drivedb” available at Physionet (physionet.org) was used as the training dataset. After verifying several SVM models by changing the ECG length, features, and SVM Kernel type, the results showed an acceptable level of accuracy for Fine Gaussian SVM (i.e. 98.3% for 1 min ECG and 93.6 % for 5 min long ECG) with Gaussian Kernel while using all available features (RR, QT, and EDR). This finding emphasizes the importance of including ventricular polarization and respiratory information in stress detection and the possibility of stress detection from short length data(i.e. form 1 min ECG data), which will be very useful to detect stress through portable ECG devices in locked down condition to analyze mental health condition without visiting the specialist doctor at hospital. This technique also alarms the cardiac patients form being stressed too much which might cause severe arrhythmogenesis.

Download Full-text

Analisis Sentimen Data Twitter Tentang Pasangan Capres-Cawapres Pemilu 2019 Dengan Metode Lexicon Based Dan Support Vector Machine

Jurnal Ilmiah FIFO ◽

10.22441/fifo.2019.v11i2.004 ◽

2019 ◽

Vol 11 (2) ◽

pp. 144

Author(s):

Danar Wido Seno ◽

Arief Wibowo

Keyword(s):

Social Media ◽

Support Vector Machine ◽

Sentiment Analysis ◽

Vice President ◽

Training Data ◽

Support Vector ◽

New Words ◽

Textual Data ◽

Data Content ◽

Combination Of Methods

Social media writing content growing make a lot of new words that appear on Twitter in the form of words and abbreviations that appear so that sentiment analysis is increasingly difficult to get high accuracy of textual data on Twitter social media. In this study, the authors conducted research on sentiment analysis of the pairs of candidates for President and Vice President of Indonesia in the 2019 Elections. To obtain higher accuracy results and accommodate the problem of textual data development on Twitter, the authors conducted a combination of methods to conduct the sentiment analysis with unsupervised and supervised methods. namely Lexicon Based. This study used Twitter data in October 2018 using the search keywords with the names of each pair of candidates for President and Vice President of the 2019 Elections totaling 800 datasets. From the study with 800 datasets the best accuracy was obtained with a value of 92.5% with 80% training data composition and 20% testing data with a Precision value in each class between 85.7% - 97.2% and Recall value for each class among 78, 2% - 93.5%. With the Lexicon Based method as a labeling dataset, the process of labeling the Support Vector Machine dataset is no longer done manually but is processed by the Lexicon Based method and the dictionary on the lexicon can be added along with the development of data content on Twitter social media.

Download Full-text

Structural Damage Detection Using Supervised Nonlinear Support Vector Machine

Journal of Composites Science ◽

10.3390/jcs5110303 ◽

2021 ◽

Vol 5 (11) ◽

pp. 303

Author(s):

Kian K. Sepahvand

Keyword(s):

Support Vector Machine ◽

Damage Detection ◽

Structural Damage ◽

Natural Frequencies ◽

Training Data ◽

Support Vector ◽

Lightweight Structures ◽

Straightforward Method ◽

Classification Boundary ◽

Nonlinear Support

Damage detection, using vibrational properties, such as eigenfrequencies, is an efficient and straightforward method for detecting damage in structures, components, and machines. The method, however, is very inefficient when the values of the natural frequencies of damaged and undamaged specimens exhibit slight differences. This is particularly the case with lightweight structures, such as fiber-reinforced composites. The nonlinear support vector machine (SVM) provides enhanced results under such conditions by transforming the original features into a new space or applying a kernel trick. In this work, the natural frequencies of damaged and undamaged components are used for classification, employing the nonlinear SVM. The proposed methodology assumes that the frequencies are identified sequentially from an experimental modal analysis; for the study propose, however, the training data are generated from the FEM simulations for damaged and undamaged samples. It is shown that nonlinear SVM using kernel function yields in a clear classification boundary between damaged and undamaged specimens, even for minor variations in natural frequencies.

Download Full-text

Analisis Sentimen Twitter untuk Teks Berbahasa Indonesia dengan Maximum Entropy dan Support Vector Machine

IJCCS (Indonesian Journal of Computing and Cybernetics Systems) ◽

10.22146/ijccs.3499 ◽

2014 ◽

Vol 8 (1) ◽

pp. 91 ◽

Cited By ~ 5

Author(s):

Noviah Dwi Putranti ◽

Edi Winarko

Keyword(s):

Support Vector Machine ◽

Maximum Entropy ◽

Social Networking Site ◽

Training Data ◽

Classification Model ◽

Support Vector ◽

Public Sentiment ◽

Pos Tagger ◽

Negative Sentiment ◽

Bahasa Indonesia

AbstrakAnalisis sentimen dalam penelitian ini merupakan proses klasifikasi dokumen tekstual ke dalam dua kelas, yaitu kelas sentimen positif dan negatif. Data opini diperoleh dari jejaring sosial Twitter berdasarkan query dalam Bahasa Indonesia. Penelitian ini bertujuan untuk menentukan sentimen publik terhadap objek tertentu yang disampaikan di Twitter dalam bahasa Indonesia, sehingga membantu usaha untuk melakukan riset pasar atas opini publik. Data yang sudah terkumpul dilakukan proses preprocessing dan POS tagger untuk menghasilkan model klasifikasi melalui proses pelatihan. Teknik pengumpulan kata yang memiliki sentimen dilakukan dengan pendekatan berdasarkan kamus, yang dihasilkan dalam penelitian ini berjumlah 18.069 kata. Algoritma Maximum Entropy digunakan untuk POS tagger dan algoritma yang digunakan untuk membangun model klasifikasi atas data pelatihan dalam penelitian ini adalah Support Vector Machine. Fitur yang digunakan adalah unigram dengan fitur pembobotan TFIDF. Implementasi klasifikasi diperoleh akurasi 86,81 % pada pengujian 7 fold cross validation untuk tipe kernel Sigmoid. Pelabelan kelas secara manual dengan POS tagger menghasilkan akurasi 81,67%. Kata kunci—analisis sentimen, klasifikasi, maximum entropy POS tagger, support vector machine, twitter. AbstractSentiment analysis in this research classified textual documents into two classes, positive and negative sentiment. Opinion data obtained a query from social networking site Twitter of Indonesian tweet. This research uses Indonesian tweets. This study aims to determine public sentiment toward a particular object presented in Twitter businesses conduct market. Collected data then prepocessed to help POS tagged to generate classification models through the training process. Sentiment word collection has done the dictionary based approach, which is generated in this study consists 18.069 words. Maximum Entropy algorithm is used for POS tagger and the algorithms used to build the classification model on the training data is Support Vector Machine. The unigram features used are the features of TFIDF weighting.Classification implementation 86,81 % accuration at examination of 7 validation cross fold for the type of kernel of Sigmoid. Class labeling manually with POS tagger yield accuration 81,67 %. Keywords—sentiment analysis, classification, maximum entropy POS tagger, support vector machine, twitter.

Download Full-text

Peringkasan dan Support Vector Machine pada Klasifikasi Dokumen

JURNAL INFOTEL ◽

10.20895/infotel.v9i4.312 ◽

2017 ◽

Vol 9 (4) ◽

pp. 416 ◽

Cited By ~ 1

Author(s):

Nelly Indriani Widiastuti ◽

Ednawati Rainarli ◽

Kania Evita Dewi

Keyword(s):

Support Vector Machine ◽

Naive Bayes ◽

Naïve Bayes ◽

Training Data ◽

Support Vector ◽

Good Reputation ◽

Multiclass Support Vector Machine ◽

Simple Logistic ◽

Better Than

Classification is the process of grouping objects that have the same features or characteristics into several classes. The automatic documents classification use words frequency that appears on training data as features. The large number of documents cause the number of words that appears as a feature will increase. Therefore, summaries are chosen to reduce the number of words that used in classification. The classification uses multiclass Support Vector Machine (SVM) method. SVM was considered to have a good reputation in the classification. This research tests the effect of summary as selection features into documents classification. The summaries reduce text into 50%. A result obtained that the summaries did not affect value accuracy of classification of documents that use SVM. But, summaries improve the accuracy of Simple Logistic Classifier. The classification testing shows that the accuracy of Naïve Bayes Multinomial (NBM) better than SVM

Download Full-text