The Tomatoes and Chilies Type Classifications by Using Machine Learning Methods

Irzal Ahmad Sabilla; Chastine Fatichah

doi:10.28926/jdr.v4i1.93

The Tomatoes and Chilies Type Classifications by Using Machine Learning Methods

Journal of Development Research ◽

10.28926/jdr.v4i1.93 ◽

2020 ◽

Vol 4 (1) ◽

pp. 1-6

Author(s):

Irzal Ahmad Sabilla ◽

Chastine Fatichah

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Support Vector ◽

Staple Food ◽

K Nearest Neighbor ◽

Learning Methods ◽

Linear Discriminant ◽

Machine Learning Methods

Vegetables are ingredients for flavoring, such as tomatoes and chilies. A Both of these ingredients are processed to accompany the people's staple food in the form of sauce and seasoning. In supermarkets, these vegetables can be found easily, but many people do not understand how to choose the type and quality of chilies and tomatoes. This study discusses the classification of types of cayenne, curly, green, red chilies, and tomatoes with good and bad conditions using machine learning and contrast enhancement techniques. The machine learning methods used are Support Vector Machine (SVM), K-Nearest Neighbor (K-NN), Linear Discriminant Analysis (LDA), and Random Forest (RF). The results of testing the best method are measured based on the value of accuracy. In addition to the accuracy of this study, it also measures the speed of computation so that the methods used are efficient.

Download Full-text

Studi Komparasi Metode Machine Learning untuk Klasifikasi Citra Huruf Vokal Hiragana

JURNAL MEDIA INFORMATIKA BUDIDARMA ◽

10.30865/mib.v5i3.3083 ◽

2021 ◽

Vol 5 (3) ◽

pp. 905

Author(s):

Muhammad Afrizal Amrustian ◽

Vika Febri Muliati ◽

Elsa Elvira Awal

Keyword(s):

Machine Learning ◽

Comparative Study ◽

Image Classification ◽

Nearest Neighbor ◽

Support Vector ◽

K Nearest Neighbor ◽

Learning Methods ◽

Machine Learning Methods ◽

The Comparative Study

Japanese is one of the most difficult languages to understand and read. Japanese writing that does not use the alphabet is the reason for the difficulty of the Japanese language to read. There are three types of Japanese, namely kanji, katakana, and hiragana. Hiragana letters are the most commonly used type of writing. In addition, hiragana has a cursive nature, so each person's writing will be different. Machine learning methods can be used to read Japanese letters by recognizing the image of the letters. The Japanese letters that are used in this study are hiragana vowels. This study focuses on conducting a comparative study of machine learning methods for the image classification of Japanese letters. The machine learning methods that were successfully compared are Naïve Bayes, Support Vector Machine, Decision Tree, Random Forest, and K-Nearest Neighbor. The results of the comparative study show that the K-Nearest Neighbor method is the best method for image classification of hiragana vowels. K-Nearest Neighbor gets an accuracy of 89.4% with a low error rate.

Download Full-text

Metabolic Syndrome Prediction Models Using Machine Learning and Sasang Constitution Type

Evidence-based Complementary and Alternative Medicine ◽

10.1155/2021/8315047 ◽

2021 ◽

Vol 2021 ◽

pp. 1-7

Author(s):

Ji-Eun Park ◽

Sujeong Mun ◽

Siwoo Lee

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Prediction Models ◽

Support Vector ◽

K Nearest Neighbor ◽

Learning Methods ◽

Machine Learning Methods ◽

Sasang Constitution ◽

Constitution Type ◽

Conventional Regression

Background. Machine learning may be a useful tool for predicting metabolic syndrome (MetS), and previous studies also suggest that the risk of MetS differs according to Sasang constitution type. The present study investigated the development of MetS prediction models utilizing machine learning methods and whether the incorporation of Sasang constitution type could improve the performance of those prediction models. Methods. Participants visiting a medical center for a health check-up were recruited in 2005 and 2006. Six kinds of machine learning were utilized (K-nearest neighbor, naive Bayes, random forest, decision tree, multilayer perceptron, and support vector machine), as was conventional logistic regression. Machine learning-derived MetS prediction models with and without the incorporation of Sasang constitution type were compared to investigate whether the former would predict MetS with higher sensitivity. Age, sex, education level, marital status, body mass index, stress, physical activity, alcohol consumption, and smoking were included as potentially predictive factors. Results. A total of 750/2,871 participants had MetS. Among the six types of machine learning methods investigated, multiplayer perceptron and support vector machine exhibited the same performance as the conventional regression method, based on the areas under the receiver operating characteristic curves. The naive-Bayes method exhibited the highest sensitivity (0.49), which was higher than that of the conventional regression method (0.39). The incorporation of Sasang constitution type improved the sensitivity of all of the machine learning methods investigated except for the K-nearest neighbor method. Conclusion. Machine learning-derived models may be useful for MetS prediction, and the incorporation of Sasang constitution type may increase the sensitivity of such models.

Download Full-text

Application of Machine Learning Approaches for the Design and Study of Anticancer Drugs

Current Drug Targets ◽

10.2174/1389450119666180809122244 ◽

2019 ◽

Vol 20 (5) ◽

pp. 488-500 ◽

Cited By ~ 6

Author(s):

Yan Hu ◽

Yi Lu ◽

Shuo Wang ◽

Mengying Zhang ◽

Xiaosheng Qu ◽

...

Keyword(s):

Machine Learning ◽

Drug Design ◽

Anticancer Drugs ◽

Nearest Neighbor ◽

Cost Effective ◽

Support Vector ◽

Learning Approaches ◽

K Nearest Neighbor ◽

Activity Prediction ◽

Linear Discriminant

Background: Globally the number of cancer patients and deaths are continuing to increase yearly, and cancer has, therefore, become one of the world's highest causes of morbidity and mortality. In recent years, the study of anticancer drugs has become one of the most popular medical topics. Objective: In this review, in order to study the application of machine learning in predicting anticancer drugs activity, some machine learning approaches such as Linear Discriminant Analysis (LDA), Principal components analysis (PCA), Support Vector Machine (SVM), Random forest (RF), k-Nearest Neighbor (kNN), and Naïve Bayes (NB) were selected, and the examples of their applications in anticancer drugs design are listed. Results: Machine learning contributes a lot to anticancer drugs design and helps researchers by saving time and is cost effective. However, it can only be an assisting tool for drug design. Conclusion: This paper introduces the application of machine learning approaches in anticancer drug design. Many examples of success in identification and prediction in the area of anticancer drugs activity prediction are discussed, and the anticancer drugs research is still in active progress. Moreover, the merits of some web servers related to anticancer drugs are mentioned.

Download Full-text

Cardiovascular Disease Prediction from Electrocardiogram by using Machine Learning Method: A Snapshot from the Subjects of the Malaysian Cohort

10.21203/rs.2.22561/v1 ◽

2020 ◽

Author(s):

Nazrul Anuar Nayan ◽

Hafifah Ab Hamid ◽

Mohd Zubir Suboh ◽

Noraidatulakma Abdullah ◽

Rosmina Jaafar ◽

...

Keyword(s):

Machine Learning ◽

Cardiovascular Disease ◽

Nearest Neighbor ◽

Total Cholesterol Level ◽

Population Based ◽

Support Vector ◽

K Nearest Neighbor ◽

Cvd Risk ◽

Linear Discriminant ◽

Artificial Neural Network Ann

Abstract Background: Cardiovascular disease (CVD) is the leading cause of deaths worldwide. In 2017, CVD contributed to 13,503 deaths in Malaysia. The current approaches for CVD prediction are usually invasive and costly. Machine learning (ML) techniques allow an accurate prediction by utilizing the complex interactions among relevant risk factors. Results: This study presents a case–control study involving 60 participants from The Malaysian Cohort, which is a prospective population-based project. Five parameters, namely, the R–R interval and root mean square of successive differences extracted from electrocardiogram (ECG), systolic and diastolic blood pressures, and total cholesterol level, were statistically significant in predicting CVD. Six ML algorithms, namely, linear discriminant analysis, linear and quadratic support vector machines, decision tree, k-nearest neighbor, and artificial neural network (ANN), were evaluated to determine the most accurate classifier in predicting CVD risk. ANN, which achieved 90% specificity, 90% sensitivity, and 90% accuracy, demonstrated the highest prediction performance among the six algorithms. Conclusions: In summary, by utilizing ML techniques, ECG data can serve as a good parameter for CVD prediction among the Malaysian multiethnic population.

Download Full-text

Cardiovascular Disease Prediction from Electrocardiogram by Using Machine Learning

International Journal of Online and Biomedical Engineering (iJOE) ◽

10.3991/ijoe.v16i07.13569 ◽

2020 ◽

Vol 16 (07) ◽

pp. 34

Author(s):

Nayan Nazrul Anuar ◽

Ab Hamid Hafifah ◽

Suboh Mohd Zubir ◽

Abdullah Noraidatulakma ◽

Jaafar Rosmina ◽

...

Keyword(s):

Machine Learning ◽

Cardiovascular Disease ◽

Nearest Neighbor ◽

Total Cholesterol Level ◽

Population Based ◽

Support Vector ◽

K Nearest Neighbor ◽

Cvd Risk ◽

Linear Discriminant ◽

Artificial Neural Network Ann

Cardiovascular disease (CVD) is the leading cause of deaths worldwide. In 2017, CVD contributed to 13,503 deaths in Malaysia. The current approaches for CVD prediction are usually invasive and costly. Machine learning (ML) techniques allow an accurate prediction by utilizing the complex interactions among relevant risk factors. This study presents a case–control study involving 60 participants from The Malaysian Cohort, which is a prospective population-based project. Five parameters, namely, the R–R interval and root mean square of successive differences extracted from electrocardiogram (ECG), systolic and diastolic blood pressures, and total cholesterol level, were statistically significant in predicting CVD. Six ML algorithms, namely, linear discriminant analysis, linear and quadratic support vector machines, decision tree, k-nearest neighbor, and artificial neural network (ANN), were evaluated to determine the most accurate classifier in predicting CVD risk. ANN, which achieved 90% specificity, 90% sensitivity, and 90% accuracy, demonstrated the highest prediction performance among the six algorithms. In summary, by utilizing ML techniques, ECG data can serve as a good parameter for CVD prediction among the Malaysian multiethnic population.

Download Full-text

Predicting Fine Particulate Matter (PM2.5) in the Greater London Area: An Ensemble Approach using Machine Learning Methods

Remote Sensing ◽

10.3390/rs12060914 ◽

2020 ◽

Vol 12 (6) ◽

pp. 914 ◽

Cited By ~ 4

Author(s):

Mahdieh Danesh Yazdi ◽

Zheng Kuang ◽

Konstantina Dimakopoulou ◽

Benjamin Barratt ◽

Esra Suel ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Nearest Neighbor ◽

Meteorological Data ◽

Fine Particulate Matter ◽

Gradient Boosting ◽

K Nearest Neighbor ◽

Learning Methods ◽

Machine Learning Methods ◽

Technological Advances

Estimating air pollution exposure has long been a challenge for environmental health researchers. Technological advances and novel machine learning methods have allowed us to increase the geographic range and accuracy of exposure models, making them a valuable tool in conducting health studies and identifying hotspots of pollution. Here, we have created a prediction model for daily PM2.5 levels in the Greater London area from 1st January 2005 to 31st December 2013 using an ensemble machine learning approach incorporating satellite aerosol optical depth (AOD), land use, and meteorological data. The predictions were made on a 1 km × 1 km scale over 3960 grid cells. The ensemble included predictions from three different machine learners: a random forest (RF), a gradient boosting machine (GBM), and a k-nearest neighbor (KNN) approach. Our ensemble model performed very well, with a ten-fold cross-validated R2 of 0.828. Of the three machine learners, the random forest outperformed the GBM and KNN. Our model was particularly adept at predicting day-to-day changes in PM2.5 levels with an out-of-sample temporal R2 of 0.882. However, its ability to predict spatial variability was weaker, with a R2 of 0.396. We believe this to be due to the smaller spatial variation in pollutant levels in this area.

Download Full-text

Predictions of chalcospinels with composition ABCX4 (X – S or Se)

Perspektivnye Materialy ◽

10.30791/1028-978x-2020-7-5-18 ◽

2020 ◽

pp. 5-18

Author(s):

N. N. Kiselyova ◽

◽

V. A. Dudarev ◽

V. V. Ryazanov ◽

O. V. Sen’ko ◽

...

Keyword(s):

Machine Learning ◽

Crystal Lattice ◽

Prediction Accuracy ◽

Cross Validation ◽

Chemical Elements ◽

Optical Memory ◽

Support Vector ◽

Learning Methods ◽

Linear Discriminant ◽

Machine Learning Methods

New chalcospinels of the most common compositions were predicted: AIBIIICIVX4 (X — S or Se) and AIIBIIICIIIS4 (A, B, and C are various chemical elements). They are promising for the search for new materials for magneto-optical memory elements, sensors and anodes in sodium-ion batteries. The parameter “a” values of their crystal lattice are estimated. When predicting only the values of chemical elements properties were used. The calculations were carried out using machine learning programs that are part of the information-analytical system developed by the authors (various ensembles of algorithms of: the binary decision trees, the linear machine, the search for logical regularities of classes, the support vector machine, Fisher linear discriminant, the k-nearest neighbors, the learning a multilayer perceptron and a neural network), — for predicting chalcospinels not yet obtained, as well as an extensive family of regression methods, presented in the scikit-learn package for the Python language, and multilevel machine learning methods that were proposed by the authors — for estimation of the new chalcospinels lattice parameter value). The prediction accuracy of new chalcospinels according to the results of the cross-validation is not lower than 80%, and the prediction accuracy of the parameter of their crystal lattice (according to the results of calculating the mean absolute error (when cross-validation in the leave-one-out mode)) is ± 0.1 Å. The effectiveness of using multilevel machine learning methods to predict the physical properties of substances was shown.

Download Full-text

Fruit Classification Utilizing a Robotic Gripper with Integrated Sensors and Adaptive Grasping

Mathematical Problems in Engineering ◽

10.1155/2021/7157763 ◽

2021 ◽

Vol 2021 ◽

pp. 1-15

Author(s):

Jintao Zhang ◽

Shuang Lai ◽

Huahua Yu ◽

Erjie Wang ◽

Xizhe Wang ◽

...

Keyword(s):

Machine Learning ◽

Information Acquisition ◽

Fruits And Vegetables ◽

Nearest Neighbor ◽

Machine Learning Algorithms ◽

Support Vector ◽

Tactile Sensing ◽

K Nearest Neighbor ◽

Linear Discriminant ◽

Agricultural Robots

As the core component of agricultural robots, robotic grippers are widely used for plucking, picking, and harvesting fruits and vegetables. Secure grasping is a severe challenge in agricultural applications because of the variation in the shape and hardness of agricultural products during maturation, as well as their variety and delicacy. In this study, a fruit identification method utilizing an adaptive gripper with tactile sensing and machine learning algorithms is reported. An adaptive robotic gripper is designed and manufactured to perform adaptive grasping. A tactile sensing information acquisition circuit is built, and force and bending sensors are integrated into the robotic gripper to measure the contact force distribution on the contact surface and the deformation of the soft fingers. A robotic manipulator platform is developed to collect the tactile sensing data in the grasping process. The performance of the random forest (RF), k-nearest neighbor (KNN), support vector classification (SVC), naive Bayes (NB), linear discriminant analysis (LDA), and ridge regression (RR) classifiers in identifying and classifying five types of fruits using the adaptive gripper is evaluated and compared. The RF classifier achieves the highest accuracy of 98%, while the accuracies of the other classifiers vary from 74% to 97%. The experiment illustrates that efficient and accurate fruit identification can be realized with the adaptive gripper and machine learning classifiers, and that the proposed method can provide a reference for controlling the grasping force and planning the robotic motion in the plucking, picking, and harvesting of fruits and vegetables.

Download Full-text

Implementation of machine learning algorithms to create diabetic patient re-admission profiles

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-019-0990-x ◽

2019 ◽

Vol 19 (S9) ◽

Cited By ~ 3

Author(s):

Mohamed Alloghani ◽

Ahmed Aljaaf ◽

Abir Hussain ◽

Thar Baker ◽

Jamila Mustafina ◽

...

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Machine Learning Algorithms ◽

Support Vector ◽

Diabetic Patients ◽

K Nearest Neighbor ◽

Critical Approach ◽

Linear Discriminant ◽

Applied Machine Learning ◽

Learning Machine

Abstract Background Machine learning is a branch of Artificial Intelligence that is concerned with the design and development of algorithms, and it enables today’s computers to have the property of learning. Machine learning is gradually growing and becoming a critical approach in many domains such as health, education, and business. Methods In this paper, we applied machine learning to the diabetes dataset with the aim of recognizing patterns and combinations of factors that characterizes or explain re-admission among diabetes patients. The classifiers used include Linear Discriminant Analysis, Random Forest, k–Nearest Neighbor, Naïve Bayes, J48 and Support vector machine. Results Of the 100,000 cases, 78,363 were diabetic and over 47% were readmitted.Based on the classes that models produced, diabetic patients who are more likely to be readmitted are either women, or Caucasians, or outpatients, or those who undergo less rigorous lab procedures, treatment procedures, or those who receive less medication, and are thus discharged without proper improvements or administration of insulin despite having been tested positive for HbA1c. Conclusion Diabetic patients who do not undergo vigorous lab assessments, diagnosis, medications are more likely to be readmitted when discharged without improvements and without receiving insulin administration, especially if they are women, Caucasians, or both.

Download Full-text

Determination of the Geographical Origin of Coffee Beans Using Terahertz Spectroscopy Combined With Machine Learning Methods

Frontiers in Nutrition ◽

10.3389/fnut.2021.680627 ◽

2021 ◽

Vol 8 ◽

Author(s):

Si Yang ◽

Chenxi Li ◽

Yang Mei ◽

Wen Liu ◽

Rong Liu ◽

...

Keyword(s):

Machine Learning ◽

Terahertz Spectroscopy ◽

Geographic Origin ◽

Principal Component ◽

Support Vector ◽

Thz Spectroscopy ◽

Learning Methods ◽

Linear Discriminant ◽

Machine Learning Methods ◽

Coffee Beans

Different geographical origins can lead to great variance in coffee quality, taste, and commercial value. Hence, controlling the authenticity of the origin of coffee beans is of great importance for producers and consumers worldwide. In this study, terahertz (THz) spectroscopy, combined with machine learning methods, was investigated as a fast and non-destructive method to classify the geographic origin of coffee beans, comparing it with the popular machine learning methods, including convolutional neural network (CNN), linear discriminant analysis (LDA), and support vector machine (SVM) to obtain the best model. The curse of dimensionality will cause some classification methods which are struggling to train effective models. Thus, principal component analysis (PCA) and genetic algorithm (GA) were applied for LDA and SVM to create a smaller set of features. The first nine principal components (PCs) with an accumulative contribution rate of 99.9% extracted by PCA and 21 variables selected by GA were the inputs of LDA and SVM models. The results demonstrate that the excellent classification (accuracy was 90% in a prediction set) could be achieved using a CNN method. The results also indicate variable selecting as an important step to create an accurate and robust discrimination model. The performances of LDA and SVM algorithms could be improved with spectral features extracted by PCA and GA. The GA-SVM has achieved 75% accuracy in a prediction set, while the SVM and PCA-SVM have achieved 50 and 65% accuracy, respectively. These results demonstrate that THz spectroscopy, together with machine learning methods, is an effective and satisfactory approach for classifying geographical origins of coffee beans, suggesting the techniques to tap the potential application of deep learning in the authenticity of agricultural products while expanding the application of THz spectroscopy.

Download Full-text