scholarly journals Application of Raman spectroscopy and Machine Learning algorithms for fruit distillates discrimination

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Camelia Berghian-Grosan ◽  
Dana Alina Magdas

AbstractThrough this pilot study, the association between Raman spectroscopy and Machine Learning algorithms were used for the first time with the purpose of distillates differentiation with respect to trademark, geographical and botanical origin. Two spectral Raman ranges (region I—200–600 cm−1 and region II—1200–1400 cm−1) appeared to have the higher discrimination potential for the investigated distillates. The proposed approach proved to be a very effective one for trademark fingerprint differentiation, a model accuracy of 95.5% being obtained (only one sample was misclassified). A comparable model accuracy (90.9%) was achieved for the geographical discrimination of the fruit spirits which can be considered as a very good one taking into account that this classification was made inside Transylvania region, among neighbouring areas. Because the trademark fingerprint is the prevailing one, the successfully distillate type differentiation, with respect to the fruit variety, was possible to be made only inside of each producing entity.

2021 ◽  
Vol 12 (1) ◽  
pp. 1-17
Author(s):  
Swati V. Narwane ◽  
Sudhir D. Sawarkar

Class imbalance is the major hurdle for machine learning-based systems. Data set is the backbone of machine learning and must be studied to handle the class imbalance. The purpose of this paper is to investigate the effect of class imbalance on the data sets. The proposed methodology determines the model accuracy for class distribution. To find possible solutions, the behaviour of an imbalanced data set was investigated. The study considers two case studies with data set divided balanced to unbalanced class distribution. Testing of the data set with trained and test data was carried out for standard machine learning algorithms. Model accuracy for class distribution was measured with the training data set. Further, the built model was tested with individual binary class. Results show that, for the improvement of the system performance, it is essential to work on class imbalance problems. The study concludes that the system produces biased results due to the majority class. In the future, the multiclass imbalance problem can be studied using advanced algorithms.


Author(s):  
Aibek Atanbekov ◽  
Habiburahman Shirzad

This paper is going to explore the difference between the vocabularies used by Thinking and Feeling personalities. To find out this we used the Machine Learning algorithm Naïve Bayes which showed the best accuracy in comparison with others. The concept was motivated by essays of scholars when they submitted the first time at university and to get the full psychological portrait of the student only by given text. To train the model we used a labeled dataset that was collected through a forum with real persons. This dataset contains the type of the person and their posts in social media. To test the model using another dataset which contains information about movie characters and their speech used in the movie. Psycho-type was described by Myers-Briggs Type Indicators (MBTI) which is one of the most popular typologies. To achieve better accuracy of prediction we trained the model separately for Thinking and Feeling predictors. Overall, we achieved better accuracy than previous studies and showed the difference between the vocabularies used by Thinkers and Feelers. 


2020 ◽  
Vol 39 (5) ◽  
pp. 6579-6590
Author(s):  
Sandy Çağlıyor ◽  
Başar Öztayşi ◽  
Selime Sezgin

The motion picture industry is one of the largest industries worldwide and has significant importance in the global economy. Considering the high stakes and high risks in the industry, forecast models and decision support systems are gaining importance. Several attempts have been made to estimate the theatrical performance of a movie before or at the early stages of its release. Nevertheless, these models are mostly used for predicting domestic performances and the industry still struggles to predict box office performances in overseas markets. In this study, the aim is to design a forecast model using different machine learning algorithms to estimate the theatrical success of US movies in Turkey. From various sources, a dataset of 1559 movies is constructed. Firstly, independent variables are grouped as pre-release, distributor type, and international distribution based on their characteristic. The number of attendances is discretized into three classes. Four popular machine learning algorithms, artificial neural networks, decision tree regression and gradient boosting tree and random forest are employed, and the impact of each group is observed by compared by the performance models. Then the number of target classes is increased into five and eight and results are compared with the previously developed models in the literature.


Sign in / Sign up

Export Citation Format

Share Document