scholarly journals Estimating pasture quality of Mediterranean grasslands using hyperspectral narrow bands from field spectroscopy by Random Forest and PLS regressions

2022 ◽  
Vol 192 ◽  
pp. 106614
Author(s):  
Jesús Fernández-Habas ◽  
Mónica Carriere Cañada ◽  
Alma María García Moreno ◽  
José Ramón Leal-Murillo ◽  
María P. González-Dugo ◽  
...  
Foods ◽  
2021 ◽  
Vol 10 (6) ◽  
pp. 1411
Author(s):  
José Luis P. Calle ◽  
Marta Ferreiro-González ◽  
Ana Ruiz-Rodríguez ◽  
Gerardo F. Barbero ◽  
José Á. Álvarez ◽  
...  

Sherry wine vinegar is a Spanish gourmet product under Protected Designation of Origin (PDO). Before a vinegar can be labeled as Sherry vinegar, the product must meet certain requirements as established by its PDO, which, in this case, means that it has been produced following the traditional solera and criadera ageing system. The quality of the vinegar is determined by many factors such as the raw material, the acetification process or the aging system. For this reason, mainly producers, but also consumers, would benefit from the employment of effective analytical tools that allow precisely determining the origin and quality of vinegar. In the present study, a total of 48 Sherry vinegar samples manufactured from three different starting wines (Palomino Fino, Moscatel, and Pedro Ximénez wine) were analyzed by Fourier-transform infrared (FT-IR) spectroscopy. The spectroscopic data were combined with unsupervised exploratory techniques such as hierarchical cluster analysis (HCA) and principal component analysis (PCA), as well as other nonparametric supervised techniques, namely, support vector machine (SVM) and random forest (RF), for the characterization of the samples. The HCA and PCA results present a clear grouping trend of the vinegar samples according to their raw materials. SVM in combination with leave-one-out cross-validation (LOOCV) successfully classified 100% of the samples, according to the type of wine used for their production. The RF method allowed selecting the most important variables to develop the characteristic fingerprint (“spectralprint”) of the vinegar samples according to their starting wine. Furthermore, the RF model reached 100% accuracy for both LOOCV and out-of-bag (OOB) sets.


2020 ◽  
Vol 40 (4) ◽  
pp. 360-371
Author(s):  
Yanli Cao ◽  
Xiying Fan ◽  
Yonghuan Guo ◽  
Sai Li ◽  
Haiyue Huang

AbstractThe qualities of injection-molded parts are affected by process parameters. Warpage and volume shrinkage are two typical defects. Moreover, insufficient or excessively large clamping force also affects the quality of parts and the cost of the process. An experiment based on the orthogonal design was conducted to minimize the above defects. Moldflow software was used to simulate the injection process of each experiment. The entropy weight was used to determine the weight of each index, the comprehensive evaluation value was calculated, and multi-objective optimization was transformed into single-objective optimization. A regression model was established by the random forest (RF) algorithm. To further illustrate the reliability and accuracy of the model, back-propagation neural network and kriging models were taken as comparative algorithms. The results showed that the error of RF was the smallest and its performance was the best. Finally, genetic algorithm was used to search for the minimum of the regression model established by RF. The optimal parameters were found to improve the quality of plastic parts and reduce the energy consumption. The plastic parts manufactured by the optimal process parameters showed good quality and met the requirements of production.


Author(s):  
A.V. Kozina ◽  
Yu.S. Belov

Automatically assessing the quality of machine translation is an important yet challenging task for machine translation research. Translation quality assessment is understood as predicting translation quality without reference to the source text. Translation quality depends on the specific machine translation system and often requires post-editing. Manual editing is a long and expensive process. Since the need to quickly determine the quality of translation increases, its automation is required. In this paper, we propose a quality assessment method based on ensemble supervised machine learning methods. The bilingual corpus WMT 2019 for the EnglishRussian language pair was used as data. The text data volume is 17089 sentences, 85% of the data was used for training, and 15% for testing the model. Linguistic functions extracted from the text in the source and target languages were used as features for training the system, since it is these characteristics that can most accurately characterize the translation in terms of quality. The following tools were used for feature extraction: a free language modeling tool based on SRILM and a Stanford POS Tagger parts of speech tagger. Before training the system, the text was preprocessed. The model was trained using three regression methods: Bagging, Extra Tree, and Random Forest. The algorithms were implemented in the Python programming language using the Scikit learn library. The parameters of the random forest method have been optimized using a grid search. The performance of the model was assessed by the mean absolute error MAE and the root mean square error RMSE, as well as by the Pearsоn coefficient, which determines the correlation with human judgment. Testing was carried out using three machine translation systems: Google and Bing neural systems, Mouses statistical machine translation systems based on phrases and based on syntax. Based on the results of the work, the method of additional trees showed itself best. In addition, for all categories of indicators under consideration, the best results are achieved using the Google machine translation system. The developed method showed good results close to human judgment. The system can be used for further research in the task of assessing the quality of translation.


2018 ◽  
Vol 23 ◽  
pp. 00016 ◽  
Author(s):  
Joanna A. Kamińska

Two data mining methods – a random forest and boosted regression trees – were used to model values of roadside air pollution depending on meteorological conditions and traffic flow, using the example of data obtained in the city of Wrocław in the years 2015–2016. Eight explanatory variables – five continuous and three categorical – were considered in the models. A comparison was made of the quality of the fit of the models to empirical data. Commonly used goodness-of-fit measures did not imply a significant preference for either of the methods. Residual analysis was also performed; this showed boosted regression trees to be a more effective method for predicting typical values in the modelling of NO2, NOx and PM2.5, while the random forest method leads to smaller errors when predicting peaks.


2020 ◽  
Vol 4 (4) ◽  
pp. 37
Author(s):  
Khaled Fawagreh ◽  
Mohamed Medhat Gaber

To make healthcare available and easily accessible, the Internet of Things (IoT), which paved the way to the construction of smart cities, marked the birth of many smart applications in numerous areas, including healthcare. As a result, smart healthcare applications have been and are being developed to provide, using mobile and electronic technology, higher diagnosis quality of the diseases, better treatment of the patients, and improved quality of lives. Since smart healthcare applications that are mainly concerned with the prediction of healthcare data (like diseases for example) rely on predictive healthcare data analytics, it is imperative for such predictive healthcare data analytics to be as accurate as possible. In this paper, we will exploit supervised machine learning methods in classification and regression to improve the performance of the traditional Random Forest on healthcare datasets, both in terms of accuracy and classification/regression speed, in order to produce an effective and efficient smart healthcare application, which we have termed eGAP. eGAP uses the evolutionary game theoretic approach replicator dynamics to evolve a Random Forest ensemble. Trees of high resemblance in an initial Random Forest are clustered, and then clusters grow and shrink by adding and removing trees using replicator dynamics, according to the predictive accuracy of each subforest represented by a cluster of trees. All clusters have an initial number of trees that is equal to the number of trees in the smallest cluster. Cluster growth is performed using trees that are not initially sampled. The speed and accuracy of the proposed method have been demonstrated by an experimental study on 10 classification and 10 regression medical datasets.


2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Freddy Bangelesa ◽  
Elhadi Adam ◽  
Jasper Knight ◽  
Inos Dhau ◽  
Marubini Ramudzuli ◽  
...  

Soil organic carbon constitutes an important indicator of soil fertility. The purpose of this study was to predict soil organic carbon content in the mountainous terrain of eastern Lesotho, southern Africa, which is an area of high endemic biodiversity as well as an area extensively used for small-scale agriculture. An integrated field and laboratory approach was undertaken, through measurements of reflectance spectra of soil using an Analytical Spectral Device (ASD) FieldSpec® 4 optical sensor. Soil spectra were collected on the land surface under field conditions and then on soil in the laboratory, in order to assess the accuracy of field spectroscopy-based models. The predictive performance of two different statistical models (random forest and partial least square regression) was compared. Results show that random forest regression can most accurately predict the soil organic carbon contents on an independent dataset using the field spectroscopy data. In contrast, the partial least square regression model overfits the calibration dataset. Important wavelengths to predict soil organic contents were localised around the visible range (400–700 nm). This study shows that soil organic carbon can be most accurately estimated using derivative field spectroscopy measurements and random forest regression.


2000 ◽  
Vol 51 (8) ◽  
pp. 1047 ◽  
Author(s):  
Y. J. Ru ◽  
J. A. Fortune

With the decline in pasture quality in southern Australia, the development of management strategies to improve nutrient supply for grazing animals is essential and requires a clear understanding of the interaction between animals and plants. The impact of grazing intensity on the morphology of subterranean clover was previously examined. This paper reports the effect of grazing intensity on the nutritive value of subterranean clover, and the variation in quality of cultivars during the growing season. Grazing intensity influenced nutritive value and interacted with cultivar maturity. Heavy grazing depressed dry matter digestibility (DMD) by 5 percentage units in October for early maturity cultivars but increased DMD by 3 percentage units in September for mid maturity cultivars. The influence of grazing intensity on nitrogen content was small. Heavy grazing did not affect acid detergent fibre for the early maturity group, but depressed it for the mid maturity group throughout the season. Acid detergent lignin remained comparable for all cultivars during the season. Mineral content of subterranean clover showed variable response to grazing treatments. Nutritive value varied among cultivars within each maturity group. DMD ranged over 53–64%, 44–62%, and 45–53% for early, mid, and late maturity groups, respectively, at the end of the growing season. The cultivar rank in all nutritional parameters changed with the progress of the season. The large ranges in the decline rate of DMD within each maturity group during the last 8 weeks of growth gave an indication of the potential quality of the cultivars during late spring and early summer. Despite the variation in mineral content there were no cultivars in which the concentration of minerals was below the minimum requirements of sheep. These results indicate that there is a potential for the selection of high quality cultivars within a breeding program, and that indicative targets of grazing intensity need to be further developed with a focus on pasture quality.


Author(s):  
Harits Ar Rosyid ◽  
Utomo Pujianto ◽  
Moch Rajendra Yudhistira

There are various ways to improve the quality of someone's education, one of them is reading. By reading, insight and knowledge of various kinds of things can increase. But, the ability and someone's understanding of reading is different. This can be a problem for readers if the reading material exceeds his comprehension ability. Therefore, it is necessary to determine the load of reading material using Lexile Levels. Lexile Levels are a value that gives a size the complexity of reading material and someone's reading ability. Thus, the reading material will be classified based a value on the Lexile Levels. Lexile Levels will cluster the reading material into 2 clusters which is easy, and difficult. The clustering process will use the k-means method. After the clustering process, reading material will be classified using the reading load Random Forest method. The k-means method was chosen because of the method has a simple computing process and fast also. Random Forest algorithm is a method that can build decision tree and it’s able to build several decision trees then choose the best tree. The results of this experiment indicate that the experiment scenario uses 2 cluster and SMOTE and GIFS preprocessing are carried out shows good results with an accuracy of 76.03%, precision of 81.85% and recall of 76.05%.


Sign in / Sign up

Export Citation Format

Share Document