Forecasting the production of gross output in agricultural sector of the Ryazan oblast

2021 ◽  
Vol 39 (6) ◽  
Author(s):  
Elena Khudyakova ◽  
Mikhail Nikanorov ◽  
Irina Bystrenina ◽  
Tatyana Cherevatova ◽  
Irina Sycheva

Gross agricultural output is a generalised physical output indicator in an industry that includes hundreds of different types of products, as well as the result of the interaction of production factors. This study provides a comparative analysis of methods based on "Gradient boosting of regression trees" in the Python programming language to identify the optimal values of the model parameters with the subsequent construction of a predictive model based on indicators that affect the production of gross agricultural output. The purpose of this study is to build a regression model for predicting gross agricultural output at actual prices for 2020. To achieve this goal, the methods of regression analysis, forecasting, gradient boosting, etc., were used. The gradient boosting of regression trees was solved for the conditions of the Ryazan Oblast. 4 models were created, 2 of which were based on the preliminary data processing. As a result of the construction of all models, the optimal values of the parameters were found and the results of the correctness on the model on the test set were obtained. It was found that the gradient boosting of regression trees gives adequate regression models for predicting the target variable, in particular, the indicator of gross agricultural output. The investigated indicator is a complex result of the interaction of many factors that are common for agricultural production. Thus, the gradient boosting of trees is most suitable for forecasting complex open systems. Such a method can be used to forecast the production of gross agricultural output not only of individual regions but also of the state as a whole. Based on the "test_score" model, which showed the correctness of 99% (0.994) on the test set, the gross agricultural output in all categories of farms in 2020 amounted to RUB 19187.84 million.

Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3625
Author(s):  
Mateusz Krzysztoń ◽  
Ewa Niewiadomska-Szynkiewicz

Intelligent wireless networks that comprise self-organizing autonomous vehicles equipped with punctual sensors and radio modules support many hostile and harsh environment monitoring systems. This work’s contribution shows the benefits of applying such networks to estimate clouds’ boundaries created by hazardous toxic substances heavier than air when accidentally released into the atmosphere. The paper addresses issues concerning sensing networks’ design, focussing on a computing scheme for online motion trajectory calculation and data exchange. A three-stage approach that incorporates three algorithms for sensing devices’ displacement calculation in a collaborative network according to the current task, namely exploration and gas cloud detection, boundary detection and estimation, and tracking the evolving cloud, is presented. A network connectivity-maintaining virtual force mobility model is used to calculate subsequent sensor positions, and multi-hop communication is used for data exchange. The main focus is on the efficient tracking of the cloud boundary. The proposed sensing scheme is sensitive to crucial mobility model parameters. The paper presents five procedures for calculating the optimal values of these parameters. In contrast to widely used techniques, the presented approach to gas cloud monitoring does not calculate sensors’ displacements based on exact values of gas concentration and concentration gradients. The sensor readings are reduced to two values: the gas concentration below or greater than the safe value. The utility and efficiency of the presented method were justified through extensive simulations, giving encouraging results. The test cases were carried out on several scenarios with regular and irregular shapes of clouds generated using a widely used box model that describes the heavy gas dispersion in the atmospheric air. The simulation results demonstrate that using only a rough measurement indicating that the threshold concentration value was exceeded can detect and efficiently track a gas cloud boundary. This makes the sensing system less sensitive to the quality of the gas concentration measurement. Thus, it can be easily used to detect real phenomena. Significant results are recommendations on selecting procedures for computing mobility model parameters while tracking clouds with different shapes and determining optimal values of these parameters in convex and nonconvex cloud boundaries.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Jong Ho Kim ◽  
Haewon Kim ◽  
Ji Su Jang ◽  
Sung Mi Hwang ◽  
So Young Lim ◽  
...  

Abstract Background Predicting difficult airway is challengeable in patients with limited airway evaluation. The aim of this study is to develop and validate a model that predicts difficult laryngoscopy by machine learning of neck circumference and thyromental height as predictors that can be used even for patients with limited airway evaluation. Methods Variables for prediction of difficulty laryngoscopy included age, sex, height, weight, body mass index, neck circumference, and thyromental distance. Difficult laryngoscopy was defined as Grade 3 and 4 by the Cormack-Lehane classification. The preanesthesia and anesthesia data of 1677 patients who had undergone general anesthesia at a single center were collected. The data set was randomly stratified into a training set (80%) and a test set (20%), with equal distribution of difficulty laryngoscopy. The training data sets were trained with five algorithms (logistic regression, multilayer perceptron, random forest, extreme gradient boosting, and light gradient boosting machine). The prediction models were validated through a test set. Results The model’s performance using random forest was best (area under receiver operating characteristic curve = 0.79 [95% confidence interval: 0.72–0.86], area under precision-recall curve = 0.32 [95% confidence interval: 0.27–0.37]). Conclusions Machine learning can predict difficult laryngoscopy through a combination of several predictors including neck circumference and thyromental height. The performance of the model can be improved with more data, a new variable and combination of models.


2021 ◽  
Author(s):  
Nicolai Ree ◽  
Andreas H. Göller ◽  
Jan H. Jensen

We present RegioML, an atom-based machine learning model for predicting the regioselectivities of electrophilic aromatic substitution reactions. The model relies on CM5 atomic charges computed using semiempirical tight binding (GFN1-xTB) combined with the ensemble decision tree variant light gradient boosting machine (LightGBM). The model is trained and tested on 21,201 bromination reactions with 101K reaction centers, which is split into a training, test, and out-of-sample datasets with 58K, 15K, and 27K reaction centers, respectively. The accuracy is 93% for the test set and 90% for the out-of-sample set, while the precision (the percentage of positive predictions that are correct) is 88% and 80%, respectively. The test-set performance is very similar to the graph-based WLN method developed by Struble et al. (React. Chem. Eng. 2020, 5, 896) though the comparison is complicated by the possibility that some of the test and out-of-sample molecules are used to train WLN. RegioML out-performs our physics-based RegioSQM20 method (J. Cheminform. 2021, 13:10) where the precision is only 75%. Even for the out-of-sample dataset, RegioML slightly outperforms RegioSQM20. The good performance of RegioML and WLN is in large part due to the large datasets available for this type of reaction. However, for reactions where there is little experimental data, physics-based approaches like RegioSQM20 can be used to generate synthetic data for model training. We demonstrate this by showing that the performance of RegioSQM20 can be reproduced by a ML-model trained on RegioSQM20-generated data.


2018 ◽  
Author(s):  
Erki Aun ◽  
Age Brauer ◽  
Veljo Kisand ◽  
Tanel Tenson ◽  
Maido Remm

AbstractWe have developed an easy-to-use and memory-efficient method called PhenotypeSeeker that (a) generates ak-mer-based statistical model for predicting a given phenotype and (b) predicts the phenotype from the sequencing data of a given bacterial isolate. The method was validated on 167Klebsiella pneumoniaeisolates (virulence), 200Pseudomonas aeruginosaisolates (ciprofloxacin resistance) and 460Clostridium difficileisolates (azithromycin resistance). The phenotype prediction models trained from these datasets performed with 88% accuracy on theK. pneumoniaetest set, 88% on theP. aeruginosatest set and 96.5% on theC. difficiletest set. Prediction accuracy was the same for assembled sequences and raw sequencing data; however, building the model from assembled genomes is significantly faster. On these datasets, the model building on a mid-range Linux server takes approximately 3 to 5 hours per phenotype if assembled genomes are used and 10 hours per phenotype if raw sequencing data are used. The phenotype prediction from assembled genomes takes less than one second per isolate. Thus, PhenotypeSeeker should be well-suited for predicting phenotypes from large sequencing datasets.PhenotypeSeeker is implemented in Python programming language, is open-source software and is available at GitHub (https://github.com/bioinfo-ut/PhenotypeSeeker/).SummaryPredicting phenotypic properties of bacterial isolates from their genomic sequences has numerous potential applications. A good example would be prediction of antimicrobial resistance and virulence phenotypes for use in medical diagnostics. We have developed a method that is able to predict phenotypes of interest from the genomic sequence of the isolate within seconds. The method uses statistical model that can be trained automatically on isolates with known phenotype. The method is implemented in Python programming language and can be run on low-end Linux server and/or on laptop computers.


Vaccines ◽  
2020 ◽  
Vol 8 (4) ◽  
pp. 709
Author(s):  
Ivan Dimitrov ◽  
Nevena Zaharieva ◽  
Irini Doytchinova

The identification of protective immunogens is the most important and vigorous initial step in the long-lasting and expensive process of vaccine design and development. Machine learning (ML) methods are very effective in data mining and in the analysis of big data such as microbial proteomes. They are able to significantly reduce the experimental work for discovering novel vaccine candidates. Here, we applied six supervised ML methods (partial least squares-based discriminant analysis, k nearest neighbor (kNN), random forest (RF), support vector machine (SVM), random subspace method (RSM), and extreme gradient boosting) on a set of 317 known bacterial immunogens and 317 bacterial non-immunogens and derived models for immunogenicity prediction. The models were validated by internal cross-validation in 10 groups from the training set and by the external test set. All of them showed good predictive ability, but the xgboost model displays the most prominent ability to identify immunogens by recognizing 84% of the known immunogens in the test set. The combined RSM-kNN model was the best in the recognition of non-immunogens, identifying 92% of them in the test set. The three best performing ML models (xgboost, RSM-kNN, and RF) were implemented in the new version of the server VaxiJen, and the prediction of bacterial immunogens is now based on majority voting.


SPE Journal ◽  
2018 ◽  
Vol 23 (04) ◽  
pp. 1075-1089 ◽  
Author(s):  
Jared Schuetter ◽  
Srikanta Mishra ◽  
Ming Zhong ◽  
Randy LaFollette (ret.)

Summary Considerable amounts of data are being generated during the development and operation of unconventional reservoirs. Statistical methods that can provide data-driven insights into production performance are gaining in popularity. Unfortunately, the application of advanced statistical algorithms remains somewhat of a mystery to petroleum engineers and geoscientists. The objective of this paper is to provide some clarity to this issue, focusing on how to build robust predictive models and how to develop decision rules that help identify factors separating good wells from poor performers. The data for this study come from wells completed in the Wolfcamp Shale Formation in the Permian Basin. Data categories used in the study included well location and assorted metrics capturing various aspects of well architecture, well completion, stimulation, and production. Predictive models for the production metric of interest are built using simple regression and other advanced methods such as random forests (RFs), support-vector regression (SVR), gradient-boosting machine (GBM), and multidimensional Kriging. The data-fitting process involves splitting the data into a training set and a test set, building a regression model on the training set and validating it with the test set. Repeated application of a “cross-validation” procedure yields valuable information regarding the robustness of each regression-modeling approach. Furthermore, decision rules that can identify extreme behavior in production wells (i.e., top x% of the wells vs. bottom x%, as ranked by the production metric) are generated using the classification and regression-tree algorithm. The resulting decision tree (DT) provides useful insights regarding what variables (or combinations of variables) can drive production performance into such extreme categories. The main contributions of this paper are to provide guidelines on how to build robust predictive models, and to demonstrate the utility of DTs for identifying factors responsible for good vs. poor wells.


Author(s):  
Oji-Okoro Izuchukwu ◽  
Huang Huiping ◽  
Abba Shehu Abubakar ◽  
Edun Adetunji Olufemi

Agricultural sector is seen as an engine that contributes to the growth of the overall economy of Nigeria, despite several government efforts the sector is still characterized with low yields and limited areas under cultivation due to government dependence on mono-agricultural economy based on oil. This study attempts to evaluate the impacts of FDI, trade and its effects on agricultural sector development in Nigeria between the periods of 1980-2009, in analyzing the variables (VAR) model was used employing a three-step procedure. The Unit root test was conducted using the Augmented Dickey Fuller (ADF) and Philips-Parron (PP). Johansen and Juselius multivariate Cointregration test indicate that there is a present of cointregration. Granger causality test result shows that the variables employed have a bidirectional relationship, unidirectional relationship and no casual relationship. It is recommended that in order to boost agricultural output and develop the sector as a whole, more FDI should not only be sourced, there is a need for the government to provide legal and administrative quality framework and encourage more exportation of agricultural output that will enhance foreign exchange earnings and improve the competitiveness of Nigeria agricultural produce in the international market.


2019 ◽  
Vol 9 (7) ◽  
pp. 1428 ◽  
Author(s):  
Adedoyin Isola LAWAL ◽  
Ernest Onyebuchi FIDELIS ◽  
Abiola Ayoopo BABAJIDE ◽  
Barnabas O. OBASAJU ◽  
Oluwatoyese OYETADE ◽  
...  

This study examines the impact of fiscal policy on agricultural output in Nigeria using the most recent official data. The metrics for fiscal policy is government capital expenditure and custom duties on fertilizer. The study used annual time series data obtained from CBN annual statistical bulletin, NCS, and FIRS which was found to be stationary at the order of I(1) and I(0). The order of unit root test led to the use of ARDL estimation method employed in the empirical analysis of this research work. The study found evidence of both short and long run relationship between the variables (VAO, GEX, IDMF, and ACGSF) using both Johansen co-integration and ARDL Bounds test. Although government expenditure (GEX) to agricultural sector was found to be statistically insignificant which recommend that government should increase agriculture capital expenditure to ensure that its contribution is significant. Consequently, custom duties on fertilizer (IDMF) was found to be negatively signed and significant indicating a negative impact on agricultural output. This demands that the policy makers should be prudent in the use of fiscal policy instrument in achieving its desired objective.


2014 ◽  
Vol 931-932 ◽  
pp. 738-743
Author(s):  
Satika Boonkaewwan ◽  
Srilert Chotpantarat

The Lower Yom River Basin is located in the north of Thailand. This study carried out to calibrate and validate using SWAT model in terms of streamflow and sediment concentration hydrographs (Year 2000-2012) for 3 RID streamflow gauging stations (the Royal Irrigation Department). The nitrates concentrations simulate have been influenced of land use changes during last ten years. Optimal values of model parameters derived from calibration and validation processes, which showed well fitted between observed and simulated results. In the last decade, particular in Lower Yom River, the land use change gradually transformed to be more paddy field and has been increased 127.48 km2 (approx. 0.87% increase), followed by urban area, which has been increased 196.66 km2 (approx. 1.35% increase), respectively. Average monthly concentration of nitrate increased 38.28 mg/l (approx.13.40 % increase), 43.17 mg/l (approx.12.00% increase), 43.02 mg/l (approx. 8.60% increase) at station Y.6, Y.4 and Y.17, respectively. Accordingly, on the basis of the results presented in this study, land use changes can significantly affect on concentrations of nitrate.


Sign in / Sign up

Export Citation Format

Share Document