Crop Yield Prediction from Meteorological Data Using Efficient Machine Learning Model

Author(s):  
Huma Khan ◽  
S. M. Ghosh
2021 ◽  
Vol 12 (2) ◽  
pp. 49-66
Author(s):  
Janmenjoy Nayak ◽  
Bighnaraj Naik ◽  
Pandit Byomakesha Dash ◽  
Danilo Pelusi

Biomedical data is often more unstructured in nature, and biomedical data processing task is becoming more complex day by day. Thus, biomedical informatics requires competent data analysis and data mining techniques for designing decision support system's framework to solve clinical and heathcare-related issues. Due to increasingly large and complex data sets and demand of biomedical informatics research, researchers are attracted towards automated machine learning models. This paper is proposed to design an efficient machine learning model based on fuzzy c-means with meta-heuristic optimizations for biomedical data analysis and clustering. The main contributions of this paper are 1) projecting an efficient machine learning model based on fuzzy c-means and meta-heuristic optimization for biomedical data classification, 2) employing benchmark validation techniques and critical hypothesises testing, and 3) providing a background for biomedical data processing with a view of data processing and mining.


Author(s):  
Fuad T. Aleskerov ◽  
Nikita Baiborodov ◽  
Sergey Demin ◽  
Sergey Shvydun ◽  
Theodore Trafalis ◽  
...  

2021 ◽  
Author(s):  
Zhihao Song ◽  
Bin Chen ◽  
Yue Huang ◽  
Li Dong ◽  
Tingting Yang

Abstract. The satellite remote-sensing aerosol optical depth (AOD) and meteorological elements were employed to invert PM2.5 in order to control air pollution more effectively. This paper proposes a restricted gradient-descent linear hybrid machine learning model (RGD–LHMLM) by integrating a random forest (RF), a gradient boosting regression tree (GBRT), and a deep neural network (DNN) to estimate the concentration of PM2.5 in China in 2019. The research data included Himawari-8 AOD with high spatiotemporal resolution, ERA-5 meteorological data, and geographic information. The results showed that, in the hybrid model developed by linear fitting, the DNN accounted for the largest proportion, whereas the weight coefficient was 0.62. The R2 values of RF, GBRT, and DNN were reported 0.79, 0.81, and 0.8, respectively. Preferably, the generalization ability of the mixed model was better than that of each sub-model, and R2 reached 0.84, whereas RMSE and MAE were reported 12.92 µg/m3 and 8.01 µg/m3, respectively. For the RGD-LHMLM, R2 was above 0.7 in more than 70 % of the sites, whereas RMSE and MAE were below 20 µg/m3 and 15 µg/m3, respectively, in more than 70 % of the sites due to the correlation coefficient having seasonal difference between the meteorological factor and PM2.5. Furthermore, the hybrid model performed best in winter (mean R2 was 0.84) and worst in summer (mean R2 was 0.71). The spatiotemporal distribution characteristics of PM2.5 in China were then estimated and analyzed. According to the results, there was severe pollution in winter with an average concentration of PM2.5 being reported 62.10 µg/m3. However, there was slight pollution in summer with an average concentration of PM2.5 being reported 47.39 µg/m3. The findings also indicate that North China and East China are more polluted than other areas and that their average annual concentration of PM2.5 was reported 82.68 µg/m3. Moreover, there was relatively low pollution in Inner Mongolia, Qinghai, and Tibet, for their average PM2.5 concentrations were reported below 40 µg/m3.


Sign in / Sign up

Export Citation Format

Share Document