Comprehensive Data Analysis and Forecasting of Covid19 Outbreak in India (Preprint)

2020 ◽  
Author(s):  
Amit Thakur ◽  
Rajesh Singh ◽  
Anita Gehlot ◽  
Shaik Vaseem Akram ◽  
Prabin Kumar Das

BACKGROUND COVID-19 is chronic based disease which is spreading with rapid pace in the entire world. Present study addresses the situation of outbreak of the COVID-19 disease in India and estimate the rise of the cases in India. This study addresses the present health infrastructure, infected health workforce clearly with the statistics. Support Vector Machine and Linear Regression are implemented in this study for predicting the expected cases. For the purpose of modelling, the input data of number of cases is considered from the march 15th , 2020. With the input data, the two models are trained for prediction of the cases. In the end, the results show that support vector machine and linear regression are giving good accuracy for prediction. OBJECTIVE The current studies aim to analyze and estimate the developments in the near future with reference to COVID-19 in India. The research is also planned to look at the preparation level of Indian government for this outbreak. The scope of the study is narrowed to build prediction models for the Indian region and uses SVMs for prediction methods based on time series that are easily built and readable under these crucial conditions. The study does not cover coverage of a COVID-19 outbreak for any other country. METHODS Support Vector Machine and Linear Regression are implemented in this study for predicting the expected cases. For the purpose of modelling, the input data of number of cases is considered from the march 15th , 2020. With the input data, the two models are trained for prediction of the cases. In the end, the results show that support vector machine and linear regression are giving good accuracy for prediction. RESULTS 1.Considering the change, the change in slope of the both curves in the graph, it can be concluded that the trained model is giving a quite good range of accuracy. 2.The Graph shows the plot of the predicted values and actual values fed during the testing of model. Considering the change, the change in slope of the both curves in the graph, it can be concluded that the trained model is giving a quite good range of accuracy. CONCLUSIONS In conclusion, the present work emphasized on presenting observations and predictions about COVID-19 outbreaks in the Indian region. Although the rate of growth at world level is not equal to the rate of growth, the situation appears dangerous as India is heading towards exponential growth. The expected patients are reaching in millions in the next 30 days by means of two separate time series forecasting models. With regard to the poor health facilities, it is going to difficult to combat the outbreak of virus without government addressing the effective measurements. Contrast to strict lockdown, social distancing, isolation, patient testing and medical care need to implement with war base for combating the pandemic in India. The forecasting in this study are still in beginning phases as the historical data is limit for creating reliable model. That to the risen of cases in India followed from the last 10 days so the training for the model may not be accurate, however the prediction model would be enhanced from existing models, as the greater number of medical and demographic data is available.Furthermore, even if the predictions are 60-70 percent correct, then the nation will also encounter this quite hard days.

2019 ◽  
Vol 4 (2) ◽  
pp. 104-107
Author(s):  
Andi Bode

Pohon kelapa banyak dimanfaatkan oleh manusia, sehingga tumbuhan ini dianggap tumbuhan serbaguna, salah satunya minyak kelapa yang dihasilkan oleh buah pohon kelapa. Produksi jumlah minyak kelapa menjadi bagian penting disetiap perusahaan yang bergerak di bidang produksi dengan tujuan mencapai target hasil produksi. Namaun Produksi minyak setiap hari mengalami perubahan fluktuatif. Perusahaan sangat memerlukan prediksi jumlah produksi. Penelitian ini bermaksud membandingakn metode support vector machine dan linear regression mengunakan fitur seleksi backward elimination berdasarkan data time series Sales Order. Hasil penelitian pada dataset sales order dengan menggunakan metode Support Vector Machine (SVM) didapatkan RMSE 0,127, dengan menggunakan metode SVM dan Backward Elimination (BE) didapatkan RMSE 0,115, dengan metode Linear Regression (LR) didapatkan RMSE 0,118 dan dengan menggunakan metode LR dan Backward Elimination didapatkan RMSE 0,118.  Dari hasil perbandingan tersebut dapat disimpulkan bahwa kinerja SVM menggunakan Backward Elimination lebih baik dibanding SVM, LR dan LR menggunakan Backward Elimination


Author(s):  
Guan-fa Li ◽  
Wen-sheng Zhu

Due to the randomness of wind speed and direction, the output power of wind turbine also has randomness. After large-scale wind power integration, it will bring a lot of adverse effects on the power quality of the power system, and also bring difficulties to the formulation of power system dispatching plan. In order to improve the prediction accuracy, an optimized method of wind speed prediction with support vector machine and genetic algorithm is put forward. Compared with other optimization methods, the simulation results show that the optimized genetic algorithm not only has good convergence speed, but also can find more suitable parameters for data samples. When the data is updated according to time series, the optimization range of vaccine and parameters is adaptively adjusted and updated. Therefore, as a new optimization method, the optimization method has certain theoretical significance and practical application value, and can be applied to other time series prediction models.


Symmetry ◽  
2021 ◽  
Vol 13 (2) ◽  
pp. 212
Author(s):  
Yu-Wei Liu ◽  
Huan Feng ◽  
Heng-Yi Li ◽  
Ling-Ling Li

Accurate prediction of photovoltaic power is conducive to the application of clean energy and sustainable development. An improved whale algorithm is proposed to optimize the Support Vector Machine model. The characteristic of the model is that it needs less training data to symmetrically adapt to the prediction conditions of different weather, and has high prediction accuracy in different weather conditions. This study aims to (1) select light intensity, ambient temperature and relative humidity, which are strictly related to photovoltaic output power as the input data; (2) apply wavelet soft threshold denoising to preprocess input data to reduce the noise contained in input data to symmetrically enhance the adaptability of the prediction model in different weather conditions; (3) improve the whale algorithm by using tent chaotic mapping, nonlinear disturbance and differential evolution algorithm; (4) apply the improved whale algorithm to optimize the Support Vector Machine model in order to improve the prediction accuracy of the prediction model. The experiment proves that the short-term prediction model of photovoltaic power based on symmetry concept achieves ideal accuracy in different weather. The systematic method for output power prediction of renewable energy is conductive to reducing the workload of predicting the output power and to promoting the application of clean energy and sustainable development.


2021 ◽  
Author(s):  
Lance F Merrick ◽  
Dennis N Lozada ◽  
Xianming Chen ◽  
Arron H Carter

Most genomic prediction models are linear regression models that assume continuous and normally distributed phenotypes, but responses to diseases such as stripe rust (caused by Puccinia striiformis f. sp. tritici) are commonly recorded in ordinal scales and percentages. Disease severity (SEV) and infection type (IT) data in germplasm screening nurseries generally do not follow these assumptions. On this regard, researchers may ignore the lack of normality, transform the phenotypes, use generalized linear models, or use supervised learning algorithms and classification models with no restriction on the distribution of response variables, which are less sensitive when modeling ordinal scores. The goal of this research was to compare classification and regression genomic selection models for skewed phenotypes using stripe rust SEV and IT in winter wheat. We extensively compared both regression and classification prediction models using two training populations composed of breeding lines phenotyped in four years (2016-2018, and 2020) and a diversity panel phenotyped in four years (2013-2016). The prediction models used 19,861 genotyping-by-sequencing single-nucleotide polymorphism markers. Overall, square root transformed phenotypes using rrBLUP and support vector machine regression models displayed the highest combination of accuracy and relative efficiency across the regression and classification models. Further, a classification system based on support vector machine and ordinal Bayesian models with a 2-Class scale for SEV reached the highest class accuracy of 0.99. This study showed that breeders can use linear and non-parametric regression models within their own breeding lines over combined years to accurately predict skewed phenotypes.


2020 ◽  
Author(s):  
Zhanyou Xu ◽  
Andreomar Kurek ◽  
Steven B. Cannon ◽  
Williams D. Beavis

AbstractSelection of markers linked to alleles at quantitative trait loci (QTL) for tolerance to Iron Deficiency Chlorosis (IDC) has not been successful. Genomic selection has been advocated for continuous numeric traits such as yield and plant height. For ordinal data types such as IDC, genomic prediction models have not been systematically compared. The objectives of research reported in this manuscript were to evaluate the most commonly used genomic prediction method, ridge regression and it’s equivalent logistic ridge regression method, with algorithmic modeling methods including random forest, gradient boosting, support vector machine, K-nearest neighbors, Naïve Bayes, and artificial neural network using the usual comparator metric of prediction accuracy. In addition we compared the methods using metrics of greater importance for decisions about selecting and culling lines for use in variety development and genetic improvement projects. These metrics include specificity, sensitivity, precision, decision accuracy, and area under the receiver operating characteristic curve. We found that Support Vector Machine provided the best specificity for culling IDC susceptible lines, while Random Forest GP models provided the best combined set of decision metrics for retaining IDC tolerant and culling IDC susceptible lines.


Author(s):  
Jianmin Bian ◽  
Qian Wang ◽  
Siyu Nie ◽  
Hanli Wan ◽  
Juanjuan Wu

Abstract Fluctuations in groundwater depth play an important role and are often overlooked when considering the transport of nitrogen in the unsaturated zone. To evaluate directly the variation of nitrogen transport due to fluctuations in groundwater depth, the prediction model of groundwater depth and nitrogen transport were combined and applied by least squares support vector machine and Hydrus-1D in the western irrigation area of Jilin in China. The calibration and testing results showed the prediction models were reliable. Considering different groundwater depth, the concentration of nitrogen was affected significantly with a groundwater depth of 3.42–1.71 m, while it was not affected with groundwater depth of 5.48–6.47 m. The total leaching loss of nitrogen gradually increased with the continuous decrease of groundwater depth. Furthermore, the limited groundwater depth of 1.7 m was found to reduce the risk of nitrogen pollution. This paper systematically analyzes the relationship between groundwater depth and nitrogen transport to form appropriate agriculture strategies.


Sign in / Sign up

Export Citation Format

Share Document