scholarly journals Heavy Rainfall Prediction using Gini Index in Decision Tree

2019 ◽  
Vol 8 (4) ◽  
pp. 4558-4562

In existing systems, it happens that sometimes the data is not accurate and proper data mining techniques not being used and this increases the complexity.We as humans are bound to make mistakes while predicting weather conditions which might result in damage to both life and property. To avoid this, we use data mining algorithms for early warning of climatic conditions such as like maximum temperature, minimum temperature wind speed, rainfall, humidity, pressure, dew point, cloud, sunshine and wind direction from data to predict rainfall. But by using proper algorithms for datasets and using the right metrics, we can achieve the accurate results in prediction of rainfall. Hence, we apply the Decision tree algorithm using Gini Index in order to predict the precipitation with accuracy and it is completely based on the historical data.

2018 ◽  
Vol 1 (94) ◽  
pp. 55-61
Author(s):  
R.O. Myalkovsky

Goal. The purpose of the research was to determine the influence of meteorological factors on potato yield in the conditions of the Right Bank Forest-steppe of Ukraine. Methods.Field, analytical and statistical. Results.It was established that among the mid-range varieties Divo stands out with a yield of 42.3 t/ha, Malin white – 39.8 t/ha, and Legend – 37.1 t/ ha. The most favourable weather and climatic conditions for the production of potato tubers were for the Divo 2011 variety with a yield of 45.9 t/ha and 2013 – 45.1 t/ha. For the Legenda variety 2016, the yield of potato tubers is 40.6 t/ha and 2017 – 43.2 t/ha. Malin White 2013 is 41.4 t/ha and 2017 42.1 t/ha. The average varieties of potatoes showed a slightly lower yield on average over the years of research. However, among the varieties is allocated Nadiyna – 40.3 t/ha, Slovyanka – 37.2 t/ ha and Vera 33.8 t/ha. Among the years, the most high-yielding for the Vera variety was 2016 with a yield of 36.6 t/ha and 2017 year – 37.8 t/ha. Varieties Slovyanka and Nadiyna 2011 and 2012 with yields of 42.6 and 44.3 t/ha and 46.5 and 45.3 t/ha, respectively. Characterizing the yield of potato tubers of medium-late varieties over the years of research, there was a decrease in this indicator compared with medium-early and middle-aged varieties. However, the high yield of the varieties of Dar is allocated – 40.0 t/ha, Alladin – 33.6 t/ha and Oxamit 31.3 t/ha. Among the years, the most favourable ones were: for Oxamit and Alladin – 2011 – 33.5 and 36.5 t/ha, and 2017 – 34.1 and 36.4 t/ha, respectively. Favourable years for harvesting varieties were 2011 and 2012 with yields of 45.7 and 45.8 t/ha. Thus, the highest yield of potato tubers on average over the years of studies of medium-early varieties of 41.2-43.3 t / ha were provided by weather conditions of 2011 and 2017 years, medium-ripe varieties 41.0-41.1 - 2012 and 2011, medium- late 37,6-38,5 t / ha - 2012 and 2011, respectively.


Author(s):  
Klent Gomez Abistado ◽  
◽  
Catherine N. Arellano ◽  
Elmer A. Maravillas ◽  

This paper presents a scheme of weather forecasting using artificial neural network (ANN) and Bayesian network. The study focuses on the data representing central Cebu weather conditions. The parameters used in this study are as follows: mean dew point, minimum temperature, maximum temperature, mean temperature, mean relative humidity, rainfall, average wind speed, prevailing wind direction, and mean cloudiness. The weather data were collected from the PAG-ASA Mactan-Cebu Station located at latitude: 10°19´, longitude: 123°59´ starting from January 2011 to December 2011 and the values available represent daily averages. These data were used for training the multi-layered backpropagation ANN in predicting the weather conditions of the succeeding days. Some outputs from the ANN, such as the humidity, temperature, and amount of rainfall, are fed to the Bayesian network for statistical analysis to forecast the probability of rain. Experiments show that the system achieved 93%–100% accuracy in forecasting weather conditions.


2018 ◽  
Vol 22 (3) ◽  
pp. 225-242 ◽  
Author(s):  
K. Mathan ◽  
Priyan Malarvizhi Kumar ◽  
Parthasarathy Panchatcharam ◽  
Gunasekaran Manogaran ◽  
R. Varadharajan

Author(s):  
Moloud Abdar ◽  
Sharareh R. Niakan Kalhori ◽  
Tole Sutikno ◽  
Imam Much Ibnu Subroto ◽  
Goli Arji

Heart diseases are among the nation’s leading couse of mortality and moribidity. Data mining teqniques can predict the likelihood of patients getting a heart disease. The purpose of this study is comparison of different data mining algorithm on prediction of heart diseases. This work applied and compared data mining techniques to predict the risk of heart diseases. After feature analysis, models by five algorithms including decision tree (C5.0), neural network, support vector machine (SVM), logistic regression and k-nearest neighborhood (KNN) were developed and validated. C5.0 Decision tree has been able to build a model with greatest accuracy 93.02%, KNN, SVM, Neural network have been 88.37%, 86.05% and 80.23% respectively. Produced results of decision tree can be simply interpretable and applicable; their rules can be understood easily by different clinical practitioner.


Author(s):  
Mihye Lee ◽  
Sachiko Ohde ◽  
Kevin Y. Urayama ◽  
Osamu Takahashi ◽  
Tsuguya Fukui

Weather affects the daily lives of individuals. However, its health effects have not been fully elucidated. It may lead to physical symptoms and/or influence mental health. Thus, we evaluated the association between weather parameters and various ailments. We used daily reports on health symptoms from 4548 individuals followed for one month in October of 2013, randomly sampled from the entirety of Japan. Weather variables from the monitoring station located closest to the participants were used as weather exposure. Logistic mixed effects model with a random intercept for each individual was applied to evaluate the effect of temperature and humidity on physical symptoms. Stratified analyses were conducted to compare weather effects by sex and age group. The lag day effects were also assessed. Joint pain was associated with higher temperature (1.87%, 95% CI = 1.15 to 2.59) and humidity (1.38%, 95% CI = 0.78 to 2.00). Headaches was increased by 0.56% (95% CI = −0.55 to 1.77) per 1 °C increase in the maximum temperature and by 1.35% per 1 °C increase in dew point. Weather was associated with various physical symptoms. Women seem to be more sensitive to weather conditions in association with physical symptoms, especially higher humidity and lower temperature.


Author(s):  
Geert Wets ◽  
Koen Vanhoof ◽  
Theo Arentze ◽  
Harry Timmermans

The utility-maximizing framework—in particular, the logit model—is the dominantly used framework in transportation demand modeling. Computational process modeling has been introduced as an alternative approach to deal with the complexity of activity-based models of travel demand. Current rule-based systems, however, lack a methodology to derive rules from data. The relevance and performance of data-mining algorithms that potentially can provide the required methodology are explored. In particular, the C4 algorithm is applied to derive a decision tree for transport mode choice in the context of activity scheduling from a large activity diary data set. The algorithm is compared with both an alternative method of inducing decision trees (CHAID) and a logit model on the basis of goodness-of-fit on the same data set. The ratio of correctly predicted cases of a holdout sample is almost identical for the three methods. This suggests that for data sets of comparable complexity, the accuracy of predictions does not provide grounds for either rejecting or choosing the C4 method. However, the method may have advantages related to robustness. Future research is required to determine the ability of decision tree-based models in predicting behavioral change.


2012 ◽  
Vol 546-547 ◽  
pp. 452-457
Author(s):  
Ping Ping Chen ◽  
Ding Ying Tan ◽  
Xiu Feng Liu

As the amount of sales increases rapidly, amount of data become very huge, and the management of customers’ relationship also becomes a more complex problem. Using data mining to analyze data to discover the rules and knowledge among them so that customized services in electronic commerce could be sustained and enterprises’ sales could be more intelligential. With data mining, we can do the following things. Firstly, it analyzes the customers’ shopping behavior and preference with association rules so that it can provide the recommending in the shopping process to make customers getting the right goods more convenient and faster. Secondly, it uses decision tree to classify the customers so that it makes a better communication between customers, provides customized shopping user interface to them and gives the pertinent advertisement to them.


2002 ◽  
Vol 02 (01) ◽  
pp. 127-143 ◽  
Author(s):  
FRANÇOIS POULET

This paper presents a 3D user-centered interactive graphical environment dedicated to data mining. The aims of this environment are to involve the user in the data mining process (to use domain knowledge during the process), to improve comprehensibility (of both the data and the results of data mining algorithms), to improve interactivity and to use algorithms from various research areas: statistics, data analysis, visualization and machine learning. The environment is made of a set of bulletin boards where the graphical tools will be mapped; bulletin boards are predefined or can be user-defined. Several different visualization tools might be used in a single display, these tools are linked together to improve data comprehensibility. The tools available in the environment are both graphical and non-graphical tools, they can be used alone or in a cooperative way. One of these tools is more detailed: CIAD, a new graphical interactive decision tree construction algorithm that allows bivariate splits and so gives smaller trees (improving result comprehensibility). Its results are compared to existing decision tree algorithms. This environment can be used on any personal computer (it is based on open-source software and so, is platform independent) as well as on high performance graphical systems like reality centers.


2019 ◽  
Vol 86 (4) ◽  
pp. 399-405 ◽  
Author(s):  
Bishwa Bhaskar Choudhary ◽  
Smita Sirohi

AbstractBased on ten years of data (2001–10), consisting of 12 673 observations on fortnightly milk yield of buffaloes reared in a dairy farm located in the Northern sub-tropics (29°41′0″N, 76°59′0″E), the present study establishes the relationship between weather conditions and production performance of lactating buffaloes. The critical threshold level of maximum temperature-humidity index (THI) was estimated to be 74, which is higher than that of crossbred cows. The duration of discomfort period for buffaloes begins in mid-March and lasts up to early November. During the aggravated stress condition (THI > 82) prevailing in the region for about 5 months starting from early May, milk productivity declines by more than 1% per unit increase in maximum THI over 82. The maximum temperature and minimum humidity (viz. maximum THI) are the most critical weather parameters causing thermal stress in animals, however, the climatic conditions in the region are such that not only maximum but also minimum THI crosses the critical threshold providing little relief to the animals during the night.


Sign in / Sign up

Export Citation Format

Share Document