Estimation of annual runoff using selected data machine learning algorithm

Author(s):  
ujjwal singh ◽  
Rajani Kumar Pradhan ◽  
Shailendra Pratap ◽  
Martin Hanel ◽  
Ioannis Markonis ◽  
...  

<p>Annual runoff is important information on water balance in the catchment and large river basin scale. It forms the boundary conditions for mathematical modelling of hydrological balance on a finer temporal and spatial scale. It is important for the assessment of climate change on water resources. Currently, there are several datasets on global gridded runoff fields available. GRUN and E-RUN provide monthly estimates of runoff rate with the spatial resolution of 0.5 degree. The GRUN is global dataset and E-RUN is covering Europe <sup>1</sup><sup>,2</sup>.In this study, we evaluate the capability of paleoclimate reconstructions on precipitation, PDSI, and temperature, which are available in the form of gridded fields, to estimate annual surface runoff using selected machine learning techniques. For this purpose, we use as a benchmark runoff information GRUN and E-RUN data sets. Both data are aggregated on the annual time scale for the period 1902 – 2014 (GRUN) and 1952-2015 (E-RUN). Following machine learning algorithms were tested: Random forests, SVM, MLP, LDA and Extra Trees. Reconstructed precipitation, temperature, PDSI<sup>3</sup> and runoff estimated using selected Budyko models with different spatial aggregation served as inputs<sup>4–7</sup> . Different combinations of inputs were analysed.Our results show that the estimated surface runoff is in good agreement with E-RUN and GRUN datasets for analysed periods. The result and newly tested approach based on derived machine learning models can be further applied to the estimation of paleoclimatic reconstructions of runoff fields.</p><p> </p><p>References:</p><ol><li>Ghiggi, G., Humphrey, V., Seneviratne, S. I. & Gudmundsson, L. GRUN: an observation-based global gridded runoff dataset from 1902 to 2014. Earth Syst. Sci. Data <strong>11</strong>, 1655–1674 (2019).</li> <li>Gudmundsson, L. & Seneviratne, S. I. Observation-based gridded runoff estimates for Europe (E-RUN version 1.1). Earth Syst. Sci. Data <strong>8</strong>, 279–295 (2016).</li> <li>Cook, E. R. et al. Old World megadroughts and pluvials during the Common Era, Sci. Adv., 1, e1500561. (2015).</li> <li>Schreiber, P. Über die Beziehungen zwischen dem Niederschlag und der Wasserführung der Flüsse in Mitteleuropa. Z Meteorol <strong>21</strong>, 441–452 (1904).</li> <li>Ol’Dekop, E. M. On evaporation from the surface of river basins. Trans. Meteorol. Obs. <strong>4</strong>, 200 (1911).</li> <li>Turc, L. Le bilan d’eau des sols: relations entre les précipitations, l’évaporation et l’écoulement. (1953).</li> <li>Pike, J. G. The estimation of annual run-off from meteorological data in a tropical climate. J. Hydrol. <strong>2</strong>, 116–123 (1964).</li> </ol><p> </p>

2021 ◽  
pp. 1-17
Author(s):  
Ahmed Al-Tarawneh ◽  
Ja’afer Al-Saraireh

Twitter is one of the most popular platforms used to share and post ideas. Hackers and anonymous attackers use these platforms maliciously, and their behavior can be used to predict the risk of future attacks, by gathering and classifying hackers’ tweets using machine-learning techniques. Previous approaches for detecting infected tweets are based on human efforts or text analysis, thus they are limited to capturing the hidden text between tweet lines. The main aim of this research paper is to enhance the efficiency of hacker detection for the Twitter platform using the complex networks technique with adapted machine learning algorithms. This work presents a methodology that collects a list of users with their followers who are sharing their posts that have similar interests from a hackers’ community on Twitter. The list is built based on a set of suggested keywords that are the commonly used terms by hackers in their tweets. After that, a complex network is generated for all users to find relations among them in terms of network centrality, closeness, and betweenness. After extracting these values, a dataset of the most influential users in the hacker community is assembled. Subsequently, tweets belonging to users in the extracted dataset are gathered and classified into positive and negative classes. The output of this process is utilized with a machine learning process by applying different algorithms. This research build and investigate an accurate dataset containing real users who belong to a hackers’ community. Correctly, classified instances were measured for accuracy using the average values of K-nearest neighbor, Naive Bayes, Random Tree, and the support vector machine techniques, demonstrating about 90% and 88% accuracy for cross-validation and percentage split respectively. Consequently, the proposed network cyber Twitter model is able to detect hackers, and determine if tweets pose a risk to future institutions and individuals to provide early warning of possible attacks.


2020 ◽  
Vol 7 (10) ◽  
pp. 380-389
Author(s):  
Asogwa D.C ◽  
Anigbogu S.O ◽  
Anigbogu G.N ◽  
Efozia F.N

Author's age prediction is the task of determining the author's age by studying the texts written by them. The prediction of author’s age can be enlightening about the different trends, opinions social and political views of an age group. Marketers always use this to encourage a product or a service to an age group following their conveyed interests and opinions. Methodologies in natural language processing have made it possible to predict author’s age from text by examining the variation of linguistic characteristics. Also, many machine learning algorithms have been used in author’s age prediction. However, in social networks, computational linguists are challenged with numerous issues just as machine learning techniques are performance driven with its own challenges in realistic scenarios. This work developed a model that can predict author's age from text with a machine learning algorithm (Naïve Bayes) using three types of features namely, content based, style based and topic based. The trained model gave a prediction accuracy of 80%.


Author(s):  
Virendra Tiwari ◽  
Balendra Garg ◽  
Uday Prakash Sharma

The machine learning algorithms are capable of managing multi-dimensional data under the dynamic environment. Despite its so many vital features, there are some challenges to overcome. The machine learning algorithms still requires some additional mechanisms or procedures for predicting a large number of new classes with managing privacy. The deficiencies show the reliable use of a machine learning algorithm relies on human experts because raw data may complicate the learning process which may generate inaccurate results. So the interpretation of outcomes with expertise in machine learning mechanisms is a significant challenge in the machine learning algorithm. The machine learning technique suffers from the issue of high dimensionality, adaptability, distributed computing, scalability, the streaming data, and the duplicity. The main issue of the machine learning algorithm is found its vulnerability to manage errors. Furthermore, machine learning techniques are also found to lack variability. This paper studies how can be reduced the computational complexity of machine learning algorithms by finding how to make predictions using an improved algorithm.


Author(s):  
Abraham García-Aliaga ◽  
Moisés Marquina ◽  
Javier Coterón ◽  
Asier Rodríguez-González ◽  
Sergio Luengo-Sánchez

The purpose of this research was to determine the on-field playing positions of a group of football players based on their technical-tactical behaviour using machine learning algorithms. Each player was characterized according to a set of 52 non-spatiotemporal descriptors including offensive, defensive and build-up variables that were computed from OPTA’s on-ball event records of the matches for 18 national leagues between the 2012 and 2019 seasons. To test whether positions could be identified from the statistical performance of the players, the dimensionality reduction techniques were used. To better understand the differences between the player positions, the most discriminatory variables for each group were obtained as a set of rules discovered by RIPPER, a machine learning algorithm. From the combination of both techniques, we obtained useful conclusions to enhance the performance of players and to identify positions on the field. The study demonstrates the suitability and potential of artificial intelligence to characterize players' positions according to their technical-tactical behaviour, providing valuable information to the professionals of this sport.


2017 ◽  
Vol 7 (1.1) ◽  
pp. 143 ◽  
Author(s):  
J. Deepika ◽  
T. Senthil ◽  
C. Rajan ◽  
A. Surendar

With the greater development of technology and automation human history is predominantly updated. The technology movement shifted from large mainframes to PCs to cloud when computing the available data for a larger period. This has happened only due to the advent of many tools and practices, that elevated the next generation in computing. A large number of techniques has been developed so far to automate such computing. Research dragged towards training the computers to behave similar to human intelligence. Here the diversity of machine learning came into play for knowledge discovery. Machine Learning (ML) is applied in many areas such as medical, marketing, telecommunications, and stock, health care and so on. This paper presents reviews about machine learning algorithm foundations, its types and flavors together with R code and Python scripts possibly for each machine learning techniques.  


2021 ◽  
Vol 22 (2) ◽  
pp. 939
Author(s):  
Jiazhi Song ◽  
Guixia Liu ◽  
Jingqing Jiang ◽  
Ping Zhang ◽  
Yanchun Liang

Accurately identifying protein–ATP binding residues is important for protein function annotation and drug design. Previous studies have used classic machine-learning algorithms like support vector machine (SVM) and random forest to predict protein–ATP binding residues; however, as new machine-learning techniques are being developed, the prediction performance could be further improved. In this paper, an ensemble predictor that combines deep convolutional neural network and LightGBM with ensemble learning algorithm is proposed. Three subclassifiers have been developed, including a multi-incepResNet-based predictor, a multi-Xception-based predictor, and a LightGBM predictor. The final prediction result is the combination of outputs from three subclassifiers with optimized weight distribution. We examined the performance of our proposed predictor using two datasets: a classic ATP-binding benchmark dataset and a newly proposed ATP-binding dataset. Our predictor achieved area under the curve (AUC) values of 0.925 and 0.902 and Matthews Correlation Coefficient (MCC) values of 0.639 and 0.642, respectively, which are both better than other state-of-art prediction methods.


Data Science in healthcare is a innovative and capable for industry implementing the data science applications. Data analytics is recent science in to discover the medical data set to explore and discover the disease. It’s a beginning attempt to identify the disease with the help of large amount of medical dataset. Using this data science methodology, it makes the user to find their disease without the help of health care centres. Healthcare and data science are often linked through finances as the industry attempts to reduce its expenses with the help of large amounts of data. Data science and medicine are rapidly developing, and it is important that they advance together. Health care information is very effective in the society. In a human life day to day heart disease had increased. Based on the heart disease to monitor different factors in human body to analyse and prevent the heart disease. To classify the factors using the machine learning algorithms and to predict the disease is major part. Major part of involves machine level based supervised learning algorithm such as SVM, Naviebayes, Decision Trees and Random forest.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Qiang Zhao

The archeological sites are a heritage that we have gained from our ancestors. These sites are crucial for understanding the past and the way of life of people during those times. The monuments and the immovable relics of ancient times are a getaway to the past. The critical cultural relics however actually over the years have faced the brunt of nature. The environmental conditions have deteriorated the condition of many important immovable relics over the years since these could not be just shifted away. People also move around the ancient cultural relics that may also deform these relics. The machine learning algorithms were used to identify the location of the relics. The data from the satellite images were used and implemented machine learning algorithm to maintain and monitor the relics. This research study dwells into the importance of the area from a research point of view and utilizes machine learning techniques called CaffeNet and deep convolutional neural network. The result showed that 96% accuracy of predicting the image, which can be used for tracking human activity, protects heritage sites in a unique way.


2021 ◽  
Vol 4 (3) ◽  
pp. 139-143
Author(s):  
Mariana Vlad ◽  
◽  
Sorin Vlad ◽  

Machine learning (ML) is a subset of artificial Intelligence (AI) aiming to develop systems that can learn and continuously improve the abilities through generalization in an autonomous manner. ML is presently all around us, almost every facet of our digital and real life is embedding some ML related content. Customer recommendation systems, customer behavior prediction, fraud detection, speech recognition, image recognition, black & white movies colorization, accounting fraud detection are just some examples of the vast range of applications in which ML is involved. The techniques that this paper investigates are mainly focused on the use of neural networks in accounting and finance research fields. An artificial neural network is modelling the brain ability of learning intricate patterns from the information presented at its inputs using elementary interconnected units, named neurons, grouped in layers and trained by means of a learning algorithm. The performance of the network depends on many factors like the number of layers, the number of each neurons in each layer, the learning algorithm, activation functions, to name just a few of them. Machine learning algorithms have already started to replace humans in jobs that require document’s processing and decision making.


2020 ◽  
Vol 8 (6) ◽  
pp. 5482-5485

Most of the times, data is created for the Intrusion Detection System (IDS) only when the set of all real working environments are explored under all the possibilities of attacks, which is an expensive task. Network Intrusion Detection software shields a system and computer network from staff and non-authorized users. The detector’s ultimate task is to build a foreboding classifier (i.e. a model) which would help in distinguishing between friendly and non-friendly connections, known as attacks or intrusions.This problem in network sectors is prevented by predicting whether the connection is attacked or not attacked from the dataset. We are using i.e. KDDCup99 using bio inspired machine learning techniques (like Artificial Neural Network). Bio inspired algorithm is a game changer in computer science. The extent of this field is really magnificent as compared to nature around it, complications of computer science are only a subset of it, opening a new era in next generation computing, modelling and algorithm engineering. The aim is to investigate bio inspired machine learning based techniques for better packet connection transfers forecasting by prediction results in best accuracy and to propose this machine learning-based method to accurately predict the DOS, R2L, U2R, Probe and overall attacks by predicting results in the form of best accuracy from comparing supervised classification machine learning algorithms. Furthermore, to compare and discuss the performance of various ML algorithms from the provided dataset with classification and evaluation report, finding and analysing the confusion matrix and for classifying data from the priority and result shows that the effectiveness of the proposed system i.e. bio inspired machine learning algorithm technique can be put on test with best accuracy along with precision, specificity, sensitivity, F1 Score and Recall


Sign in / Sign up

Export Citation Format

Share Document