scholarly journals Application of K-nearest neighbours method for water pipes failure frequency assessment

2018 ◽  
Vol 59 ◽  
pp. 00021
Author(s):  
Małgorzata Kutyłowska

The paper describes the results of failure rate modeling using K-nearest neighbours method (KNN). This algorithm is one among other regression methods, called machine learning methods. The aim of the presented paper was to check the possibilities of application of such kind of modelling and the comparison between current results and investigations of failure rate prediction in another Polish city. Operational data from 12 years of exploitation, received from water utility, were used to predict dependent variable (failure rate). Data (249 and 294 for distribution pipes and house connections, respectively) from the time span 2001–2012 were used for creating the KNN models. On the basis of other data (one case for each year) the validation of optimal model, based on Euclidean distance metric with the number of nearest neighbours K = 2, was carried out. The realization of the modelling was performed in the software program Statistica 12.0.

2018 ◽  
Vol 44 ◽  
pp. 00086
Author(s):  
Małgorzata Kutyłowska

The paper presents the results of failure rate prediction using adaptive algorithm MARSplines. This method could be defined as segmental and multiple linear regression. The range of segments defines the range of applicability of that methodology. On the basis of operational data received from Water Utility two separate models were created for distribution pipes and house connections. The calculations were carried out in the programme Statistica 13.1. Maximal number of basis function was equalled to 30; so-called pruning was used. Interaction level equalled to 1, the penalty for adding basis function amounted to 2, and the threshold – 0.0005. GCV error equalled to 0.0018 and 0.0253 as well as 0.0738 and 0.1058 for distribution pipes and house connections in learning and prognosis process, respectively. The prediction results in validation step were not satisfactory in relation to distribution pipes, because constant value of failure rate was observed. Concerning house connections, the forecasting was slightly better, but still the overestimation seems to be unacceptable from engineering point of view.


2021 ◽  
Vol 11 (9) ◽  
Author(s):  
Małgorzata Kutyłowska ◽  
Dariusz Kowalski

AbstractThe paper presents possibilities of application of selected regression methods (classification trees, support vector machines, K-nearest neighbours, artificial networks) for classification of sewers’ damages. Operational data from the time span 2006–2011 obtained from water utility were used for deterioration analysis. On the basis of the following independent variables, the modelling was carried out: diameter, depth, year of construction, material and season of damage’s occurring. The following kinds of damages were classified: corrosion, crack, longitudinal crack, displacement, unsealing, failure, collapse. The main aim of the paper was to check if prediction methodology could be useful for classification of different kinds of sewers’ damages. The obtained results pointed out that proposed classification methods are not appropriable in quality analysis of registered damages of sewers. Moreover, it is recommended for water and sewerage companies to register types of failures using unified notation which make easier preliminary classification before applying modelling approach. The calculations were performed in Statistica 13.1 software.


Author(s):  
Małgorzata Kutyłowska

The paper shows the results of failure rate prediction using non-parametric regression algorithm K-nearest neighbours. The whole data set for years 1999-2013 was divided randomly into two groups (learning – 75% and testing – 25%). Besides, data from year 2014 were used for verifying the model. The dependent variable (failure rate) was forecasted on the basis of independent variables (number of installed house connections, total length and number of damages of water mains, distribution pipes and house connections). Four types of distance metric: Euclidean, quadratic Euclidean, Manhattan and Czebyszew were checked and four KNN models were created. Taking into consideration all constraints and assumptions, models using Euclidean and quadratic Euclidean distance metrics gave the most optimal prediction results. The optimal number of K nearest neighbours equalled to 2 and 3 concerning models KNN-E, KNN-E2, KNN-C and KNN-M, respectively. Validation error was the smallest for models KNN-E and KNN-E2 and amounted to 0.0130, for model KNN-M was equal to 0.0152 and for KNN-C to 0.0150.


Author(s):  
Małgorzata Kutyłowska

The paper presents the modelling results of failure rate of watermains, distribution pipes and house connections in one Polishcity. The prediction of failure frequency was performed usingartificial neural networks. Multilayer perceptron was chosen asthe most suitable for modelling purposes. Neural network architecturecontained 11 input signals (sale, production, consumptionand losses of water, number of water-meters, length andnumber of failures of water mains, distribution pipes and houseconnections). Three neurons (failure rates of three conduitstypes) were put to the output layer. One hidden layer, with hiddenneurons in the range 1-22, was used. Operating data fromyears 2005-2011 were used for training the network. Optimalmodel was verified using operational data from 2012. ModelMLP 11-10-3 was chosen as the best one for failure rate prediction.In this model hidden and output neurons were activatedby exponential function and the learning was done using quasi-Newton approach. During the learning process the correlation(R) and determination (R2) coefficients for water mains, distributionpipes and house connections equaled to 0.9921, 0.9842;0.8685, 0.7543 and 0.9945, 0.9891, respectively. The convergencesbetween real and predicted values seem to be, from engineeringpoint of view, satisfactory.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 132095-132105 ◽  
Author(s):  
Amjad Ali ◽  
Muhammad Hamraz ◽  
Poom Kumam ◽  
Dost Muhammad Khan ◽  
Umair Khalil ◽  
...  

Author(s):  
Graham Goodfellow ◽  
Susannah Turner ◽  
Jane Haswell ◽  
Richard Espiner

The United Kingdom Onshore Pipeline Operators Association (UKOPA) was formed by UK pipeline operators to provide a common forum for representing operators interests in the safe management of pipelines. This includes providing historical failure statistics for use in pipeline quantitative risk assessment and UKOPA maintain a database to record this data. The UKOPA database holds data on product loss failures of UK major accident hazard pipelines from 1962 onwards and currently has a total length of 22,370 km of pipelines reporting. Overall exposure from 1952 to 2010 is of over 785,000 km years of operating experience with a total of 184 product loss incidents during this period. The low number of failures means that the historical failure rate for pipelines of some specific diameters, wall thicknesses and material grades is zero or statistically insignificant. It is unreasonable to assume that the failure rate for these pipelines is actually zero. However, unlike the European Gas Incident data Group (EGIG) database, which also includes the UK gas transmission pipeline data, the UKOPA database contains extensive data on measured part wall damage that did not cause product loss. The data on damage to pipelines caused by external interference can be assessed to derive statistical distribution parameters describing the expected gouge length, gouge depth and dent depth resulting from an incident. Overall 3rd party interference incident rates for different class locations can also be determined. These distributions and incident rates can be used in structural reliability based techniques to predict the failure frequency due to 3rd party damage for a given set of pipeline parameters. The UKOPA recommended methodology for the assessment of pipeline failure frequency due to 3rd party damage is implemented in the FFREQ software. The distributions of 3rd party damage currently used in FFREQ date from the mid-1990s. This paper describes the work involved in updating the analysis of the damage database and presents the updated distribution parameters. A comparison of predictions using the old and new distributions is also presented.


2021 ◽  
Vol 2021 (2) ◽  
pp. 251-260
Author(s):  
Aleksey E. TSAPLIN ◽  
◽  
Zh. O. Kuvondikov ◽  

Objective: To determine the most failure-prone rolling stock components and assemblies by processing statistical data obtained during operation using the classical reliability theory; to develop recommendations for maintaining the operational state of individual rolling stock components. Methods: Methods for calculating the quantitative reliability characteristics are used based on the rolling stock operational statistical data. Results: The 5-year operational data have been used to provide tabulated statistics on the failure rate of various rolling stock equipment. Reliability indicators have been calculated for various types of rolling stock equipment and the corresponding graphs have been plotted. Based on the calculations, the recommendations for the rolling stock maintenance have been developed. Practical importance: The calculations and the recommendations described determine the types of rolling stock equipment requiring more attention during maintenance


Author(s):  
Katarzyna Pietrucha-Urbanik ◽  
Barbara Tchórzewska-Cieślak
Keyword(s):  

2017 ◽  
Vol 31 (2) ◽  
pp. 3-32 ◽  
Author(s):  
Susan Athey ◽  
Guido W. Imbens

In this paper, we discuss recent developments in econometrics that we view as important for empirical researchers working on policy evaluation questions. We focus on three main areas, in each case, highlighting recommendations for applied work. First, we discuss new research on identification strategies in program evaluation, with particular focus on synthetic control methods, regression discontinuity, external validity, and the causal interpretation of regression methods. Second, we discuss various forms of supplementary analyses, including placebo analyses as well as sensitivity and robustness analyses, intended to make the identification strategies more credible. Third, we discuss some implications of recent advances in machine learning methods for causal effects, including methods to adjust for differences between treated and control units in high-dimensional settings, and methods for identifying and estimating heterogenous treatment effects.


Author(s):  
Min Wang ◽  
Mahesh D. Pandey ◽  
Jovica R. Riznic

The estimation of piping failure frequency is an important task to support the probabilistic risk analysis and risk-informed in-service inspection of nuclear power plant systems. This paper describes a hierarchical or two-stage Poisson-gamma Bayesian procedure and applies this to estimate the failure frequency using the Organization for Economic Co-operation and Development/Nuclear Energy Agency pipe leakage data for the United States nuclear plants. In the first stage, a generic distribution of failure rate is developed based on the failure observations from a group of similar plants. This distribution represents the interplant (plant-to-plant) variability arising from differences in construction, operation, and maintenance conditions. In the second stage, the generic prior obtained from the first stage is updated by using the data specific to a particular plant, and thus a posterior distribution of plan specific failure rate is derived. The two-stage Bayesian procedure is able to incorporate different levels of variability in a more consistent manner.


Sign in / Sign up

Export Citation Format

Share Document