scholarly journals Estimation of Missing Streamflow Data Using Anfis Models and Determination of the Number of Datasets for Anfis: The Case of Yeşilırmak River

Author(s):  
Kemal Saplioglu ◽  
Tulay Sugra Kucukerdem

Good data analysis is required for the optimal design of water resources projects. However, data are not regularly collected due to material or technical reasons, which results in incomplete-data problems. Available data and data length are of great importance to solve those problems. Various studies have been conducted on missing data treatment. This study used data from the flow observation stations on Yeşilırmak River in Turkey. In the first part of the study, models were generated and compared in order to complete missing data using ANFIS, multiple regression and Normal Ratio Method. In the second part of the study, the minimum number of data required for ANFIS models was determined using the optimum ANFIS model. Of all methods compared in this study, ANFIS models yielded the most accurate results. A 10-year training set was also found to be sufficient as a data set.

2013 ◽  
Vol 11 (7) ◽  
pp. 2779-2786
Author(s):  
Rahul Singhai

One relevant problem in data preprocessing is the presence of missing data that leads the poor quality of patterns, extracted after mining. Imputation is one of the widely used procedures that replace the missing values in a data set by some probable values. The advantage of this approach is that the missing data treatment is independent of the learning algorithm used. This allows the user to select the most suitable imputation method for each situation. This paper analyzes the various imputation methods proposed in the field of statistics with respect to data mining. A comparative analysis of three different imputation approaches which can be used to impute missing attribute values in data mining are given that shows the most promising method. An artificial input data (of numeric type) file of 1000 records is used to investigate the performance of these methods. For testing the significance of these methods Z-test approach were used.


2016 ◽  
Vol 13 (1) ◽  
pp. 83
Author(s):  
Siti Nur Zahrah Amin Burhanuddin ◽  
Sayang Mohd Deni ◽  
Norazan Mohamed Ramli

A good quality of rainfall data is highly necessary in hydrological and meteorological analyses. Lack of quality in rainfall data will influence the process of analyses and subsequently, produce misleading results. Thus, this study is aimed to propose modified missing rainfall data treatment methods that produced more accurate estimation results. In this study, the old normal ratio method and the modified normal ratio based on trimmed mean are combined with geographical coordinate method. The performances of these modified methods were tested on various levels of the missing data of 36 years complete daily rainfall records from eighteen meteorology stations in Peninsular Malaysia. The results indicated that both modified methods improved the estimation of missing rainfall values at the target station based on the least error measurements. Modified normal ratio based on trimmed mean with geographical coordinate method is found to be the most appropriate method for station Batu Kurau and Sg. Bernam while modified old normal ratio with geographical coordinate is the most accurate in estimating the missing data at station Genting Klang.


2016 ◽  
Vol 13 (1) ◽  
pp. 83 ◽  
Author(s):  
Siti Nur Zahrah Amin Burhanuddin ◽  
Sayang Mohd Deni ◽  
Norazan Mohamed Ramli

A good quality of rainfall data is highly necessary in hydrological and meteorological analyses. Lack of quality in rainfall data will influence the process of analyses and subsequently, produce misleading results. Thus, this study is aimed to propose modified missing rainfall data treatment methods that produced more accurate estimation results. In this study, the old normal ratio method and the modified normal ratio based on trimmed mean are combined with geographical coordinate method. The performances of these modified methods were tested on various levels of the missing data of 36 years complete daily rainfall records from eighteen meteorology stations in Peninsular Malaysia. The results indicated that both modified methods improved the estimation of missing rainfall values at the target station based on the least error measurements. Modified normal ratio based on trimmed mean with geographical coordinate method is found to be the most appropriate method for station Batu Kurau and Sg. Bernam while modified old normal ratio with geographical coordinate is the most accurate in estimating the missing data at station Genting Klang.


2020 ◽  
Vol 63 (6) ◽  
pp. 1947-1957
Author(s):  
Alexandra Hollo ◽  
Johanna L. Staubitz ◽  
Jason C. Chow

Purpose Although sampling teachers' child-directed speech in school settings is needed to understand the influence of linguistic input on child outcomes, empirical guidance for measurement procedures needed to obtain representative samples is lacking. To optimize resources needed to transcribe, code, and analyze classroom samples, this exploratory study assessed the minimum number and duration of samples needed for a reliable analysis of conventional and researcher-developed measures of teacher talk in elementary classrooms. Method This study applied fully crossed, Person (teacher) × Session (samples obtained on 3 separate occasions) generalizability studies to analyze an extant data set of three 10-min language samples provided by 28 general and special education teachers recorded during large-group instruction across the school year. Subsequently, a series of decision studies estimated of the number and duration of sessions needed to obtain the criterion g coefficient ( g > .70). Results The most stable variables were total number of words and mazes, requiring only a single 10-min sample, two 6-min samples, or three 3-min samples to reach criterion. No measured variables related to content or complexity were adequately stable regardless of number and duration of samples. Conclusions Generalizability studies confirmed that a large proportion of variance was attributable to individuals rather than the sampling occasion when analyzing the amount and fluency of spontaneous teacher talk. In general, conventionally reported outcomes were more stable than researcher-developed codes, which suggests some categories of teacher talk are more context dependent than others and thus require more intensive data collection to measure reliably.


Author(s):  
RUAA MUAYAD MAHMOOD ◽  
HAMSA MUNAM YASSEN ◽  
SAMAR , NAJWA ISSAC ABDULLA AHMED DARWEESH ◽  
NAJWA ISSAC ABDULLA

Simple, rapid and sensitive extractive spectrophotometric method is presented for the determination of glibenclamide (Glb) based on the formation of ion-pair complex between the Glb and anionic dye, methyl orange (MO) at pH 4. The yellow colored complex formed was quantitatively extracted into dichloromethane and measured at 426 nm. The colored product obeyed Beer’s law in the concentration range of (0.5-40) μg.ml-1. The value of molar absorptivity obtained from Beer’s data was found to be 31122 L.mol-1.cm-1, Sandell’s sensitivity value was calculated to be 0.0159 μg.cm-2, while the limits of detection (LOD) and quantification (LOQ) were found to be 0.1086 and 0.3292 μg.ml-1, respectively. The stoichiometry of the complex created between the Glb and MO was 1:1 as determined via Job’s method of continuous variation and mole ratio method. The method was successfully applied for the analysis of pharmaceutical formulation.


2018 ◽  
Vol 154 (2) ◽  
pp. 149-155
Author(s):  
Michael Archer

1. Yearly records of worker Vespula germanica (Fabricius) taken in suction traps at Silwood Park (28 years) and at Rothamsted Research (39 years) are examined. 2. Using the autocorrelation function (ACF), a significant negative 1-year lag followed by a lesser non-significant positive 2-year lag was found in all, or parts of, each data set, indicating an underlying population dynamic of a 2-year cycle with a damped waveform. 3. The minimum number of years before the 2-year cycle with damped waveform was shown varied between 17 and 26, or was not found in some data sets. 4. Ecological factors delaying or preventing the occurrence of the 2-year cycle are considered.


2020 ◽  
pp. 73-75
Author(s):  
B.M. Bazrov ◽  
T.M. Gaynutdinov

The selection of technological bases is considered before the choice of the type of billet and the development of the route of the technological process. A technique is proposed for selecting the minimum number of sets of technological bases according to the criterion of equality in the cost price of manufacturing the part according to the principle of unity and combination of bases at this stage. Keywords: part, surface, coordinating size, accuracy, design and technological base, labor input, cost price. [email protected]


Author(s):  
Nesma M Fahmy ◽  
Adel M Michael

Abstract Background Modern built-in spectrophotometer software supporting mathematical processes provided a solution for increasing selectivity for multicomponent mixtures. Objective Simultaneous spectrophotometric determination of the three naturally occurring antioxidants—rutin(RUT), hesperidin(HES), and ascorbic acid(ASC)—in bulk forms and combined pharmaceutical formulation. Method This was achieved by factorized zero order method (FZM), factorized derivative method (FD1M), and factorized derivative ratio method (FDRM), coupled with spectrum subtraction(SS). Results Mathematical filtration techniques allowed each component to be obtained separately in either its zero, first, or derivative ratio form, allowing the resolution of spectra typical to the pure components present in Vitamin C Forte® tablets. The proposed methods were applied over a concentration range of 2–50, 2–30, and 10–100 µg/mL for RUT, HES, and ASC, respectively. Conclusions Recent methods for the analysis of binary mixtures, FZM and FD1M, were successfully applied for the analysis of ternary mixtures and compared to the novel FDRM. All were revealed to be specific and sensitive with successful application on pharmaceutical formulations. Validation parameters were evaluated in accordance with the International Conference on Harmonization guidelines. Statistical results were satisfactory, revealing no significant difference regarding accuracy and precision. Highlights Factorized methods enabled the resolution of spectra identical to those of pure drugs present in mixtures. Overlapped spectra of ternary mixtures could be resolved by spectrum subtraction coupled FDRM (SS-FDRM) or by successive application of FZM and FD1M.


Author(s):  
Ahmad R. Alsaber ◽  
Jiazhu Pan ◽  
Adeeba Al-Hurban 

In environmental research, missing data are often a challenge for statistical modeling. This paper addressed some advanced techniques to deal with missing values in a data set measuring air quality using a multiple imputation (MI) approach. MCAR, MAR, and NMAR missing data techniques are applied to the data set. Five missing data levels are considered: 5%, 10%, 20%, 30%, and 40%. The imputation method used in this paper is an iterative imputation method, missForest, which is related to the random forest approach. Air quality data sets were gathered from five monitoring stations in Kuwait, aggregated to a daily basis. Logarithm transformation was carried out for all pollutant data, in order to normalize their distributions and to minimize skewness. We found high levels of missing values for NO2 (18.4%), CO (18.5%), PM10 (57.4%), SO2 (19.0%), and O3 (18.2%) data. Climatological data (i.e., air temperature, relative humidity, wind direction, and wind speed) were used as control variables for better estimation. The results show that the MAR technique had the lowest RMSE and MAE. We conclude that MI using the missForest approach has a high level of accuracy in estimating missing values. MissForest had the lowest imputation error (RMSE and MAE) among the other imputation methods and, thus, can be considered to be appropriate for analyzing air quality data.


Sign in / Sign up

Export Citation Format

Share Document