Estimation of Missing Streamflow Data Using Anfis Models and Determination of the Number of Datasets for Anfis: The Case of Yeşilırmak River

Comparative Study of Three Imputation Methods to Treat Missing Values

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v11i7.3472 ◽

2013 ◽

Vol 11 (7) ◽

pp. 2779-2786

Author(s):

Rahul Singhai

Keyword(s):

Data Mining ◽

Missing Data ◽

Missing Values ◽

Learning Algorithm ◽

Poor Quality ◽

Imputation Method ◽

Data Set ◽

Imputation Methods ◽

Missing Data Treatment

One relevant problem in data preprocessing is the presence of missing data that leads the poor quality of patterns, extracted after mining. Imputation is one of the widely used procedures that replace the missing values in a data set by some probable values. The advantage of this approach is that the missing data treatment is independent of the learning algorithm used. This allows the user to select the most suitable imputation method for each situation. This paper analyzes the various imputation methods proposed in the field of statistics with respect to data mining. A comparative analysis of three different imputation approaches which can be used to impute missing attribute values in data mining are given that shows the most promising method. An artificial input data (of numeric type) file of 1000 records is used to investigate the performance of these methods. For testing the significance of these methods Z-test approach were used.

Download Full-text

Revised Normal Ratio Methods for Imputation of Missing Rainfall Data

Scientific Research Journal ◽

10.24191/srj.v13i1.9384 ◽

2016 ◽

Vol 13 (1) ◽

pp. 83

Author(s):

Siti Nur Zahrah Amin Burhanuddin ◽

Sayang Mohd Deni ◽

Norazan Mohamed Ramli

Keyword(s):

Missing Data ◽

Daily Rainfall ◽

Peninsular Malaysia ◽

Rainfall Data ◽

Ratio Method ◽

Accurate Estimation ◽

Trimmed Mean ◽

Coordinate Method ◽

Target Station ◽

Normal Ratio

A good quality of rainfall data is highly necessary in hydrological and meteorological analyses. Lack of quality in rainfall data will influence the process of analyses and subsequently, produce misleading results. Thus, this study is aimed to propose modified missing rainfall data treatment methods that produced more accurate estimation results. In this study, the old normal ratio method and the modified normal ratio based on trimmed mean are combined with geographical coordinate method. The performances of these modified methods were tested on various levels of the missing data of 36 years complete daily rainfall records from eighteen meteorology stations in Peninsular Malaysia. The results indicated that both modified methods improved the estimation of missing rainfall values at the target station based on the least error measurements. Modified normal ratio based on trimmed mean with geographical coordinate method is found to be the most appropriate method for station Batu Kurau and Sg. Bernam while modified old normal ratio with geographical coordinate is the most accurate in estimating the missing data at station Genting Klang.

Download Full-text

Revised Normal Ratio Methods for Imputation of Missing Rainfall Data

Scientific Research Journal ◽

10.24191/srj.v13i1.5444 ◽

2016 ◽

Vol 13 (1) ◽

pp. 83 ◽

Cited By ~ 1

Author(s):

Siti Nur Zahrah Amin Burhanuddin ◽

Sayang Mohd Deni ◽

Norazan Mohamed Ramli

Keyword(s):

Missing Data ◽

Daily Rainfall ◽

Peninsular Malaysia ◽

Rainfall Data ◽

Ratio Method ◽

Accurate Estimation ◽

Trimmed Mean ◽

Coordinate Method ◽

Target Station ◽

Normal Ratio

A good quality of rainfall data is highly necessary in hydrological and meteorological analyses. Lack of quality in rainfall data will influence the process of analyses and subsequently, produce misleading results. Thus, this study is aimed to propose modified missing rainfall data treatment methods that produced more accurate estimation results. In this study, the old normal ratio method and the modified normal ratio based on trimmed mean are combined with geographical coordinate method. The performances of these modified methods were tested on various levels of the missing data of 36 years complete daily rainfall records from eighteen meteorology stations in Peninsular Malaysia. The results indicated that both modified methods improved the estimation of missing rainfall values at the target station based on the least error measurements. Modified normal ratio based on trimmed mean with geographical coordinate method is found to be the most appropriate method for station Batu Kurau and Sg. Bernam while modified old normal ratio with geographical coordinate is the most accurate in estimating the missing data at station Genting Klang.

Download Full-text

Applying Generalizability Theory to Optimize Analysis of Spontaneous Teacher Talk in Elementary Classrooms

Journal of Speech Language and Hearing Research ◽

10.1044/2020_jslhr-19-00118 ◽

2020 ◽

Vol 63 (6) ◽

pp. 1947-1957

Author(s):

Alexandra Hollo ◽

Johanna L. Staubitz ◽

Jason C. Chow

Keyword(s):

Special Education Teachers ◽

Generalizability Theory ◽

Child Outcomes ◽

Elementary Classrooms ◽

Teacher Talk ◽

Group Instruction ◽

Data Set ◽

School Year ◽

Minimum Number ◽

Representative Samples

Purpose Although sampling teachers' child-directed speech in school settings is needed to understand the influence of linguistic input on child outcomes, empirical guidance for measurement procedures needed to obtain representative samples is lacking. To optimize resources needed to transcribe, code, and analyze classroom samples, this exploratory study assessed the minimum number and duration of samples needed for a reliable analysis of conventional and researcher-developed measures of teacher talk in elementary classrooms. Method This study applied fully crossed, Person (teacher) × Session (samples obtained on 3 separate occasions) generalizability studies to analyze an extant data set of three 10-min language samples provided by 28 general and special education teachers recorded during large-group instruction across the school year. Subsequently, a series of decision studies estimated of the number and duration of sessions needed to obtain the criterion g coefficient ( g > .70). Results The most stable variables were total number of words and mazes, requiring only a single 10-min sample, two 6-min samples, or three 3-min samples to reach criterion. No measured variables related to content or complexity were adequately stable regardless of number and duration of samples. Conclusions Generalizability studies confirmed that a large proportion of variance was attributable to individuals rather than the sampling occasion when analyzing the amount and fluency of spontaneous teacher talk. In general, conventionally reported outcomes were more stable than researcher-developed codes, which suggests some categories of teacher talk are more context dependent than others and thus require more intensive data collection to measure reliably.

Download Full-text

Extractive Spectrophotometric Method for the Determination of Glibenclamide by Ion-Pair Complex in Pure Form and Pharmaceutical Formulation

International Journal of Drug Delivery Technology ◽

10.25258/ijddt.v9i3.20 ◽

2019 ◽

Vol 9 (o3) ◽

Author(s):

RUAA MUAYAD MAHMOOD ◽

HAMSA MUNAM YASSEN ◽

SAMAR , NAJWA ISSAC ABDULLA AHMED DARWEESH ◽

NAJWA ISSAC ABDULLA

Keyword(s):

Spectrophotometric Method ◽

Molar Absorptivity ◽

Ion Pair ◽

Pharmaceutical Formulation ◽

Ratio Method ◽

Colored Complex ◽

Colored Product ◽

Method Of Continuous Variation ◽

Job's Method

Simple, rapid and sensitive extractive spectrophotometric method is presented for the determination of glibenclamide (Glb) based on the formation of ion-pair complex between the Glb and anionic dye, methyl orange (MO) at pH 4. The yellow colored complex formed was quantitatively extracted into dichloromethane and measured at 426 nm. The colored product obeyed Beer’s law in the concentration range of (0.5-40) μg.ml-1. The value of molar absorptivity obtained from Beer’s data was found to be 31122 L.mol-1.cm-1, Sandell’s sensitivity value was calculated to be 0.0159 μg.cm-2, while the limits of detection (LOD) and quantification (LOQ) were found to be 0.1086 and 0.3292 μg.ml-1, respectively. The stoichiometry of the complex created between the Glb and MO was 1:1 as determined via Job’s method of continuous variation and mole ratio method. The method was successfully applied for the analysis of pharmaceutical formulation.

Download Full-text

The social wasp Vespula germanica (Fabricius) (Hymenoptera: Vespidae) population dynamics in England over 39 years.

The Entomologist s monthly magazine ◽

10.31184/m00138908.1542.3906 ◽

2018 ◽

Vol 154 (2) ◽

pp. 149-155

Author(s):

Michael Archer

Keyword(s):

Population Dynamics ◽

Population Dynamic ◽

Ecological Factors ◽

Social Wasp ◽

Data Sets ◽

Data Set ◽

Vespula Germanica ◽

The Social ◽

Minimum Number ◽

Suction Traps

1. Yearly records of worker Vespula germanica (Fabricius) taken in suction traps at Silwood Park (28 years) and at Rothamsted Research (39 years) are examined. 2. Using the autocorrelation function (ACF), a significant negative 1-year lag followed by a lesser non-significant positive 2-year lag was found in all, or parts of, each data set, indicating an underlying population dynamic of a 2-year cycle with a damped waveform. 3. The minimum number of years before the 2-year cycle with damped waveform was shown varied between 17 and 26, or was not found in some data sets. 4. Ecological factors delaying or preventing the occurrence of the 2-year cycle are considered.

Download Full-text

Determination of the minimum number of technological bases of a part

10.36652/0042-4633-2020-9-73-75 ◽

2020 ◽

pp. 73-75

Author(s):

B.M. Bazrov ◽

T.M. Gaynutdinov

Keyword(s):

Cost Price ◽

Labor Input ◽

Part Surface ◽

Technological Base ◽

Size Accuracy ◽

Minimum Number ◽

The Cost ◽

Input Cost ◽

Selection Of

The selection of technological bases is considered before the choice of the type of billet and the development of the route of the technological process. A technique is proposed for selecting the minimum number of sets of technological bases according to the criterion of equality in the cost price of manufacturing the part according to the principle of unity and combination of bases at this stage. Keywords: part, surface, coordinating size, accuracy, design and technological base, labor input, cost price. [email protected]

Download Full-text

Comparative Study for the Assay of Plant Derived Chemicals in Pharmaceutical Formulation Using Methods Dependent on Factorized Spectra

Journal of AOAC International ◽

10.1093/jaoacint/qsab027 ◽

2021 ◽

Author(s):

Nesma M Fahmy ◽

Adel M Michael

Keyword(s):

Pharmaceutical Formulations ◽

Pharmaceutical Formulation ◽

Ternary Mixtures ◽

Ratio Method ◽

The Novel ◽

Naturally Occurring ◽

Accuracy And Precision ◽

Validation Parameters ◽

Significant Difference

Abstract Background Modern built-in spectrophotometer software supporting mathematical processes provided a solution for increasing selectivity for multicomponent mixtures. Objective Simultaneous spectrophotometric determination of the three naturally occurring antioxidants—rutin(RUT), hesperidin(HES), and ascorbic acid(ASC)—in bulk forms and combined pharmaceutical formulation. Method This was achieved by factorized zero order method (FZM), factorized derivative method (FD1M), and factorized derivative ratio method (FDRM), coupled with spectrum subtraction(SS). Results Mathematical filtration techniques allowed each component to be obtained separately in either its zero, first, or derivative ratio form, allowing the resolution of spectra typical to the pure components present in Vitamin C Forte® tablets. The proposed methods were applied over a concentration range of 2–50, 2–30, and 10–100 µg/mL for RUT, HES, and ASC, respectively. Conclusions Recent methods for the analysis of binary mixtures, FZM and FD1M, were successfully applied for the analysis of ternary mixtures and compared to the novel FDRM. All were revealed to be specific and sensitive with successful application on pharmaceutical formulations. Validation parameters were evaluated in accordance with the International Conference on Harmonization guidelines. Statistical results were satisfactory, revealing no significant difference regarding accuracy and precision. Highlights Factorized methods enabled the resolution of spectra identical to those of pure drugs present in mixtures. Overlapped spectra of ternary mixtures could be resolved by spectrum subtraction coupled FDRM (SS-FDRM) or by successive application of FZM and FD1M.

Download Full-text

Handling Complex Missing Data Using Random Forest Approach for an Air Quality Monitoring Dataset: A Case Study of Kuwait Environmental Data (2012 to 2018)

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18031333 ◽

2021 ◽

Vol 18 (3) ◽

pp. 1333

Author(s):

Ahmad R. Alsaber ◽

Jiazhu Pan ◽

Adeeba Al-Hurban

Keyword(s):

Air Quality ◽

Missing Data ◽

Random Forest ◽

Missing Values ◽

Imputation Method ◽

Environmental Data ◽

Environmental Research ◽

Quality Data ◽

Data Set ◽

Air Quality Data

In environmental research, missing data are often a challenge for statistical modeling. This paper addressed some advanced techniques to deal with missing values in a data set measuring air quality using a multiple imputation (MI) approach. MCAR, MAR, and NMAR missing data techniques are applied to the data set. Five missing data levels are considered: 5%, 10%, 20%, 30%, and 40%. The imputation method used in this paper is an iterative imputation method, missForest, which is related to the random forest approach. Air quality data sets were gathered from five monitoring stations in Kuwait, aggregated to a daily basis. Logarithm transformation was carried out for all pollutant data, in order to normalize their distributions and to minimize skewness. We found high levels of missing values for NO2 (18.4%), CO (18.5%), PM10 (57.4%), SO2 (19.0%), and O3 (18.2%) data. Climatological data (i.e., air temperature, relative humidity, wind direction, and wind speed) were used as control variables for better estimation. The results show that the MAR technique had the lowest RMSE and MAE. We conclude that MI using the missForest approach has a high level of accuracy in estimating missing values. MissForest had the lowest imputation error (RMSE and MAE) among the other imputation methods and, thus, can be considered to be appropriate for analyzing air quality data.

Download Full-text