The Importance of Complete Data Sets for Job Scheduling Simulations

Abstract The missing values are not uncommon in real data sets. The algorithms and methods used for the data analysis of complete data sets cannot always be applied to missing value data. In order to use the existing methods for complete data, the missing value data sets are preprocessed. The other solution to this problem is creation of new algorithms dedicated to missing value data sets. The objective of our research is to compare the preprocessing techniques and specialised algorithms and to find their most advantageous usage.

Download Full-text

The Genome Sequencer FLX™ System—Longer reads, more applications, straight forward bioinformatics and more complete data sets

Journal of Biotechnology ◽

10.1016/j.jbiotec.2008.03.021 ◽

2008 ◽

Vol 136 (1-2) ◽

pp. 3-10 ◽

Cited By ~ 176

Author(s):

Marcus Droege ◽

Brendon Hill

Keyword(s):

Complete Data ◽

Data Sets

Download Full-text

MODIFIED POSSIBILISTIC FUZZY C-MEANS ALGORITHM FOR CLUSTERING INCOMPLETE DATA SETS

Acta Polytechnica ◽

10.14311/ap.2021.61.0364 ◽

2021 ◽

Vol 61 (2) ◽

pp. 364-377

Author(s):

. Rustam ◽

Koredianto Usman ◽

Mudyawati Kamaruddin ◽

Dina Chamidah ◽

. Nopendri ◽

...

Keyword(s):

Experimental Data ◽

Incomplete Data ◽

Missing Values ◽

Complete Data ◽

Noise Sensitivity ◽

Data Sets ◽

Fuzzy C Means ◽

Number Of Iterations ◽

Fuzzy C Means Algorithm

A possibilistic fuzzy c-means (PFCM) algorithm is a reliable algorithm proposed to deal with the weaknesses associated with handling noise sensitivity and coincidence clusters in fuzzy c-means (FCM) and possibilistic c-means (PCM). However, the PFCM algorithm is only applicable to complete data sets. Therefore, this research modified the PFCM for clustering incomplete data sets to OCSPFCM and NPSPFCM with the performance evaluated based on three aspects, 1) accuracy percentage, 2) the number of iterations, and 3) centroid errors. The results showed that the NPSPFCM outperforms the OCSPFCM with missing values ranging from 5% − 30% for all experimental data sets. Furthermore, both algorithms provide average accuracies between 97.75%−78.98% and 98.86%−92.49%, respectively.

Download Full-text

Generating Multiple Imputations for Matrix Sampling Data Analyzed With Item Response Models

Journal of Educational and Behavioral Statistics ◽

10.3102/10769986022004425 ◽

1997 ◽

Vol 22 (4) ◽

pp. 425-445 ◽

Cited By ~ 12

Author(s):

Neal Thomas ◽

Nianci Gan

Keyword(s):

Missing Data ◽

Item Response ◽

The United States ◽

Complete Data ◽

Sample Survey ◽

Standard Errors ◽

Data Sets ◽

Response Models ◽

Item Response Models ◽

Survey Designs

Sample survey designs in which each participant is administered a subset of the items contained in a complete survey instrument are becoming an increasingly popular method of reducing respondent burden ( Mislevy, Beaton, Kaplan, Sheehan 1992 ; Raghunathan & Grizzle, 1995 ; Wacholder, Carroll, Pee, Gail 1994 ). Data from these survey designs can be analyzed using multiple imputation methodology that generates several imputed values for the missing data and thus yields several complete data sets. These data sets are then analyzed using complete data estimators and their standard errors ( Rubin, 1987b ). Generating the imputed data sets, however, can be very difficult. We describe improvements to the methods currently used to generate the imputed data sets for item response models summarizing educational data collected by the National Assessment of Educational Progress (NAEP), an ongoing collection of samples of 4th, 8th, and 12th grade students in the United States. The improved approximations produce small to moderate changes in commonly reported estimates, with the larger changes associated with an increasing amount of missing data. The improved approximations produce larger standard errors.

Download Full-text

Poroelastic measurement schemes resulting in complete data sets for granular and other anisotropic porous media

International Journal of Engineering Science ◽

10.1016/j.ijengsci.2009.11.005 ◽

2010 ◽

Vol 48 (4) ◽

pp. 446-459 ◽

Cited By ~ 14

Author(s):

James G. Berryman

Keyword(s):

Porous Media ◽

Complete Data ◽

Data Sets ◽

Anisotropic Porous Media

Download Full-text

Purification, crystallization and preliminary crystallographic analysis of the adhesion domain of Epf fromStreptococcus pyogenes

Acta Crystallographica Section F Structural Biology and Crystallization Communications ◽

10.1107/s1744309112020313 ◽

2012 ◽

Vol 68 (7) ◽

pp. 793-797 ◽

Cited By ~ 1

Author(s):

Christian Linke ◽

Nikolai Siemens ◽

Martin J. Middleditch ◽

Bernd Kreikemeyer ◽

Edward N. Baker

Keyword(s):

Mass Spectrometry ◽

Escherichia Coli ◽

Epithelial Cells ◽

Streptococcus Pyogenes ◽

Crystallographic Analysis ◽

Complete Data ◽

Extracellular Protein ◽

Data Sets ◽

Space Groups ◽

Sequence Identity

The extracellular protein Epf fromStreptococcus pyogenesis important for streptococcal adhesion to human epithelial cells. However, Epf has no sequence identity to any protein of known structure or function. Thus, several predicted domains of the 205 kDa protein Epf were cloned separately and expressed inEscherichia coli. The N-terminal domain of Epf was crystallized in space groupsP21andP212121in the presence of the protease chymotrypsin. Mass spectrometry showed that the species crystallized corresponded to a fragment comprising residues 52–357 of Epf. Complete data sets were collected to 2.0 and 1.6 Å resolution, respectively, at the Australian Synchrotron.

Download Full-text

Adoption, allonursing and allosucking in farmed red deer (Cervus elaphus)

Animal Science ◽

10.1017/s1357729800052000 ◽

2001 ◽

Vol 72 (3) ◽

pp. 483-492 ◽

Cited By ~ 15

Author(s):

L. Bartoš ◽

D. Vaňkovà ◽

J. Šiler ◽

G. Illmann

Keyword(s):

Czech Republic ◽

Pilot Study ◽

Cervus Elaphus ◽

Red Deer ◽

Standing Position ◽

Complete Data ◽

Reliable Indicator ◽

Data Sets ◽

Genital Region ◽

Anogenital Region

AbstractFollowing a pilot study, the aim of this study was to test the hypothesis whether occurrence of massaging the anogenital region of a calf by a non-maternal hind is a reliable indicator of adoption. The investigation was conducted between 28 May (1st day of calving) and 2 September (abrupt weaning of all calves) on a red deer farm at Vimperk, Czech Republic. Fifty hinds and their calves were observed but only complete data sets of sucking bouts were considered for evaluation. Massaging occurred mostly during the 1st month of the calf’s life. All filial calves were massaged repeatedly. Other calves received ano-genital massage at least twice (termed adopted), on a single occasion or not at all (termed non-filial). Filial and adopted calves behaved in a similar way but differently from non-filial calves. They sucked in an antiparallel standing position so that the hind could lick their ano-genital region more often than the non-filial calves. This occurred even when two calves were involved in the bout. When two calves were involved in the sucking bout, non-filial calves sucked from behind, between the hind’s hind legs. This position occurred more frequently with non-filial than among the filial and adopted calves. It was therefore concluded, that repeated allonursing accompanied with massaging of the ano-genital region of the sucking calf by the hind can be considered a signal of adoption. Hinds usually adopted calves older than their own progeny. The adopted calves were on average 2·5 days old. This suggests that it is most likely the calf’s activity which leads to bonding. No reciprocity was found in allosucking and/or allonursing. The fact that non-filial calves commonly initiated allosucking from a non-maternal hind during the day when she gave birth appeared crucial for establishing bonding which subsequently led to adoption. Hinds may be bonded with several calves including their own. Therefore, bonding with a non-filial calf did not principally mean failure in looking after their own progeny as shown in other studies.

Download Full-text

Evaluation of the Grenada Sports for Health Program

Journal of Clinical Review & Case Reports ◽

10.33140/jcrc/03/05/00004 ◽

2018 ◽

Vol 3 (5) ◽

Keyword(s):

Data Collection ◽

Physical Health ◽

Health Program ◽

Health Indicators ◽

Complete Data ◽

Data Sets ◽

Evaluation Period ◽

Waist Hip Ratio ◽

Significant Difference ◽

June July

Objective: The study served to measure basic health outcome measures to help guide the continued implementation of the community exercise component of the Grenada Sports for Health program. Design & Methods: The study population consisted of Grenadian citizens enrolled in three different community exercise programs as part of the Royal Grenada Police Force, Point Saline and La Sagesse, Grenville, Gouyave and Tanteen community exercise program. Initial data collection for this prospective cohort study began during March of 2011 and continued data collection through quarterly assessments was continued to June/July 2014 and June/July 2016. The health indicators for the Sports for Health program were designed to monitor and analyse program participants’ physical health indicators, such as Body Mass Index (BMI), Waist to Hip ratio over time to determine if their participation in the community training program was promoting health benefits by reducing risk factors for non-communicable chronic diseases. Results: During the baseline evaluation period in March, 2011, complete data sets were obtained for 427 participants. During the evaluation period of March 2014, 337 complete data sets were collected from participants from 2011 and during June/July 2016 evaluation, 264 complete data sets were obtained. The BMI, Waist, hip, and waist: hip ratio is presented in Table 1. BMI and Waist: Hip ratio using a Student’s T-test (α=0.05) demonstrated a significant difference between 2011 and 2016 measures. Conclusion: Participants have demonstrated a significant and positive difference in physical health indicators over three years of participation in the Sports for Health program.

Download Full-text

Validation of the Odd Lindley Exponentiated Exponential by a ModiÃ–ed Goodness of Fit Test with Applications to Censored and Complete Data

Pakistan Journal of Statistics and Operation Research ◽

10.18187/pjsor.v15i3.2675 ◽

2019 ◽

pp. 745-771 ◽

Cited By ~ 7

Author(s):

Hafida Goual ◽

Haitham M. Yousof ◽

Mir Masoom Ali

Keyword(s):

Goodness Of Fit ◽

Maximum Likelihood Estimators ◽

Real Data ◽

Complete Data ◽

Data Sets ◽

Grouped Data ◽

Goodness Of Fit Test ◽

Right Censored Data ◽

Mathematical Properties ◽

Chi Squared

In this paper, we Ã–rst introduse a new extension of the exponentiated exponential distribution along with its several mathematical properties. Second, we construct a modiÃ–ed Chi-squared goodness-of-Ã–t test based on the Nikulin-Rao-Robson statistic in presence of censored and complete data. We describe the theory and the mechanism of the Y 2 n statistic test which can be used in survival and reliability data analysis. We use the maximum likelihood estimators based on the initial non grouped data sets. Then, we conduct numerical simulations to reinforce the results. For showing the applicability of our model in various Ã–elds, we illustrate it and the proposed test by applications to two real data sets for complete data case and two other right censored data sets.

Download Full-text