Random property allocation: A novel geographic imputation procedure based on a complete geocoded address file

2013 ◽  
Vol 6 ◽  
pp. 7-16 ◽  
Author(s):  
Scott R. Walter ◽  
Nectarios Rose
2020 ◽  
Vol 98 (12) ◽  
Author(s):  
Héctor Marina ◽  
Antonio Reverter ◽  
Beatriz Gutiérrez-Gil ◽  
Pamela Almeida Alexandre ◽  
Rocío Pelayo ◽  
...  

Abstract Sheep milk is mainly intended to manufacture a wide variety of high-quality cheeses. The ovine cheese industry would benefit from an improvement, through genetic selection, of traits related to the milk coagulation properties (MCPs) and cheese yield-related traits, broadly denoted as “cheese-making traits.” Considering that routine measurements of these traits needed for genetic selection are expensive and time-consuming, this study aimed to evaluate the accuracy of a cheese-making phenotype imputation method based on the information from official milk control records combined with the pH of the milk. For this study, we analyzed records of milk production traits, milk composition traits, and measurements of cheese-making traits available from a total of 1,145 dairy ewes of the Spanish Assaf sheep breed. Cheese-making traits included five related to the MCPs and two cheese yield-related traits. The milk and cheese-making phenotypes were adjusted for significant effects based on a general linear model. The adjusted phenotypes were used to define a multiple-phenotype imputation procedure for the cheese-making traits based on multivariate normality and Markov chain Monte Carlo sampling. Five of the seven cheese-making traits considered in this study achieved a prediction accuracy of 0.60 computed as the correlation between the adjusted phenotypes and the imputed phenotypes. Particularly the logarithm of curd-firming time since rennet addition (logK20) (0.68), which has been previously suggested as a potential candidate trait to improve the cheese ability in this breed, and the logarithm of the ratio between the rennet clotting time and the curd firmness at 60 min (logRCT/A60) (0.65), which has been defined by other studies as an indicator trait of milk coagulation efficiency. This study represents a first step toward the possible use of the phenotype imputation of cheese-making traits to develop a practical methodology for the dairy sheep industry to impute cheese-making traits only based on the analysis of a milk sample without the need of pedigree information. This information could be also used in future planning of specific breeding programs considering the importance of the cheese-making efficiency in dairy sheep and highlights the potential of phenotype imputation to leverage sample size on expensive, hard-to-measure phenotypes.


Author(s):  
Gregory J. Matthews ◽  
Karthik Bharath ◽  
Sebastian Kurtek ◽  
Juliet K. Brophy ◽  
George K. Thiruvathukal ◽  
...  

We consider the problem of classifying curves when they are observed only partially on their parameter domains. We propose computational methods for (i) completion of partially observed curves; (ii) assessment of completion variability through a nonparametric multiple imputation procedure; (iii) development of nearest neighbor classifiers compatible with the completion techniques. Our contributions are founded on exploiting the geometric notion of shape of a curve, defined as those aspects of a curve that remain unchanged under translations, rotations and reparameterizations. Explicit incorporation of shape information into the computational methods plays the dual role of limiting the set of all possible completions of a curve to those with similar shape while simultaneously enabling more efficient use of training data in the classifier through shape-informed neighborhoods. Our methods are then used for taxonomic classification of partially observed curves arising from images of fossilized Bovidae teeth, obtained from a novel anthropological application concerning paleoenvironmental reconstruction.


2019 ◽  
Vol 13 ◽  
pp. 117793221987388
Author(s):  
Koji Ishiya ◽  
Fuzuki Mizuno ◽  
Li Wang ◽  
Shintaroh Ueda

The incompleteness of partial human mitochondrial genome sequences makes it difficult to perform relevant comparisons among multiple resources. To deal with this issue, we propose a computational framework for deducing missing nucleotides in the human mitochondrial genome. We applied it to worldwide mitochondrial haplogroup lineages and assessed its performance. Our approach can deduce the missing nucleotides with a precision of 0.99 or higher in most human mitochondrial DNA lineages. Furthermore, although low-coverage mitochondrial genome sequences often lead to a blurred relationship in the multidimensional scaling analysis, our approach can correct this positional arrangement according to the corresponding mitochondrial DNA lineages. Therefore, our framework will provide a practical solution to compensate for the lack of genome coverage in partial and fragmented human mitochondrial genome sequences. In this study, we developed an open-source computer program, MitoIMP, implementing our imputation procedure. MitoIMP is freely available from https://github.com/omics-tools/mitoimp .


1991 ◽  
Vol 22 (4) ◽  
pp. 281-291 ◽  
Author(s):  
Otis W. Gilley ◽  
Robert P. Leone

2015 ◽  
Author(s):  
Shinichi Nakagawa ◽  
Pierre de Villemereuil

Phylogenetic comparative methods (PCMs), especially ones based on linear models, have played a central role in understanding species’ trait evolution. These methods, however, usually assume that phylogenetic trees are known without error or uncertainty, but this assumption is most likely incorrect. So far, Markov chain Monte Carlo, MCMC-based Bayesian methods have successfully been deployed to account for such phylogenetic uncertainty in PCMs. Yet, the use of these methods seems to have been limited, probably due to difficulties in their implementation. Here, we propose an approach with which phylogenetic uncertainty is incorporated in a simple, readily implementable and reliable manner. Our approach uses Rubin’s rules, which are an integral part of a standard multiple imputation procedure, often employed to recover missing data. In our case, we see the true phylogenetic tree as a missing piece of data, and apply Rubin’s rules to amalgamate parameter estimates from a number of models using a set of phylogenetic trees (e.g. a Bayesian posterior distribution of phylogenetic trees). Using a simulation study, we demonstrate that our approach using Rubin’s rules performs better in accounting for phylogenetic uncertainty than alternative methods such as MCMC-based Bayesian and Akaike information criterion, AIC-based model averaging approaches; that is, on average, our approach has the best 95% confidence/credible interval coverage among all. A unique property of the multiple imputation procedure is that the index, named ‘relative efficiency’, could be used to quantify the number of trees required for incorporating phylogenetic uncertainty. Thus, by using the relative efficiency, we show the required tree number is surprisingly small (~50 trees) at least in our simulation. In addition to these advantages above, our approach could be combined seamlessly with PCMs that utilize multiple imputation to recover missing data. Given the ubiquity of missing data, it is likely that the use of the multiple imputation procedure with Rubin’s rules will be popular to deal with phylogenetic uncertainty as well as missing data in comparative data.


10.37236/3222 ◽  
2013 ◽  
Vol 20 (2) ◽  
Author(s):  
Vindya Bhat ◽  
Vojtěch Rödl

In 1964, Erdős proved that for any $\alpha > 0$, an $l$-uniform hypergraph $G$ with $n \geq n_0(\alpha, l)$ vertices and $\alpha \binom{n}{l}$ edges contains a large complete $l$-equipartite subgraph. This implies that any sufficiently large $G$ with density $\alpha > 0$ contains a large subgraph with density at least $l!/l^l$.In this note we study a similar problem for $l$-uniform hypergraphs $Q$ with a weak quasi-random property (i.e. with edges uniformly distributed over the sufficiently large subsets of vertices). We prove that any sufficiently large quasi-random $l$-uniform hypergraph $Q$ with density $\alpha > 0$ contains a large subgraph with density at least $\frac{(l-1)!}{l^{l-1}-1}$. In particular, for $l=3$, any sufficiently large such $Q$ contains a large subgraph with density at least $\frac{1}{4}$ which is the best possible lower bound.We define jumps for quasi-random sequences of $l$-graphs and our result implies that every number between 0 and $\frac{(l-1)!}{l^{l-1}-1}$ is a jump for quasi-random $l$-graphs. For $l=3$ this interval can be improved based on a recent result of Glebov, Král' and Volec. We prove that every number between [0, 0.3192) is a jump for quasi-random $3$-graphs.


Sign in / Sign up

Export Citation Format

Share Document