scholarly journals Estimating dengue transmission intensity from serological data: a comparative analysis using mixture and catalytic models.

Author(s):  
Victoria M Cox ◽  
Megan M O'Driscoll ◽  
Natsuko Imai ◽  
Ari Prayitno ◽  
Sri Rezeki Hadinegoro ◽  
...  

Background. Dengue virus (DENV) infection is a global health concern of increasing magnitude. To target intervention strategies, accurate estimates of the force of infection (FOI) are necessary. Catalytic models have been widely used to estimate DENV FOI and rely on a binary classification of serostatus as seropositive or seronegative, according to pre-defined antibody thresholds. Previous work has demonstrated the use of thresholds can cause serostatus misclassification and biased estimates. In contrast, mixture models do not rely on thresholds and use the full distribution of antibody titres. To date, there has been limited application of mixture models to estimate DENV FOI. Methods. We compare the application of mixture models and time-constant and time-varying catalytic models to simulated data and to serological data collected in Vietnam from 2004 to 2009 (N ≥ 2178) and Indonesia in 2014 (N = 3194). Results. The simulation study showed greater estimate bias from the time-constant and time-varying catalytic models (FOI bias = 1.3% (0.05%, 4.6%) and 2.3% (0.06%, 7.8%), seroprevalence bias = 3.1% (0.25%, 9.4%) and 2.9% (0.26%, 8.7%), respectively) than from the mixture model (FOI bias = 0.41% (95% CI 0.02%, 2.7%), seroprevalence bias = 0.11% (0.01%, 3.6%)). When applied to real data from Vietnam, the mixture model frequently produced higher FOI and seroprevalence estimates than the catalytic models. Conclusions. Our results suggest mixture models represent valid, potentially less biased, alternatives to catalytic models, which could be particularly useful when estimating FOI and seroprevalence in low transmission settings, where serostatus misclassification tends to be higher.

Author(s):  
HUI ZHANG ◽  
Q. M. JONATHAN WU ◽  
THANH MINH NGUYEN

In this paper, we propose a novel algorithm for feature selection and model detection using Student's t-distribution based on the variational Bayesian (VB) approach. First, our method is based on the Student's t-mixture model (SMM) which has heavier tail than the Gaussian distribution and is therefore less sensitive to small numbers of data points and consequent precision-estimates of the components number. Second, the number of components, the local feature saliency and the parameters of the mixture model are simultaneously estimated by Bayesian variational learning. Experimental results using synthetic and real data demonstrate the improved robustness of our approach.


2010 ◽  
Vol 22 (7) ◽  
pp. 1718-1736 ◽  
Author(s):  
Shun-ichi Amari

Analysis of correlated spike trains is a hot topic of research in computational neuroscience. A general model of probability distributions for spikes includes too many parameters to be of use in analyzing real data. Instead, we need a simple but powerful generative model for correlated spikes. We developed a class of conditional mixture models that includes a number of existing models and analyzed its capabilities and limitations. We apply the model to dynamical aspects of neuron pools. When Hebbian cell assemblies coexist in a pool of neurons, the condition is specified by these assemblies such that the probability distribution of spikes is a mixture of those of the component assemblies. The probabilities of activation of the Hebbian assemblies change dynamically. We used this model as a basis for a competitive model governing the states of assemblies.


2018 ◽  
Vol 28 (12) ◽  
pp. 3769-3784
Author(s):  
Zihang Lu ◽  
Wendy Lou

In longitudinal studies, it is often of great interest to cluster individual trajectories based on repeated measurements taken over time. Non-linear growth trajectories are often seen in practice, and the individual data can also be measured sparsely, and at irregular time points, which may complicate the modeling process. Motivated by a study of pregnant women hormone profiles, we proposed a shape invariant growth mixture model for clustering non-linear growth trajectories. Bayesian inference via Monte Carlo Markov Chain was employed to estimate the parameters of interest. We compared our model to the commonly used growth mixture model and functional clustering approach by simulation studies. Results from analyzing the real data and simulated data were presented and discussed.


Viruses ◽  
2021 ◽  
Vol 13 (6) ◽  
pp. 1155
Author(s):  
Arno Swart ◽  
Miriam Maas ◽  
Ankje de Vries ◽  
Tryntsje Cuperus ◽  
Marieke Opsteegh

Serological assays, such as the enzyme-linked immunosorbent assay (ELISA), are popular tools for establishing the seroprevalence of various infectious diseases in humans and animals. In the ELISA, the optical density is measured and gives an indication of the antibody level. However, there is variability in optical density values for individuals that have been exposed to the pathogen of interest, as well as individuals that have not been exposed. In general, the distribution of values that can be expected for these two categories partly overlap. Often, a cut-off value is determined to decide which individuals should be considered seropositive or seronegative. However, the classical cut-off approach based on a putative threshold ignores heterogeneity in immune response in the population and is thus not the optimal solution for the analysis of serological data. A binary mixture model does include this heterogeneity, offers measures of uncertainty and the direct estimation of seroprevalence without the need for correction based on sensitivity and specificity. Furthermore, the probability of being seropositive can be estimated for individual samples, and both continuous and categorical covariates (risk-factors) can be included in the analysis. Using ELISA results from rats tested for the Seoul orthohantavirus, we compared the classical cut-off method with a binary mixture model set in a Bayesian framework. We show that it performs similarly or better than cut-off methods, by comparing with real-time quantitative polymerase chain reaction (RT-qPCR) results. We therefore recommend binary mixture models as an analysis tool over classical cut-off methods. An example code is included to facilitate the practical use of binary mixture models in everyday practice.


Genetics ◽  
2004 ◽  
Vol 166 (4) ◽  
pp. 1981-1993 ◽  
Author(s):  
Yuan-Ming Zhang ◽  
Shizhong Xu

AbstractIn plants and laboratory animals, QTL mapping is commonly performed using F2 or BC individuals derived from the cross of two inbred lines. Typical QTL mapping statistics assume that each F2 individual is genotyped for the markers and phenotyped for the trait. For plant traits with low heritability, it has been suggested to use the average phenotypic values of F3 progeny derived from selfing F2 plants in place of the F2 phenotype itself. All F3 progeny derived from the same F2 plant belong to the same F2:3 family, denoted by F2:3. If the size of each F2:3 family (the number of F3 progeny) is sufficiently large, the average value of the family will represent the genotypic value of the F2 plant, and thus the power of QTL mapping may be significantly increased. The strategy of using F2 marker genotypes and F3 average phenotypes for QTL mapping in plants is quite similar to the daughter design of QTL mapping in dairy cattle. We study the fundamental principle of the plant version of the daughter design and develop a new statistical method to map QTL under this F2:3 strategy. We also propose to combine both the F2 phenotypes and the F2:3 average phenotypes to further increase the power of QTL mapping. The statistical method developed in this study differs from published ones in that the new method fully takes advantage of the mixture distribution for F2:3 families of heterozygous F2 plants. Incorporation of this new information has significantly increased the statistical power of QTL detection relative to the classical F2 design, even if only a single F3 progeny is collected from each F2:3 family. The mixture model is developed on the basis of a single-QTL model and implemented via the EM algorithm. Substantial computer simulation was conducted to demonstrate the improved efficiency of the mixture model. Extension of the mixture model to multiple QTL analysis is developed using a Bayesian approach. The computer program performing the Bayesian analysis of the simulated data is available to users for real data analysis.


Author(s):  
P.L. Nikolaev

This article deals with method of binary classification of images with small text on them Classification is based on the fact that the text can have 2 directions – it can be positioned horizontally and read from left to right or it can be turned 180 degrees so the image must be rotated to read the sign. This type of text can be found on the covers of a variety of books, so in case of recognizing the covers, it is necessary first to determine the direction of the text before we will directly recognize it. The article suggests the development of a deep neural network for determination of the text position in the context of book covers recognizing. The results of training and testing of a convolutional neural network on synthetic data as well as the examples of the network functioning on the real data are presented.


Entropy ◽  
2018 ◽  
Vol 20 (11) ◽  
pp. 828 ◽  
Author(s):  
Jixia Wang ◽  
Yameng Zhang

This paper is dedicated to the study of the geometric average Asian call option pricing under non-extensive statistical mechanics for a time-varying coefficient diffusion model. We employed the non-extensive Tsallis entropy distribution, which can describe the leptokurtosis and fat-tail characteristics of returns, to model the motion of the underlying asset price. Considering that economic variables change over time, we allowed the drift and diffusion terms in our model to be time-varying functions. We used the I t o ^ formula, Feynman–Kac formula, and P a d e ´ ansatz to obtain a closed-form solution of geometric average Asian option pricing with a paying dividend yield for a time-varying model. Moreover, the simulation study shows that the results obtained by our method fit the simulation data better than that of Zhao et al. From the analysis of real data, we identify the best value for q which can fit the real stock data, and the result shows that investors underestimate the risk using the Black–Scholes model compared to our model.


Metabolites ◽  
2021 ◽  
Vol 11 (4) ◽  
pp. 214
Author(s):  
Aneta Sawikowska ◽  
Anna Piasecka ◽  
Piotr Kachlicki ◽  
Paweł Krajewski

Peak overlapping is a common problem in chromatography, mainly in the case of complex biological mixtures, i.e., metabolites. Due to the existence of the phenomenon of co-elution of different compounds with similar chromatographic properties, peak separation becomes challenging. In this paper, two computational methods of separating peaks, applied, for the first time, to large chromatographic datasets, are described, compared, and experimentally validated. The methods lead from raw observations to data that can form inputs for statistical analysis. First, in both methods, data are normalized by the mass of sample, the baseline is removed, retention time alignment is conducted, and detection of peaks is performed. Then, in the first method, clustering is used to separate overlapping peaks, whereas in the second method, functional principal component analysis (FPCA) is applied for the same purpose. Simulated data and experimental results are used as examples to present both methods and to compare them. Real data were obtained in a study of metabolomic changes in barley (Hordeum vulgare) leaves under drought stress. The results suggest that both methods are suitable for separation of overlapping peaks, but the additional advantage of the FPCA is the possibility to assess the variability of individual compounds present within the same peaks of different chromatograms.


2021 ◽  
Vol 10 (7) ◽  
pp. 435
Author(s):  
Yongbo Wang ◽  
Nanshan Zheng ◽  
Zhengfu Bian

Since pairwise registration is a necessary step for the seamless fusion of point clouds from neighboring stations, a closed-form solution to planar feature-based registration of LiDAR (Light Detection and Ranging) point clouds is proposed in this paper. Based on the Plücker coordinate-based representation of linear features in three-dimensional space, a quad tuple-based representation of planar features is introduced, which makes it possible to directly determine the difference between any two planar features. Dual quaternions are employed to represent spatial transformation and operations between dual quaternions and the quad tuple-based representation of planar features are given, with which an error norm is constructed. Based on L2-norm-minimization, detailed derivations of the proposed solution are explained step by step. Two experiments were designed in which simulated data and real data were both used to verify the correctness and the feasibility of the proposed solution. With the simulated data, the calculated registration results were consistent with the pre-established parameters, which verifies the correctness of the presented solution. With the real data, the calculated registration results were consistent with the results calculated by iterative methods. Conclusions can be drawn from the two experiments: (1) The proposed solution does not require any initial estimates of the unknown parameters in advance, which assures the stability and robustness of the solution; (2) Using dual quaternions to represent spatial transformation greatly reduces the additional constraints in the estimation process.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Camilo Broc ◽  
Therese Truong ◽  
Benoit Liquet

Abstract Background The increasing number of genome-wide association studies (GWAS) has revealed several loci that are associated to multiple distinct phenotypes, suggesting the existence of pleiotropic effects. Highlighting these cross-phenotype genetic associations could help to identify and understand common biological mechanisms underlying some diseases. Common approaches test the association between genetic variants and multiple traits at the SNP level. In this paper, we propose a novel gene- and a pathway-level approach in the case where several independent GWAS on independent traits are available. The method is based on a generalization of the sparse group Partial Least Squares (sgPLS) to take into account groups of variables, and a Lasso penalization that links all independent data sets. This method, called joint-sgPLS, is able to convincingly detect signal at the variable level and at the group level. Results Our method has the advantage to propose a global readable model while coping with the architecture of data. It can outperform traditional methods and provides a wider insight in terms of a priori information. We compared the performance of the proposed method to other benchmark methods on simulated data and gave an example of application on real data with the aim to highlight common susceptibility variants to breast and thyroid cancers. Conclusion The joint-sgPLS shows interesting properties for detecting a signal. As an extension of the PLS, the method is suited for data with a large number of variables. The choice of Lasso penalization copes with architectures of groups of variables and observations sets. Furthermore, although the method has been applied to a genetic study, its formulation is adapted to any data with high number of variables and an exposed a priori architecture in other application fields.


Sign in / Sign up

Export Citation Format

Share Document