scholarly journals Bayesian hierarchical models for disease mapping applied to contagious pathologies

2019 ◽  
Author(s):  
Sylvain Coly ◽  
Myriam Garrido ◽  
David Abrial ◽  
Anne-Françoise Yao

AbstractDisease mapping aims to determine the underlying disease risk from scattered epidemiological data and to represent it on a smoothed colored map. This methodology is based on Bayesian inference and is classically dedicated to non-infectious diseases whose incidence is low and whose cases distribution is spatially (and eventually temporally) structured. Over the last decades, disease mapping has received many major improvements to extend its scope of application: integrating the temporal dimension, dealing with missing data, taking into account various a prioris (environmental and population covariates, assumptions concerning the repartition and the evolution of the risk), dealing with overdispersion, etc. We aim to adapt this approach to rare infectious diseases. In the context of a contagious disease, the outcome of a primary case can in addition generate secondary occurrences of the pathology in a close spatial and temporal neighborhood; this can result in local overdispersion and in higher spatial and temporal dependencies due to direct and/or indirect transmission. We have proposed and tested 60 Bayesian hierarchical models on 400 simulated datasets and bovine tuberculosis real data. This analysis shows the relevance of the CAR (Conditional AutoRegressive) processes to deal with the structure of the risk. We can also conclude that the negative binomial models outperform the Poisson models with a Gaussian noise to handle overdispersion. In addition our study provided relevant maps which are congruent with the real risk (simulated data) and with the knowledge concerning bovine tuberculosis (real data).Author summaryDisease mapping is dedicated to non-infectious diseases whose incidence is low and whose distribution is spatially (and eventually temporally) structured. In this paper, we aim to adapt this approach to rare infectious pathologies. In the context of a contagious disease, the outcome of a primary case can in addition generate secondary occurrences of the pathology in a close spatial and temporal neighborhood, resulting in local overdispersion and in high spatial and temporal dependencies. We thus explored different adapted spatial, temporal and spatiotemporal links and highlight the most adapted to likely risk structures for infectious diseases. We also conclude that the negative binomial models outperform the Poisson models with a Gaussian noise to handle overdispersion. Our study also provided relevant maps which are congruent with the real risk (in case of simulated data) and with the knowledge concerning bovine tuberculosis (when applying to real data). Thus disease mapping appears as a promising way to investigate rare infectious diseases.


PLoS ONE ◽  
2021 ◽  
Vol 16 (1) ◽  
pp. e0222898
Author(s):  
Sylvain Coly ◽  
Myriam Garrido ◽  
David Abrial ◽  
Anne-Françoise Yao

Disease mapping aims to determine the underlying disease risk from scattered epidemiological data and to represent it on a smoothed colored map. This methodology is based on Bayesian inference and is classically dedicated to non-infectious diseases whose incidence is low and whose cases distribution is spatially (and eventually temporally) structured. Over the last decades, disease mapping has received many major improvements to extend its scope of application: integrating the temporal dimension, dealing with missing data, taking into account various a prioris (environmental and population covariates, assumptions concerning the repartition and the evolution of the risk), dealing with overdispersion, etc. We aim to adapt this approach to model rare infectious diseases proposing specific and generic variants of this methodology. In the context of a contagious disease, the outcome of a primary case can in addition generate secondary occurrences of the pathology in a close spatial and temporal neighborhood; this can result in local overdispersion and in higher spatial and temporal dependencies due to direct and/or indirect transmission. In consequence, we test models including a Negative Binomial distribution (instead of the usual Poisson distribution) to deal with local overdispersion. We also use a specific spatio-temporal link in order to better model the stronger spatial and temporal dependencies due to the transmission of the disease. We have proposed and tested 60 Bayesian hierarchical models on 400 simulated datasets and bovine tuberculosis real data. This analysis shows the relevance of the CAR (Conditional AutoRegressive) processes to deal with the structure of the risk. We can also conclude that the negative binomial models outperform the Poisson models with a Gaussian noise to handle overdispersion. In addition our study provided relevant maps which are congruent with the real risk (simulated data) and with the knowledge concerning bovine tuberculosis (real data).



Biostatistics ◽  
2017 ◽  
Vol 18 (4) ◽  
pp. 637-650 ◽  
Author(s):  
Luis León-Novelo ◽  
Claudio Fuentes ◽  
Sarah Emerson

SUMMARY RNA-Seq data characteristically exhibits large variances, which need to be appropriately accounted for in any proposed model. We first explore the effects of this variability on the maximum likelihood estimator (MLE) of the dispersion parameter of the negative binomial distribution, and propose instead to use an estimator obtained via maximization of the marginal likelihood in a conjugate Bayesian framework. We show, via simulation studies, that the marginal MLE can better control this variation and produce a more stable and reliable estimator. We then formulate a conjugate Bayesian hierarchical model, and use this new estimator to propose a Bayesian hypothesis test to detect differentially expressed genes in RNA-Seq data. We use numerical studies to show that our much simpler approach is competitive with other negative binomial based procedures, and we use a real data set to illustrate the implementation and flexibility of the procedure.



Metabolites ◽  
2021 ◽  
Vol 11 (4) ◽  
pp. 214
Author(s):  
Aneta Sawikowska ◽  
Anna Piasecka ◽  
Piotr Kachlicki ◽  
Paweł Krajewski

Peak overlapping is a common problem in chromatography, mainly in the case of complex biological mixtures, i.e., metabolites. Due to the existence of the phenomenon of co-elution of different compounds with similar chromatographic properties, peak separation becomes challenging. In this paper, two computational methods of separating peaks, applied, for the first time, to large chromatographic datasets, are described, compared, and experimentally validated. The methods lead from raw observations to data that can form inputs for statistical analysis. First, in both methods, data are normalized by the mass of sample, the baseline is removed, retention time alignment is conducted, and detection of peaks is performed. Then, in the first method, clustering is used to separate overlapping peaks, whereas in the second method, functional principal component analysis (FPCA) is applied for the same purpose. Simulated data and experimental results are used as examples to present both methods and to compare them. Real data were obtained in a study of metabolomic changes in barley (Hordeum vulgare) leaves under drought stress. The results suggest that both methods are suitable for separation of overlapping peaks, but the additional advantage of the FPCA is the possibility to assess the variability of individual compounds present within the same peaks of different chromatograms.



2021 ◽  
Vol 10 (7) ◽  
pp. 435
Author(s):  
Yongbo Wang ◽  
Nanshan Zheng ◽  
Zhengfu Bian

Since pairwise registration is a necessary step for the seamless fusion of point clouds from neighboring stations, a closed-form solution to planar feature-based registration of LiDAR (Light Detection and Ranging) point clouds is proposed in this paper. Based on the Plücker coordinate-based representation of linear features in three-dimensional space, a quad tuple-based representation of planar features is introduced, which makes it possible to directly determine the difference between any two planar features. Dual quaternions are employed to represent spatial transformation and operations between dual quaternions and the quad tuple-based representation of planar features are given, with which an error norm is constructed. Based on L2-norm-minimization, detailed derivations of the proposed solution are explained step by step. Two experiments were designed in which simulated data and real data were both used to verify the correctness and the feasibility of the proposed solution. With the simulated data, the calculated registration results were consistent with the pre-established parameters, which verifies the correctness of the presented solution. With the real data, the calculated registration results were consistent with the results calculated by iterative methods. Conclusions can be drawn from the two experiments: (1) The proposed solution does not require any initial estimates of the unknown parameters in advance, which assures the stability and robustness of the solution; (2) Using dual quaternions to represent spatial transformation greatly reduces the additional constraints in the estimation process.



Information ◽  
2021 ◽  
Vol 12 (5) ◽  
pp. 202
Author(s):  
Louai Alarabi ◽  
Saleh Basalamah ◽  
Abdeltawab Hendawi ◽  
Mohammed Abdalla

The rapid spread of infectious diseases is a major public health problem. Recent developments in fighting these diseases have heightened the need for a contact tracing process. Contact tracing can be considered an ideal method for controlling the transmission of infectious diseases. The result of the contact tracing process is performing diagnostic tests, treating for suspected cases or self-isolation, and then treating for infected persons; this eventually results in limiting the spread of diseases. This paper proposes a technique named TraceAll that traces all contacts exposed to the infected patient and produces a list of these contacts to be considered potentially infected patients. Initially, it considers the infected patient as the querying user and starts to fetch the contacts exposed to him. Secondly, it obtains all the trajectories that belong to the objects moved nearby the querying user. Next, it investigates these trajectories by considering the social distance and exposure period to identify if these objects have become infected or not. The experimental evaluation of the proposed technique with real data sets illustrates the effectiveness of this solution. Comparative analysis experiments confirm that TraceAll outperforms baseline methods by 40% regarding the efficiency of answering contact tracing queries.



2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Camilo Broc ◽  
Therese Truong ◽  
Benoit Liquet

Abstract Background The increasing number of genome-wide association studies (GWAS) has revealed several loci that are associated to multiple distinct phenotypes, suggesting the existence of pleiotropic effects. Highlighting these cross-phenotype genetic associations could help to identify and understand common biological mechanisms underlying some diseases. Common approaches test the association between genetic variants and multiple traits at the SNP level. In this paper, we propose a novel gene- and a pathway-level approach in the case where several independent GWAS on independent traits are available. The method is based on a generalization of the sparse group Partial Least Squares (sgPLS) to take into account groups of variables, and a Lasso penalization that links all independent data sets. This method, called joint-sgPLS, is able to convincingly detect signal at the variable level and at the group level. Results Our method has the advantage to propose a global readable model while coping with the architecture of data. It can outperform traditional methods and provides a wider insight in terms of a priori information. We compared the performance of the proposed method to other benchmark methods on simulated data and gave an example of application on real data with the aim to highlight common susceptibility variants to breast and thyroid cancers. Conclusion The joint-sgPLS shows interesting properties for detecting a signal. As an extension of the PLS, the method is suited for data with a large number of variables. The choice of Lasso penalization copes with architectures of groups of variables and observations sets. Furthermore, although the method has been applied to a genetic study, its formulation is adapted to any data with high number of variables and an exposed a priori architecture in other application fields.



2021 ◽  
Vol 11 (2) ◽  
pp. 582
Author(s):  
Zean Bu ◽  
Changku Sun ◽  
Peng Wang ◽  
Hang Dong

Calibration between multiple sensors is a fundamental procedure for data fusion. To address the problems of large errors and tedious operation, we present a novel method to conduct the calibration between light detection and ranging (LiDAR) and camera. We invent a calibration target, which is an arbitrary triangular pyramid with three chessboard patterns on its three planes. The target contains both 3D information and 2D information, which can be utilized to obtain intrinsic parameters of the camera and extrinsic parameters of the system. In the proposed method, the world coordinate system is established through the triangular pyramid. We extract the equations of triangular pyramid planes to find the relative transformation between two sensors. One capture of camera and LiDAR is sufficient for calibration, and errors are reduced by minimizing the distance between points and planes. Furthermore, the accuracy can be increased by more captures. We carried out experiments on simulated data with varying degrees of noise and numbers of frames. Finally, the calibration results were verified by real data through incremental validation and analyzing the root mean square error (RMSE), demonstrating that our calibration method is robust and provides state-of-the-art performance.



2021 ◽  
Vol 64 ◽  
pp. 173-175
Author(s):  
Salvatore Lucio Cutuli ◽  
Flavio De Maio ◽  
Gennaro De Pascale ◽  
Domenico Luca Grieco ◽  
Francesca Romana Monzo ◽  
...  


2021 ◽  
Vol 13 (5) ◽  
pp. 2426
Author(s):  
David Bienvenido-Huertas ◽  
Jesús A. Pulido-Arcas ◽  
Carlos Rubio-Bellido ◽  
Alexis Pérez-Fargallo

In recent times, studies about the accuracy of algorithms to predict different aspects of energy use in the building sector have flourished, being energy poverty one of the issues that has received considerable critical attention. Previous studies in this field have characterized it using different indicators, but they have failed to develop instruments to predict the risk of low-income households falling into energy poverty. This research explores the way in which six regression algorithms can accurately forecast the risk of energy poverty by means of the fuel poverty potential risk index. Using data from the national survey of socioeconomic conditions of Chilean households and generating data for different typologies of social dwellings (e.g., form ratio or roof surface area), this study simulated 38,880 cases and compared the accuracy of six algorithms. Multilayer perceptron, M5P and support vector regression delivered the best accuracy, with correlation coefficients over 99.5%. In terms of computing time, M5P outperforms the rest. Although these results suggest that energy poverty can be accurately predicted using simulated data, it remains necessary to test the algorithms against real data. These results can be useful in devising policies to tackle energy poverty in advance.



Genetics ◽  
2003 ◽  
Vol 165 (4) ◽  
pp. 2269-2282
Author(s):  
D Mester ◽  
Y Ronin ◽  
D Minkov ◽  
E Nevo ◽  
A Korol

Abstract This article is devoted to the problem of ordering in linkage groups with many dozens or even hundreds of markers. The ordering problem belongs to the field of discrete optimization on a set of all possible orders, amounting to n!/2 for n loci; hence it is considered an NP-hard problem. Several authors attempted to employ the methods developed in the well-known traveling salesman problem (TSP) for multilocus ordering, using the assumption that for a set of linked loci the true order will be the one that minimizes the total length of the linkage group. A novel, fast, and reliable algorithm developed for the TSP and based on evolution-strategy discrete optimization was applied in this study for multilocus ordering on the basis of pairwise recombination frequencies. The quality of derived maps under various complications (dominant vs. codominant markers, marker misclassification, negative and positive interference, and missing data) was analyzed using simulated data with ∼50-400 markers. High performance of the employed algorithm allows systematic treatment of the problem of verification of the obtained multilocus orders on the basis of computing-intensive bootstrap and/or jackknife approaches for detecting and removing questionable marker scores, thereby stabilizing the resulting maps. Parallel calculation technology can easily be adopted for further acceleration of the proposed algorithm. Real data analysis (on maize chromosome 1 with 230 markers) is provided to illustrate the proposed methodology.



Sign in / Sign up

Export Citation Format

Share Document