scholarly journals DBSIM: A Platform of Simulation Resources for Genetic Epidemiology Studies

2015 ◽  
Author(s):  
Po-Ju Yao ◽  
Ren-Hua Chung

Computer simulations are routinely conducted to evaluate new statistical methods, to compare the properties among different methods, and to mimic the real data in genetic epidemiology studies. Conducting simulation studies can become a complicated task as several challenges can occur, such as the selection of an appropriate simulation tool and the specification of parameters in the simulation model. Although abundant simulated data have been generated for human genetic research, currently there is no public database designed specifically as a repository for these simulated data. With the lack of such database, for similar studies, similar simulations may have been repeated, which resulted in redundant works. We created an online platform, DBSIM, for simulation data sharing and discussion of simulation techniques for human genetic studies. DBSIM has a database containing simulation scripts, simulated data, and documentations from published manuscripts, as well as a discussion forum, which provides a platform for discussion of the simulated data and exchanging simulation ideas. DBSIM will be useful in three aspects. Moreover, summary statistics such as the simulation tools that are most commonly used and datasets that are most frequently downloaded are provided. The statistics will be very informative for researchers to choose an appropriate simulation tool or select a common dataset for method comparisons. DBSIM can be accessed at http://dbsim.nhri.org.tw.

2017 ◽  
Vol 8 (4) ◽  
pp. 41 ◽  
Author(s):  
Anjana P Das ◽  
Sabu M Thampi

In underwater sensor network(UWSN) research, it is highly expensive to deploy a complete test bed involving complex network structure and data links to validate a network protocol or an algorithm. This practical challenge points to the need of a simulation environment which can reproduce the actual underwater scenario without the loss of generality. Since so many simulators are proposed for UWSN simulation, the selection of an appropriate tool based on the research requirement is very important in validation and interpretation of results. This paper provides an in-depth survey of different simulation tools available for UWSN simulation. We compared the features offered by each tool, pre-requirements, and provide the run time experiences of some of the open source tools. We conducted simulation of sample scenarios in some of the open source tools and compared the results. This survey helps a researcher to identify a simulation tool satisfying their specific research requirements.


Author(s):  
Aysenur Toptan ◽  
Nathan W. Porter ◽  
Jason D. Hales ◽  
Benjamin W. Spencer ◽  
Martin Pilch ◽  
...  

Abstract When establishing the pedigree of a simulation tool, code verification is used to ensure that the implemented numerical algorithm is a faithful representation of its underlying mathematical model. During this process, numerical results on various meshes are systematically compared to a reference analytic solution. The selection of analytic solutions can be a laborious process, as it is difficult to establish adequate code confidence without performing redundant work. Here, we address this issue by applying a physics-based process that establishes a set of reference problems. In this process, code simulation options are categorized and systematically tested, which ensures that gaps in testing are easily identified and addressed. The resulting problems are primarily intended for code verification analysis but may also be useful for comparison to other simulation codes, troubleshooting activities, or training exercises. The process is used to select fifteen code verification problems relevant for the one-dimensional steady-state heat conduction equation. These problems are applicable to a wide variety of simulation tools, but, in this work, a demonstration is performed using the finite element-based nuclear fuel performance code BISON. Convergence to the analytic solution at the theoretical rate is quantified for a selection of the problems, which establishes a baseline pedigree for the code. Not only can this standard set of conduction solutions be used for verification of other codes, but also the physics-based process for selecting problems can be utilized to quantify and expand testing for any simulation tool.


2016 ◽  
Vol 2016 ◽  
pp. 1-9 ◽  
Author(s):  
Jingyu Guo ◽  
Hongliang Qi ◽  
Yuan Xu ◽  
Zijia Chen ◽  
Shulong Li ◽  
...  

Limited-angle computed tomography (CT) has great impact in some clinical applications. Existing iterative reconstruction algorithms could not reconstruct high-quality images, leading to severe artifacts nearby edges. Optimal selection of initial image would influence the iterative reconstruction performance but has not been studied deeply yet. In this work, we proposed to generate optimized initial image followed by total variation (TV) based iterative reconstruction considering the feature of image symmetry. The simulated data and real data reconstruction results indicate that the proposed method effectively removes the artifacts nearby edges.


PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5852
Author(s):  
Yu-Yu Lin ◽  
Ping Chun Wu ◽  
Pei-Lung Chen ◽  
Yen-Jen Oyang ◽  
Chien-Yu Chen

Background The need for read-based phasing arises with advances in sequencing technologies. The minimum error correction (MEC) approach is the primary trend to resolve haplotypes by reducing conflicts in a single nucleotide polymorphism-fragment matrix. However, it is frequently observed that the solution with the optimal MEC might not be the real haplotypes, due to the fact that MEC methods consider all positions together and sometimes the conflicts in noisy regions might mislead the selection of corrections. To tackle this problem, we present a hierarchical assembly-based method designed to progressively resolve local conflicts. Results This study presents HAHap, a new phasing algorithm based on hierarchical assembly. HAHap leverages high-confident variant pairs to build haplotypes progressively. The phasing results by HAHap on both real and simulated data, compared to other MEC-based methods, revealed better phasing error rates for constructing haplotypes using short reads from whole-genome sequencing. We compared the number of error corrections (ECs) on real data with other methods, and it reveals the ability of HAHap to predict haplotypes with a lower number of ECs. We also used simulated data to investigate the behavior of HAHap under different sequencing conditions, highlighting the applicability of HAHap in certain situations.


2019 ◽  
Vol 17 (2) ◽  
Author(s):  
M. Tanwir Akhtar ◽  
Athar Ali Khan

Reliability data are generated in the form of success/failure. An attempt was made to model such type of data using binomial distribution in the Bayesian paradigm. For fitting the Bayesian model both analytic and simulation techniques are used. Laplace approximation was implemented for approximating posterior densities of the model parameters. Parallel simulation tools were implemented with an extensive use of R and JAGS. R and JAGS code are developed and provided. Real data sets are used for the purpose of illustration.


2020 ◽  
Vol 12 (5) ◽  
pp. 771 ◽  
Author(s):  
Miguel Angel Ortíz-Barrios ◽  
Ian Cleland ◽  
Chris Nugent ◽  
Pablo Pancardo ◽  
Eric Järpe ◽  
...  

Automatic detection and recognition of Activities of Daily Living (ADL) are crucial for providing effective care to frail older adults living alone. A step forward in addressing this challenge is the deployment of smart home sensors capturing the intrinsic nature of ADLs performed by these people. As the real-life scenario is characterized by a comprehensive range of ADLs and smart home layouts, deviations are expected in the number of sensor events per activity (SEPA), a variable often used for training activity recognition models. Such models, however, rely on the availability of suitable and representative data collection and is habitually expensive and resource-intensive. Simulation tools are an alternative for tackling these barriers; nonetheless, an ongoing challenge is their ability to generate synthetic data representing the real SEPA. Hence, this paper proposes the use of Poisson regression modelling for transforming simulated data in a better approximation of real SEPA. First, synthetic and real data were compared to verify the equivalence hypothesis. Then, several Poisson regression models were formulated for estimating real SEPA using simulated data. The outcomes revealed that real SEPA can be better approximated ( R pred 2 = 92.72 % ) if synthetic data is post-processed through Poisson regression incorporating dummy variables.


2021 ◽  
Vol 14 (1) ◽  
pp. 86-100
Author(s):  
Aleksei A. Korneev ◽  
Anatoly N. Krichevets ◽  
Konstantin V. Sugonyaev ◽  
Dmitriy V. Ushakov ◽  
Alexander G. Vinogradov ◽  
...  

Background. Spearman’s law of diminishing returns (SLODR) states that intercorrelations between scores on tests of intellectual abilities were higher when the data set was comprised of subjects with lower intellectual abilities and vice versa. After almost a hundred years of research, this trend has only been detected on average. Objective. To determine whether the very different results were obtained due to variations in scaling and the selection of subjects. Design. We used three methods for SLODR detection based on moderated factor analysis (MFCA) to test real data and three sets of simulated data. Of the latter group, the first one simulated a real SLODR effect. The second one simulated the case of a different density of tasks of varying difficulty; it did not have a real SLODR effect. The third one simulated a skewed selection of respondents with different abilities and also did not have a real SLODR effect. We selected the simulation parameters so that the correlation matrix of the simulated data was similar to the matrix created from the real data, and all distributions had similar skewness parameters (about -0.3). Results. The results of MFCA are contradictory and we cannot clearly distinguish by this method the dataset with real SLODR from datasets with similar correlation structure and skewness, but without a real SLODR effect. Theresults allow us to conclude that when effects like SLODR are very subtle and can be identified only with a large sample, then features of the psychometric scale become very important, because small variations of scale metrics may lead either to masking of real SLODR or to false identification of SLODR.


2021 ◽  
Author(s):  
Kira Villiers ◽  
Eric Dinglasan ◽  
Ben J. Hayes ◽  
Kai P. Voss-Fels

Simulation tools are key to designing and optimising breeding programs that are many-year, high-effort endeavours. Tools that operate on real genotypes and integrate easily with other analysis software are needed for users to integrate simulated data into their analysis and decision-making processes. This paper presents genomicSimulation, a fast and flexible tool for the stochastic simulation of crossing and selection on real genotypes. It is fully written in C for high execution speeds, has minimal dependencies, and is available as an R package for integration with R's broad range of analysis and visualisation tools. Comparisons of a simulated recreation of a breeding program to the real data shows that the tool's simulated offspring correctly show key population features. Both versions of genomicSimulation are freely available on GitHub: The R package version at https://github.com/vllrs/genomicSimulation/ and the C library version at https://github.com/vllrs/genomicSimulationC


1991 ◽  
Vol 158 (5) ◽  
pp. 615-623 ◽  
Author(s):  
Peter Jones ◽  
Robin M. Murray

Genes are now accepted as being important in the aetiology of schizophrenia (Gottesman & Shields, 1982; McGuffinet al,1987), and over the past decade the emphasis in genetic research has shifted away from genetic epidemiology to searching the chromosomal DNA for the genes themselves. Despite this increasing technical sophistication, the application of linkage analysis to families multiply affected by schizophrenia has been accompanied by the familiar controversy over the exact borders of the adult clinical phenotype (Sherringtonet al,1988; St Clairet al,1989). Indeed, the preoccupation of researchers with the vagaries of the clinical definition has resulted in repeated attempts to use genetic studies to determine the relative validity of different operational definitions of schizophrenia (McGuffinet al,1984; Farmeret al,1987). To us, such studies beg the question of how precisely genes are involved in the aetiology of schizophrenia; after all, genes code for proteins, not for auditory hallucinations in the third person.


Metabolites ◽  
2021 ◽  
Vol 11 (4) ◽  
pp. 214
Author(s):  
Aneta Sawikowska ◽  
Anna Piasecka ◽  
Piotr Kachlicki ◽  
Paweł Krajewski

Peak overlapping is a common problem in chromatography, mainly in the case of complex biological mixtures, i.e., metabolites. Due to the existence of the phenomenon of co-elution of different compounds with similar chromatographic properties, peak separation becomes challenging. In this paper, two computational methods of separating peaks, applied, for the first time, to large chromatographic datasets, are described, compared, and experimentally validated. The methods lead from raw observations to data that can form inputs for statistical analysis. First, in both methods, data are normalized by the mass of sample, the baseline is removed, retention time alignment is conducted, and detection of peaks is performed. Then, in the first method, clustering is used to separate overlapping peaks, whereas in the second method, functional principal component analysis (FPCA) is applied for the same purpose. Simulated data and experimental results are used as examples to present both methods and to compare them. Real data were obtained in a study of metabolomic changes in barley (Hordeum vulgare) leaves under drought stress. The results suggest that both methods are suitable for separation of overlapping peaks, but the additional advantage of the FPCA is the possibility to assess the variability of individual compounds present within the same peaks of different chromatograms.


Sign in / Sign up

Export Citation Format

Share Document