Experimental Design and Data Generation

Author(s):  
Fernando Pérez-Rodríguez ◽  
Antonio Valero
2015 ◽  
Vol 2015 ◽  
pp. 1-8 ◽  
Author(s):  
Andreas Friedrich ◽  
Erhan Kenar ◽  
Oliver Kohlbacher ◽  
Sven Nahnsen

Big data bioinformatics aims at drawing biological conclusions from huge and complex biological datasets. Added value from the analysis of big data, however, is only possible if the data is accompanied by accurate metadata annotation. Particularly in high-throughput experiments intelligent approaches are needed to keep track of the experimental design, including the conditions that are studied as well as information that might be interesting for failure analysis or further experiments in the future. In addition to the management of this information, means for an integrated design and interfaces for structured data annotation are urgently needed by researchers. Here, we propose a factor-based experimental design approach that enables scientists to easily create large-scale experiments with the help of a web-based system. We present a novel implementation of a web-based interface allowing the collection of arbitrary metadata. To exchange and edit information we provide a spreadsheet-based, humanly readable format. Subsequently, sample sheets with identifiers and metainformation for data generation facilities can be created. Data files created after measurement of the samples can be uploaded to a datastore, where they are automatically linked to the previously created experimental design model.


2018 ◽  
Author(s):  
Harold Fellermann ◽  
Ben Shirt-Ediss ◽  
Jerzy W. Koryza ◽  
Matthew Linsley ◽  
Dennis W. Lendrem ◽  
...  

Our PCR Simulator is a web-based application designed to introduce concepts of multi-factorial experimental design and support teaching of the polymerase chain reaction. Learners select experimental settings and receive results of their simulated reactions quickly, allowing rapid iteration between data generation and analysis. This enables the student to perform complex iterative experimental design strategies within a short teaching session. Here we provide a short overview of the user interface and underpinning model, and describe our experience using this tool in a teaching environment.


2017 ◽  
Author(s):  
Matthew Amodio ◽  
David van Dijk ◽  
Krishnan Srinivasan ◽  
William S Chen ◽  
Hussein Mohsen ◽  
...  

AbstractBiomedical researchers are generating high-throughput, high-dimensional single-cell data at a staggering rate. As costs of data generation decrease, experimental design is moving towards measurement of many different single-cell samples in the same dataset. These samples can correspond to different patients, conditions, or treatments. While scalability of methods to datasets of these sizes is a challenge on its own, dealing with large-scale experimental design presents a whole new set of problems, including batch effects and sample comparison issues. Currently, there are no computational tools that can both handle large amounts of data in a scalable manner (many cells) and at the same time deal with many samples (many patients or conditions). Moreover, data analysis currently involves the use of different tools that each operate on their own data representation, not guaranteeing a synchronized analysis pipeline. For instance, data visualization methods can be disjoint and mismatched with the clustering method. For this purpose, we present SAUCIE, a deep neural network that leverages the high degree of parallelization and scalability offered by neural networks, as well as the deep representation of data that can be learned by them to perform many single-cell data analysis tasks, all on a unified representation.A well-known limitation of neural networks is their interpretability. Our key contribution here are newly formulated regularizations (penalties) that render features learned in hidden layers of the neural network interpretable. When large multi-patient datasets are fed into SAUCIE, the various hidden layers contain denoised and batch-corrected data, a low dimensional visualization, unsupervised clustering, as well as other information that can be used to explore the data. We show this capability by analyzing a newly generated 180-sample dataset consisting of T cells from dengue patients in India, measured with mass cytometry. We show that SAUCIE, for the first time, can batch correct and process this 11-million cell data to identify cluster-based signatures of acute dengue infection and create a patient manifold, stratifying immune response to dengue on the basis of single-cell measurements.


2019 ◽  
Author(s):  
Angel G. Rivera-Colón ◽  
Nicolas C. Rochette ◽  
Julian M. Catchen

AbstractRestriction-site Associated DNA sequencing (RADseq) has become a powerful and versatile tool in modern population genomics, enabling large-scale genomic analyses in otherwise inaccessible biological systems. With its widespread use, different variants on the protocol have been developed to suit specific experimental needs. Researchers face the challenge of choosing the optimal molecular and sequencing protocols for their experimental design, an often-complicated process. Strategic errors can lead to improper data generation that has reduced power to answer biological questions. Here we present RADinitio, simulation software for the selection and optimization of RADseq experiments via the generation of sequencing data that behaves similarly to empirical sources. RADinitio provides an evolutionary simulation of populations, implementation of various RADseq protocols with customizable parameters, and thorough assessment of missing data. Using the software, we test its efficacy using different RAD protocols across several organisms, highlighting the importance of protocol selection on the magnitude and quality of data acquired. Additionally, we test the effects of RAD library preparation and sequencing on allelic dropout, observing that library preparation and sequencing often contributes more to missing alleles than population-level variation.


Nanomaterials ◽  
2020 ◽  
Vol 10 (4) ◽  
pp. 750 ◽  
Author(s):  
Pia Anneli Sofia Kinaret ◽  
Angela Serra ◽  
Antonio Federico ◽  
Pekka Kohonen ◽  
Penny Nymark ◽  
...  

The starting point of successful hazard assessment is the generation of unbiased and trustworthy data. Conventional toxicity testing deals with extensive observations of phenotypic endpoints in vivo and complementing in vitro models. The increasing development of novel materials and chemical compounds dictates the need for a better understanding of the molecular changes occurring in exposed biological systems. Transcriptomics enables the exploration of organisms’ responses to environmental, chemical, and physical agents by observing the molecular alterations in more detail. Toxicogenomics integrates classical toxicology with omics assays, thus allowing the characterization of the mechanism of action (MOA) of chemical compounds, novel small molecules, and engineered nanomaterials (ENMs). Lack of standardization in data generation and analysis currently hampers the full exploitation of toxicogenomics-based evidence in risk assessment. To fill this gap, TGx methods need to take into account appropriate experimental design and possible pitfalls in the transcriptomic analyses as well as data generation and sharing that adhere to the FAIR (Findable, Accessible, Interoperable, and Reusable) principles. In this review, we summarize the recent advancements in the design and analysis of DNA microarray, RNA sequencing (RNA-Seq), and single-cell RNA-Seq (scRNA-Seq) data. We provide guidelines on exposure time, dose and complex endpoint selection, sample quality considerations and sample randomization. Furthermore, we summarize publicly available data resources and highlight applications of TGx data to understand and predict chemical toxicity potential. Additionally, we discuss the efforts to implement TGx into regulatory decision making to promote alternative methods for risk assessment and to support the 3R (reduction, refinement, and replacement) concept. This review is the first part of a three-article series on Transcriptomics in Toxicogenomics. These initial considerations on Experimental Design, Technologies, Publicly Available Data, Regulatory Aspects, are the starting point for further rigorous and reliable data preprocessing and modeling, described in the second and third part of the review series.


2018 ◽  
Vol 41 ◽  
Author(s):  
Wei Ji Ma

AbstractGiven the many types of suboptimality in perception, I ask how one should test for multiple forms of suboptimality at the same time – or, more generally, how one should compare process models that can differ in any or all of the multiple components. In analogy to factorial experimental design, I advocate for factorial model comparison.


2019 ◽  
Vol 42 ◽  
Author(s):  
J. Alfredo Blakeley-Ruiz ◽  
Carlee S. McClintock ◽  
Ralph Lydic ◽  
Helen A. Baghdoyan ◽  
James J. Choo ◽  
...  

Abstract The Hooks et al. review of microbiota-gut-brain (MGB) literature provides a constructive criticism of the general approaches encompassing MGB research. This commentary extends their review by: (a) highlighting capabilities of advanced systems-biology “-omics” techniques for microbiome research and (b) recommending that combining these high-resolution techniques with intervention-based experimental design may be the path forward for future MGB research.


1978 ◽  
Vol 48 ◽  
pp. 7-29
Author(s):  
T. E. Lutz

This review paper deals with the use of statistical methods to evaluate systematic and random errors associated with trigonometric parallaxes. First, systematic errors which arise when using trigonometric parallaxes to calibrate luminosity systems are discussed. Next, determination of the external errors of parallax measurement are reviewed. Observatory corrections are discussed. Schilt’s point, that as the causes of these systematic differences between observatories are not known the computed corrections can not be applied appropriately, is emphasized. However, modern parallax work is sufficiently accurate that it is necessary to determine observatory corrections if full use is to be made of the potential precision of the data. To this end, it is suggested that a prior experimental design is required. Past experience has shown that accidental overlap of observing programs will not suffice to determine observatory corrections which are meaningful.


2011 ◽  
Vol 20 (4) ◽  
pp. 109-113
Author(s):  
Karen Copple ◽  
Rajinder Koul ◽  
Devender Banda ◽  
Ellen Frye

Abstract One of the instructional techniques reported in the literature to teach communication skills to persons with autism is video modeling (VM). VM is a form of observational learning that involves watching and imitating the desired target behavior(s) exhibited by the person on the videotape. VM has been used to teach a variety of social and communicative behaviors to persons with developmental disabilities such as autism. In this paper, we describe the VM technique and summarize the results of two single-subject experimental design studies that investigated the acquisition of spontaneous requesting skills using a speech generating device (SGD) by persons with autism following a VM intervention. The results of these two studies indicate that a VM treatment package that includes a SGD as one of its components can be effective in facilitating communication in individuals with autism who have little or no functional speech.


2014 ◽  
Vol 73 (4) ◽  
pp. 243-248 ◽  
Author(s):  
Annick Darioly ◽  
Ronald E. Riggio

This study examines how applicants who are relatives of the company’s executives are perceived when they are being considered for a leadership position. In a 2 (Family ties: with vs. without) × 2 (Applicant qualifications: well-qualified vs. underqualified) experimental design, 165 Swiss employees read the applicant’s job application and evaluated the hiring decision, the perceived competence, and the perceived career progress of the target employee. This research showed that even a well-qualified potential employee received a more negative evaluation if the candidate had family ties to the company. Despite their negative evaluation of potential nepotistic hires, the participants nevertheless believed that family ties would boost the career progress of an underqualified applicant. Limitations and implications are discussed.


Sign in / Sign up

Export Citation Format

Share Document