scholarly journals Applications of single-cell genomics and computational strategies to study common disease and population-level variation

2021 ◽  
Vol 31 (10) ◽  
pp. 1728-1741 ◽  
Author(s):  
Benjamin J. Auerbach ◽  
Jian Hu ◽  
Muredach P. Reilly ◽  
Mingyao Li

The advent and rapid development of single-cell technologies have made it possible to study cellular heterogeneity at an unprecedented resolution and scale. Cellular heterogeneity underlies phenotypic differences among individuals, and studying cellular heterogeneity is an important step toward our understanding of the disease molecular mechanism. Single-cell technologies offer opportunities to characterize cellular heterogeneity from different angles, but how to link cellular heterogeneity with disease phenotypes requires careful computational analysis. In this article, we will review the current applications of single-cell methods in human disease studies and describe what we have learned so far from existing studies about human genetic variation. As single-cell technologies are becoming widely applicable in human disease studies, population-level studies have become a reality. We will describe how we should go about pursuing and designing these studies, particularly how to select study subjects, how to determine the number of cells to sequence per subject, and the needed sequencing depth per cell. We also discuss computational strategies for the analysis of single-cell data and describe how single-cell data can be integrated with bulk tissue data and data generated from genome-wide association studies. Finally, we point out open problems and future research directions.

2016 ◽  
Author(s):  
Florent Chuffart ◽  
Magali Richard ◽  
Daniel Jost ◽  
Helene Duplus-Bottin ◽  
Yoshikazu Ohya ◽  
...  

Despite the recent progress in sequencing technologies, genome-wide association studies (GWAS) remain limited by a statistical-power issue: many polymorphisms contribute little to common trait variation and therefore escape detection. The small contribution sometimes corresponds to incomplete penetrance, which may result from probabilistic effects on molecular regulations. In such cases, genetic mapping may benefit from the wealth of data produced by single-cell technologies. We present here the development of a novel genetic mapping method that allows to scan genomes for single-cell Probabilistic Trait Loci that modify the statistical properties of cellular-level quantitative traits. Phenotypic values are acquired on thousands of individual cells, and genetic association is obtained from a multivariate analysis of a matrix of Kantorovich distances. No prior assumption is required on the mode of action of the genetic loci involved and, by exploiting all single-cell values, the method can reveal non-deterministic effects. Using both simulations and yeast experimental datasets, we show that it can detect linkages that are missed by classical genetic mapping. A probabilistic effect of a single SNP on cell shape was detected and validated. The method also detected a novel locus associated with elevated gene expression noise of the yeast galactose regulon. Our results illustrate how single-cell technologies can be exploited to improve the genetic dissection of certain common traits.


Author(s):  
Kerina H Jones ◽  
Arron S Lacey ◽  
Brian L Perkins ◽  
Mark I Rees

ABSTRACTObjectivesData safe havens can bring together and combine a rich array of anonymised person-based data for research and policy evaluation within a secure setting. To date, the majority of available datasets have been structured micro-data derived from routine health-related records. Possibilities are opening up for the greater reuse of genomic data such as Genome Wide Association studies (GWAS) and Whole Exome/Genome Sequencing (WES or WGS). However, there are considerable challenges to be addressed if the benefits of using these data in combination with health-related data are to be realized safely. ApproachWe explore the benefits and challenges of using genomic datasets with health-related data, and using the Secure Anonymised Information Linkage (SAIL) system as a case study, the implications and way forward for Data Safe Havens in seeking to incorporate genomic data for use with health-related data. ResultsThe benefits of using GWAS, WES and WGS data in conjunction with health-related data include the potential to explore genetics at a population level and open up novel research areas. These include the ability to increasingly stratify and personalize how medical indications are detected and treated through precision medicine by understanding rare conditions and adding socioeconomic and environmental context to genomic data. Among the challenges are: data availability, computing capacity, technical solutions, legal and regulatory frameworks, public perceptions, individual privacy and organizational risk. Many of the challenges within these areas are common to person-based data in general, and often Data Safe Havens have been designed to address these. But there are also aspects of these challenges, and other challenges, specific to genomic data. These include issues due to the unknown clinical significance of genomic information now or in the future, with corresponding risks for privacy and impact on individuals. ConclusionGenomic data sets contain vast amounts of valuable information, some of which is currently undefined, but which may have direct bearing on individual health at some point. The use of these data in combination with health-related data has the potential to bring great benefits, better clinical trial stratification, epidemiology project design and clinical improvements. It is, therefore, essential that such data are surrounded by a properly-designed, robust governance framework including technical and procedural access controls that enable the data to be used safely.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Natalie Stanley ◽  
Ina A. Stelzer ◽  
Amy S. Tsai ◽  
Ramin Fallahzadeh ◽  
Edward Ganio ◽  
...  

2019 ◽  
Vol 35 (14) ◽  
pp. i427-i435 ◽  
Author(s):  
Héctor Climente-González ◽  
Chloé-Agathe Azencott ◽  
Samuel Kaski ◽  
Makoto Yamada

AbstractMotivationFinding non-linear relationships between biomolecules and a biological outcome is computationally expensive and statistically challenging. Existing methods have important drawbacks, including among others lack of parsimony, non-convexity and computational overhead. Here we propose block HSIC Lasso, a non-linear feature selector that does not present the previous drawbacks.ResultsWe compare block HSIC Lasso to other state-of-the-art feature selection techniques in both synthetic and real data, including experiments over three common types of genomic data: gene-expression microarrays, single-cell RNA sequencing and genome-wide association studies. In all cases, we observe that features selected by block HSIC Lasso retain more information about the underlying biology than those selected by other techniques. As a proof of concept, we applied block HSIC Lasso to a single-cell RNA sequencing experiment on mouse hippocampus. We discovered that many genes linked in the past to brain development and function are involved in the biological differences between the types of neurons.Availability and implementationBlock HSIC Lasso is implemented in the Python 2/3 package pyHSICLasso, available on PyPI. Source code is available on GitHub (https://github.com/riken-aip/pyHSICLasso).Supplementary informationSupplementary data are available at Bioinformatics online.


2020 ◽  
Vol 9 (4) ◽  
pp. 1096
Author(s):  
Jessica Gambardella ◽  
Angela Lombardi ◽  
Marco Bruno Morelli ◽  
John Ferrara ◽  
Gaetano Santulli

Inositol 1,4,5-trisphosphate receptors (ITPRs) are intracellular calcium release channels located on the endoplasmic reticulum of virtually every cell. Herein, we are reporting an updated systematic summary of the current knowledge on the functional role of ITPRs in human disorders. Specifically, we are describing the involvement of its loss-of-function and gain-of-function mutations in the pathogenesis of neurological, immunological, cardiovascular, and neoplastic human disease. Recent results from genome-wide association studies are also discussed.


2004 ◽  
Vol 68 (3) ◽  
pp. 538-559 ◽  
Author(s):  
Byron F. Brehm-Stecher ◽  
Eric A. Johnson

SUMMARY The field of microbiology has traditionally been concerned with and focused on studies at the population level. Information on how cells respond to their environment, interact with each other, or undergo complex processes such as cellular differentiation or gene expression has been obtained mostly by inference from population-level data. Individual microorganisms, even those in supposedly “clonal” populations, may differ widely from each other in terms of their genetic composition, physiology, biochemistry, or behavior. This genetic and phenotypic heterogeneity has important practical consequences for a number of human interests, including antibiotic or biocide resistance, the productivity and stability of industrial fermentations, the efficacy of food preservatives, and the potential of pathogens to cause disease. New appreciation of the importance of cellular heterogeneity, coupled with recent advances in technology, has driven the development of new tools and techniques for the study of individual microbial cells. Because observations made at the single-cell level are not subject to the “averaging” effects characteristic of bulk-phase, population-level methods, they offer the unique capacity to observe discrete microbiological phenomena unavailable using traditional approaches. As a result, scientists have been able to characterize microorganisms, their activities, and their interactions at unprecedented levels of detail.


2018 ◽  
Author(s):  
Paul W. Hook ◽  
Andrew S. McCallion

Genome-wide association studies have implicated thousands of non-coding variants across human phenotypes. However, they cannot directly inform the cellular context in which disease-associated variants act. Here, we use open chromatin profiles from discrete mouse cell populations to address this challenge. We applied stratified linkage disequilibrium score regression and evaluated heritability enrichment in 64 genome-wide association studies, emphasizing schizophrenia. We provide evidence that mouse-derived human open chromatin profiles can serve as powerful proxies for difficult to obtain human cell populations, facilitating the illumination of common disease heritability enrichment across an array of human phenotypes. We demonstrate signatures from discrete subpopulations of cortical excitatory and inhibitory neurons are significantly enriched for schizophrenia heritability with maximal enrichment in discrete cortical layer V excitatory neurons. We also show differences between schizophrenia and bipolar disorder are concentrated in excitatory neurons in layers II-III, IV, V as well as the dentate gyrus. Finally, we use these data to fine-map variants in 177 schizophrenia loci, nominating variants in 104/177 loci, and place them in the cellular context where they may modulate risk.


2017 ◽  
Author(s):  
Alexandros Onoufriadis ◽  
Kristina Stone ◽  
Antreas Katsiamides ◽  
Ariella Amar ◽  
Yasmin Omar ◽  
...  

AbstractBackground and aimsAlthough genome-wide association studies (GWAS) in inflammatory bowel disease (IBD) have identified a large number of common disease susceptibility alleles for both Crohn’s disease (CD) and ulcerative colitis (UC), a substantial fraction of IBD heritability remains unexplained, suggesting that rare coding genetic variants may also have a role in pathogenesis. We used high-throughput sequencing in families with multiple cases of IBD, followed by genotyping of cases and controls, to investigate whether rare protein altering genetic variants are associated with susceptibility to IBD.MethodsWhole exome sequencing was carried out in 10 families in which 3 or more individuals were affected with IBD. A stepwise filtering approach was applied to exome variants to identify potential causal variants. Follow-up genotyping was performed in 6,025 IBD cases (2,948 CD; 3,077 UC) and 7,238 controls.ResultsOur exome variant analysis revealed coding variants in the NLRP7 gene that were present in affected individuals in two distinct families. Genotyping of the two variants, p.S361L and p.R801H, in IBD cases and controls showed that the p.S361L variant was significantly associated with an increased risk of ulcerative colitis (odds ratio 4.79, p=0.0039) and IBD (odds ratio 3.17, p=0.037). A combined analysis of both variants showed suggestive association with an increased risk of IBD (odds ratio 2.77, p=0.018).ConclusionsThe results suggest that NLRP7 signalling and inflammasome formation may be a significant component in the pathogenesis of IBD.


Sign in / Sign up

Export Citation Format

Share Document