scholarly journals Single-Cell Analysis for Whole-Organism Datasets

Author(s):  
Angela Oliveira Pisco ◽  
Bruno Tojo ◽  
Aaron McGeever

Cell atlases are essential companions to the genome as they elucidate how genes are used in a cell type–specific manner or how the usage of genes changes over the lifetime of an organism. This review explores recent advances in whole-organism single-cell atlases, which enable understanding of cell heterogeneity and tissue and cell fate, both in health and disease. Here we provide an overview of recent efforts to build cell atlases across species and discuss the challenges that the field is currently facing. Moreover, we propose the concept of having a knowledgebase that can scale with the number of experiments and computational approaches and a new feedback loop for development and benchmarking of computational methods that includes contributions from the users. These two aspects are key for community efforts in single-cell biology that will help produce a comprehensive annotated map of cell types and states with unparalleled resolution. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 4 is July 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

2019 ◽  
Vol 12 (1) ◽  
pp. 411-430 ◽  
Author(s):  
Pratip K. Chattopadhyay ◽  
Aidan F. Winters ◽  
Woodrow E. Lomas ◽  
Andressa S. Laino ◽  
David M. Woods

Thousands of transcripts and proteins confer function and discriminate cell types in the body. Using high-parameter technologies, we can now measure many of these markers at once, and multiple platforms are now capable of analysis on a cell-by-cell basis. Three high-parameter single-cell technologies have particular potential for discovering new biomarkers, revealing disease mechanisms, and increasing our fundamental understanding of cell biology. We review these three platforms (high-parameter flow cytometry, mass cytometry, and a new class of technologies called integrated molecular cytometry platforms) in this article. We describe the underlying hardware and instrumentation, the reagents involved, and the limitations and advantages of each platform. We also highlight the emerging field of high-parameter single-cell data analysis, providing an accessible overview of the data analysis process and choice of tools.


Author(s):  
Bethany Percha

Electronic health records (EHRs) are becoming a vital source of data for healthcare quality improvement, research, and operations. However, much of the most valuable information contained in EHRs remains buried in unstructured text. The field of clinical text mining has advanced rapidly in recent years, transitioning from rule-based approaches to machine learning and, more recently, deep learning. With new methods come new challenges, however, especially for those new to the field. This review provides an overview of clinical text mining for those who are encountering it for the first time (e.g., physician researchers, operational analytics teams, machine learning scientists from other domains). While not a comprehensive survey, this review describes the state of the art, with a particular focus on new tasks and methods developed over the past few years. It also identifies key barriers between these remarkable technical advances and the practical realities of implementation in health systems and in industry. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 4 is July 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


Author(s):  
Siobhan Cleary ◽  
Cathal Seoighe

Diploidy has profound implications for population genetics and susceptibility to genetic diseases. Although two copies are present for most genes in the human genome, they are not necessarily both active or active at the same level in a given individual. Genomic imprinting, resulting in exclusive or biased expression in favor of the allele of paternal or maternal origin, is now believed to affect hundreds of human genes. A far greater number of genes display unequal expression of gene copies due to cis-acting genetic variants that perturb gene expression. The availability of data generated by RNA sequencing applied to large numbers of individuals and tissue types has generated unprecedented opportunities to assess the contribution of genetic variation to allelic imbalance in gene expression. Here we review the insights gained through the analysis of these data about the extent of the genetic contribution to allelic expression imbalance, the tools and statistical models for gene expression imbalance, and what the results obtained reveal about the contribution of genetic variants that alter gene expression to complex human diseases and phenotypes. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 4 is July 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


Author(s):  
George-John Nychas ◽  
Emma Sims ◽  
Panagiotis Tsakanikas ◽  
Fady Mohareb

Food safety is one of the main challenges of the agri-food industry that is expected to be addressed in the current environment of tremendous technological progress, where consumers’ lifestyles and preferences are in a constant state of flux. Food chain transparency and trust are drivers for food integrity control and for improvements in efficiency and economic growth. Similarly, the circular economy has great potential to reduce wastage and improve the efficiency of operations in multi-stakeholder ecosystems. Throughout the food chain cycle, all food commodities are exposed to multiple hazards, resulting in a high likelihood of contamination. Such biological or chemical hazards may be naturally present at any stage of food production, whether accidentally introduced or fraudulently imposed, risking consumers’ health and their faith in the food industry. Nowadays, a massive amount of data is generated, not only from the next generation of food safety monitoring systems and along the entire food chain (primary production included) but also from the internet of things, media, and other devices. These data should be used for the benefit of society, and the scientific field of data science should be a vital player in helping to make this possible. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 4 is July 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


2020 ◽  
Author(s):  
Konner M. Winkley ◽  
Wendy M. Reeves ◽  
Michael T. Veeman

AbstractInductive signaling interactions between different cell types are a major mechanism for the further diversification of embryonic cell fates. Most blastomeres in the model chordate Ciona robusta become restricted to a single predominant fate between the 64-cell and mid-gastrula stages. We used single-cell RNAseq spanning this period to identify 53 distinct cell states, 25 of which are dependent on a MAPK-mediated signal critical to early Ciona patterning. Divergent gene expression between newly bifurcated sibling cell types is dominated by upregulation in the induced cell type. These upregulated genes typically include numerous transcription factors and not just one or two key regulators. The Ets family transcription factor Elk1/3/4 is upregulated in almost all the putatively direct inductions, indicating that it may act in an FGF-dependent feedback loop. We examine several bifurcations in detail and find support for a ‘broad-hourglass’ model of cell fate specification in which many genes are induced in parallel to key tissue-specific transcriptional regulators via the same set of transcriptional inputs.


Author(s):  
Jingqi Chen ◽  
Guiying Dong ◽  
Liting Song ◽  
Xingzhong Zhao ◽  
Jixin Cao ◽  
...  

The accumulation of vast amounts of multimodal data for the human brain, in both normal and disease conditions, has provided unprecedented opportunities for understanding why and how brain disorders arise. Compared with traditional analyses of single datasets, the integration of multimodal datasets covering different types of data (i.e., genomics, transcriptomics, imaging, etc.) has shed light on the mechanisms underlying brain disorders in greater detail across both the microscopic and macroscopic levels. In this review, we first briefly introduce the popular large datasets for the brain. Then, we discuss in detail how integration of multimodal human brain datasets can reveal the genetic predispositions and the abnormal molecular pathways of brain disorders. Finally, we present an outlook on how future data integration efforts may advance the diagnosis and treatment of brain disorders. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 4 is July 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


2019 ◽  
Author(s):  
David Laehnemann ◽  
Johannes Köster ◽  
Ewa Szczurek ◽  
Davis J McCarthy ◽  
Stephanie C Hicks ◽  
...  

The recent upswing of microfluidics and combinatorial indexing strategies, further enhanced by very low sequencing costs, have turned single cell sequencing into an empowering technology; analyzing thousands—or even millions—of cells per experimental run is becoming a routine assignment in laboratories worldwide. As a consequence, we are witnessing a data revolution in single cell biology. Although some issues are similar in spirit to those experienced in bulk sequencing, many of the emerging data science problems are unique to single cell analysis; together, they give rise to the new realm of 'Single-Cell Data Science'. Here, we outline twelve challenges that will be central in bringing this new field forward. For each challenge, the current state of the art in terms of prior work is reviewed, and open problems are formulated, with an emphasis on the research goals that motivate them. This compendium is meant to serve as a guideline for established researchers, newcomers and students alike, highlighting interesting and rewarding problems in 'Single-Cell Data Science' for the coming years.


Author(s):  
Jonathan Li ◽  
Ernest Fraenkel

Induced pluripotent stem cell (iPSC) technology holds promise for modeling neurodegenerative diseases. Traditional approaches for disease modeling using animal and cellular models require knowledge of disease mutations. However, many patients with neurodegenerative diseases do not have a known genetic cause. iPSCs offer a way to generate patient-specific models and study pathways of dysfunction in an in vitro setting in order to understand the causes and subtypes of neurodegeneration. Furthermore, iPSC-based models can be used to search for candidate therapeutics using high-throughput screening. Here we review how iPSC-based models are currently being used to further our understanding of neurodegenerative diseases, as well as discuss their challenges and future directions. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 4 is July 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


Author(s):  
David Laehnemann ◽  
Johannes Köster ◽  
Ewa Szcureck ◽  
Davis McCarthy ◽  
Stephanie C Hicks ◽  
...  

The recent upswing of microfluidics and combinatorial indexing strategies, further enhanced by very low sequencing costs, have turned single cell sequencing into an empowering technology; analyzing thousands—or even millions—of cells per experimental run is becoming a routine assignment in laboratories worldwide. As a consequence, we are witnessing a data revolution in single cell biology. Although some issues are similar in spirit to those experienced in bulk sequencing, many of the emerging data science problems are unique to single cell analysis; together, they give rise to the new realm of 'Single Cell Data Science'. Here, we outline twelve challenges that will be central in bringing this new field forward. For each challenge, the current state of the art in terms of prior work is reviewed, and open problems are formulated, with an emphasis on the research goals that motivate them. This compendium is meant to serve as a guideline for established researchers, newcomers and students alike, highlighting interesting and rewarding problems in 'Single Cell Data Science' for the coming years.


2017 ◽  
Author(s):  
Wenfa Ng

Single cell studies increasing reveal myriad cellular subtypes beyond those postulated or observed through optical and fluorescence microscopy as well as DNA sequencing studies. While gene sequencing at the single cell level offer a path towards illuminating, in totality, the different subtypes of cells present, the technique nevertheless does not offer answers concerning the functional repertoire of the cell, which is defined by the collection of RNA transcribed from the genome. Known as the transcriptome, transcribed RNA defines the function of the cell as proteins or effector RNA molecules, while the genome is the collection of all information endowed in the cell type, expressed or not. Thus, a particular cell state, lineage, cell fate or cellular differentiation is more fully depicted by transcriptomic analysis compared to delineating the genomic context at the single cell level. While conceptually sound and could be analysed by contemporary single cell RNA sequencing technology and data analysis pipelines, the relative instability of RNA in view of RNase in the environment would make sample preparation particularly challenging, where degradation of cellular RNA by extraneous factors could provide a misinterpretation of specific functions available to a cell type. Hence, RNA as the de facto functional molecule of the cell defining the proteomics landscape as well as effector RNA repertoire, meant that RNA transcriptomics at the single cell level is the way forward if the goal is to understand all available cell types, lineage, cell fate and cellular differentiation. Given that a cell state is defined by the functions encoded by functional molecules such as proteins and RNA, single cell RNA sequencing offers a larger contextual basis for understanding cellular decision making and functions, for example, proteins are increasingly known to work in concert with RNA effector molecules in enabling a function. Hence, providing a view of the diverse cell types and lineages present in a body, single cell RNA sequencing is only hampered by the high sensitivity required to analyse the small amount of RNA available in single cells, as well as the perennial problem of RNA studies: how to prevent or reduce RNA degradation by environmental RNase enzymes. Ability to reduce RNA degradation would provide the cell biologist a unique view of the functional landscape of different cells in the body through the language of RNA.


Sign in / Sign up

Export Citation Format

Share Document