Perspectives on Allele-Specific Expression

Author(s):  
Siobhan Cleary ◽  
Cathal Seoighe

Diploidy has profound implications for population genetics and susceptibility to genetic diseases. Although two copies are present for most genes in the human genome, they are not necessarily both active or active at the same level in a given individual. Genomic imprinting, resulting in exclusive or biased expression in favor of the allele of paternal or maternal origin, is now believed to affect hundreds of human genes. A far greater number of genes display unequal expression of gene copies due to cis-acting genetic variants that perturb gene expression. The availability of data generated by RNA sequencing applied to large numbers of individuals and tissue types has generated unprecedented opportunities to assess the contribution of genetic variation to allelic imbalance in gene expression. Here we review the insights gained through the analysis of these data about the extent of the genetic contribution to allelic expression imbalance, the tools and statistical models for gene expression imbalance, and what the results obtained reveal about the contribution of genetic variants that alter gene expression to complex human diseases and phenotypes. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 4 is July 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

Author(s):  
Bethany Percha

Electronic health records (EHRs) are becoming a vital source of data for healthcare quality improvement, research, and operations. However, much of the most valuable information contained in EHRs remains buried in unstructured text. The field of clinical text mining has advanced rapidly in recent years, transitioning from rule-based approaches to machine learning and, more recently, deep learning. With new methods come new challenges, however, especially for those new to the field. This review provides an overview of clinical text mining for those who are encountering it for the first time (e.g., physician researchers, operational analytics teams, machine learning scientists from other domains). While not a comprehensive survey, this review describes the state of the art, with a particular focus on new tasks and methods developed over the past few years. It also identifies key barriers between these remarkable technical advances and the practical realities of implementation in health systems and in industry. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 4 is July 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


2021 ◽  
Vol 39 (1) ◽  
Author(s):  
Qian Zhang ◽  
Xuetao Cao

The innate immune response is a rapid response to pathogens or danger signals. It is precisely activated not only to efficiently eliminate pathogens but also to avoid excessive inflammation and tissue damage. cis-Regulatory element–associated chromatin architecture shaped by epigenetic factors, which we define as the epiregulome, endows innate immune cells with specialized phenotypes and unique functions by establishing cell-specific gene expression patterns, and it also contributes to resolution of the inflammatory response. In this review, we focus on two aspects: ( a) how niche signals during lineage commitment or following infection and pathogenic stress program epiregulomes by regulating gene expression levels, enzymatic activities, or gene-specific targeting of chromatin modifiers and ( b) how the programed epiregulomes in turn mediate regulation of gene-specific expression, which contributes to controlling the development of innate cells, or the response to infection and inflammation, in a timely manner. We also discuss the effects of innate immunometabolic rewiring on epiregulomes and speculate on several future challenges to be encountered during the exploration of the master regulators of epiregulomes in innate immunity and inflammation. Expected final online publication date for the Annual Review of Immunology, Volume 39 is April 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


Author(s):  
George-John Nychas ◽  
Emma Sims ◽  
Panagiotis Tsakanikas ◽  
Fady Mohareb

Food safety is one of the main challenges of the agri-food industry that is expected to be addressed in the current environment of tremendous technological progress, where consumers’ lifestyles and preferences are in a constant state of flux. Food chain transparency and trust are drivers for food integrity control and for improvements in efficiency and economic growth. Similarly, the circular economy has great potential to reduce wastage and improve the efficiency of operations in multi-stakeholder ecosystems. Throughout the food chain cycle, all food commodities are exposed to multiple hazards, resulting in a high likelihood of contamination. Such biological or chemical hazards may be naturally present at any stage of food production, whether accidentally introduced or fraudulently imposed, risking consumers’ health and their faith in the food industry. Nowadays, a massive amount of data is generated, not only from the next generation of food safety monitoring systems and along the entire food chain (primary production included) but also from the internet of things, media, and other devices. These data should be used for the benefit of society, and the scientific field of data science should be a vital player in helping to make this possible. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 4 is July 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


Author(s):  
Jingqi Chen ◽  
Guiying Dong ◽  
Liting Song ◽  
Xingzhong Zhao ◽  
Jixin Cao ◽  
...  

The accumulation of vast amounts of multimodal data for the human brain, in both normal and disease conditions, has provided unprecedented opportunities for understanding why and how brain disorders arise. Compared with traditional analyses of single datasets, the integration of multimodal datasets covering different types of data (i.e., genomics, transcriptomics, imaging, etc.) has shed light on the mechanisms underlying brain disorders in greater detail across both the microscopic and macroscopic levels. In this review, we first briefly introduce the popular large datasets for the brain. Then, we discuss in detail how integration of multimodal human brain datasets can reveal the genetic predispositions and the abnormal molecular pathways of brain disorders. Finally, we present an outlook on how future data integration efforts may advance the diagnosis and treatment of brain disorders. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 4 is July 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


Author(s):  
Jonathan Li ◽  
Ernest Fraenkel

Induced pluripotent stem cell (iPSC) technology holds promise for modeling neurodegenerative diseases. Traditional approaches for disease modeling using animal and cellular models require knowledge of disease mutations. However, many patients with neurodegenerative diseases do not have a known genetic cause. iPSCs offer a way to generate patient-specific models and study pathways of dysfunction in an in vitro setting in order to understand the causes and subtypes of neurodegeneration. Furthermore, iPSC-based models can be used to search for candidate therapeutics using high-throughput screening. Here we review how iPSC-based models are currently being used to further our understanding of neurodegenerative diseases, as well as discuss their challenges and future directions. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 4 is July 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


Author(s):  
Yancong Zhang ◽  
Kelsey N. Thompson ◽  
Tobyn Branck ◽  
Yan Yan ◽  
Long H. Nguyen ◽  
...  

Shotgun metatranscriptomics (MTX) is an increasingly practical way to survey microbial community gene function and regulation at scale. This review begins by summarizing the motivations for community transcriptomics and the history of the field. We then explore the principles, best practices, and challenges of contemporary MTX workflows: beginning with laboratory methods for isolation and sequencing of community RNA, followed by informatics methods for quantifying RNA features, and finally statistical methods for detecting differential expression in a community context. In the second half of the review, we survey important biological findings from the MTX literature, drawing examples from the human microbiome, other (nonhuman) host-associated microbiomes, and the environment. Across these examples, MTX methods prove invaluable for probing microbe–microbe and host–microbe interactions, the dynamics of energy harvest and chemical cycling, and responses to environmental stresses. We conclude with a review of open challenges in the MTX field, including making assays and analyses more robust, accessible, and adaptable to new technologies; deciphering roles for millions of uncharacterized microbial transcripts; and solving applied problems such as biomarker discovery and development of microbial therapeutics. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 4 is July 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


Author(s):  
Andrea Castro ◽  
Maurizio Zanetti ◽  
Hannah Carter

Next-generation sequencing technologies have revolutionized our ability to catalog the landscape of somatic mutations in tumor genomes. These mutations can sometimes create so-called neoantigens, which allow the immune system to detect and eliminate tumor cells. However, efforts that stimulate the immune system to eliminate tumors based on their molecular differences have had less success than has been hoped for, and there are conflicting reports about the role of neoantigens in the success of this approach. Here we review some of the conflicting evidence in the literature and highlight key aspects of the tumor–immune interface that are emerging as major determinants of whether mutation-derived neoantigens will contribute to an immunotherapy response. Accounting for these factors is expected to improve success rates of future immunotherapy approaches. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 4 is July 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


Author(s):  
Angela Oliveira Pisco ◽  
Bruno Tojo ◽  
Aaron McGeever

Cell atlases are essential companions to the genome as they elucidate how genes are used in a cell type–specific manner or how the usage of genes changes over the lifetime of an organism. This review explores recent advances in whole-organism single-cell atlases, which enable understanding of cell heterogeneity and tissue and cell fate, both in health and disease. Here we provide an overview of recent efforts to build cell atlases across species and discuss the challenges that the field is currently facing. Moreover, we propose the concept of having a knowledgebase that can scale with the number of experiments and computational approaches and a new feedback loop for development and benchmarking of computational methods that includes contributions from the users. These two aspects are key for community efforts in single-cell biology that will help produce a comprehensive annotated map of cell types and states with unparalleled resolution. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 4 is July 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


Author(s):  
Qingyu Chen ◽  
Robert Leaman ◽  
Alexis Allot ◽  
Ling Luo ◽  
Chih-Hsuan Wei ◽  
...  

The COVID-19 (coronavirus disease 2019) pandemic has had a significant impact on society, both because of the serious health effects of COVID-19 and because of public health measures implemented to slow its spread. Many of these difficulties are fundamentally information needs; attempts to address these needs have caused an information overload for both researchers and the public. Natural language processing (NLP)—the branch of artificial intelligence that interprets human language—can be applied to address many of the information needs made urgent by the COVID-19 pandemic. This review surveys approximately 150 NLP studies and more than 50 systems and datasets addressing the COVID-19 pandemic. We detail work on four core NLP tasks: information retrieval, named entity recognition, literature-based discovery, and question answering. We also describe work that directly addresses aspects of the pandemic through four additional tasks: topic modeling, sentiment and emotion analysis, caseload forecasting, and misinformation detection. We conclude by discussing observable trends and remaining challenges. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 4 is July 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


Author(s):  
Tracey Holloway ◽  
Daegan Miller ◽  
Susan Anenberg ◽  
Minghui Diao ◽  
Bryan Duncan ◽  
...  

Data from satellite instruments provide estimates of gas and particle levels relevant to human health, even pollutants invisible to the human eye. However, the successful interpretation of satellite data requires an understanding of how satellites relate to other data sources, as well as factors affecting their application to health challenges. Drawing from the expertise and experience of the 2016–2020 NASA HAQAST (Health and Air Quality Applied Sciences Team), we present a review of satellite data for air quality and health applications. We include a discussion of satellite data for epidemiological studies and health impact assessments, as well as the use of satellite data to evaluate air quality trends, support air quality regulation, characterize smoke from wildfires, and quantify emission sources. The primary advantage of satellite data compared to in situ measurements, e.g., from air quality monitoring stations, is their spatial coverage. Satellite data can reveal where pollution levels are highest around the world, how levels have changed over daily to decadal periods, and where pollutants are transported from urban to global scales. To date, air quality and health applications have primarily utilized satellite observations and satellite-derived products relevant to near-surface particulate matter <2.5 μm in diameter (PM2.5) and nitrogen dioxide (NO2). Health and air quality communities have grown increasingly engaged in the use of satellite data, and this trend is expected to continue. From health researchers to air quality managers, and from global applications to community impacts, satellite data are transforming the way air pollution exposure is evaluated. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 4 is July 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


Sign in / Sign up

Export Citation Format

Share Document