biological insight
Recently Published Documents


TOTAL DOCUMENTS

159
(FIVE YEARS 79)

H-INDEX

22
(FIVE YEARS 4)

Author(s):  
Evan J. Giangrande ◽  
Ramona S. Weber ◽  
Eric Turkheimer

In the second half of the twentieth century, twin and family studies established beyond a reasonable doubt that all forms of psychopathology are substantially heritable and highly polygenic. These conclusions were simultaneously an important theoretical advance and a difficult methodological obstacle, as it became clear that heritability is universal and undifferentiated across forms of psychopathology, and the radical polygenicity of genetic effects limits the biological insight provided by genetically informed studies at the phenotypic level. The paradigm-shifting revolution brought on by the Human Genome Project has recapitulated the great methodological promise and the profound theoretical difficulties of the twin study era. We review these issues using the rubric of genetic architecture, which we define as a search for specific genetic insight that adds to the general conclusion that psychopathology is heritable and polygenic. Although significant problems remain, we see many promising avenues for progress. Expected final online publication date for the Annual Review of Clinical Psychology, Volume 18 is May 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


2021 ◽  
Vol 119 (1) ◽  
pp. e2109649118
Author(s):  
David H. Brookes ◽  
Amirali Aghazadeh ◽  
Jennifer Listgarten

Fitness functions map biological sequences to a scalar property of interest. Accurate estimation of these functions yields biological insight and sets the foundation for model-based sequence design. However, the fitness datasets available to learn these functions are typically small relative to the large combinatorial space of sequences; characterizing how much data are needed for accurate estimation remains an open problem. There is a growing body of evidence demonstrating that empirical fitness functions display substantial sparsity when represented in terms of epistatic interactions. Moreover, the theory of Compressed Sensing provides scaling laws for the number of samples required to exactly recover a sparse function. Motivated by these results, we develop a framework to study the sparsity of fitness functions sampled from a generalization of the NK model, a widely used random field model of fitness functions. In particular, we present results that allow us to test the effect of the Generalized NK (GNK) model’s interpretable parameters—sequence length, alphabet size, and assumed interactions between sequence positions—on the sparsity of fitness functions sampled from the model and, consequently, the number of measurements required to exactly recover these functions. We validate our framework by demonstrating that GNK models with parameters set according to structural considerations can be used to accurately approximate the number of samples required to recover two empirical protein fitness functions and an RNA fitness function. In addition, we show that these GNK models identify important higher-order epistatic interactions in the empirical fitness functions using only structural information.


2021 ◽  
Author(s):  
Isabell Bludau ◽  
Charlotte Nicod ◽  
Claudia Martelli ◽  
Peng Xue ◽  
Moritz Heusel ◽  
...  

Protein complexes constitute the primary functional modules of cellular activity. To respond to perturbations, complexes undergo changes in their abundance, subunit composition or state of modification. Understanding the function of biological systems requires global strategies to capture this contextual state information on protein complexes and interaction networks. Methods based on co-fractionation paired with mass spectrometry have demonstrated the capability for deep biological insight but the scope of studies using this approach has been limited by the large measurement time per biological sample and challenges with data analysis. As such, there has been little uptake of this strategy beyond a few expert labs into the broader life science community despite rich biological information content. We present a rapid integrated experimental and computational workflow to assess the re-organization of protein complexes across multiple cellular states. It enables complex experimental designs requiring increased sample/condition numbers. The workflow combines short gradient chromatography and DIA/SWATH mass spectrometry with a data analysis toolset to quantify changes in complex organization. We applied the workflow to study the global protein complex rearrangements of THP-1 cells undergoing monocyte to macrophage differentiation and a subsequent stimulation of macrophage cells with lipopolysaccharide. We observed massive proteome organization in functions related to signaling, cell adhesion, and extracellular matrix during differentiation, and less pronounced changes in processes related to innate immune response induced by the macrophage stimulation. We therefore establish our integrated differential pipeline for rapid and state-specific profiling of protein complex organization with broad utility in complex experimental designs.


2021 ◽  
Vol 5 (Supplement_1) ◽  
pp. 372-372
Author(s):  
Albert Higgins-Chen ◽  
Yaroslav Markov ◽  
Raghav Sehgal ◽  
Morgan Levine ◽  
Kyra Thrush

Abstract The current era of multi-omics data collection has enabled researchers to obtain exceptionally comprehensive profiling of disease subjects. However, exceptionally high dimensionality can ultimately be an obstacle to biological insight. Previously, we presented a method in which penalized regression of methylation principal components reduces noise and improves prediction of age, disease, and Alzheimer’s Disease (AD) pathophysiology. However, strictly linear methods may overly simplify the complex epigenetic aging landscape. We hypothesized that non-linear deep learning methods could identify molecular signatures that better reflect individual resilience to AD. Through the use of an autoencoder to represent high dimensional methylation array data, and supplemental machine learning methods, we connect latent nonlinear representations of the brain to aging, resilience, and indications of AD. In particular, resultant age-predicting representations of methylation were correlated with enrichment of methylation regions and biological pathways. Contextualized within AD pathology, this work provides valuable, ongoing insight into resilience in AD.


2021 ◽  
Author(s):  
Luke Reilly ◽  
Lirong Peng ◽  
Erika Lara ◽  
Daniel Ramos ◽  
Michael Fernandopulle ◽  
...  

Fully automated proteomic pipelines have the potential to achieve deep coverage of cellular proteomes with high throughput and scalability. However, it is important to evaluate performance, including both reproducibility and ability to provide meaningful levels of biological insight. Here, we present an approach combining high field asymmetric waveform ion mobility spectrometer (FAIMS) interface and data independent acquisition (DIA) proteomics approach developed as part of the induced pluripotent stem cell (iPSC) Neurodegenerative Disease Initiative (iNDI), a large-scale effort to understand how inherited diseases may manifest in neuronal cells. Our FAIMS-DIA approach identified more than 8000 proteins per mass spectrometry (MS) acquisition as well as superior total identification, reproducibility, and accuracy compared to other existing DIA methods. Next, we applied this approach to perform a longitudinal proteomic profiling of the differentiation of iPSC-derived neurons from the KOLF2.1J parental line used in iNDI. This analysis demonstrated a steady increase in expression of mature cortical neuron markers over the course of neuron differentiation. We validated the performance of our proteomics pipeline by comparing it to single cell RNA-Seq datasets obtained in parallel, confirming expression of key markers and cell type annotations. An interactive webapp of this temporal data is available for aligned-UMAP visualization and data browsing (https://share.streamlit.io/anant-droid/singlecellumap). In summary, we report an extensively optimized and validated proteomic pipeline that will be suitable for large-scale studies such as iNDI.


Author(s):  
Sean M. S. Hayes ◽  
Jeffrey R. Sachs ◽  
Carolyn R. Cho

AbstractNetwork inference is a valuable approach for gaining mechanistic insight from high-dimensional biological data. Existing methods for network inference focus on ranking all possible relations (edges) among all measured quantities such as genes, proteins, metabolites (features) observed, which yields a dense network that is challenging to interpret. Identifying a sparse, interpretable network using these methods thus requires an error-prone thresholding step which compromises their performance. In this article we propose a new method, DEKER-NET, that addresses this limitation by directly identifying a sparse, interpretable network without thresholding, improving real-world performance. DEKER-NET uses a novel machine learning method for feature selection in an iterative framework for network inference. DEKER-NET is extremely flexible, handling linear and nonlinear relations while making no assumptions about the underlying distribution of data, and is suitable for categorical or continuous variables. We test our method on the Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge data, demonstrating that it can directly identify sparse, interpretable networks without thresholding while maintaining performance comparable to the hypothetical best-case thresholded network of other methods.


Inorganics ◽  
2021 ◽  
Vol 9 (11) ◽  
pp. 83
Author(s):  
Louis J. Delinois ◽  
Omar De León-Vélez ◽  
Adriana Vázquez-Medina ◽  
Alondra Vélez-Cabrera ◽  
Amanda Marrero-Sánchez ◽  
...  

The heme protein cytochrome c (Cyt c) plays pivotal roles in cellular life and death processes. In the respiratory chain of mitochondria, it serves as an electron transfer protein, contributing to the proliferation of healthy cells. In the cell cytoplasm, it activates intrinsic apoptosis to terminate damaged cells. Insight into these mechanisms and the associated physicochemical properties and biomolecular interactions of Cyt c informs on the anticancer therapeutic potential of the protein, especially in its ability to subvert the current limitations of small molecule-based chemotherapy. In this review, we explore the development of Cyt c as an anticancer drug by identifying cancer types that would be receptive to the cytotoxicity of the protein and factors that can be finetuned to enhance its apoptotic potency. To this end, some information is obtained by characterizing known drugs that operate, in part, by triggering Cyt c induced apoptosis. The application of different smart drug delivery systems is surveyed to highlight important features for maintaining Cyt c stability and activity and improving its specificity for cancer cells and high drug payload release while recognizing the continuing limitations. This work serves to elucidate on the optimization of the strategies to translate Cyt c to the clinical market.


Author(s):  
Gaetano Valenza ◽  
Luca Faes ◽  
Nicola Toschi ◽  
Riccardo Barbieri

Recent developments in computational physiology have successfully exploited advanced signal processing and artificial intelligence tools for predicting or uncovering characteristic features of physiological and pathological states in humans. While these advanced tools have demonstrated excellent diagnostic capabilities, the high complexity of these computational 'black boxes’ may severely limit scientific inference, especially in terms of biological insight about both physiology and pathological aberrations. This theme issue highlights current challenges and opportunities of advanced computational tools for processing dynamical data reflecting autonomic nervous system dynamics, with a specific focus on cardiovascular control physiology and pathology. This includes the development and adaptation of complex signal processing methods, multivariate cardiovascular models, multiscale and nonlinear models for central-peripheral dynamics, as well as deep and transfer learning algorithms applied to large datasets. The width of this perspective highlights the issues of specificity in heartbeat-related features and supports the need for an imminent transition from the black-box paradigm to explainable and personalized clinical models in cardiovascular research. This article is part of the theme issue 'Advanced computation in cardiovascular physiology: new challenges and opportunities'.


2021 ◽  
Author(s):  
Jarrett D Egertson ◽  
Dan DiPasquo ◽  
Alana Killeen ◽  
Vadim Lobanov ◽  
Sujal Patel ◽  
...  

The proteome is perhaps the most dynamic and valuable source of functional biological insight. Current proteomic techniques are limited in their sensitivity and throughput. A typical single experiment measures no more than 8% of the human proteome from blood or 35% from cells and tissues. Here, we introduce a theoretical framework for a fundamentally different approach to proteomics that we call Protein Identification by Short-epitope Mapping (PrISM). PrISM utilizes multi-affinity reagents to target short linear epitopes with both a high affinity and low specificity. PrISM further employs a novel protein decoding algorithm that considers the stochasticity expected for single-molecule binding. In simulations, PrISM is able to identify more than 98% of proteins across the proteomes of a wide range of organisms. PrISM is robust to potential experimental confounders including false negative detection events and noise. Simulations of the approach with a chip containing 10 billion protein molecules show a dynamic range of 11.5 and 9.5 orders of magnitude for blood plasma and HeLa cells, respectively. If implemented experimentally, PrISM stands to rapidly quantify over 90% of the human proteome in a single experiment, potentially revolutionizing proteomics research.


Sign in / Sign up

Export Citation Format

Share Document