scholarly journals Frustration and Fidelity in Influenza Genome Assembly

2019 ◽  
Author(s):  
Nida Farheen ◽  
Mukund Thattai

AbstractThe genome of the influenza virus consists of eight distinct single-stranded RNA segments, each encoding proteins essential for the viral life cycle. When the virus infects a host cell these segments must be replicated and packaged into new budding virions. The viral genome is assembled with remarkably high fidelity: experiments reveal that most virions contain precisely one copy of each of the eight RNA segments. Cell-biological studies suggest that genome assembly is mediated by specific reversible and irreversible interactions between the RNA segments and their associated proteins. However, the precise inter-segment interaction network remains unresolved. Here we computationally predict that tree-like irreversible interaction networks guarantee high-fidelity genome assembly, while cyclic interaction networks lead to futile or frustrated off-pathway products. We test our prediction against multiple experimental datasets. We find that tree-like networks capture the nearest-neighbor statistics of RNA segments in packaged virions, as observed by EM tomography. Just eight tree-like networks (of a possible 262,144) optimally capture both the nearest-neighbor data as well as independently measured RNA-RNA contact propensities. These eight do not include the previously-proposed hub-and-spoke and linear networks. Rather, each predicted network combines hub-like and linear features, consistent with evolutionary models of interaction gain and loss.

2019 ◽  
Vol 16 (160) ◽  
pp. 20190411 ◽  
Author(s):  
Nida Farheen ◽  
Mukund Thattai

The genome of the influenza virus consists of eight distinct single-stranded RNA segments, each encoding proteins essential for the viral life cycle. When the virus infects a host cell, these segments must be replicated and packaged into new budding virions. The viral genome is assembled with remarkably high fidelity: experiments reveal that most virions contain precisely one copy of each of the eight RNA segments. Cell-biological studies suggest that genome assembly is mediated by specific reversible and irreversible interactions between the RNA segments and their associated proteins. However, the precise inter-segment interaction network remains unresolved. Here, we computationally predict that tree-like irreversible interaction networks guarantee high-fidelity genome assembly, while cyclic interaction networks lead to futile or frustrated off-pathway products. We test our prediction against multiple experimental datasets. We find that tree-like networks capture the nearest-neighbour statistics of RNA segments in packaged virions, as observed by electron tomography. Just eight tree-like networks (of a possible 262 144) optimally capture both the nearest-neighbour data and independently measured RNA–RNA binding and co-localization propensities. These eight do not include the previously proposed hub-and-spoke and linear networks. Rather, each predicted network combines hub-like and linear features, consistent with evolutionary models of interaction gain and loss.


2022 ◽  
Author(s):  
Aayush Grover ◽  
Laurent Gatto

Protein subcellular localization prediction plays a crucial role in improving our understandings of different diseases and consequently assists in building drug targeting and drug development pipelines. Proteins are known to co-exist at multiple subcellular locations which make the task of prediction extremely challenging. A protein interaction network is a graph that captures interactions between different proteins. It is safe to assume that if two proteins are interacting, they must share some subcellular locations. With this regard, we propose ProtFinder - the first deep learning-based model that exclusively relies on protein interaction networks to predict the multiple subcellular locations of proteins. We also integrate biological priors like the cellular component of Gene Ontology to make ProtFinder a more biology-aware intelligent system. ProtFinder is trained and tested using the STRING and BioPlex databases whereas the annotations of proteins are obtained from the Human Protein Atlas. Our model gives an AUC-ROC score of 90.00% and an MCC score of 83.42% on a held-out set of proteins. We also apply ProtFinder to annotate proteins that currently do not have confident location annotations. We observe that ProtFinder is able to confirm some of these unreliable location annotations, while in some cases complementing the existing databases with novel location annotations.


F1000Research ◽  
2014 ◽  
Vol 3 ◽  
pp. 146 ◽  
Author(s):  
Guanming Wu ◽  
Eric Dawson ◽  
Adrian Duong ◽  
Robin Haw ◽  
Lincoln Stein

High-throughput experiments are routinely performed in modern biological studies. However, extracting meaningful results from massive experimental data sets is a challenging task for biologists. Projecting data onto pathway and network contexts is a powerful way to unravel patterns embedded in seemingly scattered large data sets and assist knowledge discovery related to cancer and other complex diseases. We have developed a Cytoscape app called “ReactomeFIViz”, which utilizes a highly reliable gene functional interaction network and human curated pathways from Reactome and other pathway databases. This app provides a suite of features to assist biologists in performing pathway- and network-based data analysis in a biologically intuitive and user-friendly way. Biologists can use this app to uncover network and pathway patterns related to their studies, search for gene signatures from gene expression data sets, reveal pathways significantly enriched by genes in a list, and integrate multiple genomic data types into a pathway context using probabilistic graphical models. We believe our app will give researchers substantial power to analyze intrinsically noisy high-throughput experimental data to find biologically relevant information.


2020 ◽  
Author(s):  
Diogo Borges Lima ◽  
Ying Zhu ◽  
Fan Liu

ABSTRACTSoftware tools that allow visualization and analysis of protein interaction networks are essential for studies in systems biology. One of the most popular network visualization tools in biology is Cytoscape, which offers a large selection of plugins for interpretation of protein interaction data. Chemical cross-linking coupled to mass spectrometry (XL-MS) is an increasingly important source for such interaction data, but there are currently no Cytoscape tools to analyze XL-MS results. In light of the suitability of Cytoscape platform but also to expand its toolbox, here we introduce XlinkCyNET, an open-source Cytoscape Java plugin for exploring large-scale XL-MS-based protein interaction networks. XlinkCyNET offers rapid and easy visualization of intra and intermolecular cross-links and the locations of protein domains in a rectangular bar style, allowing subdomain-level interrogation of the interaction network. XlinkCyNET is freely available from the Cytoscape app store: http://apps.cytoscape.org/apps/xlinkcynet and at https://www.theliulab.com/software/xlinkcynet.


Author(s):  
Divya Dasagrandhi ◽  
Arul Salomee Kamalabai Ravindran ◽  
Anusuyadevi Muthuswamy ◽  
Jayachandran K. S.

Understanding the mechanisms of a disease is highly complicated due to the complex pathways involved in the disease progression. Despite several decades of research, the occurrence and prognosis of the diseases is not completely understood even with high throughput experiments like DNA microarray and next-generation sequencing. This is due to challenges in analysis of huge data sets. Systems biology is one of the major divisions of bioinformatics and has laid cutting edge techniques for the better understanding of these pathways. Construction of protein-protein interaction network (PPIN) guides the modern scientists to identify vital proteins through protein-protein interaction network, which facilitates the identification of new drug target and associated proteins. The chapter is focused on PPI databases, construction of PPINs, and its analysis.


2020 ◽  
Vol 13 (1) ◽  
Author(s):  
Jie Zhou ◽  
Weston D. Viles ◽  
Boran Lu ◽  
Zhigang Li ◽  
Juliette C. Madan ◽  
...  

Abstract Background Throughout their lifespans, humans continually interact with the microbial world, including those organisms which live in and on the human body. Research in this domain has revealed the extensive links between the human-associated microbiota and health. In particular, the microbiota of the human gut plays essential roles in digestion, nutrient metabolism, immune maturation and homeostasis, neurological signaling, and endocrine regulation. Microbial interaction networks are frequently estimated from data and are an indispensable tool for representing and understanding the conditional correlation between the microbes. In this high-dimensional setting, zero-inflation and unit-sum constraint for relative abundance data pose challenges to the reliable estimation of microbial interaction networks. Methods and Results To identify the microbial interaction network, the zero-inflated latent Ising (ZILI) model is proposed which assumes the distribution of relative abundance relies only on finite latent states and provides a novel way to solve issues induced by the unit-sum and zero-inflation constrains. A two-step algorithm is proposed for the model selection of ZILI. ZILI is evaluated through simulated data and subsequently applied to an infant gut microbiota dataset from New Hampshire Birth Cohort Study. The results are compared with results from Gaussian graphical model (GGM) and dichotomous Ising model (DIS). Providing ZILI is the true data-generating model, the simulation studies show that the two-step algorithm can identify the graphical structure effectively and is robust to a range of parameter settings. For the infant gut microbiota dataset, the final estimated networks from GGM and ZILI turn out to have significant overlap in which the ZILI tends to select the sparser network than those from GGM. From the shared subnetwork, a hub taxon Lachnospiraceae is identified whose involvement in human disease development has been discovered recently in literature. Conclusions Constrains induced by relative abundance of microbiota such as zero inflation and unit sum render the conditional correlation analysis unreliable for conventional methods such as GGM. The proposed optimal categoricalization based ZILI model provides an alternative yet elegant way to deal with these difficulties. The results from ZILI have reasonable biological interpretation. This model can also be used to study the microbial interaction in other body parts.


2020 ◽  
Vol 6 (1) ◽  
Author(s):  
Massimiliano Zanin ◽  
Bruno F. R. Santos ◽  
Paul M. A. Antony ◽  
Clara Berenguer-Escuder ◽  
Simone B. Larsen ◽  
...  

Abstract Mitochondrial dysfunction is linked to pathogenesis of Parkinson’s disease (PD). However, individual mitochondria-based analyses do not show a uniform feature in PD patients. Since mitochondria interact with each other, we hypothesize that PD-related features might exist in topological patterns of mitochondria interaction networks (MINs). Here we show that MINs formed nonclassical scale-free supernetworks in colonic ganglia both from healthy controls and PD patients; however, altered network topological patterns were observed in PD patients. These patterns were highly correlated with PD clinical scores and a machine-learning approach based on the MIN features alone accurately distinguished between patients and controls with an area-under-curve value of 0.989. The MINs of midbrain dopaminergic neurons (mDANs) derived from several genetic PD patients also displayed specific changes. CRISPR/CAS9-based genome correction of alpha-synuclein point mutations reversed the changes in MINs of mDANs. Our organelle-interaction network analysis opens another critical dimension for a deeper characterization of various complex diseases with mitochondrial dysregulation.


Author(s):  
Raymond Wan ◽  
Hiroshi Mamitsuka

This chapter examines some of the available techniques for analyzing a protein interaction network (PIN) when depicted as an undirected graph. Within this graph, algorithms have been developed which identify “notable” smaller building blocks called network motifs. The authors examine these algorithms by dividing them into two broad categories based on two de?nitions of “notable”: (a) statistically-based methods and (b) frequency-based methods. They describe how these two classes of algorithms differ not only in terms of ef?ciency, but also in terms of the type of results that they report. Some publicly-available programs are demonstrated as part of their comparison. While most of the techniques are generic and were originally proposed for other types of networks, the focus of this chapter is on the application of these methods and software tools to PINs.


2011 ◽  
Vol 2011 ◽  
pp. 1-14 ◽  
Author(s):  
Gaston K. Mazandu ◽  
Nicola J. Mulder

Technological developments in large-scale biological experiments, coupled with bioinformatics tools, have opened the doors to computational approaches for the global analysis of whole genomes. This has provided the opportunity to look at genes within their context in the cell. The integration of vast amounts of data generated by these technologies provides a strategy for identifying potential drug targets within microbial pathogens, the causative agents of infectious diseases. As proteins are druggable targets, functional interaction networks between proteins are used to identify proteins essential to the survival, growth, and virulence of these microbial pathogens. Here we have integrated functional genomics data to generate functional interaction networks between Mycobacterium tuberculosis proteins and carried out computational analyses to dissect the functional interaction network produced for identifying drug targets using network topological properties. This study has provided the opportunity to expand the range of potential drug targets and to move towards optimal target-based strategies.


2019 ◽  
Vol 11 (11) ◽  
pp. 3144-3157 ◽  
Author(s):  
Yutaka Satou ◽  
Ryohei Nakamura ◽  
Deli Yu ◽  
Reiko Yoshida ◽  
Mayuko Hamada ◽  
...  

Abstract Since its initial publication in 2002, the genome of Ciona intestinalis type A (Ciona robusta), the first genome sequence of an invertebrate chordate, has provided a valuable resource for a wide range of biological studies, including developmental biology, evolutionary biology, and neuroscience. The genome assembly was updated in 2008, and it included 68% of the sequence information in 14 pairs of chromosomes. However, a more contiguous genome is required for analyses of higher order genomic structure and of chromosomal evolution. Here, we provide a new genome assembly for an inbred line of this animal, constructed with short and long sequencing reads and Hi-C data. In this latest assembly, over 95% of the 123 Mb of sequence data was included in the chromosomes. Short sequencing reads predicted a genome size of 114–120 Mb; therefore, it is likely that the current assembly contains almost the entire genome, although this estimate of genome size was smaller than previous estimates. Remapping of the Hi-C data onto the new assembly revealed a large inversion in the genome of the inbred line. Moreover, a comparison of this genome assembly with that of Ciona savignyi, a different species in the same genus, revealed many chromosomal inversions between these two Ciona species, suggesting that such inversions have occurred frequently and have contributed to chromosomal evolution of Ciona species. Thus, the present assembly greatly improves an essential resource for genome-wide studies of ascidians.


Sign in / Sign up

Export Citation Format

Share Document