Dimensional reduction of phenotypes from 53,000 mouse models reveals a diverse landscape of gene function

Bioinformatics Advances ◽

10.1093/bioadv/vbab026 ◽

2021 ◽

Author(s):

Tomasz Konopka ◽

Letizia Vestito ◽

Damian Smedley

Keyword(s):

Mouse Models ◽

Gene Function ◽

Dimensional Reduction ◽

Phenotypic Diversity ◽

Phenotypic Data ◽

Protein Coding ◽

Visual Maps ◽

Public Data ◽

Genomic Markers ◽

The Impact

Abstract Animal models have long been used to study gene function and the impact of genetic mutations on phenotype. Through the research efforts of thousands of research groups, systematic curation of published literature, and high-throughput phenotyping screens, the collective body of knowledge for the mouse now covers the majority of protein-coding genes. We here collected data for over 53,000 mouse models with mutations in over 15,000 genomic markers and characterized by more than 254,000 annotations using more than 9,000 distinct ontology terms. We investigated dimensional reduction and embedding techniques as means to facilitate access to this diverse and high-dimensional information. Our analyses provide the first visual maps of the landscape of mouse phenotypic diversity. We also summarize some of the difficulties in producing and interpreting embeddings of sparse phenotypic data. In particular, we show that data preprocessing, filtering, and encoding have as much impact on the final embeddings as the process of dimensional reduction. Nonetheless, techniques developed in the context of dimensional reduction create opportunities for explorative analysis of this large pool of public data, including for searching for mouse models suited to study human diseases.

Download Full-text

Dimensional reduction of phenotypes from 53,000 mouse models reveals a diverse landscape of gene function

10.1101/2021.06.10.447851 ◽

2021 ◽

Author(s):

Tomasz Konopka ◽

Letizia Vestito ◽

Damian Smedley

Keyword(s):

Mouse Models ◽

Gene Function ◽

Dimensional Reduction ◽

Phenotypic Diversity ◽

Phenotypic Data ◽

Protein Coding ◽

Visual Maps ◽

Public Data ◽

Genomic Markers ◽

The Impact

Animal models have long been used to study gene function and the impact of genetic mutations on phenotype. Through the research efforts of thousands of research groups, systematic curation of published literature, and high-throughput phenotyping screens, the collective body of knowledge for the mouse now covers the majority of protein-coding genes. We here collected data for over 53,000 mouse models with mutations in over 15,000 genomic markers and characterized by more than 254,000 annotations using more than 9,000 distinct ontology terms. We investigated dimensional reduction and embedding techniques as means to facilitate access to this diverse and high-dimensional information. Our analyses provide the first visual maps of the landscape of mouse phenotypic diversity. We also summarize some of the difficulties in producing and interpreting embeddings of sparse phenotypic data. In particular, we show that data preprocessing, filtering, and encoding have as much impact on the final embeddings as the process of dimensional reduction. Nonetheless, techniques developed in the context of dimensional reduction create opportunities for explorative analysis of this large pool of public data, including for searching for mouse models suited to study human diseases.

Download Full-text

Contribution of mobile elements to the uniqueness of human genome with more than 15,000 human-specific insertions

10.1101/083295 ◽

2016 ◽

Cited By ~ 1

Author(s):

Wanxiangfu Tang ◽

Seyoung Mun ◽

Adiya Joshi ◽

Kyundong Han ◽

Ping Liang

Keyword(s):

Genetic Diversity ◽

Human Genome ◽

Genome Evolution ◽

Gene Function ◽

Mobile Elements ◽

Dna Transposition ◽

Protein Coding ◽

Human Genome Evolution ◽

The Impact ◽

Human Specific

AbstractMobile elements (MEs) collectively constituted to at least 51% of the human genome. Due to their past incremental accumulation and ongoing DNA transposition of members from certain subfamilies, MEs serve as a significant source for both inter- and intra-species genetic diversity during primate and human evolution. Since MEs can exert direct impact on gene function via a plethora of mechanism, it is believed that the ME-derived genetic diversity has contributed to the phenotypic differences between human and non-human primates, as well as among human populations and individuals. To define the specific contribution of MEs in making Human sapiens as a biologically unique species, we aim to compile a complete list of MEs that are only uniquely present in the human genome, i.e., human-specific MEs (HS-MEs).By making use of the most recent reference genome sequences for human and many other primates and a unbiased more robust and integrative multi-way comparative genomic approach, we identified a total of 15,463 HS-MEs. This list of HS-MEs represents a 120% increase from prior studies with over 8,000 being newly identified as HS-MEs. Collectively, these ~15,000 HS-MEs have contributed to a total of 15 million base pair (Mbp) sequence increase through insertion, generation of target site duplications, and transductions, as well as a 0.5 Mbp sequence loss via insertion- mediated deletions, leading to a net total of 14.5 Mbp genome size increase. Other new observations made with these HS-MEs include: 1) identification of several additional ME subfamilies with significant transposition activities not visible with prior smaller datasets (e.g. L1HS, L1PA2, and HERV-K); 2) A clear similarity of the retrotransposition mechanism among L1, Alus, and SVAs that is distinct from HERVs based on the pre- integration site sequence motifs; 3) Y-chromosome as a strikingly hot target for HS-MEs, particularly for LTRs, which showed an insertion rate 15 times higher than the genome average; 4) among the ME types, SVAs seem to show a very strong bias in inserting into existing SVAs. Among the HS-MEs, more than 8,000 elements were integrated into the vicinity of ~4900 unique genes, in regions including CDS, untranslated exon regions, promoters, and introns of protein coding genes, as well as promoters and exons of non- coding RNAs. In seven cases, MEs participate in protein coding. Furthermore, 1,213 HS-MEs contributed to a total of 3,124 experimentally identified binding sites for 146 of the 161 transcriptional factors in association with 622 genes. All these data suggest that these HS-MEs, despite being very young, already showed sufficient sign for their participation in gene function via regulation of transcription, splicing, and protein coding, with more potential for future participation.In conclusion, our results demonstrate that the amount of MEs uniquely occurred in the human genome is much higher than previously known, and we predict that the same is true regarding their impact on human genome evolution and function. The comprehensive list of HS-MEs provides an important reference resource for studying the impact of DNA transposition in human genome evolution and gene function.

Download Full-text

Functional and transcriptional profiling of non-coding RNAs in yeast reveal context-dependent phenotypes and in trans effects on the protein regulatory network

PLoS Genetics ◽

10.1371/journal.pgen.1008761 ◽

2021 ◽

Vol 17 (1) ◽

pp. e1008761

Author(s):

Laura Natalia Balarezo-Cisneros ◽

Steven Parker ◽

Marcin G. Fraczek ◽

Soukaina Timouma ◽

Ping Wang ◽

...

Keyword(s):

Large Scale ◽

Transcriptional Profiling ◽

Growth Conditions ◽

Phenotypic Data ◽

Protein Coding ◽

Protein Coding Genes ◽

Genome Wide ◽

In Trans ◽

Non Coding Rnas ◽

The Impact

Non-coding RNAs (ncRNAs), including the more recently identified Stable Unannotated Transcripts (SUTs) and Cryptic Unstable Transcripts (CUTs), are increasingly being shown to play pivotal roles in the transcriptional and post-transcriptional regulation of genes in eukaryotes. Here, we carried out a large-scale screening of ncRNAs in Saccharomyces cerevisiae, and provide evidence for SUT and CUT function. Phenotypic data on 372 ncRNA deletion strains in 23 different growth conditions were collected, identifying ncRNAs responsible for significant cellular fitness changes. Transcriptome profiles were assembled for 18 haploid ncRNA deletion mutants and 2 essential ncRNA heterozygous deletants. Guided by the resulting RNA-seq data we analysed the genome-wide dysregulation of protein coding genes and non-coding transcripts. Novel functional ncRNAs, SUT125, SUT126, SUT035 and SUT532 that act in trans by modulating transcription factors were identified. Furthermore, we described the impact of SUTs and CUTs in modulating coding gene expression in response to different environmental conditions, regulating important biological process such as respiration (SUT125, SUT126, SUT035, SUT432), steroid biosynthesis (CUT494, SUT053, SUT468) or rRNA processing (SUT075 and snR30). Overall, these data capture and integrate the regulatory and phenotypic network of ncRNAs and protein-coding genes, providing genome-wide evidence of the impact of ncRNAs on cellular homeostasis.

Download Full-text

Faculty Opinions recommendation of A systematic genome-wide analysis of zebrafish protein-coding gene function.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.718007453.793477821 ◽

2013 ◽

Author(s):

Martin Lowe

Keyword(s):

Gene Function ◽

Protein Coding ◽

Genome Wide Analysis ◽

Genome Wide

Download Full-text

PSI-40 Two mitochondrial lineages revealed in North American yak

Journal of Animal Science ◽

10.1093/jas/skaa278.833 ◽

2020 ◽

Vol 98 (Supplement_4) ◽

pp. 477-477

Author(s):

Leah K Treffer ◽

Edward S Rice ◽

Anna M Fuller ◽

Samuel Cutler ◽

Jessica L Petersen

Keyword(s):

Sequence Data ◽

Haplotype Network ◽

Ovis Aries ◽

Similar Species ◽

Nucleotide Polymorphisms ◽

Mt Dna ◽

Protein Coding ◽

Sister Clade ◽

Mtdna Sequence ◽

The Impact

Abstract Domestic yak (Bos grunniens) are bovids native to the Asian Qinghai-Tibetan Plateau. Studies of Asian yak have revealed that introgression with domestic cattle has contributed to the evolution of the species. When imported to North America (NA), some hybridization with B. taurus did occur. The objective of this study was to use mitochondrial (mt) DNA sequence data to better understand the mtDNA origin of NA yak and their relationship to Asian yak and related species. The complete mtDNA sequence of 14 individuals (12 NA yak, 1 Tibetan yak, 1 Tibetan B. indicus) was generated and compared with sequences of similar species from GeneBank (B. indicus, B. grunniens (Chinese), B. taurus, B. gaurus, B. primigenius, B. frontalis, Bison bison, and Ovis aries). Individuals were aligned to the B. grunniens reference genome (ARS_UNL_BGru_maternal_1.0), which was also included in the analyses. The mtDNA genes were annotated using the ARS-UCD1.2 cattle sequence as a reference. Ten unique NA yak haplotypes were identified, which a haplotype network separated into two clusters. Variation among the NA haplotypes included 93 nonsynonymous single nucleotide polymorphisms. A maximum likelihood tree including all taxa was made using IQtree after the data were partitioned into twenty-two subgroups using PartitionFinder2. Notably, six NA yak haplotypes formed a clade with B. indicus; the other four haplotypes grouped with B. grunniens and fell as a sister clade to bison, gaur and gayal. These data demonstrate two mitochondrial origins of NA yak with genetic variation in protein coding genes. Although these data suggest yak introgression with B. indicus, it appears to date prior to importation into NA. In addition to contributing to our understanding of the species history, these results suggest the two major mtDNA haplotypes in NA yak may functionally differ. Characterization of the impact of these differences on cellular function is currently underway.

Download Full-text

Consideration of Gut Microbiome in Murine Models of Diseases

Microorganisms ◽

10.3390/microorganisms9051062 ◽

2021 ◽

Vol 9 (5) ◽

pp. 1062

Author(s):

Chunye Zhang ◽

Craig L. Franklin ◽

Aaron C. Ericsson

Keyword(s):

Animal Models ◽

Mouse Models ◽

Biomedical Research ◽

Gut Microbiome ◽

Close Association ◽

Disease Model ◽

Potential Contributor ◽

Potential Factors ◽

Models Of Disease ◽

The Impact

The gut microbiome (GM), a complex community of bacteria, viruses, protozoa, and fungi located in the gut of humans and animals, plays significant roles in host health and disease. Animal models are widely used to investigate human diseases in biomedical research and the GM within animal models can change due to the impact of many factors, such as the vendor, husbandry, and environment. Notably, variations in GM can contribute to differences in disease model phenotypes, which can result in poor reproducibility in biomedical research. Variation in the gut microbiome can also impact the translatability of animal models. For example, standard lab mice have different pathogen exposure experiences when compared to wild or pet store mice. As humans have antigen experiences that are more similar to the latter, the use of lab mice with more simplified microbiomes may not yield optimally translatable data. Additionally, the literature describes many methods to manipulate the GM and differences between these methods can also result in differing interpretations of outcomes measures. In this review, we focus on the GM as a potential contributor to the poor reproducibility and translatability of mouse models of disease. First, we summarize the important role of GM in host disease and health through different gut–organ axes and the close association between GM and disease susceptibility through colonization resistance, immune response, and metabolic pathways. Then, we focus on the variation in the microbiome in mouse models of disease and address how this variation can potentially impact disease phenotypes and subsequently influence research reproducibility and translatability. We also discuss the variations between genetic substrains as potential factors that cause poor reproducibility via their effects on the microbiome. In addition, we discuss the utility of complex microbiomes in prospective studies and how manipulation of the GM through differing transfer methods can impact model phenotypes. Lastly, we emphasize the need to explore appropriate methods of GM characterization and manipulation.

Download Full-text

Reproduction in Trypanosomatids: Past and Present

Biology ◽

10.3390/biology10060471 ◽

2021 ◽

Vol 10 (6) ◽

pp. 471

Author(s):

Camino Gutiérrez-Corbo ◽

Bárbara Domínguez-Asenjo ◽

María Martínez-Valladares ◽

Yolanda Pérez-Pertejo ◽

Carlos García-Estrada ◽

...

Keyword(s):

Low Income ◽

Phenotypic Diversity ◽

Natural Populations ◽

Genetic Exchange ◽

Health Concern ◽

Current Debate ◽

Naturally Occurring ◽

Experimental Works ◽

Genomic Recombination ◽

The Impact

Diseases caused by trypanosomatids (Sleeping sickness, Chagas disease, and leishmaniasis) are a serious public health concern in low-income endemic countries. These diseases are produced by single-celled parasites with a diploid genome (although aneuploidy is frequent) organized in pairs of non-condensable chromosomes. To explain the way they reproduce through the analysis of natural populations, the theory of strict clonal propagation of these microorganisms was taken as a rule at the beginning of the studies, since it partially justified their genomic stability. However, numerous experimental works provide evidence of sexual reproduction, thus explaining certain naturally occurring events that link the number of meiosis per mitosis and the frequency of mating. Recent techniques have demonstrated genetic exchange between individuals of the same species under laboratory conditions, as well as the expression of meiosis specific genes. The current debate focuses on the frequency of genomic recombination events and its impact on the natural parasite population structure. This paper reviews the results and techniques used to demonstrate the existence of sex in trypanosomatids, the inheritance of kinetoplast DNA (maxi- and minicircles), the impact of genetic exchange in these parasites, and how it can contribute to the phenotypic diversity of natural populations.

Download Full-text

Characterization of the nuclear and cytosolic transcriptomes in human brain tissue reveals new insights into the subcellular distribution of RNA transcripts

Scientific Reports ◽

10.1038/s41598-021-83541-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Ammar Zaghlool ◽

Adnan Niazi ◽

Åsa K. Björklund ◽

Jakub Orzechowski Westholm ◽

Adam Ameur ◽

...

Keyword(s):

Human Brain ◽

Expression Analysis ◽

Differential Expression Analysis ◽

Adult Brain ◽

Sequencing Data ◽

Human Brain Tissue ◽

Protein Coding ◽

Rna Transcripts ◽

Nuclear Rna ◽

The Impact

AbstractTranscriptome analysis has mainly relied on analyzing RNA sequencing data from whole cells, overlooking the impact of subcellular RNA localization and its influence on our understanding of gene function, and interpretation of gene expression signatures in cells. Here, we separated cytosolic and nuclear RNA from human fetal and adult brain samples and performed a comprehensive analysis of cytosolic and nuclear transcriptomes. There are significant differences in RNA expression for protein-coding and lncRNA genes between cytosol and nucleus. We show that transcripts encoding the nuclear-encoded mitochondrial proteins are significantly enriched in the cytosol compared to the rest of protein-coding genes. Differential expression analysis between fetal and adult frontal cortex show that results obtained from the cytosolic RNA differ from results using nuclear RNA both at the level of transcript types and the number of differentially expressed genes. Our data provide a resource for the subcellular localization of thousands of RNA transcripts in the human brain and highlight differences in using the cytosolic or the nuclear transcriptomes for expression analysis.

Download Full-text

Behavior of Traffic Congestion and Public Transport in Eight Large Cities in Latin America during the COVID-19 Pandemic

Applied Sciences ◽

10.3390/app11104703 ◽

2021 ◽

Vol 11 (10) ◽

pp. 4703

Author(s):

Renato Andara ◽

Jesús Ortego-Osa ◽

Melva Inés Gómez-Caicedo ◽

Rodrigo Ramírez-Pisco ◽

Luis Manuel Navas-Gracia ◽

...

Keyword(s):

Latin American ◽

Public Transport ◽

Public Transportation ◽

Traffic Congestion ◽

Urban Transport ◽

Control Measures ◽

Johns Hopkins Hospital ◽

Large Cities ◽

Public Data ◽

The Impact

This comparative study analyzes the impact of the COVID-19 pandemic on motorized mobility in eight large cities of five Latin American countries. Public institutions and private organizations have made public data available for a better understanding of the contagion process of the pandemic, its impact, and the effectiveness of the implemented health control measures. In this research, data from the IDB Invest Dashboard were used for traffic congestion as well as data from the Moovit© public transport platform. For the daily cases of COVID-19 contagion, those published by Johns Hopkins Hospital University were used. The analysis period corresponds from 9 March to 30 September 2020, approximately seven months. For each city, a descriptive statistical analysis of the loss and subsequent recovery of motorized mobility was carried out, evaluated in terms of traffic congestion and urban transport through the corresponding regression models. The recovery of traffic congestion occurs earlier and faster than that of urban transport since the latter depends on the control measures imposed in each city. Public transportation does not appear to have been a determining factor in the spread of the pandemic in Latin American cities.

Download Full-text

A Tale of Two Entities

ACM Transactions on Internet of Things ◽

10.1145/3437258 ◽

2021 ◽

Vol 2 (2) ◽

pp. 1-21

Author(s):

Hossam ElHussini ◽

Chadi Assi ◽

Bassam Moussa ◽

Ribal Atallah ◽

Ali Ghrayeb

Keyword(s):

Power Flow ◽

Power Grid ◽

Communication Protocols ◽

Charging Infrastructure ◽

Electric Vehicle Charging ◽

Public Data ◽

Simulation Based ◽

Charging Stations ◽

The Impact ◽

Ev Charging

With the growing market of Electric Vehicles (EV), the procurement of their charging infrastructure plays a crucial role in their adoption. Within the revolution of Internet of Things, the EV charging infrastructure is getting on board with the introduction of smart Electric Vehicle Charging Stations (EVCS), a myriad set of communication protocols, and different entities. We provide in this article an overview of this infrastructure detailing the participating entities and the communication protocols. Further, we contextualize the current deployment of EVCSs through the use of available public data. In the light of such a survey, we identify two key concerns, the lack of standardization and multiple points of failures, which renders the current deployment of EV charging infrastructure vulnerable to an array of different attacks. Moreover, we propose a novel attack scenario that exploits the unique characteristics of the EVCSs and their protocol (such as high power wattage and support for reverse power flow) to cause disturbances to the power grid. We investigate three different attack variations; sudden surge in power demand, sudden surge in power supply, and a switching attack. To support our claims, we showcase using a real-world example how an adversary can compromise an EVCS and create a traffic bottleneck by tampering with the charging schedules of EVs. Further, we perform a simulation-based study of the impact of our proposed attack variations on the WSCC 9 bus system. Our simulations show that an adversary can cause devastating effects on the power grid, which might result in blackout and cascading failure by comprising a small number of EVCSs.

Download Full-text