scholarly journals Dimensional reduction of phenotypes from 53,000 mouse models reveals a diverse landscape of gene function

Author(s):  
Tomasz Konopka ◽  
Letizia Vestito ◽  
Damian Smedley

Abstract Animal models have long been used to study gene function and the impact of genetic mutations on phenotype. Through the research efforts of thousands of research groups, systematic curation of published literature, and high-throughput phenotyping screens, the collective body of knowledge for the mouse now covers the majority of protein-coding genes. We here collected data for over 53,000 mouse models with mutations in over 15,000 genomic markers and characterized by more than 254,000 annotations using more than 9,000 distinct ontology terms. We investigated dimensional reduction and embedding techniques as means to facilitate access to this diverse and high-dimensional information. Our analyses provide the first visual maps of the landscape of mouse phenotypic diversity. We also summarize some of the difficulties in producing and interpreting embeddings of sparse phenotypic data. In particular, we show that data preprocessing, filtering, and encoding have as much impact on the final embeddings as the process of dimensional reduction. Nonetheless, techniques developed in the context of dimensional reduction create opportunities for explorative analysis of this large pool of public data, including for searching for mouse models suited to study human diseases.

2021 ◽  
Author(s):  
Tomasz Konopka ◽  
Letizia Vestito ◽  
Damian Smedley

Animal models have long been used to study gene function and the impact of genetic mutations on phenotype. Through the research efforts of thousands of research groups, systematic curation of published literature, and high-throughput phenotyping screens, the collective body of knowledge for the mouse now covers the majority of protein-coding genes. We here collected data for over 53,000 mouse models with mutations in over 15,000 genomic markers and characterized by more than 254,000 annotations using more than 9,000 distinct ontology terms. We investigated dimensional reduction and embedding techniques as means to facilitate access to this diverse and high-dimensional information. Our analyses provide the first visual maps of the landscape of mouse phenotypic diversity. We also summarize some of the difficulties in producing and interpreting embeddings of sparse phenotypic data. In particular, we show that data preprocessing, filtering, and encoding have as much impact on the final embeddings as the process of dimensional reduction. Nonetheless, techniques developed in the context of dimensional reduction create opportunities for explorative analysis of this large pool of public data, including for searching for mouse models suited to study human diseases.


2016 ◽  
Author(s):  
Wanxiangfu Tang ◽  
Seyoung Mun ◽  
Adiya Joshi ◽  
Kyundong Han ◽  
Ping Liang

AbstractMobile elements (MEs) collectively constituted to at least 51% of the human genome. Due to their past incremental accumulation and ongoing DNA transposition of members from certain subfamilies, MEs serve as a significant source for both inter- and intra-species genetic diversity during primate and human evolution. Since MEs can exert direct impact on gene function via a plethora of mechanism, it is believed that the ME-derived genetic diversity has contributed to the phenotypic differences between human and non-human primates, as well as among human populations and individuals. To define the specific contribution of MEs in making Human sapiens as a biologically unique species, we aim to compile a complete list of MEs that are only uniquely present in the human genome, i.e., human-specific MEs (HS-MEs).By making use of the most recent reference genome sequences for human and many other primates and a unbiased more robust and integrative multi-way comparative genomic approach, we identified a total of 15,463 HS-MEs. This list of HS-MEs represents a 120% increase from prior studies with over 8,000 being newly identified as HS-MEs. Collectively, these ~15,000 HS-MEs have contributed to a total of 15 million base pair (Mbp) sequence increase through insertion, generation of target site duplications, and transductions, as well as a 0.5 Mbp sequence loss via insertion- mediated deletions, leading to a net total of 14.5 Mbp genome size increase. Other new observations made with these HS-MEs include: 1) identification of several additional ME subfamilies with significant transposition activities not visible with prior smaller datasets (e.g. L1HS, L1PA2, and HERV-K); 2) A clear similarity of the retrotransposition mechanism among L1, Alus, and SVAs that is distinct from HERVs based on the pre- integration site sequence motifs; 3) Y-chromosome as a strikingly hot target for HS-MEs, particularly for LTRs, which showed an insertion rate 15 times higher than the genome average; 4) among the ME types, SVAs seem to show a very strong bias in inserting into existing SVAs. Among the HS-MEs, more than 8,000 elements were integrated into the vicinity of ~4900 unique genes, in regions including CDS, untranslated exon regions, promoters, and introns of protein coding genes, as well as promoters and exons of non- coding RNAs. In seven cases, MEs participate in protein coding. Furthermore, 1,213 HS-MEs contributed to a total of 3,124 experimentally identified binding sites for 146 of the 161 transcriptional factors in association with 622 genes. All these data suggest that these HS-MEs, despite being very young, already showed sufficient sign for their participation in gene function via regulation of transcription, splicing, and protein coding, with more potential for future participation.In conclusion, our results demonstrate that the amount of MEs uniquely occurred in the human genome is much higher than previously known, and we predict that the same is true regarding their impact on human genome evolution and function. The comprehensive list of HS-MEs provides an important reference resource for studying the impact of DNA transposition in human genome evolution and gene function.


PLoS Genetics ◽  
2021 ◽  
Vol 17 (1) ◽  
pp. e1008761
Author(s):  
Laura Natalia Balarezo-Cisneros ◽  
Steven Parker ◽  
Marcin G. Fraczek ◽  
Soukaina Timouma ◽  
Ping Wang ◽  
...  

Non-coding RNAs (ncRNAs), including the more recently identified Stable Unannotated Transcripts (SUTs) and Cryptic Unstable Transcripts (CUTs), are increasingly being shown to play pivotal roles in the transcriptional and post-transcriptional regulation of genes in eukaryotes. Here, we carried out a large-scale screening of ncRNAs in Saccharomyces cerevisiae, and provide evidence for SUT and CUT function. Phenotypic data on 372 ncRNA deletion strains in 23 different growth conditions were collected, identifying ncRNAs responsible for significant cellular fitness changes. Transcriptome profiles were assembled for 18 haploid ncRNA deletion mutants and 2 essential ncRNA heterozygous deletants. Guided by the resulting RNA-seq data we analysed the genome-wide dysregulation of protein coding genes and non-coding transcripts. Novel functional ncRNAs, SUT125, SUT126, SUT035 and SUT532 that act in trans by modulating transcription factors were identified. Furthermore, we described the impact of SUTs and CUTs in modulating coding gene expression in response to different environmental conditions, regulating important biological process such as respiration (SUT125, SUT126, SUT035, SUT432), steroid biosynthesis (CUT494, SUT053, SUT468) or rRNA processing (SUT075 and snR30). Overall, these data capture and integrate the regulatory and phenotypic network of ncRNAs and protein-coding genes, providing genome-wide evidence of the impact of ncRNAs on cellular homeostasis.


2020 ◽  
Vol 98 (Supplement_4) ◽  
pp. 477-477
Author(s):  
Leah K Treffer ◽  
Edward S Rice ◽  
Anna M Fuller ◽  
Samuel Cutler ◽  
Jessica L Petersen

Abstract Domestic yak (Bos grunniens) are bovids native to the Asian Qinghai-Tibetan Plateau. Studies of Asian yak have revealed that introgression with domestic cattle has contributed to the evolution of the species. When imported to North America (NA), some hybridization with B. taurus did occur. The objective of this study was to use mitochondrial (mt) DNA sequence data to better understand the mtDNA origin of NA yak and their relationship to Asian yak and related species. The complete mtDNA sequence of 14 individuals (12 NA yak, 1 Tibetan yak, 1 Tibetan B. indicus) was generated and compared with sequences of similar species from GeneBank (B. indicus, B. grunniens (Chinese), B. taurus, B. gaurus, B. primigenius, B. frontalis, Bison bison, and Ovis aries). Individuals were aligned to the B. grunniens reference genome (ARS_UNL_BGru_maternal_1.0), which was also included in the analyses. The mtDNA genes were annotated using the ARS-UCD1.2 cattle sequence as a reference. Ten unique NA yak haplotypes were identified, which a haplotype network separated into two clusters. Variation among the NA haplotypes included 93 nonsynonymous single nucleotide polymorphisms. A maximum likelihood tree including all taxa was made using IQtree after the data were partitioned into twenty-two subgroups using PartitionFinder2. Notably, six NA yak haplotypes formed a clade with B. indicus; the other four haplotypes grouped with B. grunniens and fell as a sister clade to bison, gaur and gayal. These data demonstrate two mitochondrial origins of NA yak with genetic variation in protein coding genes. Although these data suggest yak introgression with B. indicus, it appears to date prior to importation into NA. In addition to contributing to our understanding of the species history, these results suggest the two major mtDNA haplotypes in NA yak may functionally differ. Characterization of the impact of these differences on cellular function is currently underway.


2021 ◽  
Vol 9 (5) ◽  
pp. 1062
Author(s):  
Chunye Zhang ◽  
Craig L. Franklin ◽  
Aaron C. Ericsson

The gut microbiome (GM), a complex community of bacteria, viruses, protozoa, and fungi located in the gut of humans and animals, plays significant roles in host health and disease. Animal models are widely used to investigate human diseases in biomedical research and the GM within animal models can change due to the impact of many factors, such as the vendor, husbandry, and environment. Notably, variations in GM can contribute to differences in disease model phenotypes, which can result in poor reproducibility in biomedical research. Variation in the gut microbiome can also impact the translatability of animal models. For example, standard lab mice have different pathogen exposure experiences when compared to wild or pet store mice. As humans have antigen experiences that are more similar to the latter, the use of lab mice with more simplified microbiomes may not yield optimally translatable data. Additionally, the literature describes many methods to manipulate the GM and differences between these methods can also result in differing interpretations of outcomes measures. In this review, we focus on the GM as a potential contributor to the poor reproducibility and translatability of mouse models of disease. First, we summarize the important role of GM in host disease and health through different gut–organ axes and the close association between GM and disease susceptibility through colonization resistance, immune response, and metabolic pathways. Then, we focus on the variation in the microbiome in mouse models of disease and address how this variation can potentially impact disease phenotypes and subsequently influence research reproducibility and translatability. We also discuss the variations between genetic substrains as potential factors that cause poor reproducibility via their effects on the microbiome. In addition, we discuss the utility of complex microbiomes in prospective studies and how manipulation of the GM through differing transfer methods can impact model phenotypes. Lastly, we emphasize the need to explore appropriate methods of GM characterization and manipulation.


Biology ◽  
2021 ◽  
Vol 10 (6) ◽  
pp. 471
Author(s):  
Camino Gutiérrez-Corbo ◽  
Bárbara Domínguez-Asenjo ◽  
María Martínez-Valladares ◽  
Yolanda Pérez-Pertejo ◽  
Carlos García-Estrada ◽  
...  

Diseases caused by trypanosomatids (Sleeping sickness, Chagas disease, and leishmaniasis) are a serious public health concern in low-income endemic countries. These diseases are produced by single-celled parasites with a diploid genome (although aneuploidy is frequent) organized in pairs of non-condensable chromosomes. To explain the way they reproduce through the analysis of natural populations, the theory of strict clonal propagation of these microorganisms was taken as a rule at the beginning of the studies, since it partially justified their genomic stability. However, numerous experimental works provide evidence of sexual reproduction, thus explaining certain naturally occurring events that link the number of meiosis per mitosis and the frequency of mating. Recent techniques have demonstrated genetic exchange between individuals of the same species under laboratory conditions, as well as the expression of meiosis specific genes. The current debate focuses on the frequency of genomic recombination events and its impact on the natural parasite population structure. This paper reviews the results and techniques used to demonstrate the existence of sex in trypanosomatids, the inheritance of kinetoplast DNA (maxi- and minicircles), the impact of genetic exchange in these parasites, and how it can contribute to the phenotypic diversity of natural populations.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Ammar Zaghlool ◽  
Adnan Niazi ◽  
Åsa K. Björklund ◽  
Jakub Orzechowski Westholm ◽  
Adam Ameur ◽  
...  

AbstractTranscriptome analysis has mainly relied on analyzing RNA sequencing data from whole cells, overlooking the impact of subcellular RNA localization and its influence on our understanding of gene function, and interpretation of gene expression signatures in cells. Here, we separated cytosolic and nuclear RNA from human fetal and adult brain samples and performed a comprehensive analysis of cytosolic and nuclear transcriptomes. There are significant differences in RNA expression for protein-coding and lncRNA genes between cytosol and nucleus. We show that transcripts encoding the nuclear-encoded mitochondrial proteins are significantly enriched in the cytosol compared to the rest of protein-coding genes. Differential expression analysis between fetal and adult frontal cortex show that results obtained from the cytosolic RNA differ from results using nuclear RNA both at the level of transcript types and the number of differentially expressed genes. Our data provide a resource for the subcellular localization of thousands of RNA transcripts in the human brain and highlight differences in using the cytosolic or the nuclear transcriptomes for expression analysis.


2021 ◽  
Vol 11 (10) ◽  
pp. 4703
Author(s):  
Renato Andara ◽  
Jesús Ortego-Osa ◽  
Melva Inés Gómez-Caicedo ◽  
Rodrigo Ramírez-Pisco ◽  
Luis Manuel Navas-Gracia ◽  
...  

This comparative study analyzes the impact of the COVID-19 pandemic on motorized mobility in eight large cities of five Latin American countries. Public institutions and private organizations have made public data available for a better understanding of the contagion process of the pandemic, its impact, and the effectiveness of the implemented health control measures. In this research, data from the IDB Invest Dashboard were used for traffic congestion as well as data from the Moovit© public transport platform. For the daily cases of COVID-19 contagion, those published by Johns Hopkins Hospital University were used. The analysis period corresponds from 9 March to 30 September 2020, approximately seven months. For each city, a descriptive statistical analysis of the loss and subsequent recovery of motorized mobility was carried out, evaluated in terms of traffic congestion and urban transport through the corresponding regression models. The recovery of traffic congestion occurs earlier and faster than that of urban transport since the latter depends on the control measures imposed in each city. Public transportation does not appear to have been a determining factor in the spread of the pandemic in Latin American cities.


2021 ◽  
Vol 2 (2) ◽  
pp. 1-21
Author(s):  
Hossam ElHussini ◽  
Chadi Assi ◽  
Bassam Moussa ◽  
Ribal Atallah ◽  
Ali Ghrayeb

With the growing market of Electric Vehicles (EV), the procurement of their charging infrastructure plays a crucial role in their adoption. Within the revolution of Internet of Things, the EV charging infrastructure is getting on board with the introduction of smart Electric Vehicle Charging Stations (EVCS), a myriad set of communication protocols, and different entities. We provide in this article an overview of this infrastructure detailing the participating entities and the communication protocols. Further, we contextualize the current deployment of EVCSs through the use of available public data. In the light of such a survey, we identify two key concerns, the lack of standardization and multiple points of failures, which renders the current deployment of EV charging infrastructure vulnerable to an array of different attacks. Moreover, we propose a novel attack scenario that exploits the unique characteristics of the EVCSs and their protocol (such as high power wattage and support for reverse power flow) to cause disturbances to the power grid. We investigate three different attack variations; sudden surge in power demand, sudden surge in power supply, and a switching attack. To support our claims, we showcase using a real-world example how an adversary can compromise an EVCS and create a traffic bottleneck by tampering with the charging schedules of EVs. Further, we perform a simulation-based study of the impact of our proposed attack variations on the WSCC 9 bus system. Our simulations show that an adversary can cause devastating effects on the power grid, which might result in blackout and cascading failure by comprising a small number of EVCSs.


Sign in / Sign up

Export Citation Format

Share Document