scholarly journals The deep(er) roots of Eukaryotes and Akaryotes

F1000Research ◽  
2020 ◽  
Vol 9 ◽  
pp. 112
Author(s):  
Ajith Harish ◽  
David Morrison

Background: Locating the root node of the “tree of life” (ToL) is one of the hardest problems in phylogenetics. The root-node or the universal common ancestor (UCA) divides descendants into organismal domains. Two notable variants of the two-domains ToL (2D-ToL) have gained support recently. One 2D-ToL posits that eukaryotes (organisms with nuclei) and akaryotes (organisms without nuclei) are sister clades that diverged from the UCA and that Asgard archaea are sister to other archaea, whereas the other proposes that eukaryotes emerged within archaea and places Asgard archaea sister to eukaryotes. Williams et al. (Nature Ecol. Evol. 4: 138–147; 2020) re-evaluated the data and methods that support the competing two-domains proposals and concluded that eukaryotes are the closest relatives of Asgard archaea. Critique: We argue that important aspects of estimating evolutionary relatedness and assessing phylogenetic signal in empirical data were overlooked. We focus on phylogenetic character reconstructions necessary to describe the UCA or its closest descendants in the absence of reliable fossils. It is well known that different character types present different perspectives on evolutionary history that relate to different phylogenetic depths. Which 2D-ToL is better supported depends on which kind of molecular features – protein-domains or their component amino acids – are better for resolving common ancestors at the roots of clades. In practice, this involves reconstructing character compositions of the ancestral nodes all the way back to the UCA. We believe the criticisms of 2D-ToL focus on superficial aspects of the data and reflects common misunderstandings of phylogenetic reconstructions using protein domains (folds).   Clarifications: Models of protein domain evolution support more reliable phylogenetic reconstructions. In contrast, even the best available amino acid substitution models fail to resolve the archaeal radiation, despite employing thousands of genes. Therefore, the primary domains Eukaryotes and Akaryotes are better supported in a 2D-ToL.

F1000Research ◽  
2020 ◽  
Vol 9 ◽  
pp. 112
Author(s):  
Ajith Harish ◽  
David Morrison

Background: Locating the root node of the “tree of life” (ToL) is one of the hardest problems in phylogenetics, given the time depth. The root-node, or the universal common ancestor (UCA), groups descendants into organismal clades/domains. Two notable variants of the two-domains ToL (2D-ToL) have gained support recently. One 2D-ToL posits that eukaryotes (organisms with nuclei) and akaryotes (organisms without nuclei) are sister clades that diverged from the UCA, and that Asgard archaea are sister to other archaea. The other 2D-ToL proposes that eukaryotes emerged from within archaea and places Asgard archaea as sister to eukaryotes. Williams et al. ( Nature Ecol. Evol. 4: 138–147; 2020) re-evaluated the data and methods that support the competing two-domains proposals and concluded that eukaryotes are the closest relatives of Asgard archaea. Critique: The poor resolution of the archaea in their analysis, despite employing amino acid alignments from thousands of proteins and the best-fitting substitution models, contradicts their conclusions. We argue that they overlooked important aspects of estimating evolutionary relatedness and assessing phylogenetic signal in empirical data. Which 2D-ToL is better supported depends on which kind of molecular features are better for resolving common ancestors at the roots of clades – protein-domains or their component amino acids. We focus on phylogenetic character reconstructions necessary to describe the UCA or its closest descendants in the absence of reliable fossils.     Clarifications: It is well known that different character types present different perspectives on evolutionary history that relate to different phylogenetic depths. We show that protein structural-domains support more reliable phylogenetic reconstructions of deep-diverging clades in the ToL. Accordingly, Eukaryotes and Akaryotes are better supported clades in a 2D-ToL.


2020 ◽  
Author(s):  
Ajith Harish ◽  
David A. Morrison

AbstractLocating the root-node of the “tree of life” (ToL) is one of the hardest problems in phylogenetics1. The root-node or the universal common ancestor (UCA) divides the descendants into organismal domains2. Two notable variants of the two-domains ToL (2D-ToL) have gained support recently3,4, though, Williams and colleagues (W&C)4 claim that one is better supported than the other. Here, we argue that important aspects of estimating evolutionary relatedness and assessing phylogenetic signal in empirical data were overlooked4. We focus on phylogenetic character reconstructions necessary to describe the UCA or its closest descendants in the absence of reliable fossils. It is well-known that different character-types present different perspectives on evolutionary history that relate to different phylogenetic depths5–7. Which of the 2D-ToL2,4 hypotheses is better supported depends on which kind of molecular features – protein-domains or their component amino-acids – are better for resolving the common ancestors (CA) at the roots of clades. In practice, this involves reconstructing character compositions of the ancestral nodes all the way back to the UCA2,3.


2019 ◽  
Vol 17 (2) ◽  
pp. 161-171
Author(s):  
M. Thoihidul Islam ◽  
Mohammad Rashid Arif ◽  
Arif Hasan Khan Robin

Wheat blast is a devastating disease which is baffling scientists from its inception. This study characterized the blast resistance related protein domains with a view to develop molecular markers to identify resistant wheat genotypes against Blast fungus Magnaporthe oryzae. A genome browse analysis detected that the candidate resistance gene against blast could be located in several different chromosomes. An in silico analysis was collected with fifty nucleotide-binding site leucine-rich repeat (NBS-LRR), leucine-rich repeat (LRR), pathogenesis and resistance protein-encoding accessions on the basis of the previous resistance report. The phylogenetic tree of those putative resistance accessions, bearing resistance related protein-encoding domains, showed that an NBS-LRR accession JP957107.1 has 67% similarity with the disease resistance protein domain encoding accession of Brazilian resistant cultivar Thatcher. By contrast, the rice blast resistance Pita gene has 72% similarity with 18 pathogenesis protein domain encoding accessions. Among putative protein domains, disease resistance protein of Thatcher has 78% similarity with two NBS-LRR protein domains AAZ99757.1 and AAZ99757.1. Eighteen microsatellite markers were designed from eighteen putative NBS-LRR protein encoding accessions along with Piz3 marker. The 19 markers were unable to separate resistant and susceptible genotypes. Diffused versus conspicuous bands indicated either presence of insertion/deletion (InDel) or single nucleotide polymorphism (SNP) among wheat genotypes. Detection of InDel or SNP markers is a subject of further investigation. Additional markers are needed to be designed using new NBS-LRR, pathogenesis, coiled-coil (CC), translocated intimin receptor (TIR) resistance protein encoding accessions to find out markers specific for blast resistance. J. Bangladesh Agril. Univ. 17(2): 161–171, June 2019


2013 ◽  
Vol 9 (4) ◽  
pp. 20130268 ◽  
Author(s):  
Chia-Hsin Hsu ◽  
Chien-Kuo Chen ◽  
Ming-Jing Hwang

Protein domain architectures (PDAs), in which single domains are linked to form multiple-domain proteins, are a major molecular form used by evolution for the diversification of protein functions. However, the design principles of PDAs remain largely uninvestigated. In this study, we constructed networks to connect domain architectures that had grown out from the same single domain for every single domain in the Pfam-A database and found that there are three main distinctive types of these networks, which suggests that evolution can exploit PDAs in three different ways. Further analysis showed that these three different types of PDA networks are each adopted by different types of protein domains, although many networks exhibit the characteristics of more than one of the three types. Our results shed light on nature's blueprint for protein architecture and provide a framework for understanding architectural design from a network perspective.


2019 ◽  
Vol 36 (4) ◽  
pp. 757-765 ◽  
Author(s):  
Jürgen F H Strassert ◽  
Mahwash Jamy ◽  
Alexander P Mylnikov ◽  
Denis V Tikhonenkov ◽  
Fabien Burki

AbstractThe resolution of the broad-scale tree of eukaryotes is constantly improving, but the evolutionary origin of several major groups remains unknown. Resolving the phylogenetic position of these “orphan” groups is important, especially those that originated early in evolution, because they represent missing evolutionary links between established groups. Telonemia is one such orphan taxon for which little is known. The group is composed of molecularly diverse biflagellated protists, often prevalent although not abundant in aquatic environments. Telonemia has been hypothesized to represent a deeply diverging eukaryotic phylum but no consensus exists as to where it is placed in the tree. Here, we established cultures and report the phylogenomic analyses of three new transcriptome data sets for divergent telonemid lineages. All our phylogenetic reconstructions, based on 248 genes and using site-heterogeneous mixture models, robustly resolve the evolutionary origin of Telonemia as sister to the Sar supergroup. This grouping remains well supported when as few as 60% of the genes are randomly subsampled, thus is not sensitive to the sets of genes used but requires a minimal alignment length to recover enough phylogenetic signal. Telonemia occupies a crucial position in the tree to examine the origin of Sar, one of the most lineage-rich eukaryote supergroups. We propose the moniker “TSAR” to accommodate this new mega-assemblage in the phylogeny of eukaryotes.


2015 ◽  
Author(s):  
Martin L Miller ◽  
Ed Reznik ◽  
Nicholas P Gauthier ◽  
Bülent Arman Aksoy ◽  
Anil Korkut ◽  
...  

In cancer genomics, frequent recurrence of mutations in independent tumor samples is a strong indication of functional impact. However, rare functional mutations can escape detection by recurrence analysis for lack of statistical power. We address this problem by extending the notion of recurrence of mutations from single genes to gene families that share homologous protein domains. In addition to lowering the threshold of detection, this sharpens the functional interpretation of the impact of mutations, as protein domains more succinctly embody function than entire genes. Mapping mutations in 22 different tumor types to equivalent positions in multiple sequence alignments of protein domains, we confirm well-known functional mutation hotspots and make two types of discoveries: 1) identification and functional interpretation of uncharacterized rare variants in one gene that are equivalent to well-characterized mutations in canonical cancer genes, such as uncharacterizedERBB4(S303F) mutations that are analogous to canonicalERRB2(S310F) mutations in the furin-like domain, and 2) detection of previously unknown mutation hotspots with novel functional implications. With the rapid expansion of cancer genomics projects, protein domain hotspot analysis is likely to provide many more leads linking mutations in proteins to the cancer phenotype.


2019 ◽  
Author(s):  
Daniel Buchan ◽  
David Jones

AbstractIn this paper, using word2vec, we demonstrate that proteins domains may have semantic “meaning” in the context of multi-domain proteins. Word2vec is a group of models which can be used to produce semantically meaningful embeddings of words or tokens in a vector space. In this work we treat multi-domain proteins as “sentences” where domain identifiers are tokens which may be considered as “words”. Using all Interpro (Finn, Attwood et al. 2017) eukaryotic proteins as a corpus of “sentences” we demonstrate that Word2vec creates functionally meaningful embeddings of protein domains. We additionally show how this can be applied to identifying the putative functional roles for Pfam (Finn, Coggill et al. 2016) Domains of Unknown Function.


2021 ◽  
Author(s):  
Tisham De

Here, I demonstrate that sex determination and sexual dimorphism across tree of life are deeply related to polyamine biochemistry in cells, especially to the synteny of genes: [SAT1-NR0B1], [SAT2-SHBG] and DMRT1. This synteny was found to be most distinct in mammals. Further, the common protein domain of SAT1 and SAT2 - PF00583 was shown to be present in the genome of the last universal common ancestor (LUCA). Protein domain-domain interaction analysis of LUCAs genes suggests the possibility that LUCA had developed an immune defence against viruses. This domain-domain interaction analysis is the first scientific evidence indicating that viruses existed at least 3.5 billions years ago and probably co-existed with LUCA on early Hadean Earth.


2017 ◽  
Author(s):  
Arli A. Parikesit ◽  
Peter F. Stadler ◽  
Sonja J. Prohaska

AbstractThe genomic inventory of protein domains is an important indicator of an organism’s regulatory and metabolic capabilities. Existing gene annotations, however, can be plagued by substantial ascertainment biases that make it difficult to obtain and compare quantitative domain data. We find that quantitative trends across the Eukarya can be investigated based on a combination of gene prediction and standard domain annotation pipelines. Species-specific training is required, however, to account for the genomic peculiarities in many lineages. In contrast to earlier studies we find wide-spread statistically significant avoidance of protein domains associated with distinct functional high-level gene-ontology terms.1998 ACM Subject Classification J.3 Life and Medical Sciences


Sign in / Sign up

Export Citation Format

Share Document