The protein domains of vertebrate species in which selection is more effective have greater intrinsic structural disorder

AbstractThe effectiveness of selection varies among species. It is often estimated by means of an “effective population size” based on neutral polymorphism, but this is confounded in complex ways with demography. The strength of codon bias more directly pertains to how well adaptation at many sites can be maintained in the face of deleterious mutations, but past metrics that compare codon bias across species are confounded by among-species variation in %GC content and/or amino acid composition. Here we propose a new Codon Adaptation Index of Species (CAIS) that corrects for both confounders. Unlike previous metrics, CAIS yields the expected relationship with adult vertebrate body mass. As an example of the use of CAIS, we ask whether protein domains evolve lower intrinsic structural disorder (ISD) when present in more exquisitely adapted species, as expected given that ISD is higher in eukaryotic proteomes than prokaryotic proteomes. Using phylogenetically corrected linear models, we find, contrary to expectations, that the ISD of a given protein domain evolves to be higher when in well-adapted species. This effect is stronger in young protein domains but is also present in ancient domains.

Download Full-text

Genome Sequence of the Asian Honeybee in Pakistan Sheds Light on Its Phylogenetic Relationship with Other Honeybees

Insects ◽

10.3390/insects12070652 ◽

2021 ◽

Vol 12 (7) ◽

pp. 652

Author(s):

Hongwei Tan ◽

Muhammad Naeem ◽

Hussain Ali ◽

Muhammad Shakeel ◽

Haiou Kuang ◽

...

Keyword(s):

Phylogenetic Relationship ◽

Genome Sequence ◽

Apis Cerana ◽

Gc Content ◽

Protein Domain ◽

Pollination Services ◽

Protein Coding ◽

Close Relationship ◽

Genome Scale ◽

Asian Honeybee

In Pakistan, Apis cerana, the Asian honeybee, has been used for honey production and pollination services. However, its genomic makeup and phylogenetic relationship with those in other countries are still unknown. We collected A. cerana samples from the main cerana-keeping region in Pakistan and performed whole genome sequencing. A total of 28 Gb of Illumina shotgun reads were generated, which were used to assemble the genome. The obtained genome assembly had a total length of 214 Mb, with a GC content of 32.77%. The assembly had a scaffold N50 of 2.85 Mb and a BUSCO completeness score of 99%, suggesting a remarkably complete genome sequence for A. cerana in Pakistan. A MAKER pipeline was employed to annotate the genome sequence, and a total of 11,864 protein-coding genes were identified. Of them, 6750 genes were assigned at least one GO term, and 8813 genes were annotated with at least one protein domain. Genome-scale phylogeny analysis indicated an unexpectedly close relationship between A. cerana in Pakistan and those in China, suggesting a potential human introduction of the species between the two countries. Our results will facilitate the genetic improvement and conservation of A. cerana in Pakistan.

Download Full-text

The genomic landscape of recombination rate variation in Chlamydomonas reinhardtii reveals a pronounced effect of linked selection

10.1101/340992 ◽

2018 ◽

Cited By ~ 1

Author(s):

Ahmed R. Hasan ◽

Rob W. Ness

Keyword(s):

Chlamydomonas Reinhardtii ◽

Recombination Rate ◽

Natural Populations ◽

Gc Content ◽

Sexual Cycle ◽

Rate Variation ◽

Effective Population ◽

Entire Genome ◽

Recombination Rate Variation ◽

Linked Selection

AbstractRecombination confers a major evolutionary advantage by breaking up linkage disequilibrium (LD) between harmful and beneficial mutations and facilitating selection. Here, we use genome-wide patterns of LD to infer fine-scale recombination rate variation in the genome of the model green alga Chlamydomonas reinhardtii and estimate rates of LD decay across the entire genome. We observe recombination rate variation of up to two orders of magnitude, finding evidence of recombination hotspots playing a role in the genome. Recombination rate is highest just upstream of genic regions, suggesting the preferential targeting of recombination breakpoints in promoter regions. Furthermore, we observe a positive correlation between GC content and recombination rate, suggesting a role for GC-biased gene conversion or selection on base composition within the GC-rich genome of C. reinhardtii. We also find a positive relationship between nucleotide diversity and recombination, consistent with widespread influence of linked selection in the genome. Finally, we use estimates of the effective rate of recombination to calculate the rate of sex that occurs in natural populations of this important model microbe, estimating a sexual cycle roughly every 770 generations. We argue that the relatively infrequent rate of sex and large effective population size creates an population genetic environment that increases the influence of linked selection on the genome.

Download Full-text

Selection intensity for codon bias.

Genetics ◽

10.1093/genetics/138.1.227 ◽

1994 ◽

Vol 138 (1) ◽

pp. 227-234 ◽

Cited By ~ 7

Author(s):

D L Hartl ◽

E N Moriyama ◽

S A Sawyer

Keyword(s):

Amino Acids ◽

Dna Sequences ◽

Codon Bias ◽

Enteric Bacteria ◽

Average Intensity ◽

Ancestral State ◽

Selection Intensity ◽

Coding Region ◽

Effective Population ◽

Synonymous Codons

Abstract The patterns of nonrandom usage of synonymous codons (codon bias) in enteric bacteria were analyzed. Poisson random field (PRF) theory was used to derive the expected distribution of frequencies of nucleotides differing from the ancestral state at aligned sites in a set of DNA sequences. This distribution was applied to synonymous nucleotide polymorphisms and amino acid polymorphisms in the gnd and putP genes of Escherichia coli. For the gnd gene, the average intensity of selection against disfavored synonymous codons was estimated as approximately 7.3 x 10(-9); this value is significantly smaller than the estimated selection intensity against selectively disfavored amino acids in observed polymorphisms (2.0 x 10(-8)), but it is approximately of the same order of magnitude. The selection coefficients for optimal synonymous codons estimated from PRF theory were consistent with independent estimates based on codon usage for threonine and glycine. Across 118 genes in E. coli and Salmonella typhimurium, the distribution of estimated selection coefficients, expressed as multiples of the effective population size, has a mean and standard deviation of 0.5 +/- 0.4. No significant differences were found in the degree of codon bias between conserved positions and replacement positions, suggesting that translational misincorporation is not an important selective constraint among synonymous polymorphic codons in enteric bacteria. However, across the first 100 codons of the genes, conserved amino acids with identical codons have significantly greater codon bias than that of either synonymous or nonidentical codons, suggesting that there are unique selective constraints, perhaps including mRNA secondary structures, in this part of the coding region.

Download Full-text

Characterization of blast resistance related protein domains in wheat for molecular marker development

Journal of the Bangladesh Agricultural University ◽

10.3329/jbau.v17i2.41939 ◽

2019 ◽

Vol 17 (2) ◽

pp. 161-171

Author(s):

M. Thoihidul Islam ◽

Mohammad Rashid Arif ◽

Arif Hasan Khan Robin

Keyword(s):

Disease Resistance ◽

Blast Resistance ◽

Protein Domains ◽

Protein Domain ◽

Leucine Rich Repeat ◽

Related Protein ◽

Disease Resistance Protein ◽

Protein Encoding ◽

Resistance Protein ◽

Lrr Protein

Wheat blast is a devastating disease which is baffling scientists from its inception. This study characterized the blast resistance related protein domains with a view to develop molecular markers to identify resistant wheat genotypes against Blast fungus Magnaporthe oryzae. A genome browse analysis detected that the candidate resistance gene against blast could be located in several different chromosomes. An in silico analysis was collected with fifty nucleotide-binding site leucine-rich repeat (NBS-LRR), leucine-rich repeat (LRR), pathogenesis and resistance protein-encoding accessions on the basis of the previous resistance report. The phylogenetic tree of those putative resistance accessions, bearing resistance related protein-encoding domains, showed that an NBS-LRR accession JP957107.1 has 67% similarity with the disease resistance protein domain encoding accession of Brazilian resistant cultivar Thatcher. By contrast, the rice blast resistance Pita gene has 72% similarity with 18 pathogenesis protein domain encoding accessions. Among putative protein domains, disease resistance protein of Thatcher has 78% similarity with two NBS-LRR protein domains AAZ99757.1 and AAZ99757.1. Eighteen microsatellite markers were designed from eighteen putative NBS-LRR protein encoding accessions along with Piz3 marker. The 19 markers were unable to separate resistant and susceptible genotypes. Diffused versus conspicuous bands indicated either presence of insertion/deletion (InDel) or single nucleotide polymorphism (SNP) among wheat genotypes. Detection of InDel or SNP markers is a subject of further investigation. Additional markers are needed to be designed using new NBS-LRR, pathogenesis, coiled-coil (CC), translocated intimin receptor (TIR) resistance protein encoding accessions to find out markers specific for blast resistance. J. Bangladesh Agril. Univ. 17(2): 161–171, June 2019

Download Full-text

The architectural design of networks of protein domain architectures

Biology Letters ◽

10.1098/rsbl.2013.0268 ◽

2013 ◽

Vol 9 (4) ◽

pp. 20130268 ◽

Cited By ~ 4

Author(s):

Chia-Hsin Hsu ◽

Chien-Kuo Chen ◽

Ming-Jing Hwang

Keyword(s):

Architectural Design ◽

Molecular Form ◽

Protein Domains ◽

Single Domain ◽

Protein Domain ◽

Protein Architecture ◽

Different Types ◽

Protein Functions ◽

Shed Light ◽

Multiple Domain

Protein domain architectures (PDAs), in which single domains are linked to form multiple-domain proteins, are a major molecular form used by evolution for the diversification of protein functions. However, the design principles of PDAs remain largely uninvestigated. In this study, we constructed networks to connect domain architectures that had grown out from the same single domain for every single domain in the Pfam-A database and found that there are three main distinctive types of these networks, which suggests that evolution can exploit PDAs in three different ways. Further analysis showed that these three different types of PDA networks are each adopted by different types of protein domains, although many networks exhibit the characteristics of more than one of the three types. Our results shed light on nature's blueprint for protein architecture and provide a framework for understanding architectural design from a network perspective.

Download Full-text

Status of the Indus River dolphin Platanista minor

Oryx ◽

10.1046/j.1365-3008.1998.00016.x ◽

1998 ◽

Vol 32 (1) ◽

pp. 35-44 ◽

Cited By ~ 6

Author(s):

Randall R. Reeves ◽

Abdul Aleem Chaudhry

Keyword(s):

Biological Diversity ◽

Industrial Effluent ◽

River System ◽

Agricultural Runoff ◽

Water Levels ◽

Rapid Decline ◽

Indus River ◽

Effective Population ◽

Urban Sewage ◽

The Face

The endemic freshwater dolphins in the Indus River system of Pakistan, Platanista minor, have been considered endangered since the early 1970s. Measures taken to protect them from deliberate capture seem to have stopped a rapid decline, and combined counts in Sindh and Punjab provinces since the early 1980s suggest a total population of at least a few hundred animals. Severe problems remain, however. In addition to the risks inherent to any species with an effective population size in the low hundreds (at most), these dolphins are subject to long-term threats associated with living in an artificially controlled waterway used intensively by humans. Irrigation barrages partition the aggregate population into discrete subpopulations for much of the year. Dolphins that ‘escape’ during the flood season into irrigation canals or into reaches downstream of barrages where winter water levels are low have little chance of survival. A few dolphins probably die each year after being caught in fishing nets. Pollution by untreated urban sewage, agricultural runoff and industrial effluent threatens the health of the entire Indus system. The future of this dolphin species depends on Pakistan's commitment to protecting biological diversity in the face of escalating human demands on dwindling resources.

Download Full-text

Effective Population Size Predicts Local Rates but Not Local Mitigation of Read-through Errors

Molecular Biology and Evolution ◽

10.1093/molbev/msaa210 ◽

2020 ◽

Vol 38 (1) ◽

pp. 244-262

Author(s):

Alexander T Ho ◽

Laurence D Hurst

Keyword(s):

Population Size ◽

Effective Population Size ◽

Stop Codon ◽

Neutral Theory ◽

Error Rates ◽

Species Variation ◽

Effective Population ◽

Weak Selection ◽

Error Mitigation ◽

Good Proxy

Abstract In correctly predicting that selection efficiency is positively correlated with the effective population size (Ne), the nearly neutral theory provides a coherent understanding of between-species variation in numerous genomic parameters, including heritable error (germline mutation) rates. Does the same theory also explain variation in phenotypic error rates and in abundance of error mitigation mechanisms? Translational read-through provides a model to investigate both issues as it is common, mostly nonadaptive, and has good proxy for rate (TAA being the least leaky stop codon) and potential error mitigation via “fail-safe” 3′ additional stop codons (ASCs). Prior theory of translational read-through has suggested that when population sizes are high, weak selection for local mitigation can be effective thus predicting a positive correlation between ASC enrichment and Ne. Contra to prediction, we find that ASC enrichment is not correlated with Ne. ASC enrichment, although highly phylogenetically patchy, is, however, more common both in unicellular species and in genes expressed in unicellular modes in multicellular species. By contrast, Ne does positively correlate with TAA enrichment. These results imply that local phenotypic error rates, not local mitigation rates, are consistent with a drift barrier/nearly neutral model.

Download Full-text

Differences in codon bias and GC content contribute to the balanced expression of TLR7 and TLR9

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1518976113 ◽

2016 ◽

Vol 113 (10) ◽

pp. E1362-E1371 ◽

Cited By ~ 48

Author(s):

Zachary R. Newman ◽

Janet M. Young ◽

Nicholas T. Ingolia ◽

Gregory M. Barton

Keyword(s):

Nucleic Acids ◽

Codon Bias ◽

Codon Optimization ◽

Immune Surveillance ◽

Gc Content ◽

Surveillance Strategy ◽

Protein Levels ◽

Low Levels ◽

The Cost ◽

Control Of Expression

The innate immune system detects diverse microbial species with a limited repertoire of immune receptors that recognize nucleic acids. The cost of this immune surveillance strategy is the potential for inappropriate recognition of self-derived nucleic acids and subsequent autoimmune disease. The relative expression of two closely related receptors, Toll-like receptor (TLR) 7 and TLR9, is balanced to allow recognition of microbial nucleic acids while limiting recognition of self-derived nucleic acids. Situations that tilt this balance toward TLR7 promote inappropriate responses, including autoimmunity; therefore, tight control of expression is critical for proper homeostasis. Here we report that differences in codon bias limit TLR7 expression relative to TLR9. Codon optimization of Tlr7 increases protein levels as well as responses to ligands, but, unexpectedly, these changes only modestly affect translation. Instead, we find that much of the benefit attributed to codon optimization is actually the result of enhanced transcription. Our findings, together with other recent examples, challenge the dogma that codon optimization primarily increases translation. We propose that suboptimal codon bias, which correlates with low guanine-cytosine (GC) content, limits transcription of certain genes. This mechanism may establish low levels of proteins whose overexpression leads to particularly deleterious effects, such as TLR7.

Download Full-text

Protein Domain Hotspots Reveal Functional Mutations across Genes in Cancer

10.1101/015719 ◽

2015 ◽

Author(s):

Martin L Miller ◽

Ed Reznik ◽

Nicholas P Gauthier ◽

Bülent Arman Aksoy ◽

Anil Korkut ◽

...

Keyword(s):

Statistical Power ◽

Cancer Genomics ◽

Protein Domains ◽

Rapid Expansion ◽

Protein Domain ◽

Cancer Genes ◽

Sequence Alignments ◽

Hotspot Analysis ◽

Functional Interpretation ◽

The Impact

In cancer genomics, frequent recurrence of mutations in independent tumor samples is a strong indication of functional impact. However, rare functional mutations can escape detection by recurrence analysis for lack of statistical power. We address this problem by extending the notion of recurrence of mutations from single genes to gene families that share homologous protein domains. In addition to lowering the threshold of detection, this sharpens the functional interpretation of the impact of mutations, as protein domains more succinctly embody function than entire genes. Mapping mutations in 22 different tumor types to equivalent positions in multiple sequence alignments of protein domains, we confirm well-known functional mutation hotspots and make two types of discoveries: 1) identification and functional interpretation of uncharacterized rare variants in one gene that are equivalent to well-characterized mutations in canonical cancer genes, such as uncharacterizedERBB4(S303F) mutations that are analogous to canonicalERRB2(S310F) mutations in the furin-like domain, and 2) detection of previously unknown mutation hotspots with novel functional implications. With the rapid expansion of cancer genomics projects, protein domain hotspot analysis is likely to provide many more leads linking mutations in proteins to the cancer phenotype.

Download Full-text

Selection in a growing bacterial/yeast colony biases results of mutation accumulation experiments

10.1101/2021.04.12.439444 ◽

2021 ◽

Author(s):

Anjali Mahilkar ◽

Sharvari Kemkar ◽

Supreet Saini

Keyword(s):

Natural Selection ◽

Population Size ◽

Effective Population Size ◽

Colony Growth ◽

Mutation Accumulation ◽

Rate Of Change ◽

Raw Material ◽

Effective Population ◽

The Face ◽

Mean Fitness

AbstractMutations provide the raw material for natural selection to act. Therefore, understanding the variety and relative frequency of different type of mutations is critical to understanding the nature of genetic diversity in a population. Mutation accumulation (MA) experiments have been used in this context to estimate parameters defining mutation rates, distribution of fitness effects (DFE), and spectrum of mutations. MA experiments performed with organisms such asDrosophilahave an effective population size of one. However, in MA experiments with bacteria and yeast, a single founder is allowed to grow to a size of a colony (~108). The effective population size in these experiments is of the order of 10. In this scenario, while it is assumed that natural selection plays a minimal role in dictating the dynamics of colony growth and therefore, the MA experiment; this effect has not been tested explicitly. In this work, we simulate colony growth and perform an MA experiment, and demonstrate that selection ensures that, in an MA experiment, fraction of all mutations that are beneficial is over represented by a factor greater than two. The DFE of beneficial and deleterious mutations are accurately captured in an MA experiment. We show that the effect of selection in a growing colony varies non-monotonically and that, in the face of natural selection dictating an MA experiment, estimates of mutation rate of an organism is not trivial. We perform experiments with 160 MA lines ofE. coli, and demonstrate that rate of change of mean fitness is a non-monotonic function of the colony size, and that selection acts differently in different sectors of a growing colony. Overall, we demonstrate that the results of MA experiments need to be revisited taking into account the action of selection in a growing colony.

Download Full-text