scholarly journals Conversion between 100-million-year-old duplicated genes contributes to rice subspecies divergence

2020 ◽  
Author(s):  
Chendan Wei ◽  
Zhenyi Wang ◽  
Jianyu Wang ◽  
Jia Teng ◽  
Shaoqi Shen ◽  
...  

AbstractExtensive sequence similarity between duplicated gene pairs produced by paleo-polyploidization may result from illegitimate recombination between homologous chromosomes. The genomes of Asian cultivated rice Xian/indica (XI) and Geng/japonica (GJ) have recently been updated, providing new opportunities for investigating on-going gene conversion events. Using comparative genomics and phylogenetic analyses, we evaluated gene conversion rates between duplicated genes produced by polyploidization 100 million years ago (mya) in GJ and XI. At least 5.19%–5.77% of genes duplicated across three genomes were affected by whole-gene conversion after the divergence of GJ and XI at ~0.4 mya, with more (7.77%–9.53%) showing conversion of only gene portions. Independently converted duplicates surviving in genomes of different subspecies often used the same donor genes. On-going gene conversion frequency was higher near chromosome termini, with a single pair of homoeologous chromosomes 11 and 12 in each genome most affected. Notably, on-going gene conversion has maintained similarity between very ancient duplicates, provided opportunities for further gene conversion, and accelerated rice divergence. Chromosome rearrangement after polyploidization may result in gene loss, providing a basis for on-going gene conversion, and may have contributed directly to restricted recombination/conversion between homoeologous regions. Gene conversion affected biological functions associated with multiple genes, such as catalytic activity, implying opportunities for interaction among members of large gene families, such as NBS-LRR disease-resistance genes, resulting in gene conversion. Duplicated genes in rice subspecies generated by grass polyploidization ~100 mya remain affected by gene conversion at high frequency, with important implications for the divergence of rice subspecies.One-sentence summaryOn-going gene conversion between duplicated genes produced by 100 mya polyploidization contributes to rice subspecies divergence, often involving the same donor genes at chromosome termini.

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Chendan Wei ◽  
Zhenyi Wang ◽  
Jianyu Wang ◽  
Jia Teng ◽  
Shaoqi Shen ◽  
...  

Abstract Background Duplicated gene pairs produced by ancient polyploidy maintain high sequence similarity over a long period of time and may result from illegitimate recombination between homeologous chromosomes. The genomes of Asian cultivated rice Oryza sativa ssp. indica (XI) and Oryza sativa ssp. japonica (GJ) have recently been updated, providing new opportunities for investigating ongoing gene conversion events and their impact on genome evolution. Results Using comparative genomics and phylogenetic analyses, we evaluated gene conversion rates between duplicated genes produced by polyploidization 100 million years ago (mya) in GJ and XI. At least 5.19–5.77% of genes duplicated across the three rice genomes were affected by whole-gene conversion after the divergence of GJ and XI at ~ 0.4 mya, with more (7.77–9.53%) showing conversion of only portions of genes. Independently converted duplicates surviving in the genomes of different subspecies often use the same donor genes. The ongoing gene conversion frequency was higher near chromosome termini, with a single pair of homoeologous chromosomes, 11 and 12, in each rice genome being most affected. Notably, ongoing gene conversion has maintained similarity between very ancient duplicates, provided opportunities for further gene conversion, and accelerated rice divergence. Chromosome rearrangements after polyploidization are associated with ongoing gene conversion events, and they directly restrict recombination and inhibit duplicated gene conversion between homeologous regions. Furthermore, we found that the converted genes tended to have more similar expression patterns than nonconverted duplicates. Gene conversion affects biological functions associated with multiple genes, such as catalytic activity, implying opportunities for interaction among members of large gene families, such as NBS-LRR disease-resistance genes, contributing to the occurrence of the gene conversion. Conclusion Duplicated genes in rice subspecies generated by grass polyploidization ~ 100 mya remain affected by gene conversion at high frequency, with important implications for the divergence of rice subspecies.


2016 ◽  
Vol 113 (48) ◽  
pp. 13815-13820 ◽  
Author(s):  
Mi Ok Lee ◽  
Susanne Bornelöv ◽  
Leif Andersson ◽  
Susan J. Lamont ◽  
Junfeng Chen ◽  
...  

Defensins constitute an evolutionary conserved family of cationic antimicrobial peptides that play a key role in host innate immune responses to infection. Defensin genes generally reside in complex genomic regions that are prone to structural variation, and defensin genes exhibit extensive copy number variation in humans and in other species. Copy number variation of defensin genes was examined in inbred lines of Leghorn and Fayoumi chickens, and a duplication ofdefensin7was discovered in the Fayoumi breed. Analysis of junction sequences confirmed the occurrence of a simple tandem duplication ofdefensin7with sequence identity at the junction, suggesting nonallelic homologous recombination betweendefensin7anddefensin6. The duplication event generated two chimeric promoters that are best explained by gene conversion followed by homologous recombination. Expression ofdefensin7was not elevated in animals with two genes despite both genes being transcribed in the tissues examined. Computational prediction of promoter regions revealed the presence of several putative transcription factor binding sites generated by the duplication event. These data provide insight into the evolution and possible function of large gene families and specifically, the defensins.


2021 ◽  
Author(s):  
Christina N. Hodson ◽  
Kamil S. Jaron ◽  
Susan Gerbi ◽  
Laura Ross

AbstractGermline restricted DNA has evolved in diverse animal taxa, and is found in several vertebrate clades, nematodes, and flies. In these lineages, either portions of chromosomes or entire chromosomes are eliminated from somatic cells early in development, restricting portions of the genome to the germline. Little is known about why germline restricted DNA has evolved, especially in flies, in which three diverse families, Chironomidae, Cecidomyiidae, and Sciaridae exhibit germline restricted chromosomes (GRCs). We conducted a genomic analysis of germline restricted chromosomes in the fungus gnat Bradysia (Sciara) coprophila (Diptera: Sciaridae), which carries two large germline restricted “L” chromosomes. We sequenced and assembled the genome of B. coprophila, and used differences in sequence coverage and k-mer frequency between somatic and germ tissues to identify GRC sequence and compare it to the other chromosomes in the genome. We found that the GRCs in B. coprophila are large, gene-rich, and have many genes with paralogs on other chromosomes in the genome. We also found that the GRC genes are extraordinarily divergent from their paralogs, and have sequence similarity to another Dipteran family (Cecidomyiidae) in phylogenetic analyses, suggesting that these chromosomes have arisen in Sciaridae through introgression from a related lineage. These results suggest that the GRCs may have evolved through an ancient hybridization event, raising questions about how this may have occurred, how these chromosomes became restricted to the germline after introgression, and why they were retained over time.


2008 ◽  
Vol 2008 ◽  
pp. 1-11 ◽  
Author(s):  
Deng Pan ◽  
Liqing Zhang

Tandemly arrayed genes (TAGs) are duplicated genes that are linked as neighbors on a chromosome, many of which have important physiological and biochemical functions. Here we performed a survey of these genes in 11 available vertebrate genomes. TAGs account for an average of about 14% of all genes in these vertebrate genomes, and about 25% of all duplications. The majority of TAGs (72–94%) have parallel transcription orientation (i.e., they are encoded on the same strand) in contrast to the genome, which has about 50% of its genes in parallel transcription orientation. The majority of tandem arrays have only two members. In all species, the proportion of genes that belong to TAGs tends to be higher in large gene families than in small ones; together with our recent finding that tandem duplication played a more important role than retroposition in large families, this fact suggests that among all types of duplication mechanisms, tandem duplication is the predominant mechanism of duplication, especially in large families. Finally, several species have a higher proportion of large tandem arrays that are species-specific than random expectation.


2017 ◽  
Author(s):  
Abigail J. Moore ◽  
Jurriaan M. de Vos ◽  
Lillian P. Hancock ◽  
Eric Goolsby ◽  
Erika J. Edwards

ABSTRACTHybrid enrichment is an increasingly popular approach for obtaining hundreds of loci for phylogenetic analysis across many taxa quickly and cheaply. The genes targeted for sequencing are typically single-copy loci, which facilitate a more straightforward sequence assembly and homology assignment process. However, single copy loci are relatively uncommon elements of most genomes, and as such may provide a biased evolutionary history. Furthermore, this approach limits the inclusion of most genes of functional interest, which often belong to multi-gene families. Here we demonstrate the feasibility of including large gene families in hybrid enrichment protocols for phylogeny reconstruction and subsequent analyses of molecular evolution, using a new set of bait sequences designed for the “portullugo” (Caryophyllales), a moderately sized lineage of flowering plants (~2200 species) that includes the cacti and harbors many evolutionary transitions to C4 and CAM photosynthesis. Including multi-gene families allowed us to simultaneously infer a robust phylogeny and construct a dense sampling of sequences for a major enzyme of C4 and CAM photosynthesis, which revealed the accumulation of adaptive amino acid substitutions associated with C4 and CAM origins in particular paralogs. Our final set of matrices for phylogenetic analyses included 75–218 loci across 74 taxa, with ~50% matrix completeness across datasets. Phylogenetic resolution was greatly improved across the tree, at both shallow and deep levels. Concatenation and coalescent-based approaches both resolve with strong support the sister lineage of the cacti: Anacampserotaceae + Portulacaceae, two lineages of mostly diminutive succulent herbs of warm, arid regions. In spite of this congruence, BUCKy concordance analyses demonstrated strong and conflicting signals across gene trees for the resolution of the sister group of the cacti. Our results add to the growing number of examples illustrating the complexity of phylogenetic signals in genomic-scale data.


Genes ◽  
2021 ◽  
Vol 12 (12) ◽  
pp. 1944
Author(s):  
Shaoqi Shen ◽  
Yuxian Li ◽  
Jianyu Wang ◽  
Chendan Wei ◽  
Zhenyi Wang ◽  
...  

The peanut (Arachis hypogaea L.) is the leading oil and food crop among the legume family. Extensive duplicate gene pairs generated from recursive polyploidizations with high sequence similarity could result from gene conversion, caused by illegitimate DNA recombination. Here, through synteny-based comparisons of two diploid and three tetraploid peanut genomes, we identified the duplicated genes generated from legume common tetraploidy (LCT) and peanut recent allo-tetraploidy (PRT) within genomes. In each peanut genome (or subgenomes), we inferred that 6.8–13.1% of LCT-related and 11.3–16.5% of PRT-related duplicates were affected by gene conversion, in which the LCT-related duplicates were the most affected by partial gene conversion, whereas the PRT-related duplicates were the most affected by whole gene conversion. Notably, we observed the conversion between duplicates as the long-lasting contribution of polyploidizations accelerated the divergence of different Arachis genomes. Moreover, we found that the converted duplicates are unevenly distributed across the chromosomes and are more often near the ends of the chromosomes in each genome. We also confirmed that well-preserved homoeologous chromosome regions may facilitate duplicates’ conversion. In addition, we found that these biological functions contain a higher number of preferentially converted genes, such as catalytic activity-related genes. We identified specific domains that are involved in converted genes, implying that conversions are associated with important traits of peanut growth and development.


2019 ◽  
Author(s):  
Joel Vizueta ◽  
Alejandro Sánchez-Gracia ◽  
Julio Rozas

AbstractGene annotation is a critical bottleneck in genomic research, especially for the comprehensive study of very large gene families in the genomes of non-model organisms. Despite the recent progress in automatic methods, the tools developed for this task often produce inaccurate annotations, such as fused, chimeric, partial or even completely absent gene models for many family copies, which require considerable extra efforts to be amended. Here we present BITACORA, a bioinformatics solution that integrates sequence similarity search tools and Perl scripts to facilitate both the curation of these inaccurate annotations and the identification of previously undetected gene family copies directly from DNA sequences. We tested the performance of the BITACORA pipeline in annotating the members of two chemosensory gene families of different sizes in seven available chelicerate genome drafts. Despite the relatively high fragmentation of some of these drafts, BITACORA was able to improve the annotation of many members of these families and detected thousands of new chemoreceptors encoded in genome sequences. The program generates an output file in the general feature format (GFF) files, with both curated and novel gene models, and a FASTA file with the predicted proteins. These outputs can be easily integrated in genomic annotation editors, greatly facilitating subsequent manual annotation and downstream evolutionary analyses.


2021 ◽  
Author(s):  
Alberto Cenci ◽  
Mairenys Concepci&oacuten-Hernández ◽  
Geert Angenon ◽  
Mathieu Rouard

GDSL-type esterase/lipase (GELP) enzymes have multiple functions in plants, spanning from developmental processes to the response to biotic and abiotic stresses. Genes encoding GELP belong to a large gene family with several tens to more than hundred members per species in angiosperms. Here, we applied iterative phylogenic analyses to identify 10 main clusters subdivided into 44 expert-curated reference orthogroups (OGs) using three monocot and five dicot genomes. Our results show that some GELP OGs expanded while others were maintained as single copy genes. This semi-automatic approach proves to be effective to characterize large gene families and provides a solid classification framework for the GELP members in angiosperms. The orthogroup-based reference will be useful to perform comparative studies, infer gene functions and better understand the evolutionary history of this gene family.


2010 ◽  
Vol 60 (11) ◽  
pp. 2535-2539 ◽  
Author(s):  
Hui-Rong Li ◽  
Yong Yu ◽  
Wei Luo ◽  
Yin-Xin Zeng

Strain ZS314T was isolated from a sandy intertidal sediment sample collected from the coastal area off the Chinese Antarctic Zhongshan Station, east Antarctica (6 ° 22′ 13″ S 7 ° 21′ 41″ E). The cells were Gram-positive, motile, short rods. The temperature range for growth was 0–26 °C and the pH for growth ranged from 5 to 10, with optimum growth occurring within the temperature range 18–23 °C and pH range 6.0–8.0. Growth occurred in the presence of 0–6 % (w/v) NaCl, with optimum growth occurring in the presence of 2–4 % (w/v) NaCl. Strain ZS314T had MK-10 as the major menaquinone and anteiso-C15 : 0, iso-C16 : 0 and anteiso-C17 : 0 as major fatty acids. The cell-wall peptidoglycan type was B2β with ornithine as the diagnostic diamino acid. The major polar lipids were diphosphatidylglycerol and phosphatidylglycerol. The genomic DNA G+C content was approximately 67 mol%. Phylogenetic analysis based on 16S rRNA gene sequence similarity showed that strain ZS314T represents a new lineage in the family Microbacteriaceae. On the basis of the phylogenetic analyses and phenotypic characteristics, a new genus, namely Marisediminicola gen. nov., is proposed, harbouring the novel species Marisediminicola antarctica sp. nov. with the type strain ZS314T (=DSM 22350T =CCTCC AB 209077T).


Sign in / Sign up

Export Citation Format

Share Document