Mining EST databases to resolve evolutionary events in major crop species

Genome ◽  
2004 ◽  
Vol 47 (5) ◽  
pp. 868-876 ◽  
Author(s):  
Jessica A Schlueter ◽  
Phillip Dixon ◽  
Cheryl Granger ◽  
David Grant ◽  
Lynn Clark ◽  
...  

Using plant EST collections, we obtained 1392 potential gene duplicates across 8 plant species: Zea mays, Oryza sativa, Sorghum bicolor, Hordeum vulgare, Solanum tuberosum, Lycopersicon esculentum, Medicago truncatula, and Glycine max. We estimated the synonymous and nonsynonymous distances between each gene pair and identified two to three mixtures of normal distributions corresponding to one to three rounds of genome duplication in each species. Within the Poaceae, we found a conserved duplication event among all four species that occurred approximately 50–60 million years ago (Mya); an event that probably occurred before the major radiation of the grasses. In the Solanaceae, we found evidence for a conserved duplication event approximately 50–52 Mya. A duplication in soybean occurred approximately 44 Mya and a duplication in Medicago about 58 Mya. Comparing synonymous and nonsynonymous distances allowed us to determine that most duplicate gene pairs are under purifying, negative selection. We calculated Pearson's correlation coefficients to provide us with a measure of how gene expression patterns have changed between duplicate pairs, and compared this across evolutionary distances. This analysis showed that some duplicates seemed to retain expression patterns between pairs, whereas others showed uncorrelated expression.Key words: genome evolution, polyploidy, genome duplication, expressed sequence tag.

Development ◽  
2001 ◽  
Vol 128 (13) ◽  
pp. 2471-2484 ◽  
Author(s):  
James M. McClintock ◽  
Robin Carlson ◽  
Devon M. Mann ◽  
Victoria E. Prince

As a result of a whole genome duplication event in the lineage leading to teleosts, the zebrafish has seven clusters of Hox patterning genes, rather than four, as described for tetrapod vertebrates. To investigate the consequences of this genome duplication, we have carried out a detailed comparison of genes from a single Hox paralogue group, paralogue group (PG) 1. We have analyzed the sequences, expression patterns and potential functions of all four of the zebrafish PG1 Hox genes, and compared our data with that available for the three mouse genes. As the basic functions of Hox genes appear to be tightly constrained, comparison with mouse data has allowed us to identify specific changes in the developmental roles of Hox genes that have occurred during vertebrate evolution. We have found variation in expression patterns, amino acid sequences within functional domains, and potential gene functions both within the PG1 genes of zebrafish, and in comparison to mouse PG1 genes. We observed novel expression patterns in the midbrain, such that zebrafish hoxa1a and hoxc1a are expressed anterior to the domain traditionally thought to be under Hox patterning control. The hoxc1a gene shows significant coding sequence changes in known functional domains, which correlate with a reduced capacity to cause posteriorizing transformations. Moreover, the hoxb1 duplicate genes have differing functional capacities, suggesting divergence after duplication. We also find that an intriguing function ‘shuffling’ between paralogues has occurred, such that one of the zebrafish hoxb1 duplicates, hoxb1b, performs the role in hindbrain patterning played in mouse by the non-orthologous Hoxa1 gene.


Genome ◽  
2002 ◽  
Vol 45 (4) ◽  
pp. 693-701 ◽  
Author(s):  
Cheryl Granger ◽  
Virginia Coryell ◽  
Anupama Khanna ◽  
Paul Keim ◽  
Lila Vodkin ◽  
...  

Expressed sequence tags (ESTs) exhibiting homology to a BURP domain containing gene family were identified from the Glycine max (L.) Merr. EST database. These ESTs were assembled into 16 contigs of variable sizes and lengths. Consistent with the structure of known BURP domain containing proteins, the translation products exhibit a modular structure consisting of a C-terminal BURP domain, an N-terminal signal sequence, and a variable internal region. The soybean family members exhibit 35–98% similarity in a ~100-amino-acid C-terminal region, and a phylogenetic tree constructed using this region shows that some soybean family members group together in closely related pairs, triplets, and quartets, whereas others remain as singletons. The structure of these groups suggests that multiple gene duplication events occurred during the evolutionary history of this family. The depth and diversity of G. max EST libraries allowed tissue-specific expression patterns of the putative soybean BURPs to be examined. Consistent with known BURP proteins, the newly identified soybean BURPs have diverse expression patterns. Furthermore, putative paralogs can have both spatially and quantitatively distinct expression patterns. We discuss the functional and evolutionary implications of these findings, as well as the utility of EST-based analyses for identifying and characterizing gene families.Key words: BURP domain, expressed sequence tag, gene duplication, Glycine max.


BMC Genomics ◽  
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Yuan Lu ◽  
Mikki Boswell ◽  
William Boswell ◽  
Raquel Ybanez Salinas ◽  
Markita Savage ◽  
...  

Abstract Background Studying functional divergences between paralogs that originated from genome duplication is a significant topic in investigating molecular evolution. Genes that exhibit basal level cyclic expression patterns including circadian and light responsive genes are important physiological regulators. Temporal shifts in basal gene expression patterns are important factors to be considered when studying genetic functions. However, adequate efforts have not been applied to studying basal gene expression variation on a global scale to establish transcriptional activity baselines for each organ. Furthermore, the investigation of cyclic expression pattern comparisons between genome duplication created paralogs, and potential functional divergence between them has been neglected. To address these questions, we utilized a teleost fish species, Xiphophorus maculatus, and profiled gene expression within 9 organs at 3-h intervals throughout a 24-h diurnal period. Results Our results showed 1.3–21.9% of genes in different organs exhibited cyclic expression patterns, with eye showing the highest fraction of cycling genes while gonads yielded the lowest. A majority of the duplicated gene pairs exhibited divergences in their basal level expression patterns wherein only one paralog exhibited an oscillating expression pattern, or both paralogs exhibit oscillating expression patterns, but each gene duplicate showed a different peak expression time, and/or in different organs. Conclusions These observations suggest cyclic genes experienced significant sub-, neo-, or non-functionalization following the teleost genome duplication event. In addition, we developed a customized, web-accessible, gene expression browser to facilitate data mining and data visualization for the scientific community.


2017 ◽  
Author(s):  
Jeremy Pasquier ◽  
Ingo Braasch ◽  
Peter Batzel ◽  
Cedric Cabau ◽  
Jérome Montfort ◽  
...  

AbstractWhole genome duplications (WGD) are important evolutionary events. Our understanding of underlying mechanisms, including the evolution of duplicated genes after WGD, however remains incomplete. Teleost fish experienced a common WGD (teleost-specific genome duplication, or TGD) followed by a dramatic adaptive radiation leading to more than half of all vertebrate species. The analysis of gene expression patterns following TGD at the genome level has been limited by the lack of suitable genomic resources. The recent concomitant release of the genome sequence of spotted gar (a representative of holosteans, the closest lineage of teleosts that lacks the TGD) and the tissue-specific gene expression repertoires of over 20 holostean and teleostean fish species, including spotted gar, zebrafish and medaka (the PhyloFish project), offered a unique opportunity to study the evolution of gene expression following TGD in teleosts. We show that most TGD duplicates gained their current status (loss of one duplicate gene or retention of both duplicates) relatively rapidly after TGD (i.e. prior to the divergence of medaka and zebrafish lineages). The loss of one duplicate is the most common fate after TGD with a probability of approximately 80%. In addition, the fate of duplicate genes after TGD, including subfunctionalization, neofunctionalization, or retention of two ‘similar’ copies occurred not only before, but also after the radiation of species tested, in consistency with a role of the TGD in speciation and/or evolution of gene function. Finally, we report novel cases of TGD ohnolog subfunctionalization and neofunctionalization that further illustrate the importance of these processes.


2017 ◽  
Author(s):  
Nicholas L. Panchy ◽  
Christina B. Azodi ◽  
Eamon F. Winship ◽  
Ronan C. O’Malley ◽  
Shin-Han Shiu

AbstractTranscription factors (TFs) play a key role in regulating plant development and response to environmental stimuli. While most genes revert to single copy after a duplication event, transcription factors are retained at a significantly higher rate. However, it is unclear why TF duplicates have higher rates of retention relative to other genes. In this study, we compared three types of features (expression, sequence, and conservation) of retained TFs following whole genome duplication (WGD) events to genes with other functions, using Arabidopsis thaliana as a model. We found that gene function groups with higher maximum expression but lower mean expression tended to have higher duplicate retention rate post WGD, though TFs in particular are retained more often than would be expected based on the features examined. Conversely, expression of individual genes was not associated with duplication, but sequence conservation was. Furthermore, we found that the evolution of TF expression patterns and cis-regulatory cites favors the partitioning of ancestral states among the resulting duplicates. In particular, we found that one duplicate retains the majority of ancestral expression and cis-regulatory sites, while the “non-ancestral” duplicate was enriched for novel regulatory sites. To investigate how this pattern of partitioning pattern evolved, we modeled the retention of ancestral states in duplicate pairs using a system of differential equations. Our findings indicate that duplicate pairs evolve to a partitioned state more often than away from it, which in combination with accumulation of new regulatory sites in non-ancestral duplicates, suggest that selection favors partitioning via neofunctionalization.Author SummaryGene expression is controlled by regulatory proteins known as transcription factors. These factors control how an organism develops and responds to its environment. The evolution of transcription factor functions also contributes to the emergence of new species and crop domestication. In plants, new transcription factors mainly arise due to polyploidy, multiplication of the genome. Although most duplicated copies are lost following a genome duplication event, transcription factors are exceptional because they are often kept. Furthermore, we found that transcription factor duplicates that tend to diverge in how they are expressed and regulated in an unusual way where one copy mirrors the original, pre-duplication functional states of the ancestral gene, while the other loses the ancestral status and instead accumulates novel regulatory sites. Our results suggest these duplicate transcription factors may have been kept because one copy preserve ancestral function while the other has evolved new ones.


2001 ◽  
Vol 24 (1-4) ◽  
pp. 35-41 ◽  
Author(s):  
Dirce Maria Carraro ◽  
Marcio R. Lambais ◽  
Helaine Carrer

Sucrose non-fermenting-1-related protein kinases (SnRKs) may play a major role in regulating gene expression in plant cells. This family of regulatory proteins is represented by sucrose non-fermenting-1 (SNF1) protein kinase in Saccharomyces cerevisiae, AMP-activated protein kinases (AMPKs) in mammals and SnRKs in higher plants. The SnRK family has been reorganized into three subfamilies according to the evolutionary relationships of their amino acid sequences. Members of the SnRK subfamily have been identified in several plants. There is evidence that they are involved in the nutritional and/or environmental stress response, although their roles are not yet well understood. We have identified at least 22 sugarcane expressed sequence tag (EST) contigs encoding putative SnRKs. The amino acid sequence alignment of both putative sugarcane SnRKs and known SnRKs revealed a highly conserved N-terminal catalytic domain. Our results indicated that sugarcane has at least one member of each SnRK subfamily. Expression pattern analysis of sugarcane EST-contigs encoding putative SnRKs in 26 selected cDNA libraries from the sugarcane expressed sequence tag SUCEST database has indicated that members of this family are expressed throughout the plant. Members of the same subfamily showed no specific expression patterns, suggesting that their functions are not related to their phylogenic relationships based on N-terminal amino acid sequence phylogenetic relationships.


2011 ◽  
Vol 2011 ◽  
pp. 1-20 ◽  
Author(s):  
Tiehui Wang ◽  
Bartolomeo Gorgoglione ◽  
Tanja Maehr ◽  
Jason W. Holland ◽  
Jose L. González Vecino ◽  
...  

The intracellular suppressors of cytokine signaling (SOCS) family members, including CISH and SOCS1 to 7 in mammals, are important regulators of cytokine signaling pathways. So far, the orthologues of all the eight mammalian SOCS members have been identified in fish, with several of them having multiple copies. Whilst fish CISH, SOCS3, and SOCS5 paralogues are possibly the result of the fish-specific whole genome duplication event, gene duplication or lineage-specific genome duplication may also contribute to some paralogues, as with the three trout SOCS2s and three zebrafish SOCS5s. Fish SOCS genes are broadly expressed and also show species-specific expression patterns. They can be upregulated by cytokines, such as IFN-γ, TNF-α, IL-1β, IL-6, and IL-21, by immune stimulants such as LPS, poly I:C, and PMA, as well as by viral, bacterial, and parasitic infections in member- and species-dependent manners. Initial functional studies demonstrate conserved mechanisms of fish SOCS action via JAK/STAT pathways.


Genome ◽  
2006 ◽  
Vol 49 (5) ◽  
pp. 531-544 ◽  
Author(s):  
S Chao ◽  
G R Lazo ◽  
F You ◽  
C C Crossman ◽  
D D Hummel ◽  
...  

The US Wheat Genome Project, funded by the National Science Foundation, developed the first large public Triticeae expressed sequence tag (EST) resource. Altogether, 116 272 ESTs were produced, comprising 100 674 5′ ESTs and 15 598 3′ ESTs. These ESTs were derived from 42 cDNA libraries, which were created from hexaploid bread wheat (Triticum aestivum L.) and its close relatives, including diploid wheat (T. monococcum L. and Aegilops speltoides L.), tetraploid wheat (T. turgidum L.), and rye (Secale cereale L.), using tissues collected from various stages of plant growth and development and under diverse regimes of abiotic and biotic stress treatments. ESTs were assembled into 18 876 contigs and 23 034 singletons, or 41 910 wheat unigenes. Over 90% of the contigs contained fewer than 10 EST members, implying that the ESTs represented a diverse selection of genes and that genes expressed at low and moderate to high levels were well sampled. Statistical methods were used to study the correlation of gene expression patterns, based on the ESTs clustered in the1536 contigs that contained at least 10 5′ EST members and thus representing the most abundant genes expressed in wheat. Analysis further identified genes in wheat that were significantly upregulated (p < 0.05) in tissues under various abiotic stresses when compared with control tissues. Though the function annotation cannot be assigned for many of these genes, it is likely that they play a role associated with the stress response. This study predicted the possible functionality for 4% of total wheat unigenes, which leaves the remaining 96% with their functional roles and expression patterns largely unknown. Nonetheless, the EST data generated in this project provide a diverse and rich source for gene discovery in wheat.Key words: Expressed sequence tags, ESTs, gene expression profiles, wheat, Triticeae.


Sign in / Sign up

Export Citation Format

Share Document