scholarly journals Evolutionary adaptability linked to length variation in short genic tandem repeats

2018 ◽  
Author(s):  
William B Reinar ◽  
Jonfinn B Knutsen ◽  
Sissel Jentoft ◽  
Ole K Tørresen ◽  
Melinka A Butenko ◽  
...  

AbstractThere is increasing evidence that short tandem repeats (STRs) – mutational hotspots present in genes and in intergenic regions throughout most genomes – may influence gene and protein function and consequently affect the phenotype of an organism. However, the overall importance of STRs and their standing genetic variation within a population, e.g. if and how they facilitate evolutionary change and local adaptation, is still debated. Through genome-wide characterization of STRs in over a thousand wild Arabidopsis thaliana accessions we demonstrate that STRs display significant variation in length across the species’ geographical distribution. We find that length variants are correlated with environmental conditions, key adaptive phenotypic traits as well as gene expression levels. Further, we show that coding STRs are overrepresented in putative protein interaction sites. Taken together, our results suggest that these hypervariable loci play a major role in facilitating adaptation in plants, and due to the ubiquitous presence of STRs throughout the tree of life, similar roles in other organisms are likely.

Diversity ◽  
2021 ◽  
Vol 13 (12) ◽  
pp. 684
Author(s):  
Han Wang ◽  
Wenna Shao ◽  
Min Yan ◽  
Ye Xu ◽  
Shaohua Liu ◽  
...  

Class III homeodomain-leucine zipper (HD-ZIP III) genes encode plant-specific transcription factors that play pivotal roles in plant growth and development. There is no systematic report on HD-ZIP III members in Brassica plants and their responses to stress are largely unknown. In this study, a total of 10, 9 and 16 HD-ZIP III genes were identified from B. rapa, B. oleracea and B. napus, respectively. The phylogenetic analysis showed that HD-ZIP III proteins were grouped into three clades: PHB/PHV, REV and CNA/HB8. Genes in the same group tended to have similar exon–intron structures. Various phytohormone-responsive elements and stress-responsive elements were detected in the promoter regions of HD-ZIP III genes. Gene expression levels in different tissues, as well as under different stress conditions, were investigated using public transcription profiling data. The HD-ZIP III genes were constitutively expressed among all the tested tissues and were highly accumulated in root and stem. In B. rapa, only one BrREV gene especially responded to heat stress, BrPHB and BrREV members were downregulated upon cold stress and most HD-ZIP III genes exhibited divergent responses to drought stress. In addition, we investigated the genetic variation at known miR165/166 complementary sites of the identified HD-ZIP III genes and found one single nucleotide polymorphism (SNP) in PHB members and two SNPs in REV members, which were further confirmed using Sanger sequencing. Taken together, these results provide information for the genome-wide characterization of HD-ZIP III genes and their stress response diversity in Brassica species.


Plants ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 322
Author(s):  
Zhoutao Wang ◽  
Hui Ren ◽  
Fu Xu ◽  
Guilong Lu ◽  
Wei Cheng ◽  
...  

Sugarcane is an important sugar and bioenergy ethanol crop, and the hyperploidy has led to stagnant progress in sugarcane genome decipherment, which also hindered the genome-wide analyses of versatile lectin receptor kinases (LecRKs). The published genome of Saccharum spontaneum, one of the two sugarcane ancestor species, enables us to study the characterization of LecRKs and their responses to sugarcane leaf blight (SLB) triggered by Stagonospora tainanensis. A total of 429 allelic and non-allelic LecRKs, which were classified into evolved independently three types according to signal domains and phylogeny, were identified based on the genome. Regarding those closely related LecRKs in the phylogenetic tree, their motifs and exon architectures of representative L- and G-types were similar or identical. LecRKs showed an unequal distribution on chromosomes and more G-type tandem repeats may come from the gene expansion. Comparing the differentially expressed LecRKs (DELs) in response to SLB in sugarcane hybrid and ancestor species S. spontaneum, we found that the DEL number in the shared gene sets was highly variable among each sugarcane accession, which indicated that the expression dynamics of LecRKs in response to SLB were quite different between hybrids and particularly between sugarcane hybrid and S. spontaneum. In addition, C-type LecRKs may participate in metabolic processes of plant–pathogen interaction, mainly including pathogenicity and plant resistance, indicating their putative roles in sugarcane responses to SLB infection. The present study provides a basic reference and global insight into the further study and utilization of LecRKs in plants.


2019 ◽  
Author(s):  
Tihana Vondrak ◽  
Laura Ávila Robledillo ◽  
Petr Novák ◽  
Andrea Koblížková ◽  
Pavel Neumann ◽  
...  

AbstractBackgroundAmplification of monomer sequences into long contiguous arrays is the main feature distinguishing satellite DNA from other tandem repeats, yet it is also the main obstacle in its investigation because these arrays are in principle difficult to assemble. Here we explore an alternative, assembly-free approach that utilizes ultra-long Oxford Nanopore reads to infer the length distribution of satellite repeat arrays, their association with other repeats and the prevailing sequence periodicities.ResultsWe have developed a computational workflow for similarity-based detection and downstream analysis of satellite repeats in individual nanopore reads that led to genome-wide characterization of their properties. Using the satellite DNA-rich legume plantLathyrus sativusas a model, we demonstrated this approach by analyzing eleven major satellite repeats using a set of nanopore reads ranging from 30 to over 200 kb in length and representing 0.73x genome coverage. We found surprising differences between the analyzed repeats because only two of them were predominantly organized in long arrays typical for satellite DNA. The remaining nine satellites were found to be derived from short tandem arrays located within LTR-retrotransposons that occasionally expanded in length. While the corresponding LTR-retrotransposons were dispersed across the genome, this array expansion occurred mainly in the primary constrictions of theL. sativuschromosomes, which suggests that these genome regions are favorable for satellite DNA accumulation.ConclusionsThe presented approach proved to be efficient in revealing differences in long-range organization of satellite repeats that can be used to investigate their origin and evolution in the genome.


2013 ◽  
Vol 368 (1620) ◽  
pp. 20120362 ◽  
Author(s):  
Alexandra C. Nica ◽  
Emmanouil T. Dermitzakis

The last few years have seen the development of large efforts for the analysis of genome function, especially in the context of genome variation. One of the most prominent directions has been the extensive set of studies on expression quantitative trait loci (eQTLs), namely, the discovery of genetic variants that explain variation in gene expression levels. Such studies have offered promise not just for the characterization of functional sequence variation but also for the understanding of basic processes of gene regulation and interpretation of genome-wide association studies. In this review, we discuss some of the key directions of eQTL research and its implications.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Satomi Mitsuhashi ◽  
Martin C. Frith ◽  
Naomichi Matsumoto

Abstract Background Tandem repeats are highly mutable and contribute to the development of human disease by a variety of mechanisms. It is difficult to predict which tandem repeats may cause a disease. One hypothesis is that changeable tandem repeats are the source of genetic diseases, because disease-causing repeats are polymorphic in healthy individuals. However, it is not clear whether disease-causing repeats are more polymorphic than other repeats. Methods We performed a genome-wide survey of the millions of human tandem repeats using publicly available long read genome sequencing data from 21 humans. We measured tandem repeat copy number changes using . Length variation of known disease-associated repeats was compared to other repeat loci. Results We found that known Mendelian disease-causing or disease-associated repeats, especially CAG and 5′UTR GGC repeats, are relatively long and polymorphic in the general population. We also show that repeat lengths of two disease-causing tandem repeats, in ATXN3 and GLS, are correlated with near-by GWAS SNP genotypes. Conclusions We provide a catalog of polymorphic tandem repeats across a variety of repeat unit lengths and sequences, from long read sequencing data. This method especially if used in genome wide association study, may indicate possible new candidates of pathogenic or biologically important tandem repeats in human genomes.


PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e6286 ◽  
Author(s):  
Yujie Zhou ◽  
Won Kyong Cho ◽  
Hee-Seong Byun ◽  
Vivek Chavan ◽  
Eui-Joon Kil ◽  
...  

Long non-coding RNAs (lncRNAs) play an important role in regulating many biological processes. In this study, tomato seeds were first irradiated by neutrons. Eight tomato mutants were then selected and infected byTomato yellow leaf curl virus(TYLCV). RNA sequencing followed by bioinformatics analyses identified 1,563 tomato lncRNAs. About half of the lncRNAs were derived from intergenic regions, whereas antisense lncRNAs accounted for 35%. There were fewer lncRNAs identified in our study than in other studies identifying tomato lncRNAs. Functional classification of 794 lncRNAs associated with tomato genes showed that many lncRNAs were associated with binding functions required for interactions with other molecules and localized in the cytosol and membrane. In addition, we identified 19 up-regulated and 11 down-regulated tomato lncRNAs by comparing TYLCV infected plants to non-infected plants using previously published data. Based on these results, the lncRNAs identified in this study provide important resources for characterization of tomato lncRNAs in response to TYLCV infection.


2020 ◽  
Author(s):  
Satomi Mitsuhashi ◽  
Martin C Frith ◽  
Naomichi Matsumoto

Abstract Background: Tandem repeats are highly mutable and contribute to the development of human disease by a variety of mechanisms. It is difficult to predict which tandem repeats may cause a disease. One hypothesis is that changeable tandem repeats are the source of genetic diseases, because disease-causing repeats are polymorphic in healthy individuals. However, it is not clear whether disease-causing repeats are more polymorphic than other repeats. Methods: We performed a genome-wide survey of the millions of human tandem repeats using publicly available long read genome sequencing data from 21 humans. We measured tandem repeat copy number changes using tandem-genotypes. Length variation of known disease-associated repeats was compared to other repeat loci. Results: We found that known Mendelian disease-causing or disease-associated repeats, especially CAG and 5'UTR GGC repeats, are relatively long and polymorphic in the general population. We also show that repeat lengths of two disease-causing tandem repeats, in ATXN3 and GLS, are correlated with near-by GWAS SNP genotypes. Conclusions: We provide a catalog of polymorphic tandem repeats across a variety of repeat unit lengths and sequences, from long read sequencing data. This method especially if used in genome wide association study (GWAS), may indicate possible new candidates of pathogenic or biologically important tandem repeats in human genomes.


2018 ◽  
Author(s):  
Michael Babokhov ◽  
Bradley I. Reinfeld ◽  
Kevin Hackbarth ◽  
Yotam Bentov ◽  
Stephen M. Fuchs

AbstractCopy-number variation in tandem repeat coding regions is more prevalent in eukaryotic genomes than current literature suggests. We have reexamined the genomes of nearly 100 yeast strains looking to map regions of repeat variation. From this analysis we have identified that length variation is highly correlated to intrinsically disordered regions (IDRs). Furthermore, the majority of length variation is associated with tandem repeats. These repetitive regions are rich in homopolymeric amino acid sequences but nearly half of the variation comes from longer-repeating motifs. Comparisons of repeat copy number and sequence between strains of budding yeast as well as closely related fungi suggest selection for and conservation of IDR-related tandem repeats. In some instances, repeat variation has been demonstrated to mediate binding affinity, aggregation, and protein stability. With this analysis, we can identify proteins for which repeat variation may play conserved roles in modulating protein function.


Sign in / Sign up

Export Citation Format

Share Document