scholarly journals Link Between Short tandem Repeats and Translation Initiation Site Selection

2018 ◽  
Author(s):  
M Arabfard ◽  
K Kavousi ◽  
A Delbari ◽  
M Ohadi

AbstractRecent work in yeast and humans suggest that evolutionary divergence in cis-regulatory sequences impact translation initiation sites (TISs). Cis-elements can also affect the efficacy and amount of protein synthesis. Despite their vast biological implication, the landscape and relevance of short tandem repeats (STRs)/microsatellites to the human protein-coding gene TISs remain largely unknown. Here we characterized the STR distribution at the 120 bp cDNA sequence upstream of all annotated human protein-coding gene TISs based on the Ensembl database. Furthermore, we performed a comparative genomics study of all annotated orthologous TIS-flanking sequences across 47 vertebrate species (755,956 transcripts), aimed at identifying human-specific STRs in this interval. We also hypothesized that STRs may be used as genetic codes for the initiation of translation. The initial five amino acid sequences (excluding the initial methionine) that were flanked by STRs in human were BLASTed against the initial orthologous five amino acids in other vertebrate species (2,025,817 pair-wise TIS comparisons) in order to compare the number of events in which human-specific and non-specific STRs occurred with homologous and non-homologous TISs (i.e. ≥50% and <50% similarity of the five amino acids). We characterized human-specific STRs and a bias of this compartment in comparison to the overall (human-specific and non-specific) distribution of STRs (Mann Whitney p=1.4 × 10−11). We also found significant enrichment of non-homologous TISs flanked by human-specific STRs (p<0.00001). In conclusion, our data indicate a link between STRs and TIS selection, which is supported by differential evolution of the human-specific STRs in the TIS upstream flanking sequence.AbbreviationscDNAComplementary DNACDSCoding DNA sequenceSTRShort Tandem RepeatTISTranslation Initiation SiteTSSTranscription Start Site

2021 ◽  
Vol 1 ◽  
Author(s):  
Max A. Verbiest ◽  
Matteo Delucchi ◽  
Tugce Bilgin Sonay ◽  
Maria Anisimova

Short tandem repeats (STRs) are abundant in genomic sequences and are known for comparatively high mutation rates; STRs therefore are thought to be a potent source of genetic diversity. In protein-coding sequences STRs primarily encode disorder-promoting amino acids and are often located in intrinsically disordered regions (IDRs). STRs are frequently studied in the scope of microsatellite instability (MSI) in cancer, with little focus on the connection between protein STRs and IDRs. We believe, however, that this relationship should be explicitly included when ascertaining STR functionality in cancer. Here we explore this notion using all canonical human proteins from SwissProt, wherein we detected 3,699 STRs. Over 80% of these consisted completely of disorder promoting amino acids. 62.1% of amino acids in STR sequences were predicted to also be in an IDR, compared to 14.2% for non-repeat sequences. Over-representation analysis showed STR-containing proteins to be primarily located in the nucleus where they perform protein- and nucleotide-binding functions and regulate gene expression. They were also enriched in cancer-related signaling pathways. Furthermore, we found enrichments of STR-containing proteins among those correlated with patient survival for cancers derived from eight different anatomical sites. Intriguingly, several of these cancer types are not known to have a MSI-high (MSI-H) phenotype, suggesting that protein STRs play a role in cancer pathology in non MSI-H settings. Their intrinsic link with IDRs could therefore be an attractive topic of future research to further explore the role of STRs and IDRs in cancer. We speculate that our observations may be linked to the known dosage-sensitivity of disordered proteins, which could hint at a concentration-dependent gain-of-function mechanism in cancer for proteins containing STRs and IDRs.


1989 ◽  
Vol 257 (3) ◽  
pp. 921-924 ◽  
Author(s):  
T Takeno ◽  
S S L Li

Human genomic clones containing parts of the lactate dehydrogenase B (LDH-B) gene (approx. 25 kb in length) were isolated and characterized. The protein-coding sequence of human LDH-B gene is interrupted by six introns at codons nos. 42-43, 82, 140, 198, 237 and 278-279, and the positions of these introns are homologous to those of LDH-A genes from man and mouse. The 5' non-coding region of human LDH-B gene is interrupted by an intron six nucleotide residues upstream of the ATG translation-initiation site, whereas those of human and mouse LDH-A genes are interrupted at 24 nucleotide residues 5' to the ATG initiation codon. As is the case of LDH-A genes from man and mouse, there is no intron in the 3' non-coding region of human LDH-B gene.


2021 ◽  
Author(s):  
Ali Maddi ◽  
Kaveh Kavousi ◽  
Masoud Arabfard ◽  
Hamid Ohadi ◽  
Mina Ohadi

Abstract Findings in yeast and human suggest that evolutionary divergence in cis-regulatory sequences impact translation initiation sites (TISs). Here we employed the TIS homology concept to study a possible link between all categories of tandem repeats (TRs) and TIS selection. Human and 83 other species were selected, and data was extracted on the entire protein-coding genes (n = 1,611,368) and transcripts (n = 2,730,515) annotated for those species from Ensembl 102. On average, every transcript was flanked by 1.19 TRs of various categories in their 120 bp upstream RNA sequence. We detected statistically significant excess of non-homologous TISs co-occurring with human-specific TRs, and vice versa. We conclude that TRs are abundant cis elements in the upstream sequences of TISs across species, and there is a link between all categories of TRs and TIS selection. TR-induced symmetric and stem-loop structures may function as genetic marks for TIS selection.


PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e7031 ◽  
Author(s):  
Thanh Hoa Le ◽  
Khue Thi Nguyen ◽  
Nga Thi Bich Nguyen ◽  
Huong Thi Thanh Doan ◽  
Takeshi Agatsuma ◽  
...  

We present the complete mitochondrial genome of Paragonimus ohirai Miyazaki, 1939 and compare its features with those of previously reported mitochondrial genomes of the pathogenic lung-fluke, Paragonimus westermani, and other members of the genus. The circular mitochondrial DNA molecule of the single fully sequenced individual of P. ohirai was 14,818 bp in length, containing 12 protein-coding, two ribosomal RNA and 22 transfer RNA genes. As is common among trematodes, an atp8 gene was absent from the mitogenome of P. ohirai and the 5′ end of nad4 overlapped with the 3′ end of nad4L by 40 bp. Paragonimusohirai and four forms/strains of P. westermani from South Korea and India, exhibited remarkably different base compositions and hence codon usage in protein-coding genes. In the fully sequenced P. ohirai individual, the non-coding region started with two long identical repeats (292 bp each), separated by tRNAGlu. These were followed by an array of six short tandem repeats (STR), 117 bp each. Numbers of the short tandem repeats varied among P. ohirai individuals. A phylogenetic tree inferred from concatenated mitochondrial protein sequences of 50 strains encompassing 42 species of trematodes belonging to 14 families identified a monophyletic Paragonimidae in the class Trematoda. Characterization of additional mitogenomes in the genus Paragonimus will be useful for biomedical studies and development of molecular tools and mitochondrial markers for diagnostic, identification, hybridization and phylogenetic/epidemiological/evolutionary studies.


2021 ◽  
Author(s):  
Ali M.A. Maddi ◽  
Kaveh Kavousi ◽  
Masoud Arabfard ◽  
Hamid Ohadi ◽  
Mina Ohadi

Abstract Evolutionary divergence in cis-regulatory sequences impacts translation initiation sites (TISs). The implication of tandem repeats (TRs) in TIS selection remains elusive for the most part. Here we employed the TIS homology concept to study a possible link between all categories of TRs and TIS selection. Human and 83 other species were selected, and data was extracted on the entire protein-coding genes (n=1,611,368) and transcripts (n=2,730,515) annotated for those species from Ensembl 102. Two different weighing vectors were employed to assign TIS homology, and the results were assessed in 10-fold validation. On average, every TIS was flanked by 1.19 TRs of various categories within the 120 bp upstream sequence. We detected statistically significant excess of non-homologous TISs co-occurring with human-specific TRs, vice versa. We conclude that TRs are abundant cis elements in the upstream sequences of TISs across species, and there is a link between all categories of TRs and TIS selection.


1997 ◽  
Vol 45 (3) ◽  
pp. 265-270 ◽  
Author(s):  
Anna Pérez-Lezaun ◽  
Francesc Calafell ◽  
Mark Seielstad ◽  
Eva Mateu ◽  
David Comas ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document