motif length
Recently Published Documents


TOTAL DOCUMENTS

18
(FIVE YEARS 7)

H-INDEX

4
(FIVE YEARS 2)

2021 ◽  
Vol 12 ◽  
Author(s):  
Yuanyuan Xu ◽  
Miaomiao Xing ◽  
Lixiao Song ◽  
Jiyong Yan ◽  
Wenjiang Lu ◽  
...  

Cabbage (Brassica oleracea L. var. capitata) accounts for a critical vegetable crop belonging to Brassicaceae family, and it has been extensively planted worldwide. Simple sequence repeats (SSRs), the markers with high polymorphism and co-dominance degrees, offer a crucial genetic research resource. The current work identified totally 64,546 perfect and 93,724 imperfect SSR motifs in the genome of the cabbage ‘TO1000.’ Then, we divided SSRs based on the respective overall length and repeat number into different linkage groups. Later, we characterized cabbage genomes from the perspectives of motif length, motif-type classified and SSR level, and compared them across cruciferous genomes. Furthermore, a large set of 64,546 primer pairs were successfully identified, which generated altogether 1,113 SSR primers, including 916 (82.3%) exhibiting repeated and stable amplification. In addition, there were 32 informative SSR markers screened, which might decide 32 cabbage genotypes for their genetic diversity, with level of polymorphism information of 0.14–0.88. Cultivars were efficiently identified by the new strategy designating manual diagram for identifying cultivars. Lastly, 32 cabbage accessions were clearly separately by five Bol-SSR markers. Besides, we verified whether such SSRs were available and transferable in 10 Brassicaceae relatives. Based on the above findings, those genomic SSR markers identified in the present work may facilitate cabbage research, which lay a certain foundation for further gene tagging and genetic linkage analyses, like marker-assisted selection, genetic mapping, as well as comparative genomic analysis.


Horticulturae ◽  
2021 ◽  
Vol 7 (6) ◽  
pp. 143
Author(s):  
Lei Zhu ◽  
Huayu Zhu ◽  
Yanman Li ◽  
Yong Wang ◽  
Xiangbin Wu ◽  
...  

Simple sequence repeats (SSRs) are widely used in mapping constructions and comparative and genetic diversity analyses. Here, 103,056 SSR loci were found in Cucurbita species by in silico PCR. In general, the frequency of these SSRs decreased with the increase in the motif length, and di-nucleotide motifs were the most common type. For the same repeat types, the SSR frequency decreased sharply with the increase in the repeat number. The majority of the SSR loci were suitable for marker development (84.75% in Cucurbita moschata, 94.53% in Cucurbita maxima, and 95.09% in Cucurbita pepo). Using these markers, the cross-species transferable SSR markers between C. pepo and other Cucurbitaceae species were developed, and the complicated mosaic relationships among them were analyzed. Especially, the main syntenic relationships between C. pepo and C. moschata or C. maxima indicated that the chromosomes in the Cucurbita genomes were highly conserved during evolution. Furthermore, 66 core SSR markers were selected to measure the genetic diversity in 61 C. pepo germplasms, and they were divided into two groups by structure and unweighted pair group method with arithmetic analysis. These results will promote the utilization of SSRs in basic and applied research of Cucurbita species.


PLoS ONE ◽  
2021 ◽  
Vol 16 (5) ◽  
pp. e0248841
Author(s):  
Denys Bulavka ◽  
Ariel A. Aptekmann ◽  
Nicolás A. Méndez ◽  
Teresa Krick ◽  
Ignacio E. Sánchez

Linear motifs are short protein subsequences that mediate protein interactions. Hundreds of motif classes including thousands of motif instances are known. Our theory estimates how many motif classes remain undiscovered. As commonly done, we describe motif classes as regular expressions specifying motif length and the allowed amino acids at each motif position. We measure motif specificity for a pair of motif classes by quantifying how many motif-discriminating positions prevent a protein subsequence from matching the two classes at once. We derive theorems for the maximal number of motif classes that can simultaneously maintain a certain number of motif-discriminating positions between all pairs of classes in the motif universe, for a given amino acid alphabet. We also calculate the fraction of all protein subsequences that would belong to a motif class if all potential motif classes came into existence. Naturally occurring pairs of motif classes present most often a single motif-discriminating position. This mild specificity maximizes the potential number of coexisting motif classes, the expansion of the motif universe due to amino acid modifications and the fraction of amino acid sequences that code for a motif instance. As a result, thousands of linear motif classes may remain undiscovered.


2020 ◽  
Author(s):  
Lei Zhu ◽  
Hua yu Zhu ◽  
Yan man Li ◽  
Xiang bin Wu ◽  
Jin tao Li ◽  
...  

Abstract Background The Cucurbita genus contains important economic crops in the world, while limited molecular markers have been developed in the past years. Simple sequence repeats (SSR) markers are powerful tools for the study of genetic mapping construction, genetic diversity analysis and genome wide association. The availability of pumpkin genome information has made it possible to analyze SSRs in genome wide across three Cucurbita species. Results In this paper, based on the whole genome sequences, 34,375 SSR loci were found in C. moschata, 30,577 SSR loci were found in C. maxima and 38,104 SSR loci were found in C. pepo. C. pepo has the maximum density of SSRs with an average of 145 SSR/Mb. In general, the frequency in total SSR loci decreased with the increase of the motif length, dinucleotide motifs were the most common motifs in the three species, and for the same repeat types, the SSR frequency decreased sharply with the increase of the repeat number. Most of those SSR loci were suitable for marker development (84.75% in C. moscata, 94.53% in C. maxima and 95.09% in C. pepo). Based on those markers, we compared and analyzed the cross-species SSR markers between C. pepo and other Cucurbitaceae species by silico-PCR. Using these cross-species primers, the high collinear relationships between C. pepo and the other two species were detected, respectively. Furthermore, the application of SSR markers in genetic diversity analysis was tested in C. pepo, the results showed that they were good tools to be used in genetic diversity analysis. Conclusion In this study, the genome wide SSR markers were detected from three Cucurbita species, and some of their applications were proved by comparative genomics and genetic diversity analysis. The large number of genome-wide SSR markers and crossspecies markers would promote the basic and applied studies of Cucurbita species, such as gene mapping, QTLs mapping, comparative genomics and marker-assisted breeding.


Author(s):  
Navaneethakrishna Makaram ◽  
Ramakrishnan Swaminathan

Exercise-induced muscle damage is a condition which results in the loss of muscle function due to overexertion. Muscle fatigue is a precursor of this phenomenon. The characterization of muscle fatigue plays a crucial role in preventing muscle damage. In this work, an attempt is made to develop signal processing methods to understand the dynamics of the muscle’s electrical properties. Surface electromyography signals are recorded from 50 healthy adult volunteers under dynamic curl exercise. The signals are preprocessed, and the first difference signal is computed. Furthermore, ascending and descending slopes are used to generate a binary sequence. The binary sequence of various motif lengths is analyzed using features such as the average symbolic occurrence, modified Shannon entropy, chi-square value, time irreversibility, maximum probability of pattern and forbidden pattern ratio. The progression of muscle fatigue is assessed using trend analysis techniques. The motif length is optimized to maximize the rho value of features. In addition, the first and the last zones of the signal are compared with standard statistical tests. The results indicate that the recorded signals differ in both frequency and amplitude in both inter- and intra-subjects along the period of the experiment. The binary sequence generated has information related to the complexity of the signal. The presence of more repetitive patterns across the motif lengths in the case of fatigue indicates that the signal has lower complexity. In most cases, larger motif length resulted in better rho values. In a comparison of the first and the last zones, most of the extracted features are statistically significant with p < 0.05. It is observed that at the motif length of 13 all the extracted features are significant. This analysis method can be extended to diagnose other neuromuscular conditions.


2019 ◽  
Vol 20 (20) ◽  
pp. 5111 ◽  
Author(s):  
María Rodríguez ◽  
Belén Molina ◽  
Manuel Merlo ◽  
Alberto Arias-Pérez ◽  
Silvia Portela-Bens ◽  
...  

Solea senegalensis is a flatfish belonging to the Soleidae family within the Pleuronectiformes order. It has a karyotype of 2n = 42 (FN = 60; 6M + 4 SM + 8 St + 24 T) and a XX/XY system. The first pair of metacentric chromosomes has been proposed as a proto sex-chromosome originated by a Robertsonian fusion between acrocentric chromosomes. In order to elucidate a possible evolutionary origin of this chromosome 1, studies of genomic synteny were carried out with eight fish species. A total of 88 genes annotated within of 14 BACs located in the chromosome 1 of S. senegalensis were used to elaborate syntenic maps. Six BACs (BAC5K5, BAC52C17, BAC53B20, BAC84K7, BAC56H24, and BAC48P7) were distributed in, at least, 5 chromosomes in the species studied, and a group of four genes from BAC53B20 (grsf1, rufy3, slc4a4 and npffr2) and genes from BAC48K7 (dmrt2, dmrt3, dmrt1, c9orf117, kank1 and fbp1) formed a conserved cluster in all species. The analysis of repetitive sequences showed that the number of retroelements and simple repeat per BAC showed its highest value in the subcentromeric region where 53B20, 16E16 and 48K7 BACs were localized. This region contains all the dmrt genes, which are associated with sex determination in some species. In addition, the presence of a satellite “chromosome Y” (motif length: 860 bp) was detected in this region. These findings allowed to trace an evolutionary trend for the large metacentric chromosome of S. senegalensis, throughout different rearrangements, which could be at an initial phase of differentiation as sex chromosome.


eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
Vandita D Bhat ◽  
Kathleen L McCann ◽  
Yeming Wang ◽  
Dallas R Fonseca ◽  
Tarjani Shukla ◽  
...  

PUF (PUmilio/FBF) RNA-binding proteins recognize distinct elements. In C. elegans, PUF-8 binds to an 8-nt motif and restricts proliferation in the germline. Conversely, FBF-2 recognizes a 9-nt element and promotes mitosis. To understand how motif divergence relates to biological function, we first determined a crystal structure of PUF-8. Comparison of this structure to that of FBF-2 revealed a major difference in a central repeat. We devised a modified yeast 3-hybrid screen to identify mutations that confer recognition of an 8-nt element to FBF-2. We identified several such mutants and validated structurally and biochemically their binding to 8-nt RNA elements. Using genome engineering, we generated a mutant animal with a substitution in FBF-2 that confers preferential binding to the PUF-8 element. The mutant largely rescued overproliferation in animals that spontaneously generate tumors in the absence of puf-8. This work highlights the critical role of motif length in the specification of biological function.


2018 ◽  
Author(s):  
Yang Li ◽  
Pengyu Ni ◽  
Shaoqiang Zhang ◽  
Guojun Li ◽  
Zhengchang Su

ABSTRACTThe availability of a large volume of chromatin immunoprecipitation followed by sequencing (ChIP-seq) datasets for various transcription factors (TF) has provided an unprecedented opportunity to identify all functional TF binding motifs clustered in the enhancers in genomes. However, the progress has been largely hindered by the lack of a highly efficient and accurate tool that is fast enough to find not only the target motifs, but also cooperative motifs contained in very large ChIP-seq datasets with a binding peak length of typical enhancers (∼ 1,000 bp). To circumvent this hurdle, we herein present an ultra-fast and highly accurate motif-finding algorithm, ProSampler, with automatic motif length detection. ProSampler first identifies significant k-mers in the dataset and combines highly similar significant k-mers to form preliminary motifs. ProSampler then merges preliminary motifs with subtle similarity using a novel graph-based Gibbs sampler to find core motifs. Finally, ProSampler extends the core motifs by applying a two-proportion z-test to the flanking positions to identify motifs longer than k. As the number of preliminary motifs is much smaller than that of k-mers in a dataset, we greatly reduce the search space of the Gibbs sampler compared with conventional ones. By storing flanking sequences in a hash table, we avoid extensive IO and the necessity of examining all lengths of motifs in an interval. When evaluated on both synthetic and real ChIP-seq datasets, ProSampler runs orders of magnitude faster than the fastest existing tools while more accurately discovering primary motifs as well as cooperative motifs than do the best existing tools. Using ProSampler, we revealed previously unknown complex motif occurrence patterns in large ChIP-seq datasets, thereby providing insights into the mechanisms of cooperative TF binding for gene transcriptional regulation. Therefore, by allowing fast and accurate mining of the entire ChIP-seq datasets, ProSampler can greatly facilitate the efforts to identify the entire cis-regulatory code in genomes.


2018 ◽  
Author(s):  
Chathurani Ranathunge ◽  
Gregory L. Wheeler ◽  
Melody E. Chimahusky ◽  
Andy D. Perkins ◽  
Sreepriya Pramod ◽  
...  

ABSTRACTMicrosatellites are common in most species. While an adaptive role for these highly mutable regions has been considered, little is known concerning their contribution towards phenotypic variation. We used populations of the common sunflower (Helianthus annuus) at two latitudes to quantify the effect of microsatellite allele length on phenotype at the level of gene expression. We conducted a common garden experiment with seed collected from sunflower populations in Kansas and Oklahoma followed by an RNA-Seq experiment on 95 individuals. The effect of microsatellite allele length on gene expression was assessed across 3325 microsatellites that could be consistently scored. Our study revealed 479 microsatellites at which allele length significantly correlates with gene expression (eSTRs). When irregular allele sizes not conforming to the motif length were removed, the number of eSTRs rose to 2379. The percentage of variation in gene expression explained by eSTRs ranged from 1–86% when controlling for population and allele-by-population interaction effects at the 479 eSTRs. Of these, 70.4% are in untranslated regions (UTRs). A Gene Ontology (GO) analysis revealed that eSTRs are significantly enriched for GO terms associated with cis- and trans-regulatory processes. These findings suggest that a substantial number of transcribed microsatellites can influence gene expression.


Sign in / Sign up

Export Citation Format

Share Document