scholarly journals The Complex Repeats of Dictyostelium discoideum

2001 ◽  
Vol 11 (4) ◽  
pp. 585-594
Author(s):  
Gernot Glöckner ◽  
Karol Szafranski ◽  
Thomas Winckler ◽  
Theodor Dingermann ◽  
Michael A. Quail ◽  
...  

In the course of determining the sequence of the Dictyostelium discoideum genome we have characterized in detail the quantity and nature of interspersed repetitive elements present in this species. Several of the most abundant small complex repeats and transposons (DIRS-1; TRE3-A,B; TRE5-A; skipper; Tdd-4; H3R) have been described previously. In our analysis we have identified additional elements. Thus, we can now present a complete list of complex repetitive elements in D. discoideum. All elements add up to 10% of the genome. Some of the newly described elements belong to established classes (TRE3-C, D; TRE5-B,C; DGLT-A,P; Tdd-5). However, we have also defined two new classes of DNA transposable elements (DDT and thug) that have not been described thus far. Based on the nucleotide amount, we calculated the least copy number in each family. These vary between <10 up to >200 copies. Unique sequences adjacent to the element ends and truncation points in elements gave a measure for the fragmentation of the elements. Furthermore, we describe the diversity of single elements with regard to polymorphisms and conserved structures. All elements show insertion preference into loci in which other elements of the same family reside. The analysis of the complex repeats is a valuable data resource for the ongoing assembly of whole D. discoideum chromosomes.[The sequence data described in this paper have been submitted to the GenBank data library under accession nos. AF135841, AF298201, AF298202, AF298203, AF298204,AF298205, AF298206, AF298207, AF298208, AF298209, AF298210 and AF298624.]

1999 ◽  
Vol 9 (2) ◽  
pp. 121-129
Author(s):  
Alan J. Davidson ◽  
John H. Postlethwait ◽  
Yi-Lin Yan ◽  
David R. Beier ◽  
Cherie van Doren ◽  
...  

The Growth/differentiation factor (Gdf)5, 6, 7 genes form a closely related subgroup belonging to the TGF-β superfamily. In zebrafish, there are three genes that belong to the Gdf5, 6, 7subgroup that have been named radar, dynamo, andcontact. The genes radar and dynamo both encode proteins most similar to mouse GDF6. The orthologous identity of these genes on the basis of amino acid similarities has not been clear. We have identified gdf7, a fourth zebrafish gene belonging to the Gdf5, 6, 7 subgroup. To assign correct orthologies and to investigate the evolutionary relationships of the human, mouse, and zebrafish Gdf5, 6, 7subgroup, we have compared genetic map positions of the zebrafish and mammalian genes. We have mapped zebrafish gdf7 to linkage group (LG) 17, contact to LG9, GDF6 to human chromosome (Hsa) 8 and GDF7 to Hsa2p. The radar anddynamo genes have been localized previously to LG16 and LG19, respectively. A comparison of syntenies shared among human, mouse, and zebrafish genomes indicates that gdf7 is the ortholog of mammalian GDF7/Gdf7. LG16 shares syntenic relationships with mouse chromosome (Mmu) 4, including Gdf6. Portions of LG16 and LG19 appear to be duplicate chromosomes, thus suggesting thatradar and dynamo are both orthologs of Gdf6. Finally, the mapping data is consistent with contact being the zebrafish ortholog of mammalian GDF5/Gdf5.[The sequence data described in this paper have been submitted to the GenBank data library under accession numbers AF113022 and AF113023.]


1999 ◽  
Vol 9 (2) ◽  
pp. 130-136 ◽  
Author(s):  
John Murray ◽  
Jérôme Buard ◽  
David L. Neil ◽  
Edouard Yeramian ◽  
Keiji Tamaki ◽  
...  

The highly variable human minisatellites MS32 (D1S8), MS31A (D7S21), and CEB1 (D2S90) all show recombination-based repeat instability restricted to the germline. Mutation usually results in polar interallelic conversion or occasionally in crossovers, which, at MS32 at least, extend into DNA flanking the repeat array, defining a localized recombination hotspot and suggesting that cis-acting elements in flanking DNA can influence repeat instability. Therefore, comparative sequence analysis was performed to search for common flanking elements associated with these unstable loci. All three minisatellites are located in GC-rich DNA abundant in dispersed and tandem repetitive elements. There were no significant sequence similarities between different loci upstream of the unstable end of the repeat array. Only one of the three loci showed clear evidence for putative coding sequences near the minisatellite. No consistent patterns of thermal stability or DNA secondary structure were shared by DNA flanking these loci. This work extends previous data on the genomic environment of minisatellites. In addition, this work suggests that recombinational activity is not controlled by primary or secondary characteristics of the DNA sequence flanking the repeat array and is not obviously associated with gene promoters as seen in yeast.[The sequence data described in this paper have been submitted to the GenBank data library under accession nos. AF048727(CEB1), AF048728 (MS31A), and AF048729 (MS32).]


1999 ◽  
Vol 9 (7) ◽  
pp. 647-653 ◽  
Author(s):  
Fabienne Giraudeau ◽  
Elisabeth Petit ◽  
Hervé Avet-Loiseau ◽  
Yolande Hauck ◽  
Gilles Vergnaud ◽  
...  

Microsatellites and minisatellites are two classes of tandem repeat sequences differing in their size, mutation processes, and chromosomal distribution. The boundary between the two classes is not defined. We have developed a convenient, hybridization-based human library screening procedure able to detect long CA-rich sequences. Analysis of cosmid clones derived from a chromosome 1 library show that cross-hybridizing sequences tested are imperfect CA-rich sequences, some of them showing a minisatellite organization. All but one of the 13 positive chromosome 1 clones studied are localized in chromosomal bands to which minisatellites have previously been assigned, such as the 1pter cluster. To test the applicability of the procedure to minisatellite detection on a larger scale, we then used a large-insert whole-genome PAC library. Altogether, 22 new minisatellites have been identified in positive PAC and cosmid clones and 20 of them are telomeric. Among the 42 positive PAC clones localized within the human genome by FISH and/or linkage analysis, 25 (60%) are assigned to a terminal band of the karyotype, 4 (9%) are juxtacentromeric, and 13 (31%) are interstitial. The localization of at least two of the interstitial PAC clones corresponds to previously characterized minisatellite-containing regions and/or ancestrally telomeric bands, in agreement with this minisatellite-like distribution. The data obtained are in close agreement with the parallel investigation of human genome sequence data and suggest that long human (CA)s are imperfect CA repeats belonging to the minisatellite class of sequences. This approach provides a new tool to efficiently target genomic clones originating from subtelomeric domains, from which minisatellite sequences can readily be obtained.[The sequence data described in this paper have been submitted to the EMBL data library under accession nos.AJ000377–AJ000383.]


2000 ◽  
Vol 10 (12) ◽  
pp. 2030-2043
Author(s):  
Justen Andrews ◽  
Gerard G. Bouffard ◽  
Chris Cheadle ◽  
Jining Lü ◽  
Kevin G. Becker ◽  
...  

Identification and annotation of all the genes in the sequencedDrosophila genome is a work in progress. Wild-type testis function requires many genes and is thus of potentially high value for the identification of transcription units. We therefore undertook a survey of the repertoire of genes expressed in the Drosophilatestis by computational and microarray analysis. We generated 3141 high-quality testis expressed sequence tags (ESTs). Testis ESTs computationally collapsed into 1560 cDNA set used for further analysis. Of those, 11% correspond to named genes, and 33% provide biological evidence for a predicted gene. A surprising 47% fail to align with existing ESTs and 16% with predicted genes in the current genome release. EST frequency and microarray expression profiles indicate that the testis mRNA population is highly complex and shows an extended range of transcript abundance. Furthermore, >80% of the genes expressed in the testis showed onefold overexpression relative to ovaries, or gonadectomized flies. Additionally, >3% showed more than threefold overexpression at p <0.05. Surprisingly, 22% of the genes most highly overexpressed in testis matchDrosophila genomic sequence, but not predicted genes. These data strongly support the idea that sequencing additional cDNA libraries from defined tissues, such as testis, will be important tools for refined annotation of the Drosophila genome. Additionally, these data suggest that the number of genes in Drosophila will significantly exceed the conservative estimate of 13,601.[The sequence data described in this paper have been submitted to the dbEST data library under accession nos.AI944400–AI947263 and BE661985–BE662262.][The microarray data described in this paper have been submitted to the GEO data library under accession nos. GPLS, GSM3–GSM10.]


1999 ◽  
Vol 9 (2) ◽  
pp. 103-120 ◽  
Author(s):  
Ann E. Sluder ◽  
Siuyien Wong Mathews ◽  
David Hough ◽  
Viravuth P. Yin ◽  
Claude V. Maina

The nuclear receptor (NR) superfamily is the most abundant class of transcriptional regulators encoded in the Caenorhabditis elegans genome, with >200 predicted genes revealed by the screens and analysis of genomic sequence reported here. This is the largest number of NR genes yet described from a single species, although our analysis of available genomic sequence from the related nematode Caenorhabditis briggsae indicates that it also has a large number. Existing data demonstrate expression for 25% of theC. elegans NR sequences. Sequence conservation and statistical arguments suggest that the majority represent functional genes. An analysis of these genes based on the DNA-binding domain motif revealed that several NR classes conserved in both vertebrates and insects are also represented among the nematode genes, consistent with the existence of ancient NR classes shared among most, and perhaps all, metazoans. Most of the nematode NR sequences, however, are distinct from those currently known in other phyla, and reveal a previously unobserved diversity within the NR superfamily. In C. elegans, extensive proliferation and diversification of NR sequences have occurred on chromosome V, accounting for > 50% of the predicted NR genes.[The sequence data described in this paper have been submitted to the GenBank data library under accession nos.AF083222–AF083225 and AF083251–AF083234.]


2019 ◽  
Author(s):  
Ekaterina Osipova ◽  
Nikolai Hecker ◽  
Michael Hiller

AbstractTransposons and other repetitive sequences make up a large part of complex genomes. Repetitive sequences can be co-opted into a variety of functions and thus provide a source for evolutionary novelty. However, comprehensively detecting ancestral repeats that align between species is difficult since considering all repeat-overlapping seeds in alignment methods that rely on the seed-and-extend heuristic results in prohibitively high runtimes. Here, we show that ignoring repeat-overlapping alignment seeds when aligning entire genomes misses numerous alignments between repetitive elements. We present a tool – RepeatFiller – that improves genome alignments by incorporating previously-undetected local alignments between repetitive sequences. By applying RepeatFiller to genome alignments between human and 20 other representative mammals, we uncover between 22 and 84 megabases of previously-undetected alignments that mostly overlap transposable elements. We further show that the increased alignment coverage improves the annotation of conserved non-exonic elements, both by discovering numerous novel transposon-derived elements that evolve under constraint and by removing thousands of elements that are not under constraint in placental mammals. In conclusion, RepeatFiller contributes to comprehensively aligning repetitive genomic regions, which facilitates studying transposon co-option and genome evolution.Source codehttps://github.com/hillerlab/GenomeAlignmentTools


PLoS Genetics ◽  
2014 ◽  
Vol 10 (4) ◽  
pp. e1004298 ◽  
Author(s):  
Concepcion M. Diez ◽  
Esteban Meca ◽  
Maud I. Tenaillon ◽  
Brandon S. Gaut

2016 ◽  
Vol 10 (1) ◽  
Author(s):  
Ana R. Cardoso ◽  
Manuela Oliveira ◽  
Antonio Amorim ◽  
Luisa Azevedo

2014 ◽  
Vol 64 (Pt_2) ◽  
pp. 316-324 ◽  
Author(s):  
Jongsik Chun ◽  
Fred A. Rainey

The polyphasic approach used today in the taxonomy and systematics of the Bacteria and Archaea includes the use of phenotypic, chemotaxonomic and genotypic data. The use of 16S rRNA gene sequence data has revolutionized our understanding of the microbial world and led to a rapid increase in the number of descriptions of novel taxa, especially at the species level. It has allowed in many cases for the demarcation of taxa into distinct species, but its limitations in a number of groups have resulted in the continued use of DNA–DNA hybridization. As technology has improved, next-generation sequencing (NGS) has provided a rapid and cost-effective approach to obtaining whole-genome sequences of microbial strains. Although some 12 000 bacterial or archaeal genome sequences are available for comparison, only 1725 of these are of actual type strains, limiting the use of genomic data in comparative taxonomic studies when there are nearly 11 000 type strains. Efforts to obtain complete genome sequences of all type strains are critical to the future of microbial systematics. The incorporation of genomics into the taxonomy and systematics of the Bacteria and Archaea coupled with computational advances will boost the credibility of taxonomy in the genomic era. This special issue of International Journal of Systematic and Evolutionary Microbiology contains both original research and review articles covering the use of genomic sequence data in microbial taxonomy and systematics. It includes contributions on specific taxa as well as outlines of approaches for incorporating genomics into new strain isolation to new taxon description workflows.


Genome ◽  
2009 ◽  
Vol 52 (3) ◽  
pp. 217-221 ◽  
Author(s):  
Xia Shen ◽  
Bruce Walsh ◽  
Jing J. Li ◽  
Hong X. Pang ◽  
Wen J. Wang ◽  
...  

While many studies of cis-elements CArG bound by serum response factor (SRF) are in progress, little is known about the positional distribution of the functional CArG elements around the transcription start site (TSS) of genes that they influence. We use a validated CArG data set to calculate the distance distribution of functional CArG elements around the TSS. Distances between adjacent CArGs were also analyzed. We compare these distributions with those derived using a control set of randomly selected CArGs (that were not experimentally validated for function). Our results show that most functional CArG elements (108 of 152, 71%) exist upstream of the annotated TSS, with copy number increasing as one moves closer to the TSS. Moreover, the average number of the CArG elements in the CArG-containing genes is significantly more than that in the control genes. Our study extends earlier bioinformatic analyses of functional CArG elements and provides an application of comparative sequence data to the identification of transcription factor binding sites.


Sign in / Sign up

Export Citation Format

Share Document