scholarly journals Weighted Genomic Distance Can Hardly Impose a Bound on the Proportion of Transpositions

Author(s):  
Shuai Jiang ◽  
Max A. Alekseyev
Keyword(s):  
2008 ◽  
Vol 191 (1) ◽  
pp. 91-99 ◽  
Author(s):  
Marc Deloger ◽  
Meriem El Karoui ◽  
Marie-Agnès Petit

ABSTRACT The fundamental unit of biological diversity is the species. However, a remarkable extent of intraspecies diversity in bacteria was discovered by genome sequencing, and it reveals the need to develop clear criteria to group strains within a species. Two main types of analyses used to quantify intraspecies variation at the genome level are the average nucleotide identity (ANI), which detects the DNA conservation of the core genome, and the DNA content, which calculates the proportion of DNA shared by two genomes. Both estimates are based on BLAST alignments for the definition of DNA sequences common to the genome pair. Interestingly, however, results using these methods on intraspecies pairs are not well correlated. This prompted us to develop a genomic-distance index taking into account both criteria of diversity, which are based on DNA maximal unique matches (MUM) shared by two genomes. The values, called MUMi, for MUM index, correlate better with the ANI than with the DNA content. Moreover, the MUMi groups strains in a way that is congruent with routinely used multilocus sequence-typing trees, as well as with ANI-based trees. We used the MUMi to determine the relatedness of all available genome pairs at the species and genus levels. Our analysis reveals a certain consistency in the current notion of bacterial species, in that the bulk of intraspecies and intragenus values are clearly separable. It also confirms that some species are much more diverse than most. As the MUMi is fast to calculate, it offers the possibility of measuring genome distances on the whole database of available genomes.


2020 ◽  
Author(s):  
Jean-Charles Walter ◽  
Jerome Rech ◽  
Nils-Ole Walliser ◽  
Jerome Dorignac ◽  
Frederic Geniet ◽  
...  

2011 ◽  
Vol 24 (1) ◽  
pp. 82-86 ◽  
Author(s):  
Péter L. Erdős ◽  
Lajos Soukup ◽  
Jens Stoye

2014 ◽  
Vol 21 (8) ◽  
pp. 622-631 ◽  
Author(s):  
Nikita Alexeev ◽  
Peter Zograf

2008 ◽  
Vol 06 (01) ◽  
pp. 23-36 ◽  
Author(s):  
WEI XU

Based on a large repertoire of chromosomal rearrangement operations, the genomic distance d between two genomes with χr and χb linear chromosomes, respectively, both containing the same (or orthologous) n genes or markers, is d = n + max (χr,χb) - c, where c is the number of cycles in the breakpoint graph of the two genomes. In this paper, we study the exact probability distribution of c. We derive the expectation and variance, and show that, in the limit, the expectation of d is [Formula: see text].


Genetics ◽  
2004 ◽  
Vol 166 (1) ◽  
pp. 621-629 ◽  
Author(s):  
Richard Durrett ◽  
Rasmus Nielsen ◽  
Thomas L. York

2010 ◽  
Vol 4 (1) ◽  
Author(s):  
Tim J Dexter ◽  
David Sims ◽  
Costas Mitsopoulos ◽  
Alan Mackay ◽  
Anita Grigoriadis ◽  
...  

2019 ◽  
Author(s):  
Rafał Zaborowski ◽  
Bartek Wilczyński

AbstractHigh throughput Chromosome Conformation Capture experiments have become the standard technique to assess the structure and dynamics of chromosomes in living cells. As any other sufficiently advanced biochemical technique, Hi-C datasets are complex and contain multiple documented biases, with the main ones being the non-uniform read coverage and the decay of contact coverage with genomic distance. Both of these effects have been studied and there are published methods that are able to normalize different Hi-C data to mitigate these biases to some extent. It is crucial that this is done properly, or otherwise the results of any comparative analysis of two or more Hi-C experiments are bound to be biased. In this paper we study both mentioned biases present in the Hi-C data and show that normalization techniques aimed at alleviating the coverage bias are at the same time exacerbating the problems with contact decay bias. We also postulate that it is possible to use generalized linear models to directly compare non-normalized data an that it is giving better results in identification of differential contacts between Hi-C matrices than using the normalized data.


Author(s):  
Sampath Perumal ◽  
Chu Shin Koh ◽  
Lingling Jin ◽  
Miles Buchwaldt ◽  
Erin Higgins ◽  
...  

AbstractHigh-quality nanopore genome assemblies were generated for two Brassica nigra genotypes (Ni100 and CN115125); a member of the agronomically important Brassica species. The N50 contig length for the two assemblies were 17.1 Mb (58 contigs) and 0.29 Mb (963 contigs), respectively, reflecting recent improvements in the technology. Comparison with a de novo short read assembly for Ni100 corroborated genome integrity and quantified sequence related error rates (0.002%). The contiguity and coverage allowed unprecedented access to low complexity regions of the genome. Pericentromeric regions and coincidence of hypo-methylation enabled localization of active centromeres and identified a novel centromere-associated ALE class I element which appears to have proliferated through relatively recent nested transposition events (<1 million years ago). Computational abstraction was used to define a post-triplication Brassica specific ancestral genome and to calculate the extensive rearrangements that define the genomic distance separating B. nigra from its diploid relatives.


Sign in / Sign up

Export Citation Format

Share Document