ABSTRACTA frequent event in the evolution of prokaryotic genomes is homologous recombination, where a foreign DNA stretch replaces a genomic region similar in sequence. Recombination can affect the relative position of two genomes in a phylogenetic reconstruction in two different ways: (i) one genome can recombine with a DNA stretch that is similar to the other genome, thereby reducing their pairwise sequence divergence; (ii) one genome can recombine with a DNA stretch from an outgroup genome, increasing the pairwise divergence. While several recombination-aware phylogenetic algorithms exist, many of these cannot account for both types of recombination; some algorithms can, but do so inefficiently. Moreover, many existing algorithms require that a substantial portion of each genome has not been affected by recombination, a sometimes unrealistic assumption. Here, we propose a novel coarse-graining approach for phylogenetic reconstruction (CGP), which is recombination-aware, applicable even if all genomic regions have experienced substantial amounts of recombination, and can be used on both nucleotide and amino acid sequences. CGP considers the local density of substitutions along pairwise genome alignments, fitting a model to the empirical distribution of substitution density to infer the pairwise coalescent time. Given all pairwise coalescent times, CGP reconstructs an ultrametric tree representing vertical inheritance. Based on simulations, we show that the proposed approach can reconstruct ultrametric trees with accurate topology, branch lengths, and root positioning. Applied to a set of E. coli strains, the reconstructed trees are most consistent with gene distributions when inferred from amino acid sequences, a data type that cannot be utilized by many alternative approaches.AUTHOR SUMMARYIn homologous recombination, segments of foreign DNA overwrite similar segments of a prokaryotic genome. A single recombination event can simultaneously introduce many DNA substitutions. This disturbs phylogenetic signals, making it difficult to reconstruct prokaryotic family trees. While a handful of recombination-aware phylogenetic algorithms have been proposed, most do not take all effects of recombination into account; others rely on the frequently unrealistic assumption that a substantial part of a genome has not been affected by recombination at all. Here, we introduce a novel approach to phylogenetic reconstruction, which estimates the age of the most recent common ancestor of two strains from the density distribution of DNA or amino acid substitutions between their genomes. The proposed phylogenetic tree is the tree most compatible with these age estimates. Based on nucleotide or amino acid sequences, our approach accurately predicts the topology, branch lengths, and root positioning of prokaryotic family trees.