homologizer: Phylogenetic phasing of gene copies into polyploid subgenomes
SummaryOrganisms such as allopolyploids and F1 hybrids contain multiple subgenomes, each potentially with its own evolutionary history. These organisms present a challenge for multilocus phylogenetic inference and other analyses since it is not apparent which gene copies from different loci are from the same subgenome.Here we introduce homologizer, a flexible Bayesian approach that uses a phylogenetic framework to infer the phasing of gene copies across loci into polyploid subgenomes.Through the use of simulation tests we demonstrate that homologizer is robust to a wide range of factors, such as the phylogenetic informativeness of loci and incomplete lineage sorting. Furthermore, we establish the utility of homologizer on real data, by analyzing a multilocus dataset consisting of nine diploids and 19 tetraploids from the fern family Cystopteridaceae.Finally, we describe how homologizer may potentially be used beyond its core phasing functionality to identify non-homologous sequences, such as hidden paralogs, contaminants, or allelic variation that was erroneously modelled as homeologous.