Despite the increasing feasibility of sequencing whole genomes from
diverse taxa, a persistent problem in phylogenomics is the selection of
appropriate markers or loci for a given taxonomic group or research
question. In this review, we aim to streamline the decision-making
process for selecting data types used in phylogenomic studies by
providing an introduction to commonly used types of genomic data, their
characteristics, and their associated uses in phylogenomics.
Specifically, we review the uses and features of ultraconserved elements
(UCEs; including flanking regions), anchored hybrid enrichment (AHE)
loci, conserved non-exonic elements (CNEE), untranslated regions (UTRs),
introns, exons, mitochondrial DNA (mtDNA), single nucleotide
polymorphisms (SNPs), and anonymous regions (nonspecific regions of the
genome that are evenly or randomly distributed across the genome). These
various data types differ in their mutation rates, likelihood of
neutrality or of being strongly linked to loci under selection, and mode
of inheritance, each of which are important considerations in
phylogenomic reconstruction. These features give each genomic region or
data type important advantages and disadvantages, depending on the
biological question, number of taxa, evolutionary timescale, and
analytical methods used. We provide a clear and concise outline (Table
1) as a resource to efficiently consider relevant and key aspects of
each data type in order. As there are a number of factors to consider
when designing phylogenomic studies, this review may serve as a primer
when weighing options between multiple potential phylogenomic data
types.