Revising transcriptome assemblies with phylogenetic information in Agalma1.0
AbstractMotivationOne of the most common transcriptome assembly errors is to mistake different transcripts of the same gene as transcripts from multiple closely related genes. It is difficult to identify these errors during assembly, but in a phylogenetic analysis these errors can be diagnosed from gene trees containing clades of tips from the same species with improbably short branch lengths.Resultstreeinform is a module implemented in Agalma1.0 that uses phylogenetic analyses across species to refine transcriptome assemblies. It identifies transcripts of the same gene that were incorrectly assigned to multiple genes and reassign them as transcripts of the same gene.Availability and Implementationtreeinform is implemented in Agalma1.0, available at https://bitbucket.org/caseywdunn/[email protected] informationSupplementary information is available at bioRxiv.