AbstractErroneous data can creep into sequence datasets for reasons ranging from contamination to annotation and alignment mistakes. These errors can reduce the accuracy of downstream analyses such as tree inference and will diminish the confidence of the community in the results even when they do not impact the analysis. As datasets keep getting larger, it has become difficult to visually check for errors, and thus, automatic error detection methods are needed more than ever before. Alignment masking methods, which are widely used, completely remove entire aligned sites. Therefore, they may reduce signal as much as or more than they reduce the noise. An alternative is designing targeted methods that look for errors in small species-specific stretches of the alignment by detecting outliers. Crucially, such a method should attempt to distinguish the real heterogeneity, which includes signal, from errors. This type of error filtering is surprisingly under-explored. In this paper, we introduce TAPER, an automatic algorithm that looks for small stretches of error in sequence alignments. Our results show that TAPER removes very little data yet finds much of the error and cleans up the alignments.