Validation of De Novo Peptide Sequences with Bottom-Up Tag Convolution
De novo sequencing is indispensable for the analysis of proteins from organisms with unknown genomes, novel splice variants, and antibodies. However, despite a variety of methods developed to this end, distinguishing between the correct interpretation of a mass spectrum and a number of incorrect alternatives often remains a challenge. Tag convolution is computed for a set of peptide sequence tags of a fixed length k generated from the input tandem mass spectra and can be viewed as a generalization of the well-known spectral convolution. We demonstrate its utility for validating de novo peptide sequences by using a set of those generated by the algorithm PepNovo+ from high-resolution bottom-up data sets for carbonic anhydrase 2 and the Fab region of alemtuzumab and indicate its further potential applications.