Fungal Genomes: Suffering with Functional Annotation Errors
Abstract Background The genome sequencing data are accumulating at a rapid pace, with the current genome sequence data of more than 5780 species being publicly available at the National Center for Biotechnology Information (NCBI) database alone. However, for the researcher communities to use these data, an error-free functional annotation report is a must. Results Analyses of the whole proteome sequence data of 689 fungal species (7.15 million protein sequences) to find the presence of functional annotation error in several species. Hence, calcium dependent protein kinases (CDPKs) and selenoproteins were targeted for the analysis as it is absent all across the fungi kingdom. The analyses revealed the presence of protein with the functional annotation name CDPK. InterproScan analysis revealed that, none of the protein sequences tagged with name “calcium dependent protein kinase” was found to encode calcium binding EF-hands at the regulatory domain. Similarly, none of a protein sequences with annotation name associated with “selenocysteine” was found to encode Sec (U) amino acid. Conclusion The presence of naming of such functional annotation errors in the fungal kingdom is raised a great concern and need to address it at the earliest possible time.