PSEA: A phenotypic similarity ensemble approach for prioritizes candidate genes to aid mendelian disease diagnosis
Motivation: Next-generation sequencing is increasingly applied to the molecular diagnosis of genetic disorders. However, challenges for the interpretation of NGS data remain given the massive number of variants produced by NGS. Careful assessment is required to identify the most likely disease-causing variants that best match the patients' clinical phenotypes, which is highly experience-dependent and of low cost-effectiveness. Results: The human phenotype ontology (HPO) together with the information content (IC) are widely used for phenotypic similarity evaluation. Here, we introduce PSEA, a new phenotypic similarity evaluation tool capable of quantifying groups of HPO terms unbiasedly. By comparing with other methods, PSEA show optimal performance and show a higher tolerance to phenotypic noise or incompleteness. We also developed a web server for disease-causing gene prioritization and HPO-gene weighted linkage visualization. Availability: Source code and Web service are free available at https://github.com/zhonghua-wang/psea and https://phoenix.bgi.com/psea, respectively.