Interpretation of Maturity-Onset Diabetes of the Young Genetic Variants Based on American College of Medical Genetics and Genomics Criteria: Machine-Learning Model Development (Preprint)
BACKGROUND Maturity-onset diabetes of the young (MODY) is a group of dominantly inherited monogenic diabetes, with <i>HNF4A</i>-MODY, <i>GCK</i>-MODY, and <i>HNF1A</i>-MODY as the three most common forms based on the causal genes. Molecular diagnosis of MODY is important for precise treatment. Although a DNA variant causing MODY can be assessed based on the criteria of the American College of Medical Genetics and Genomics (ACMG) guidelines, gene-specific assessment of disease-causing mutations is important to differentiate among MODY subtypes. As the ACMG criteria were not originally designed for machine-learning algorithms, they are not true independent variables. OBJECTIVE The aim of this study was to develop machine-learning models for interpretation of DNA variants and MODY diagnosis using the ACMG criteria. METHODS We applied machine-learning models for interpretation of DNA variants in MODY genes defined by the ACMG criteria based on the Human Gene Mutation Database (HGMD) and ClinVar database. RESULTS With a machine-learning procedure, we found that the weight matrix of the ACMG criteria was significantly different between the three MODY genes <i>HNF1A</i>, <i>HNF4A</i>, and <i>GCK</i>. The models showed high predictive abilities with accuracy over 95%. CONCLUSIONS Our results highlight the need for applying different weights of the ACMG criteria in relation to different MODY genes for accurate functional classification. As proof of principle, we applied the ACMG criteria as feature vectors in a machine-learning model and obtained a precision-based result.