Self-Attention based model forde-novoantibiotic resistant gene classification with enhanced reliability for Out of Distribution data detection
AbstractAntibiotic resistance monitoring is of paramount importance in the face of this ongoing global epidemic. Using traditional alignment based methods to detect antibiotic resistant genes results in huge number of false negatives. In this paper, we introduce a deep learning model based on a self-attention architecture that can classify antibiotic resistant genes into correct classes with high precision and recall by just using protein sequences as input. Additionally, deep learning models trained with traditional optimization algorithms (e.g. Adam, SGD) provide poor posterior estimates when tested against Out-of-Distribution (OoD) antibiotic resistant/non-resistant genes. We train our model with an optimization method called Preconditioned Stochastic Gradient Langevin Dynamics (pSGLD) which provides reliable uncertainty estimates when tested against OoD data.