scholarly journals Deep Learning on High-Throughput Transcriptomics to Predict Drug-Induced Liver Injury

Author(s):  
Ting Li ◽  
Weida Tong ◽  
Ruth Roberts ◽  
Zhichao Liu ◽  
Shraddha Thakkar

Drug-induced liver injury (DILI) is one of the most cited reasons for the high drug attrition rate and drug withdrawal from the market. The accumulated large amount of high throughput transcriptomic profiles and advances in deep learning provide an unprecedented opportunity to improve the suboptimal performance of DILI prediction. In this study, we developed an eight-layer Deep Neural Network (DNN) model for DILI prediction using transcriptomic profiles of human cell lines (LINCS L1000 dataset) with the current largest binary DILI annotation data [i.e., DILI severity and toxicity (DILIst)]. The developed models were evaluated by Monte Carlo cross-validation (MCCV), permutation test, and an independent validation (IV) set. The developed DNN model achieved the area under the receiver operating characteristic curve (AUC) of 0.802 and 0.798, and balanced accuracy of 0.741 and 0.721 for training and an IV set, respectively, outperforming the conventional machine learning algorithms, including K-nearest neighbors (KNN), Support Vector Machine (SVM), and Random Forest (RF). Moreover, the developed DNN model provided a more balanced sensitivity of 0.839 and specificity of 0.603. Besides, we found the developed DNN model had a superior predictive performance for oncology drugs. Also, the functional and network analysis of genes driving the predictions revealed their relevance to the underlying mechanisms of DILI. The proposed DNN model could be a promising tool for early detection of DILI potential in the pre-clinical setting.

Author(s):  
Robert Ancuceanu ◽  
Marilena Viorica Hovanet ◽  
Adriana Iuliana Anghel ◽  
Florentina Furtunescu ◽  
Monica Neagu ◽  
...  

Drug induced liver injury (DILI) remains one of the challenges in the safety profile of both authorized drugs and candidate drugs and predicting hepatotoxicity from the chemical structure of a substance remains a challenge worth pursuing, being also coherent with the current tendency for replacing non-clinical tests with in vitro or in silico alternatives. In 2016 a group of researchers from FDA published an improved annotated list of drugs with respect to their DILI risk, constituting “the largest reference drug list ranked by the risk for developing drug-induced liver injury in humans”, DILIrank. This paper is one of the few attempting to predict liver toxicity using the DILIrank dataset. Molecular descriptors were computed with the Dragon 7.0 software, and a variety of feature selection and machine learning algorithms were implemented in the R computing environment. Nested (double) cross-validation was used to externally validate the models selected. A number of 78 models with reasonable performance have been selected and stacked through several approaches, including the building of multiple meta-models. The performance of the stacked models was slightly superior to other models published. The models were applied in a virtual screening exercise on over 100,000 compounds from the ZINC database and about 20% of them were predicted to be non-hepatotoxic.


2020 ◽  
Vol 21 (6) ◽  
pp. 2114
Author(s):  
Robert Ancuceanu ◽  
Marilena Viorica Hovanet ◽  
Adriana Iuliana Anghel ◽  
Florentina Furtunescu ◽  
Monica Neagu ◽  
...  

Drug-induced liver injury (DILI) remains one of the challenges in the safety profile of both authorized and candidate drugs, and predicting hepatotoxicity from the chemical structure of a substance remains a task worth pursuing. Such an approach is coherent with the current tendency for replacing non-clinical tests with in vitro or in silico alternatives. In 2016, a group of researchers from the FDA published an improved annotated list of drugs with respect to their DILI risk, constituting “the largest reference drug list ranked by the risk for developing drug-induced liver injury in humans” (DILIrank). This paper is one of the few attempting to predict liver toxicity using the DILIrank dataset. Molecular descriptors were computed with the Dragon 7.0 software, and a variety of feature selection and machine learning algorithms were implemented in the R computing environment. Nested (double) cross-validation was used to externally validate the models selected. A total of 78 models with reasonable performance were selected and stacked through several approaches, including the building of multiple meta-models. The performance of the stacked models was slightly superior to other models published. The models were applied in a virtual screening exercise on over 100,000 compounds from the ZINC database and about 20% of them were predicted to be non-hepatotoxic.


2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Anika Liu ◽  
Moritz Walter ◽  
Peter Wright ◽  
Aleksandra Bartosik ◽  
Daniela Dolciami ◽  
...  

Abstract Background Drug-induced liver injury (DILI) is a major safety concern characterized by a complex and diverse pathogenesis. In order to identify DILI early in drug development, a better understanding of the injury and models with better predictivity are urgently needed. One approach in this regard are in silico models which aim at predicting the risk of DILI based on the compound structure. However, these models do not yet show sufficient predictive performance or interpretability to be useful for decision making by themselves, the former partially stemming from the underlying problem of labeling the in vivo DILI risk of compounds in a meaningful way for generating machine learning models. Results As part of the Critical Assessment of Massive Data Analysis (CAMDA) “CMap Drug Safety Challenge” 2019 (http://camda2019.bioinf.jku.at), chemical structure-based models were generated using the binarized DILIrank annotations. Support Vector Machine (SVM) and Random Forest (RF) classifiers showed comparable performance to previously published models with a mean balanced accuracy over models generated using 5-fold LOCO-CV inside a 10-fold training scheme of 0.759 ± 0.027 when predicting an external test set. In the models which used predicted protein targets as compound descriptors, we identified the most information-rich proteins which agreed with the mechanisms of action and toxicity of nonsteroidal anti-inflammatory drugs (NSAIDs), one of the most important drug classes causing DILI, stress response via TP53 and biotransformation. In addition, we identified multiple proteins involved in xenobiotic metabolism which could be novel DILI-related off-targets, such as CLK1 and DYRK2. Moreover, we derived potential structural alerts for DILI with high precision, including furan and hydrazine derivatives; however, all derived alerts were present in approved drugs and were over specific indicating the need to consider quantitative variables such as dose. Conclusion Using chemical structure-based descriptors such as structural fingerprints and predicted protein targets, DILI prediction models were built with a predictive performance comparable to previous literature. In addition, we derived insights on proteins and pathways statistically (and potentially causally) linked to DILI from these models and inferred new structural alerts related to this adverse endpoint.


2015 ◽  
Vol 55 (10) ◽  
pp. 2085-2093 ◽  
Author(s):  
Youjun Xu ◽  
Ziwei Dai ◽  
Fangjin Chen ◽  
Shuaishi Gao ◽  
Jianfeng Pei ◽  
...  

2021 ◽  
Author(s):  
Tao Zhong ◽  
Zian Zhuang ◽  
Xiaoli Dong ◽  
Ka Hing Wong ◽  
Wing Tak Wong ◽  
...  

BACKGROUND Tuberculosis (TB) is a pandemic, being one of the top 10 causes of death and the main cause of death from a single source of infection. Drug-induced liver injury (DILI) is the most common and serious side effect during the treatment of TB. OBJECTIVE We aim to predict the status of liver injury in patients with TB at the clinical treatment stage. METHODS We designed an interpretable prediction model based on the XGBoost algorithm and identified the most robust and meaningful predictors of the risk of TB-DILI on the basis of clinical data extracted from the Hospital Information System of Shenzhen Nanshan Center for Chronic Disease Control from 2014 to 2019. RESULTS In total, 757 patients were included, and 287 (38%) had developed TB-DILI. Based on values of relative importance and area under the receiver operating characteristic curve, machine learning tools selected patients’ most recent alanine transaminase levels, average rate of change of patients’ last 2 measures of alanine transaminase levels, cumulative dose of pyrazinamide, and cumulative dose of ethambutol as the best predictors for assessing the risk of TB-DILI. In the validation data set, the model had a precision of 90%, recall of 74%, classification accuracy of 76%, and balanced error rate of 77% in predicting cases of TB-DILI. The area under the receiver operating characteristic curve score upon 10-fold cross-validation was 0.912 (95% CI 0.890-0.935). In addition, the model provided warnings of high risk for patients in advance of DILI onset for a median of 15 (IQR 7.3-27.5) days. CONCLUSIONS Our model shows high accuracy and interpretability in predicting cases of TB-DILI, which can provide useful information to clinicians to adjust the medication regimen and avoid more serious liver injury in patients.


2020 ◽  
Vol 17 (7) ◽  
pp. 2628-2637 ◽  
Author(s):  
Eni Minerali ◽  
Daniel H. Foil ◽  
Kimberley M. Zorn ◽  
Thomas R. Lane ◽  
Sean Ekins

2020 ◽  
Author(s):  
Anika Liu ◽  
Moritz Walter ◽  
Peter Wright ◽  
Aleksandra Maria Bartosik ◽  
Daniela Dolciami ◽  
...  

Abstract Background: Drug-induced liver injury (DILI) is a major safety concern characterized by a complex and diverse pathogenesis. In order to identify DILI early in drug development, a better understanding of the injury and models with better predictivity are urgently needed. One approach in this regard are in silico models which aim at predicting the risk of DILI based on the compound structure. However, these models do yet show sufficient predictive performance or interpretability to be useful for decision making by themselves, the former partially stemming from the underlying problem of labeling the in vivo DILI risk of compounds in a meaningful way for generating machine learning models.Results: As part of the Critical Assessment of Massive Data Analysis (CAMDA) “CMap Drug Safety Challenge” 2019 (http://papers.camda.info/), chemical structure-based models were generated using the binarized DILIrank annotations. Support Vector Machine (SVM) and Random Forest (RF) classifiers showed comparable performance to previously published models with a mean balanced accuracy over models generated using 5-fold cross-validation inside a 10-fold training scheme of 0.759±0.027 when predicting an external test set. In the models which used predicted protein targets as compound descriptors, we identified the most information-rich proteins which agreed with the mechanisms of action and toxicity of nonsteroidal anti-inflammatory drugs (NSAIDs), one of the most important drug classes causing DILI, and the previously established stress response pathways mediated by mitochondria, p38 MAPK and TP53. In addition, we identified multiple proteins involved in xenobiotic metabolism which could be novel DILI-related off-targets, such as CLK1 and DDR2. Moreover, we derived potential structural alerts for DILI with high precision, including furan and hydrazine derivatives; however, all derived alerts were present in approved drugs and were over specific indicating the need to consider quantitative variables such as dose.Conclusion: Using chemical structure-based descriptors such as structural fingerprints and predicted protein targets, DILI prediction models were built with a predictive performance comparable to previous literature. In addition, we derived insights on proteins and pathways statistically (and potentially causally) linked to DILI from these models and inferred new structural alerts related to this adverse endpoint.


Sign in / Sign up

Export Citation Format

Share Document