Machine-Learning Prediction of Drug-Induced Cardiac Arrhythmia: Analysis of Gene Expression and Clustering

2018 ◽  
Vol 46 (3) ◽  
pp. 245-275
Author(s):  
Dennis Michael Bergau ◽  
Cong Liu ◽  
Richard L. Magin ◽  
Hui Lu
2021 ◽  
Vol 12 ◽  
Author(s):  
Wojciech Lesiński ◽  
Krzysztof Mnich ◽  
Witold R. Rudnicki

Motivation: Drug-induced liver injury (DILI) is one of the primary problems in drug development. Early prediction of DILI, based on the chemical properties of substances and experiments performed on cell lines, would bring a significant reduction in the cost of clinical trials and faster development of drugs. The current study aims to build predictive models of risk of DILI for chemical compounds using multiple sources of information.Methods: Using several supervised machine learning algorithms, we built predictive models for several alternative splits of compounds between DILI and non-DILI classes. To this end, we used chemical properties of the given compounds, their effects on gene expression levels in six human cell lines treated with them, as well as their toxicological profiles. First, we identified the most informative variables in all data sets. Then, these variables were used to build machine learning models. Finally, composite models were built with the Super Learner approach. All modeling was performed using multiple repeats of cross-validation for unbiased and precise estimates of performance.Results: With one exception, gene expression profiles of human cell lines were non-informative and resulted in random models. Toxicological reports were not useful for prediction of DILI. The best results were obtained for models discerning between harmless compounds and those for which any level of DILI was observed (AUC = 0.75). These models were built with Random Forest algorithm that used molecular descriptors.


Author(s):  
Yue Wu ◽  
Jieqiang Zhu ◽  
Peter Fu ◽  
Weida Tong ◽  
Huixiao Hong ◽  
...  

An effective approach for assessing a drug’s potential to induce autoimmune diseases (ADs) is needed in drug development. Here, we aim to develop a workflow to examine the association between structural alerts and drugs-induced ADs to improve toxicological prescreening tools. Considering reactive metabolite (RM) formation as a well-documented mechanism for drug-induced ADs, we investigated whether the presence of certain RM-related structural alerts was predictive for the risk of drug-induced AD. We constructed a database containing 171 RM-related structural alerts, generated a dataset of 407 AD- and non-AD-associated drugs, and performed statistical analysis. The nitrogen-containing benzene substituent alerts were found to be significantly associated with the risk of drug-induced ADs (odds ratio = 2.95, p = 0.0036). Furthermore, we developed a machine-learning-based predictive model by using daily dose and nitrogen-containing benzene substituent alerts as the top inputs and achieved the predictive performance of area under curve (AUC) of 70%. Additionally, we confirmed the reactivity of the nitrogen-containing benzene substituent aniline and related metabolites using quantum chemistry analysis and explored the underlying mechanisms. These identified structural alerts could be helpful in identifying drug candidates that carry a potential risk of drug-induced ADs to improve their safety profiles.


2021 ◽  
Vol 11 (2) ◽  
pp. 61
Author(s):  
Jiande Wu ◽  
Chindo Hicks

Background: Breast cancer is a heterogeneous disease defined by molecular types and subtypes. Advances in genomic research have enabled use of precision medicine in clinical management of breast cancer. A critical unmet medical need is distinguishing triple negative breast cancer, the most aggressive and lethal form of breast cancer, from non-triple negative breast cancer. Here we propose use of a machine learning (ML) approach for classification of triple negative breast cancer and non-triple negative breast cancer patients using gene expression data. Methods: We performed analysis of RNA-Sequence data from 110 triple negative and 992 non-triple negative breast cancer tumor samples from The Cancer Genome Atlas to select the features (genes) used in the development and validation of the classification models. We evaluated four different classification models including Support Vector Machines, K-nearest neighbor, Naïve Bayes and Decision tree using features selected at different threshold levels to train the models for classifying the two types of breast cancer. For performance evaluation and validation, the proposed methods were applied to independent gene expression datasets. Results: Among the four ML algorithms evaluated, the Support Vector Machine algorithm was able to classify breast cancer more accurately into triple negative and non-triple negative breast cancer and had less misclassification errors than the other three algorithms evaluated. Conclusions: The prediction results show that ML algorithms are efficient and can be used for classification of breast cancer into triple negative and non-triple negative breast cancer types.


2016 ◽  
Vol 119 (suppl_1) ◽  
Author(s):  
Elena Matsa ◽  
Paul W Burridge ◽  
Kun-Hsing Yu ◽  
Haodi Wu ◽  
Vittavat Termglinchan ◽  
...  

Rapid improvements in human induced pluripotent stem cell (hiPSC) differentiation methodologies have allowed previously unattainable access to high-purity, patient-specific cardiomyocytes (CMs) for use in disease modeling, cardiac regeneration, and drug testing. In the present study, we investigate the ability of hiPSC-derived cardiomyocytes (hiPSC-CMs) to reflect the donor’s genetic identity and serve as preclinical functional readout platforms for precision medicine. We used footprint-free Sendai virus to create two separate hiPSC clones from the fibroblasts of five different individuals lacking known mutations associated with cardiovascular disease. Whole genome expression profiling of hiPSC-CMs showed that inter-patient variation was greater than intra-patient variation, thereby verifying that reprogramming and cardiac differentiation technologies can preserve patient-specific gene expression signatures. Gene ontologies (GOs) accounting for inter-patient variation were mostly metabolic or epigenetic. Toxicology analysis based on gene expression profiles predicted patient-specific susceptibility of hiPSC-CMs to cardiotoxicity, and functional assays using drugs targeting key regulators in pathways predicted to produce cardiotoxicity showed inter-patient differential responses in hiPSC-CMs. Our data suggest that hiPSC-CMs can be used in vitro to predict and help prevent patient-specific drug-induced cardiotoxicity, potentially enabling personalized patient consultation in the future.


Cell Cycle ◽  
2018 ◽  
Vol 17 (4) ◽  
pp. 486-491 ◽  
Author(s):  
Nicolas Borisov ◽  
Victor Tkachev ◽  
Maria Suntsova ◽  
Olga Kovalchuk ◽  
Alex Zhavoronkov ◽  
...  

PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5285 ◽  
Author(s):  
Mei Sze Tan ◽  
Siow-Wee Chang ◽  
Phaik Leng Cheah ◽  
Hwa Jen Yap

Although most of the cervical cancer cases are reported to be closely related to the Human Papillomavirus (HPV) infection, there is a need to study genes that stand up differentially in the final actualization of cervical cancers following HPV infection. In this study, we proposed an integrative machine learning approach to analyse multiple gene expression profiles in cervical cancer in order to identify a set of genetic markers that are associated with and may eventually aid in the diagnosis or prognosis of cervical cancers. The proposed integrative analysis is composed of three steps: namely, (i) gene expression analysis of individual dataset; (ii) meta-analysis of multiple datasets; and (iii) feature selection and machine learning analysis. As a result, 21 gene expressions were identified through the integrative machine learning analysis which including seven supervised and one unsupervised methods. A functional analysis with GSEA (Gene Set Enrichment Analysis) was performed on the selected 21-gene expression set and showed significant enrichment in a nine-potential gene expression signature, namely PEG3, SPON1, BTD and RPLP2 (upregulated genes) and PRDX3, COPB2, LSM3, SLC5A3 and AS1B (downregulated genes).


Sign in / Sign up

Export Citation Format

Share Document