Computational medicine: quantitative modeling of complex diseases

2019 ◽  
Vol 21 (2) ◽  
pp. 429-440 ◽  
Author(s):  
Basant K Tiwary

Abstract Biological complex systems are composed of numerous components that interact within and across different scales. The ever-increasing generation of high-throughput biomedical data has given us an opportunity to develop a quantitative model of nonlinear biological systems having implications in health and diseases. Multidimensional molecular data can be modeled using various statistical methods at different scales of biological organization, such as genome, transcriptome and proteome. I will discuss recent advances in the application of computational medicine in complex diseases such as network-based studies, genome-scale metabolic modeling, kinetic modeling and support vector machines with specific examples in the field of cancer, psychiatric disorders and type 2 diabetes. The recent advances in translating these computational models in diagnosis and identification of drug targets of complex diseases are discussed, as well as the challenges researchers and clinicians are facing in taking computational medicine from the bench to bedside.

eLife ◽  
2014 ◽  
Vol 3 ◽  
Author(s):  
Keren Yizhak ◽  
Edoardo Gaude ◽  
Sylvia Le Dévédec ◽  
Yedael Y Waldman ◽  
Gideon Y Stein ◽  
...  

Utilizing molecular data to derive functional physiological models tailored for specific cancer cells can facilitate the use of individually tailored therapies. To this end we present an approach termed PRIME for generating cell-specific genome-scale metabolic models (GSMMs) based on molecular and phenotypic data. We build >280 models of normal and cancer cell-lines that successfully predict metabolic phenotypes in an individual manner. We utilize this set of cell-specific models to predict drug targets that selectively inhibit cancerous but not normal cell proliferation. The top predicted target, MLYCD, is experimentally validated and the metabolic effects of MLYCD depletion investigated. Furthermore, we tested cell-specific predicted responses to the inhibition of metabolic enzymes, and successfully inferred the prognosis of cancer patients based on their PRIME-derived individual GSMMs. These results lay a computational basis and a counterpart experimental proof of concept for future personalized metabolic modeling applications, enhancing the search for novel selective anticancer therapies.


2020 ◽  
Vol 27 ◽  
Author(s):  
Gabriela Bitencourt-Ferreira ◽  
Camila Rizzotto ◽  
Walter Filgueira de Azevedo Junior

Background: Analysis of atomic coordinates of protein-ligand complexes can provide three-dimensional data to generate computational models to evaluate binding affinity and thermodynamic state functions. Application of machine learning techniques can create models to assess protein-ligand potential energy and binding affinity. These methods show superior predictive performance when compared with classical scoring functions available in docking programs. Objective: Our purpose here is to review the development and application of the program SAnDReS. We describe the creation of machine learning models to assess the binding affinity of protein-ligand complexes. Method: SAnDReS implements machine learning methods available in the scikit-learn library. This program is available for download at https://github.com/azevedolab/sandres. SAnDReS uses crystallographic structures, binding, and thermodynamic data to create targeted scoring functions. Results: Recent applications of the program SAnDReS to drug targets such as Coagulation factor Xa, cyclin-dependent kinases, and HIV-1 protease were able to create targeted scoring functions to predict inhibition of these proteins. These targeted models outperform classical scoring functions. Conclusion: Here, we reviewed the development of machine learning scoring functions to predict binding affinity through the application of the program SAnDReS. Our studies show the superior predictive performance of the SAnDReS-developed models when compared with classical scoring functions available in the programs such as AutoDock4, Molegro Virtual Docker, and AutoDock Vina.


2014 ◽  
Vol 2014 ◽  
pp. 1-7 ◽  
Author(s):  
Itziar Irigoien ◽  
Basilio Sierra ◽  
Concepción Arenas

In the problem of one-class classification (OCC) one of the classes, the target class, has to be distinguished from all other possible objects, considered as nontargets. In many biomedical problems this situation arises, for example, in diagnosis, image based tumor recognition or analysis of electrocardiogram data. In this paper an approach to OCC based on a typicality test is experimentally compared with reference state-of-the-art OCC techniques—Gaussian, mixture of Gaussians, naive Parzen, Parzen, and support vector data description—using biomedical data sets. We evaluate the ability of the procedures using twelve experimental data sets with not necessarily continuous data. As there are few benchmark data sets for one-class classification, all data sets considered in the evaluation have multiple classes. Each class in turn is considered as the target class and the units in the other classes are considered as new units to be classified. The results of the comparison show the good performance of the typicality approach, which is available for high dimensional data; it is worth mentioning that it can be used for any kind of data (continuous, discrete, or nominal), whereas state-of-the-art approaches application is not straightforward when nominal variables are present.


Microbiology ◽  
2014 ◽  
Vol 160 (6) ◽  
pp. 1252-1266 ◽  
Author(s):  
Hassan B. Hartman ◽  
David A. Fell ◽  
Sergio Rossell ◽  
Peter Ruhdal Jensen ◽  
Martin J. Woodward ◽  
...  

Salmonella enterica sv. Typhimurium is an established model organism for Gram-negative, intracellular pathogens. Owing to the rapid spread of resistance to antibiotics among this group of pathogens, new approaches to identify suitable target proteins are required. Based on the genome sequence of S. Typhimurium and associated databases, a genome-scale metabolic model was constructed. Output was based on an experimental determination of the biomass of Salmonella when growing in glucose minimal medium. Linear programming was used to simulate variations in the energy demand while growing in glucose minimal medium. By grouping reactions with similar flux responses, a subnetwork of 34 reactions responding to this variation was identified (the catabolic core). This network was used to identify sets of one and two reactions that when removed from the genome-scale model interfered with energy and biomass generation. Eleven such sets were found to be essential for the production of biomass precursors. Experimental investigation of seven of these showed that knockouts of the associated genes resulted in attenuated growth for four pairs of reactions, whilst three single reactions were shown to be essential for growth.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Liang He ◽  
Haiyan Xu ◽  
Ginger Y. Ke

PurposeDespite better accessibility and flexibility, peer-to-peer (P2P) lending has suffered from excessive credit risks, which may cause significant losses to the lenders and even lead to the collapse of P2P platforms. The purpose of this research is to construct a hybrid predictive framework that integrates classification, feature selection, and data balance algorithms to cope with the high-dimensional and imbalanced nature of P2P credit data.Design/methodology/approachAn improved synthetic minority over-sampling technique (IMSMOTE) is developed to incorporate the randomness and probability into the traditional synthetic minority over-sampling technique (SMOTE) to enhance the quality of synthetic samples and the controllability of synthetic processes. IMSMOTE is then implemented along with the grey relational clustering (GRC) and the support vector machine (SVM) to facilitate a comprehensive assessment of the P2P credit risks. To enhance the associativity and functionality of the algorithm, a dynamic selection approach is integrated with GRC and then fed in the SVM's process of parameter adaptive adjustment to select the optimal critical value. A quantitative model is constructed to recognize key criteria via multidimensional representativeness.FindingsA series of experiments based on real-world P2P data from Prosper Funding LLC demonstrates that our proposed model outperforms other existing approaches. It is also confirmed that the grey-based GRC approach with dynamic selection succeeds in reducing data dimensions, selecting a critical value, identifying key criteria, and IMSMOTE can efficiently handle the imbalanced data.Originality/valueThe grey-based machine-learning framework proposed in this work can be practically implemented by P2P platforms in predicting the borrowers' credit risks. The dynamic selection approach makes the first attempt in the literature to select a critical value and indicate key criteria in a dynamic, visual and quantitative manner.


2018 ◽  
Vol 34 ◽  
pp. 52-58 ◽  
Author(s):  
Jingsheng Shi ◽  
Guanglei Zhao ◽  
Yibing Wei

The dynamic balance between acetylation and deacetylation of histones plays a crucial role in the epigenetic regulation of gene expression. It is equilibrated by two families of enzymes: histone acetyltransferases and histone deacetylases (HDACs). HDACs repress transcription by regulating the conformation of the higher-order chromatin structure. HDAC inhibitors have recently become a class of chemical agents for potential treatment of the abnormal chromatin remodeling process involved in certain cancers. In this study, we constructed a large dataset to predict the activity value of HDAC1 inhibitors. Each compound was represented with seven fingerprints, and computational models were subsequently developed to predict HDAC1 inhibitors via five machine learning methods. These methods include naïve Bayes, κ-nearest neighbor, C4.5 decision tree, random forest, and support vector machine (SVM) algorithms. The best predicting model was CDK fingerprint with SVM, which exhibited an accuracy of 0.89. This model also performed best in five-fold cross-validation. Some representative substructure alerts responsible for HDAC1 inhibitors were identified by using MoSS in KNIME, which could facilitate the identification of HDAC1 inhibitors.


2012 ◽  
Vol 23 (4) ◽  
pp. 617-623 ◽  
Author(s):  
Tae Yong Kim ◽  
Seung Bum Sohn ◽  
Yu Bin Kim ◽  
Won Jun Kim ◽  
Sang Yup Lee

Molecules ◽  
2019 ◽  
Vol 24 (10) ◽  
pp. 1973 ◽  
Author(s):  
Nalini Schaduangrat ◽  
Chanin Nantasenamat ◽  
Virapong Prachayasittikul ◽  
Watshara Shoombuatong

Anticancer peptides (ACPs) have emerged as a new class of therapeutic agent for cancer treatment due to their lower toxicity as well as greater efficacy, selectivity and specificity when compared to conventional small molecule drugs. However, the experimental identification of ACPs still remains a time-consuming and expensive endeavor. Therefore, it is desirable to develop and improve upon existing computational models for predicting and characterizing ACPs. In this study, we present a bioinformatics tool called the ACPred, which is an interpretable tool for the prediction and characterization of the anticancer activities of peptides. ACPred was developed by utilizing powerful machine learning models (support vector machine and random forest) and various classes of peptide features. It was observed by a jackknife cross-validation test that ACPred can achieve an overall accuracy of 95.61% in identifying ACPs. In addition, analysis revealed the following distinguishing characteristics that ACPs possess: (i) hydrophobic residue enhances the cationic properties of α-helical ACPs resulting in better cell penetration; (ii) the amphipathic nature of the α-helical structure plays a crucial role in its mechanism of cytotoxicity; and (iii) the formation of disulfide bridges on β-sheets is vital for structural maintenance which correlates with its ability to kill cancer cells. Finally, for the convenience of experimental scientists, the ACPred web server was established and made freely available online.


2020 ◽  
Vol 21 (12) ◽  
pp. 4541 ◽  
Author(s):  
Erica Gianazza ◽  
Maura Brioschi ◽  
Roberta Baetta ◽  
Alice Mallia ◽  
Cristina Banfi ◽  
...  

Platelets are a heterogeneous small anucleate blood cell population with a central role both in physiological haemostasis and in pathological states, spanning from thrombosis to inflammation, and cancer. Recent advances in proteomic studies provided additional important information concerning the platelet biology and the response of platelets to several pathophysiological pathways. Platelets circulate systemically and can be easily isolated from human samples, making proteomic application very interesting for characterizing the complexity of platelet functions in health and disease as well as for identifying and quantifying potential platelet proteins as biomarkers and novel antiplatelet therapeutic targets. To date, the highly dynamic protein content of platelets has been studied in resting and activated platelets, and several subproteomes have been characterized including platelet-derived microparticles, platelet granules, platelet releasates, platelet membrane proteins, and specific platelet post-translational modifications. In this review, a critical overview is provided on principal platelet proteomic studies focused on platelet biology from signaling to granules content, platelet proteome changes in several diseases, and the impact of drugs on platelet functions. Moreover, recent advances in quantitative platelet proteomics are discussed, emphasizing the importance of targeted quantification methods for more precise, robust and accurate quantification of selected proteins, which might be used as biomarkers for disease diagnosis, prognosis and therapy, and their strong clinical impact in the near future.


Sign in / Sign up

Export Citation Format

Share Document