Benchmarks for interpretation of QSAR models

Mapping Intimacies ◽

10.21203/rs.3.rs-271027/v1 ◽

2021 ◽

Author(s):

Mariia Matveieva ◽

Pavel Polishchuk

Keyword(s):

Neural Networks ◽

Complex Nature ◽

Data Sets ◽

Knowledge Based ◽

Box Models ◽

Quantitative Metrics ◽

Qsar Models ◽

Pharmacophore Hypotheses ◽

New Interpretation ◽

Conventional Models

Abstract Interpretation of QSAR models is useful to understand the complex nature of biological or physicochemical processes, guide structural optimization or perform knowledge-based validation of QSAR models. Highly predictive models are usually complex and their interpretation is non-trivial. This is particularly true for modern neural networks. Various approaches to interpretation of these models exist. However, it is difficult to evaluate and compare performance and applicability of these ever-emerging methods. Herein, we developed several benchmark data sets with end-points determined by pre-defined patterns. These data sets are purposed for evaluation of the ability of interpretation approaches to retrieve these patterns. They represent tasks with different complexity levels: from simple atom-based additive properties to pharmacophore hypotheses. We proposed several quantitative metrics of interpretation performance. Applicability of benchmarks and metrics was demonstrated on a set of conventional models and end-to-end graph convolutional neural networks interpreted by the previously suggested universal ML-agnostic approach for structural interpretation. We anticipate these benchmarks to be useful in evaluation of new interpretation approaches and investigation of decision making of complex “black box” models.

Download Full-text

Benchmarks for interpretation of QSAR models

Journal of Cheminformatics ◽

10.1186/s13321-021-00519-x ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Mariia Matveieva ◽

Pavel Polishchuk

Keyword(s):

Neural Networks ◽

Complex Nature ◽

Data Sets ◽

Knowledge Based ◽

Box Models ◽

Quantitative Metrics ◽

Qsar Models ◽

Pharmacophore Hypothesis ◽

New Interpretation ◽

Conventional Models

AbstractInterpretation of QSAR models is useful to understand the complex nature of biological or physicochemical processes, guide structural optimization or perform knowledge-based validation of QSAR models. Highly predictive models are usually complex and their interpretation is non-trivial. This is particularly true for modern neural networks. Various approaches to interpretation of these models exist. However, it is difficult to evaluate and compare performance and applicability of these ever-emerging methods. Herein, we developed several benchmark data sets with end-points determined by pre-defined patterns. These data sets are purposed for evaluation of the ability of interpretation approaches to retrieve these patterns. They represent tasks with different complexity levels: from simple atom-based additive properties to pharmacophore hypothesis. We proposed several quantitative metrics of interpretation performance. Applicability of benchmarks and metrics was demonstrated on a set of conventional models and end-to-end graph convolutional neural networks, interpreted by the previously suggested universal ML-agnostic approach for structural interpretation. We anticipate these benchmarks to be useful in evaluation of new interpretation approaches and investigation of decision making of complex “black box” models.

Download Full-text

Elaboration of Novel TTK1 Inhibitory Leads via QSAR-Guided Selection of Crystallographic Pharmacophores Followed By In vitro Assay

Current Computer - Aided Drug Design ◽

10.2174/1573409916666200611122736 ◽

2020 ◽

Vol 16 ◽

Author(s):

Mahmoud A. Al-Sha'er ◽

Mutasem O. Taha

Keyword(s):

Standard Error ◽

Qsar Model ◽

Qsar Modeling ◽

Vitro Assay ◽

Physicochemical Descriptors ◽

Qsar Models ◽

Ic50 Values ◽

Pharmacophore Hypotheses ◽

Selection Of

Introduction: Tyrosine threonine kinase (TTK1) is a key regulator of chromosome segregation. TTK targeting received recent concern for the enhancement of possible anticancer therapies. Objective: In this regard we employed our well-known method of QSAR-guided selection of best crystallographic pharmacophore(s) to discover considerable binding interactions that anchore inhibitors into TTK1 binding site. Method:Sixtyone TTK1 crystallographic complexes were used to extract 315 pharmacophore hypotheses. QSAR modeling was subsequently used to choose a single crystallographic pharmacophore that when combined with other physicochemical descriptors elucidates bioactivity discrepancy within a list of 55 miscellaneous inhibitors. Results: The best QSAR model was robust and predictive (r2(55) = 0.75, r2LOO = 0.72 , r2press against external testing list of 12 compounds = 0.67), Standard error of estimate (training set) (S)= 0.63 , Standard error of estimate (testing set)(Stest) = 0.62. The resulting pharmacophore and QSAR models were used to scan the National Cancer Institute (NCI) database for new TTK1 inhibitors. Conclusion: Five hits confirmed significant TTK1 inhibitory profiles with IC50 values ranging between 11.7 and 76.6 micM.

Download Full-text

Forecasting container freight rates for major trade routes: a comparison of artificial neural networks and conventional models

Maritime Economics & Logistics ◽

10.1057/s41278-020-00156-5 ◽

2020 ◽

Cited By ~ 1

Author(s):

Ziaul Haque Munim ◽

Hans-Joachim Schramm

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Trade Routes ◽

Freight Rates ◽

Artificial Neural ◽

Conventional Models ◽

Container Freight

Download Full-text

Generation of geometric interpolations of building types with deep variational autoencoders

Design Science ◽

10.1017/dsj.2020.31 ◽

2020 ◽

Vol 6 ◽

Author(s):

Jaime de Miguel Rodríguez ◽

Maria Eugenia Villafañe ◽

Luka Piškorec ◽

Fernando Sancho Caparrini

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Large Data ◽

Learning Model ◽

Large Data Sets ◽

Data Sets ◽

Connectivity Map ◽

Data Set ◽

3D Objects ◽

Machine Learning Model

Abstract This work presents a methodology for the generation of novel 3D objects resembling wireframes of building types. These result from the reconstruction of interpolated locations within the learnt distribution of variational autoencoders (VAEs), a deep generative machine learning model based on neural networks. The data set used features a scheme for geometry representation based on a ‘connectivity map’ that is especially suited to express the wireframe objects that compose it. Additionally, the input samples are generated through ‘parametric augmentation’, a strategy proposed in this study that creates coherent variations among data by enabling a set of parameters to alter representative features on a given building type. In the experiments that are described in this paper, more than 150 k input samples belonging to two building types have been processed during the training of a VAE model. The main contribution of this paper has been to explore parametric augmentation for the generation of large data sets of 3D geometries, showcasing its problems and limitations in the context of neural networks and VAEs. Results show that the generation of interpolated hybrid geometries is a challenging task. Despite the difficulty of the endeavour, promising advances are presented.

Download Full-text

AEWS: an integrated knowledge-based system with neural networks for reliability prediction

Computers in Industry ◽

10.1016/s0166-3615(97)00076-6 ◽

1998 ◽

Vol 35 (2) ◽

pp. 101-108 ◽

Cited By ~ 5

Author(s):

Young B Moon ◽

C.Kenneth Divers ◽

Hyune-Ju Kim

Keyword(s):

Neural Networks ◽

Reliability Prediction ◽

Knowledge Based System ◽

Knowledge Based ◽

Integrated Knowledge

Download Full-text

Convolutional Neural Networks in Computer-Aided Diagnosis of Colorectal Polyps and Cancer: A Review

10.20944/preprints202110.0135.v1 ◽

2021 ◽

Author(s):

Kamyab Keshtkar

Keyword(s):

Colorectal Cancer ◽

Neural Networks ◽

Deep Learning ◽

Computer Aided Diagnosis ◽

Colorectal Polyps ◽

Support Vector ◽

Training Process ◽

Computer Aided ◽

Conventional Models ◽

Aided Diagnosis

As a relatively high percentage of adenoma polyps are missed, a computer-aided diagnosis (CAD) tool based on deep learning can aid the endoscopist in diagnosing colorectal polyps or colorectal cancer in order to decrease polyps missing rate and prevent colorectal cancer mortality. Convolutional Neural Network (CNN) is a deep learning method and has achieved better results in detecting and segmenting specific objects in images in the last decade than conventional models such as regression, support vector machines or artificial neural networks. In recent years, based on the studies in medical imaging criteria, CNN models have acquired promising results in detecting masses and lesions in various body organs, including colorectal polyps. In this review, the structure and architecture of CNN models and how colonoscopy images are processed as input and converted to the output are explained in detail. In most primary studies conducted in the colorectal polyp detection and classification field, the CNN model has been regarded as a black box since the calculations performed at different layers in the model training process have not been clarified precisely. Furthermore, I discuss the differences between the CNN and conventional models, inspect how to train the CNN model for diagnosing colorectal polyps or cancer, and evaluate model performance after the training process.

Download Full-text

Empirical modeling of very large data sets using neural networks

Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium ◽

10.1109/ijcnn.2000.859413 ◽

2000 ◽

Cited By ~ 5

Author(s):

A.J. Owens

Keyword(s):

Neural Networks ◽

Large Data ◽

Empirical Modeling ◽

Large Data Sets ◽

Data Sets

Download Full-text

New modeling of reconfigurable microstrip antenna using hybrid structure of simulation driven and knowledge based artificial neural networks

Pamukkale University Journal of Engineering Sciences ◽

10.5505/pajes.2020.67809 ◽

2020 ◽

Vol 26 (5) ◽

pp. 935-943

Author(s):

Ashrf Aoad ◽

Zafer Aydin

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Microstrip Antenna ◽

Hybrid Structure ◽

Knowledge Based ◽

Artificial Neural

Download Full-text

Limits and Convergence properties of the Sequentially Markovian Coalescent

10.1101/2020.07.23.217091 ◽

2020 ◽

Author(s):

Thibaut Sellinger ◽

Diala Abu Awad ◽

Aurélien Tellier

Keyword(s):

Sequence Data ◽

Demographic History ◽

Simulated Data ◽

Simultaneous Estimation ◽

Data Sets ◽

Performance Limits ◽

Biological Variables ◽

Convergence Proofs ◽

New Interpretation ◽

Population Demographic

AbstractMany methods based on the Sequentially Markovian Coalescent (SMC) have been and are being developed. These methods make use of genome sequence data to uncover population demographic history. More recently, new methods have extended the original theoretical framework, allowing the simultaneous estimation of the demographic history and other biological variables. These methods can be applied to many different species, under different model assumptions, in hopes of unlocking the population/species evolutionary history. Although convergence proofs in particular cases have been given using simulated data, a clear outline of the performance limits of these methods is lacking. We here explore the limits of this methodology, as well as present a tool that can be used to help users quantify what information can be confidently retrieved from given datasets. In addition, we study the consequences for inference accuracy violating the hypotheses and the assumptions of SMC approaches, such as the presence of transposable elements, variable recombination and mutation rates along the sequence and SNP call errors. We also provide a new interpretation of the SMC through the use of the estimated transition matrix and offer recommendations for the most efficient use of these methods under budget constraints, notably through the building of data sets that would be better adapted for the biological question at hand.

Download Full-text

Renosterveld Conservation in South Africa: A Case Study for Handling Uncertainty in Knowledge-Based Neural Networks for Environmental Management

Journal of Environmental Informatics ◽

10.3808/jei.200900140 ◽

2009 ◽

Vol 13 (1) ◽

pp. 56-65 ◽

Cited By ~ 2

Author(s):

R. Chandra

Keyword(s):

South Africa ◽

Neural Networks ◽

Environmental Management ◽

Knowledge Based

Download Full-text