scholarly journals Evaluation of multi-target deep neural network models for compound potency prediction under increasingly challenging test conditions

Author(s):  
Raquel Rodríguez-Pérez ◽  
Jürgen Bajorath

AbstractMachine learning (ML) enables modeling of quantitative structure–activity relationships (QSAR) and compound potency predictions. Recently, multi-target QSAR models have been gaining increasing attention. Simultaneous compound potency predictions for multiple targets can be carried out using ensembles of independently derived target-based QSAR models or in a more integrated and advanced manner using multi-target deep neural networks (MT-DNNs). Herein, single-target and multi-target ML models were systematically compared on a large scale in compound potency value predictions for 270 human targets. By design, this large-magnitude evaluation has been a special feature of our study. To these ends, MT-DNN, single-target DNN (ST-DNN), support vector regression (SVR), and random forest regression (RFR) models were implemented. Different test systems were defined to benchmark these ML methods under conditions of varying complexity. Source compounds were divided into training and test sets in a compound- or analog series-based manner taking target information into account. Data partitioning approaches used for model training and evaluation were shown to influence the relative performance of ML methods, especially for the most challenging compound data sets. For example, the performance of MT-DNNs with per-target models yielded superior performance compared to single-target models. For a test compound or its analogs, the availability of potency measurements for multiple targets affected model performance, revealing the influence of ML synergies.

Author(s):  
Robert J. O’Shea ◽  
Amy Rose Sharkey ◽  
Gary J. R. Cook ◽  
Vicky Goh

Abstract Objectives To perform a systematic review of design and reporting of imaging studies applying convolutional neural network models for radiological cancer diagnosis. Methods A comprehensive search of PUBMED, EMBASE, MEDLINE and SCOPUS was performed for published studies applying convolutional neural network models to radiological cancer diagnosis from January 1, 2016, to August 1, 2020. Two independent reviewers measured compliance with the Checklist for Artificial Intelligence in Medical Imaging (CLAIM). Compliance was defined as the proportion of applicable CLAIM items satisfied. Results One hundred eighty-six of 655 screened studies were included. Many studies did not meet the criteria for current design and reporting guidelines. Twenty-seven percent of studies documented eligibility criteria for their data (50/186, 95% CI 21–34%), 31% reported demographics for their study population (58/186, 95% CI 25–39%) and 49% of studies assessed model performance on test data partitions (91/186, 95% CI 42–57%). Median CLAIM compliance was 0.40 (IQR 0.33–0.49). Compliance correlated positively with publication year (ρ = 0.15, p = .04) and journal H-index (ρ = 0.27, p < .001). Clinical journals demonstrated higher mean compliance than technical journals (0.44 vs. 0.37, p < .001). Conclusions Our findings highlight opportunities for improved design and reporting of convolutional neural network research for radiological cancer diagnosis. Key Points • Imaging studies applying convolutional neural networks (CNNs) for cancer diagnosis frequently omit key clinical information including eligibility criteria and population demographics. • Fewer than half of imaging studies assessed model performance on explicitly unobserved test data partitions. • Design and reporting standards have improved in CNN research for radiological cancer diagnosis, though many opportunities remain for further progress.


2021 ◽  
Vol 11 (15) ◽  
pp. 6918
Author(s):  
Chidubem Iddianozie ◽  
Gavin McArdle

The effectiveness of a machine learning model is impacted by the data representation used. Consequently, it is crucial to investigate robust representations for efficient machine learning methods. In this paper, we explore the link between data representations and model performance for inference tasks on spatial networks. We argue that representations which explicitly encode the relations between spatial entities would improve model performance. Specifically, we consider homogeneous and heterogeneous representations of spatial networks. We recognise that the expressive nature of the heterogeneous representation may benefit spatial networks and could improve model performance on certain tasks. Thus, we carry out an empirical study using Graph Neural Network models for two inference tasks on spatial networks. Our results demonstrate that heterogeneous representations improves model performance for down-stream inference tasks on spatial networks.


1997 ◽  
pp. 931-935 ◽  
Author(s):  
Anders Lansner ◽  
Örjan Ekeberg ◽  
Erik Fransén ◽  
Per Hammarlund ◽  
Tomas Wilhelmsson

2020 ◽  
Vol 34 (05) ◽  
pp. 9282-9289
Author(s):  
Qingyang Wu ◽  
Lei Li ◽  
Hao Zhou ◽  
Ying Zeng ◽  
Zhou Yu

Many social media news writers are not professionally trained. Therefore, social media platforms have to hire professional editors to adjust amateur headlines to attract more readers. We propose to automate this headline editing process through neural network models to provide more immediate writing support for these social media news writers. To train such a neural headline editing model, we collected a dataset which contains articles with original headlines and professionally edited headlines. However, it is expensive to collect a large number of professionally edited headlines. To solve this low-resource problem, we design an encoder-decoder model which leverages large scale pre-trained language models. We further improve the pre-trained model's quality by introducing a headline generation task as an intermediate task before the headline editing task. Also, we propose Self Importance-Aware (SIA) loss to address the different levels of editing in the dataset by down-weighting the importance of easily classified tokens and sentences. With the help of Pre-training, Adaptation, and SIA, the model learns to generate headlines in the professional editor's style. Experimental results show that our method significantly improves the quality of headline editing comparing against previous methods.


Author(s):  
Sacha J. van Albada ◽  
Jari Pronold ◽  
Alexander van Meegen ◽  
Markus Diesmann

AbstractWe are entering an age of ‘big’ computational neuroscience, in which neural network models are increasing in size and in numbers of underlying data sets. Consolidating the zoo of models into large-scale models simultaneously consistent with a wide range of data is only possible through the effort of large teams, which can be spread across multiple research institutions. To ensure that computational neuroscientists can build on each other’s work, it is important to make models publicly available as well-documented code. This chapter describes such an open-source model, which relates the connectivity structure of all vision-related cortical areas of the macaque monkey with their resting-state dynamics. We give a brief overview of how to use the executable model specification, which employs NEST as simulation engine, and show its runtime scaling. The solutions found serve as an example for organizing the workflow of future models from the raw experimental data to the visualization of the results, expose the challenges, and give guidance for the construction of an ICT infrastructure for neuroscience.


Author(s):  
Vo Ngoc Phu ◽  
Vo Thi Ngoc Tran

Artificial intelligence (ARTINT) and information have been famous fields for many years. A reason has been that many different areas have been promoted quickly based on the ARTINT and information, and they have created many significant values for many years. These crucial values have certainly been used more and more for many economies of the countries in the world, other sciences, companies, organizations, etc. Many massive corporations, big organizations, etc. have been established rapidly because these economies have been developed in the strongest way. Unsurprisingly, lots of information and large-scale data sets have been created clearly from these corporations, organizations, etc. This has been the major challenges for many commercial applications, studies, etc. to process and store them successfully. To handle this problem, many algorithms have been proposed for processing these big data sets.


2018 ◽  
Vol 7 (3.15) ◽  
pp. 95 ◽  
Author(s):  
M Zabir ◽  
N Fazira ◽  
Zaidah Ibrahim ◽  
Nurbaity Sabri

This paper aims to evaluate the accuracy performance of pre-trained Convolutional Neural Network (CNN) models, namely AlexNet and GoogLeNet accompanied by one custom CNN. AlexNet and GoogLeNet have been proven for their good capabilities as these network models had entered ImageNet Large Scale Visual Recognition Challenge (ILSVRC) and produce relatively good results. The evaluation results in this research are based on the accuracy, loss and time taken of the training and validation processes. The dataset used is Caltech101 by California Institute of Technology (Caltech) that contains 101 object categories. The result reveals that custom CNN architecture produces 91.05% accuracy whereas AlexNet and GoogLeNet achieve similar accuracy which is 99.65%. GoogLeNet consistency arrives at an early training stage and provides minimum error function compared to the other two models. 


2011 ◽  
Vol 403-408 ◽  
pp. 3805-3812 ◽  
Author(s):  
Kong Hui Guo ◽  
Xian Yun Wang

Nonparametric models of hydraulic damper based on support vector regression (SVR) are developed. Then these models are compared with two kinds neural network models. One is backpropagation neural network (BPNN) model; another is radial basis function neural network (RBFNN) model. Comparisons are carried out both on virtual damper and actual damper. The force-velocity relation of a virtual damper is obtained based on a rheological model. Then these data are used to identify the characteristics of the virtual damper. The dynamometer measurements of an actual displacement-dependent damper are obtained by experiment. And these data are used to identify the characteristics of this actual damper. The comparisons show that BPNN model is best at identifying the characteristics of the virtual damper, but SVR model is best at identifying the characteristics of the actual damper. The reason is that all experimental data include noise more or less. When the amplitude of the noise is smaller than the parameter of SVR, the noise can not affect the construction of the resulting model. So when training a model based on the experimental data, SVR is superior to other neural networks methods.


2018 ◽  
Vol 12 (3) ◽  
pp. 891-905 ◽  
Author(s):  
Andrew M. Snauffer ◽  
William W. Hsieh ◽  
Alex J. Cannon ◽  
Markus A. Schnorbus

Abstract. Estimates of surface snow water equivalent (SWE) in mixed alpine environments with seasonal melts are particularly difficult in areas of high vegetation density, topographic relief, and snow accumulations. These three confounding factors dominate much of the province of British Columbia (BC), Canada. An artificial neural network (ANN) was created using as predictors six gridded SWE products previously evaluated for BC. Relevant spatiotemporal covariates were also included as predictors, and observations from manual snow surveys at stations located throughout BC were used as target data. Mean absolute errors (MAEs) and interannual correlations for April surveys were found using cross-validation. The ANN using the three best-performing SWE products (ANN3) had the lowest mean station MAE across the province. ANN3 outperformed each product as well as product means and multiple linear regression (MLR) models in all of BC's five physiographic regions except for the BC Plains. Subsequent comparisons with predictions generated by the Variable Infiltration Capacity (VIC) hydrologic model found ANN3 to better estimate SWE over the VIC domain and within most regions. The superior performance of ANN3 over the individual products, product means, MLR, and VIC was found to be statistically significant across the province.


Sign in / Sign up

Export Citation Format

Share Document