Elicitability and identifiability of set-valued measures of systemic risk

AbstractIdentification and scoring functions are statistical tools to assess the calibration of risk measure estimates and to compare their performance with other estimates, e.g. in backtesting. A risk measure is called identifiable (elicitable) if it admits a strict identification function (strictly consistent scoring function). We consider measures of systemic risk introduced in Feinstein et al. (SIAM J. Financial Math. 8:672–708, 2017). Since these are set-valued, we work within the theoretical framework of Fissler et al. (preprint, available online at arXiv:1910.07912v2, 2020) for forecast evaluation of set-valued functionals. We construct oriented selective identification functions, which induce a mixture representation of (strictly) consistent scoring functions. Their applicability is demonstrated with a comprehensive simulation study.

Download Full-text

On the elicitability of range value at risk

Statistics & Risk Modeling ◽

10.1515/strm-2020-0037 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Tobias Fissler ◽

Johanna F. Ziegel

Keyword(s):

At Risk ◽

Least Squares ◽

Simulation Study ◽

Diagnostic Tool ◽

Value At Risk ◽

Risk Measure ◽

Predictive Performance ◽

Scoring Functions ◽

Classical Approach ◽

Least Squares Regression

Abstract The debate of which quantitative risk measure to choose in practice has mainly focused on the dichotomy between value at risk (VaR) and expected shortfall (ES). Range value at risk (RVaR) is a natural interpolation between VaR and ES, constituting a tradeoff between the sensitivity of ES and the robustness of VaR, turning it into a practically relevant risk measure on its own. Hence, there is a need to statistically assess, compare and rank the predictive performance of different RVaR models, tasks subsumed under the term “comparative backtesting” in finance. This is best done in terms of strictly consistent loss or scoring functions, i.e., functions which are minimized in expectation by the correct risk measure forecast. Much like ES, RVaR does not admit strictly consistent scoring functions, i.e., it is not elicitable. Mitigating this negative result, we show that a triplet of RVaR with two VaR-components is elicitable. We characterize all strictly consistent scoring functions for this triplet. Additional properties of these scoring functions are examined, including the diagnostic tool of Murphy diagrams. The results are illustrated with a simulation study, and we put our approach in perspective with respect to the classical approach of trimmed least squares regression.

Download Full-text

Random Forest Refinement of Pairwise Potentials for Protein-ligand Decoy Detection

10.26434/chemrxiv.8047820.v1 ◽

2019 ◽

Cited By ~ 1

Author(s):

Jun Pei ◽

Zheng Zheng ◽

Hyunji Kim ◽

Lin Song ◽

Sarah Walworth ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Probability Function ◽

Pair Potential ◽

Scoring Function ◽

Stable Structure ◽

Scoring Functions ◽

Atom Pair ◽

Data Set ◽

Atom Pairs

An accurate scoring function is expected to correctly select the most stable structure from a set of pose candidates. One can hypothesize that a scoring function’s ability to identify the most stable structure might be improved by emphasizing the most relevant atom pairwise interactions. However, it is hard to evaluate the relevant importance for each atom pair using traditional means. With the introduction of machine learning methods, it has become possible to determine the relative importance for each atom pair present in a scoring function. In this work, we use the Random Forest (RF) method to refine a pair potential developed by our laboratory (GARF6) by identifying relevant atom pairs that optimize the performance of the potential on our given task. Our goal is to construct a machine learning (ML) model that can accurately differentiate the native ligand binding pose from candidate poses using a potential refined by RF optimization. We successfully constructed RF models on an unbalanced data set with the ‘comparison’ concept and, the resultant RF models were tested on CASF-2013.5 In a comparison of the performance of our RF models against 29 scoring functions, we found our models outperformed the other scoring functions in predicting the native pose. In addition, we used two artificial designed potential models to address the importance of the GARF potential in the RF models: (1) a scrambled probability function set, which was obtained by mixing up atom pairs and probability functions in GARF, and (2) a uniform probability function set, which share the same peak positions with GARF but have fixed peak heights. The results of accuracy comparison from RF models based on the scrambled, uniform, and original GARF potential clearly showed that the peak positions in the GARF potential are important while the well depths are not.

Download Full-text

ASFP (Artificial Intelligence based Scoring Function Platform): a web server for the development of customized scoring functions

Journal of Cheminformatics ◽

10.1186/s13321-021-00486-3 ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Xujun Zhang ◽

Chao Shen ◽

Xueying Guo ◽

Zhe Wang ◽

Gaoqi Weng ◽

...

Keyword(s):

High Efficiency ◽

Low Cost ◽

Pearson Correlation ◽

Scoring Function ◽

Web Server ◽

Scoring Functions ◽

Protein Ligand Interactions ◽

Prediction Module ◽

Ligand Interactions ◽

Benchmark Datasets

AbstractVirtual screening (VS) based on molecular docking has emerged as one of the mainstream technologies of drug discovery due to its low cost and high efficiency. However, the scoring functions (SFs) implemented in most docking programs are not always accurate enough and how to improve their prediction accuracy is still a big challenge. Here, we propose an integrated platform called ASFP, a web server for the development of customized SFs for structure-based VS. There are three main modules in ASFP: (1) the descriptor generation module that can generate up to 3437 descriptors for the modelling of protein–ligand interactions; (2) the AI-based SF construction module that can establish target-specific SFs based on the pre-generated descriptors through three machine learning (ML) techniques; (3) the online prediction module that provides some well-constructed target-specific SFs for VS and an additional generic SF for binding affinity prediction. Our methodology has been validated on several benchmark datasets. The target-specific SFs can achieve an average ROC AUC of 0.973 towards 32 targets and the generic SF can achieve the Pearson correlation coefficient of 0.81 on the PDBbind version 2016 core set. To sum up, the ASFP server is a powerful tool for structure-based VS.

Download Full-text

Incorporating structural similarity into a scoring function to enhance the prediction of binding affinities

Journal of Cheminformatics ◽

10.1186/s13321-021-00493-4 ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Beihong Ji ◽

Xibing He ◽

Yuzhao Zhang ◽

Jingchen Zhai ◽

Viet Hoang Man ◽

...

Keyword(s):

Computational Cost ◽

Scoring Function ◽

Structural Similarity ◽

Scoring Functions ◽

Binding Affinities ◽

Autodock Vina ◽

Predictive Index ◽

Drug Lead ◽

Screening Performance ◽

Calibration Algorithm

AbstractIn this study, we developed a novel algorithm to improve the screening performance of an arbitrary docking scoring function by recalibrating the docking score of a query compound based on its structure similarity with a set of training compounds, while the extra computational cost is neglectable. Two popular docking methods, Glide and AutoDock Vina were adopted as the original scoring functions to be processed with our new algorithm and similar improvement performance was achieved. Predicted binding affinities were compared against experimental data from ChEMBL and DUD-E databases. 11 representative drug receptors from diverse drug target categories were applied to evaluate the hybrid scoring function. The effects of four different fingerprints (FP2, FP3, FP4, and MACCS) and the four different compound similarity effect (CSE) functions were explored. Encouragingly, the screening performance was significantly improved for all 11 drug targets especially when CSE = S4 (S is the Tanimoto structural similarity) and FP2 fingerprint were applied. The average predictive index (PI) values increased from 0.34 to 0.66 and 0.39 to 0.71 for the Glide and AutoDock vina scoring functions, respectively. To evaluate the performance of the calibration algorithm in drug lead identification, we also imposed an upper limit on the structural similarity to mimic the real scenario of screening diverse libraries for which query ligands are general-purpose screening compounds and they are not necessarily structurally similar to reference ligands. Encouragingly, we found our hybrid scoring function still outperformed the original docking scoring function. The hybrid scoring function was further evaluated using external datasets for two systems and we found the PI values increased from 0.24 to 0.46 and 0.14 to 0.42 for A2AR and CFX systems, respectively. In a conclusion, our calibration algorithm can significantly improve the virtual screening performance in both drug lead optimization and identification phases with neglectable computational cost.

Download Full-text

Learning from Docked Ligands: Ligand-Based Features Rescue Structure-Based Scoring Functions When Trained On Docked Poses

10.26434/chemrxiv.13637756 ◽

2021 ◽

Author(s):

Fergus Boyles ◽

Charlotte M Deane ◽

Garrett Morris

Keyword(s):

Machine Learning ◽

Ligand Binding ◽

Crystal Structures ◽

Binding Affinity ◽

Scoring Function ◽

Scoring Functions ◽

Data Set ◽

Core Sets ◽

Strong Performance

Machine learning scoring functions for protein-ligand binding affinity have been found to consistently outperform classical scoring functions when trained and tested on crystal structures of bound protein-ligand complexes. However, it is less clear how these methods perform when applied to docked poses of complexes. We explore how the use of docked, rather than crystallographic, poses for both training and testing affects the performance of machine learning scoring functions. Using the PDBbind Core Sets as benchmarks, we show that the performance of a structure-based machine learning scoring function trained and tested on docked poses is lower than that of the same scoring function trained and tested on crystallographic poses. We construct a hybrid scoring function by combining both structure-based and ligand-based features, and show that its ability to predict binding affinity using docked poses is comparable to that of purely structure-based scoring functions trained and tested on crystal poses. Despite strong performance on docked poses of the PDBbind Core Sets, we find that our hybrid scoring function fails to generalise to anew data set, demonstrating the need for improved scoring functions and additional validation benchmarks. Code and data to reproduce our results are available from https://github.com/oxpig/learning-from-docked-poses.

Download Full-text

Selecting Machine-Learning Scoring Functions for Structure-Based Virtual Screening

10.26434/chemrxiv.12967160 ◽

2020 ◽

Author(s):

Pedro Ballester

Keyword(s):

Machine Learning ◽

Drug Discovery ◽

Virtual Screening ◽

Predictive Accuracy ◽

Scoring Function ◽

3D Models ◽

Large Datasets ◽

Scoring Functions ◽

Discovery Process ◽

Drug Discovery Process

Interest in docking technologies has grown parallel to the ever increasing number and diversity of 3D models for macromolecular therapeutic targets. Structure-Based Virtual Screening (SBVS) aims at leveraging these experimental structures to discover the necessary starting points for the drug discovery process. It is now established that Machine Learning (ML) can strongly enhance the predictive accuracy of scoring functions for SBVS by exploiting large datasets from targets, molecules and their associations. However, with greater choice, the question of which ML-based scoring function is the most suitable for prospective use on a given target has gained importance. Here we analyse two approaches to select an existing scoring function for the target along with a third approach consisting in generating a scoring function tailored to the target. These analyses required discussing the limitations of popular SBVS benchmarks, the alternatives to benchmark scoring functions for SBVS and how to generate them or use them using freely-available software.

Download Full-text

The Time-Spatial Dimension of Eurozone Banking Systemic Risk

Risks ◽

10.3390/risks7030075 ◽

2019 ◽

Vol 7 (3) ◽

pp. 75 ◽

Cited By ~ 2

Author(s):

Matteo Foglia ◽

Eliana Angelini

Keyword(s):

Spatial Econometrics ◽

Systemic Risk ◽

Spatial Dependence ◽

Financial Stability ◽

Risk Measure ◽

Spatial Dimension ◽

Time Dimension ◽

Cross Sectional ◽

Time Space ◽

Contagion Risk

In this paper, we measure the systemic risk with a novel methodology, based on a “spatial-temporal” approach. We propose a new bank systemic risk measure to consider the two components of systemic risk: cross-sectional and time dimension. The aim is to highlight the “time-space dynamics” of contagion, i.e., if the CDS spread of bank i depends on the CDS spread of other banks. To do this, we use an advanced spatial econometrics design with a time-varying spatial dependence that can be interpreted as an index of the degree of cross-sectional spillovers. The findings highlight that the Eurozone banks have strong spatial dependence in the evolution of CDS spread, namely the contagion effect is present and persistent. Moreover, we analyse the role of the European Central Bank in managing contagion risk. We find that monetary policy has been effective in reducing systemic risk. However, the results show that systemic risk does not imply a policy intervention, highlighting how financial stability policy is not yet an objective.

Download Full-text

A copula-based systemic risk measure: application to investment-grade and high-yield CDS portfolios

Applied Economics Letters ◽

10.1080/13504851.2019.1676867 ◽

2019 ◽

Vol 27 (15) ◽

pp. 1264-1271

Author(s):

So Eun Choi ◽

Hyun Jin Jang ◽

Geon Ho Choe

Keyword(s):

Systemic Risk ◽

Risk Measure ◽

High Yield ◽

Investment Grade

Download Full-text

Assessing Molecular Docking Tools to Guide Targeted Drug Discovery of CD38 Inhibitors

International Journal of Molecular Sciences ◽

10.3390/ijms21155183 ◽

2020 ◽

Vol 21 (15) ◽

pp. 5183 ◽

Cited By ~ 1

Author(s):

Eric D. Boittier ◽

Yat Yin Tang ◽

McKenna E. Buckley ◽

Zachariah P. Schuurs ◽

Derek J. Richard ◽

...

Keyword(s):

Scoring Function ◽

Pose Prediction ◽

Scoring Functions ◽

Molecular Fingerprints ◽

Biologically Relevant ◽

Protein Ligand Interactions ◽

Molecular Features ◽

Ligand Interactions ◽

Model Protein ◽

Scoring Accuracy

A promising protein target for computational drug development, the human cluster of differentiation 38 (CD38), plays a crucial role in many physiological and pathological processes, primarily through the upstream regulation of factors that control cytoplasmic Ca2+ concentrations. Recently, a small-molecule inhibitor of CD38 was shown to slow down pathways relating to aging and DNA damage. We examined the performance of seven docking programs for their ability to model protein-ligand interactions with CD38. A test set of twelve CD38 crystal structures, containing crystallized biologically relevant substrates, were used to assess pose prediction. The rankings for each program based on the median RMSD between the native and predicted were Vina, AD4 > PLANTS, Gold, Glide, Molegro > rDock. Forty-two compounds with known affinities were docked to assess the accuracy of the programs at affinity/ranking predictions. The rankings based on scoring power were: Vina, PLANTS > Glide, Gold > Molegro >> AutoDock 4 >> rDock. Out of the top four performing programs, Glide had the only scoring function that did not appear to show bias towards overpredicting the affinity of the ligand-based on its size. Factors that affect the reliability of pose prediction and scoring are discussed. General limitations and known biases of scoring functions are examined, aided in part by using molecular fingerprints and Random Forest classifiers. This machine learning approach may be used to systematically diagnose molecular features that are correlated with poor scoring accuracy.

Download Full-text

MTD-PLS and docking study for a series of substituted 2-phenylindole derivatives with oestrogenic activity

Chemical Papers ◽

10.2478/s11696-011-0040-3 ◽

2011 ◽

Vol 65 (4) ◽

Author(s):

Edward Seclaman ◽

Alina Bora ◽

Sorin Avram ◽

Zeno Simon ◽

Ludovic Kurunczi

Keyword(s):

Oestrogen Receptor ◽

Scoring Function ◽

Docking Study ◽

Scoring Functions ◽

X Ray Diffraction ◽

X Ray ◽

X Ray Crystallography ◽

Receptor Complexes ◽

Test Sets ◽

Latent Structures

AbstractA series of 36 substituted 2-phenylindoles was analysed using minimal topological difference-projections in latent structures variant (MTD-PLS) and molecular docking, using fast rigid exhaustive docking (FRED) and AutoDock Vina programs. For quantitative structure activity relationships (QSAR) validation, a sphere exclusion algorithm in the multi-dimensional descriptor space was used to construct training and test sets. Docking procedures were based on X-ray crystallography studies using the human alpha oestrogen receptor-17β-oestradiol complex. The ranking abilities of the different scoring functions of the FRED package were presented, and the most suitable scoring function (Chemgauss3) for the oestrogen receptor was chosen. Although the series studied contains only a limited number of compounds, the MTD-PLS method and the docking procedure provided coherent results in concordance with the X-ray diffraction data for different ligand-oestrogen receptor complexes.

Download Full-text