Diverse, High-Quality Test Set for the Validation of Protein−Ligand Docking Performance

2007 ◽  
Vol 50 (4) ◽  
pp. 726-741 ◽  
Author(s):  
Michael J. Hartshorn ◽  
Marcel L. Verdonk ◽  
Gianni Chessari ◽  
Suzanne C. Brewerton ◽  
Wijnand T. M. Mooij ◽  
...  
2012 ◽  
Vol E95.D (12) ◽  
pp. 3001-3009 ◽  
Author(s):  
Michiko INOUE ◽  
Akira TAKETANI ◽  
Tomokazu YONEDA ◽  
Hideo FUJIWARA

10.29007/zbb8 ◽  
2018 ◽  
Author(s):  
Emanuele Di Rosa ◽  
Enrico Giunchiglia ◽  
Massimo Narizzano ◽  
Gabriele Palma ◽  
Alessandra Puddu

Software Testing is the most used technique for software verification in industry. In the case of safety critical software, the test set can be required to cover a high percentage (up to 100%) of the software code according to some metrics. Unfortunately, attaining such high percentages is not easy using standard automatic tools for tests generation, and manual generation by domain experts is often necessary, thereby significantly increasing the associated costs.In previous papers, we have shown how it is possible to automatize the test generation process of C programs via the bounded model checker CBMC. In particular, we have shown how it is possible to productively use CBMC for the automatic generation of test sets covering 100% of branches of 5 modules of ERTMS/ETCS, a safety critical industrial software by Ansaldo STS. Unfortunately, the test set we automatically generated, is of lower "quality" if compared to the test set manually generated by domain experts: Both test sets attained the desired 100% branch coverage, but the sizes of the automatically generated test sets are roughly twice the sizes of the corresponding manually generated ones. Indeed, the automatically generated test sets contain redundant tests, i.e. tests that do not contribute to reach the desired 100% branch coverage. These redundant tests are useless from the perspective of the branch coverage, are not easy to detect and then to eliminate a posteriori, and, if maintained, imply additional costs during the verification process.In this paper we present a new methodology for the automatic generation of "high quality" test sets guaranteeing full branch coverage. Given an initially empty test set T, the basic idea is to extend T with a test covering as many as possible of the branches which are not covered by T. This requires an analysis of the control flow graph of the program in order to first individuate a path p with the desired property, and then the run of a tool (CBMC in our case) able to return either a test causing the execution of p or that such a test does not exist (under the given assumptions). We have experimented the methodology on 31 modules of the Ansaldo STS ERTMS/ETCS software, thus greatly extending the benchmarking set. For 27 of the 31 modules we succeeded in our goal to automatically generate "high quality" test sets attaining full branch coverage: All the feasible branches are executed by at least one test and the sizes of our test sets are significantly smaller than the sizes of the test sets manually generated by domain experts (and thus are also significantly smaller than the test sets automatically generated with our previous methodology). However, for 4 modules, we have been unable to automatically generate test sets attaining full branch coverage: These modules contain complex functions falling out of CBMC capacity.Our analysis on 31 modules greatly extends our previous analysis based on 5 modules, confirming that automatic test generation tools based on CBMC can be productively used in industry for attaining full branch coverage. Further, the methodology presented in this paper leads to a further increase in the productivity by substantially reducing the number of generated tests and thus the costs of the testing phase.


2014 ◽  
Vol 644-650 ◽  
pp. 2137-2142
Author(s):  
Jian Wang ◽  
Yan Liu ◽  
Zhi Guang Zhang ◽  
Zhan Jiang Yu ◽  
Bao Gui Wang ◽  
...  

A new kind of human-imitate shooting platform is needed, so that the automation and standardization of small arm experiment could be realized. And the main part of shooting platform design is the modeling of human-gun interaction system. The main object of this paper is modeling human-gun interaction system by testing the model of the system. Firstly, the testing scheme is promoted for testing interaction between gun and human shoulder, and high quality test data is collected. Then, the model parameter of human-gun system is calculated by the method of model parameter identification. 3D model of human-gun system is built. At last, the dynamic simulation is made by ADAMS. And human-gun model built by experiment method is verified.


2013 ◽  
Vol 774-776 ◽  
pp. 1604-1608
Author(s):  
Yi Zhang ◽  
Gang Wang ◽  
Ping Rong Lin

The standard program slicing of different slices is put into the fusion matrix of the optimal fusion to measure the consistent fusion of slices. In the biopsy of the actual fusion process, the slicing techniques with high consistent fusion and balanced fusion distribution are used to reasonably allocate each weight coefficient, and thus the final fusion estimation formula is obtained. We use slice fusion, path conditions, as well as the internal mechanism of software fault trigger and propagation, to construct the test constraint of a fault. It can help to direct high quality test case design and to evaluate the applicability of the adaptive random testing.


2020 ◽  
Author(s):  
BN Acharya

This study describes screening of DrugBank library for approved drugs by pharmacophore modeling and receptor-ligand docking. A 3D-QSAR model was generated on the<br>inhibition constants (Ki AutoDock ) determined by AutoDock. This 3D-QSAR model was statistically validated by Fischer’s randomization test and further evaluated by a test set<br>comprising 75 molecules. Ki AutoDock values of 49 molecules were predicted correctly by the 3D-QSAR model. The validated 3D-QSAR model was used for screening of DrugBank library for approved molecules to identify potential molecules against novel SARS corona virus-2 (SARS-CoV-2). Ten out of 40 the shortlisted molecules were kinase inhibitors.


ACTA IMEKO ◽  
2017 ◽  
Vol 6 (3) ◽  
pp. 29 ◽  
Author(s):  
Piercarlo Dondi ◽  
Luca Lombardi ◽  
Marco Malagodi ◽  
Maurizio Licchelli

Measuring historical violins provides crucial information about the morphology of the instruments, useful both for researchers and violin makers. Generally, these measures are taken manually using a calliper, but they can be repeated only occasionally due to both the restricted access to these precious instruments and the need of avoiding accidental damages to the wood or to the varnishes. In this work, we describe and assess the accuracy of a protocol for the acquisition and creation of high quality 3D models of violins, suitable for taking accurate measurements. Six historical violins of 17th – 18th centuries, kept in “Museo del Violino” in Cremona (Italy), were used as test set. The quality of the final outcomes was checked comparing measures taken on the 3D meshes with the correspondent ones taken by calliper on the original instruments. Finally, a comparison between the sound board of the instruments were performed.


2019 ◽  
Vol 37 (15_suppl) ◽  
pp. 6558-6558
Author(s):  
Fernando Jose Suarez Saiz ◽  
Corey Sanders ◽  
Rick J Stevens ◽  
Robert Nielsen ◽  
Michael W Britt ◽  
...  

6558 Background: Finding high-quality science to support decisions for individual patients is challenging. Common approaches to assess clinical literature quality and relevance rely on bibliometrics or expert knowledge. We describe a method to automatically identify clinically relevant, high-quality scientific citations using abstract content. Methods: We used machine learning trained on text from PubMed papers cited in 3 expert resources: NCCN, NCI-PDQ, and Hemonc.org. Balanced training data included text cited in at least two sources to form an “on topic” set (i.e., relevant and high quality), and an “off-topic” set, not cited in any of the above 3 sources. The off-topic set was published in lower ranked journals, using a citation-based score. Articles were part of an Oncology Clinical Trial corpus generated using a standard PubMed query. We used a gradient boosted-tree approach with a binary logistic supervised learning classification. Briefly, 988 texts were processed to produce a term frequency-inverse document frequency (tf-idf) n-gram representation of both the training and the test set (70/30 split). Ideal parameters were determined using 1000-fold cross validation. Results: Our model classified papers in the test set with 0.93 accuracy (95% CI (0.09:0.96) p ≤ 0.0001), with sensitivity 0.95 and specificity 0.91. Some false positives contained language considered clinically relevant that may have been missed or not yet included in expert resources. False negatives revealed a potential bias towards chemotherapy-focused research over radiation therapy or surgical approaches. Conclusions: Machine learning can be used to automatically identify relevant clinical publications from biographic databases, without relying on expert curation or bibliometric methods. The use of machine learning to identify relevant publications may reduce the time clinicians spend finding pertinent evidence for a patient. This approach is generalizable to cases where a corpus of high-quality publications that can serve as a training set exists or cases where document metadata is unreliable, as is the case of “grey” literature within oncology and beyond to other diseases. Future work will extend this approach and may integrate it into oncology clinical decision-support tools.


Sign in / Sign up

Export Citation Format

Share Document