Greedy Algorithm for a Training Set Reduction in the Kernel Methods

Author(s):  
Vojtěch Franc ◽  
Václav Hlaváč
2011 ◽  
Author(s):  
Jeffrey S. Katz ◽  
John F. Magnotti ◽  
Anthony A. Wright

CCIT Journal ◽  
2019 ◽  
Vol 12 (2) ◽  
pp. 170-176
Author(s):  
Anggit Dwi Hartanto ◽  
Aji Surya Mandala ◽  
Dimas Rio P.L. ◽  
Sidiq Aminudin ◽  
Andika Yudirianto

Pacman is one of the labyrinth-shaped games where this game has used artificial intelligence, artificial intelligence is composed of several algorithms that are inserted in the program and Implementation of the dijkstra algorithm as a method of solving problems that is a minimum route problem on ghost pacman, where ghost plays a role chase player. The dijkstra algorithm uses a principle similar to the greedy algorithm where it starts from the first point and the next point is connected to get to the destination, how to compare numbers starting from the starting point and then see the next node if connected then matches one path with the path). From the results of the testing phase, it was found that the dijkstra algorithm is quite good at solving the minimum route solution to pursue the player, namely by getting a value of 13 according to manual calculations


2020 ◽  
Vol 2020 (10) ◽  
pp. 64-1-64-5
Author(s):  
Mustafa I. Jaber ◽  
Christopher W. Szeto ◽  
Bing Song ◽  
Liudmila Beziaeva ◽  
Stephen C. Benz ◽  
...  

In this paper, we propose a patch-based system to classify non-small cell lung cancer (NSCLC) diagnostic whole slide images (WSIs) into two major histopathological subtypes: adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC). Classifying patients accurately is important for prognosis and therapy decisions. The proposed system was trained and tested on 876 subtyped NSCLC gigapixel-resolution diagnostic WSIs from 805 patients – 664 in the training set and 141 in the test set. The algorithm has modules for: 1) auto-generated tumor/non-tumor masking using a trained residual neural network (ResNet34), 2) cell-density map generation (based on color deconvolution, local drain segmentation, and watershed transformation), 3) patch-level feature extraction using a pre-trained ResNet34, 4) a tower of linear SVMs for different cell ranges, and 5) a majority voting module for aggregating subtype predictions in unseen testing WSIs. The proposed system was trained and tested on several WSI magnifications ranging from x4 to x40 with a best ROC AUC of 0.95 and an accuracy of 0.86 in test samples. This fully-automated histopathology subtyping method outperforms similar published state-of-the-art methods for diagnostic WSIs.


2020 ◽  
Author(s):  
Xin Yi See ◽  
Benjamin Reiner ◽  
Xuelan Wen ◽  
T. Alexander Wheeler ◽  
Channing Klein ◽  
...  

<div> <div> <div> <p>Herein, we describe the use of iterative supervised principal component analysis (ISPCA) in de novo catalyst design. The regioselective synthesis of 2,5-dimethyl-1,3,4-triphenyl-1H- pyrrole (C) via Ti- catalyzed formal [2+2+1] cycloaddition of phenyl propyne and azobenzene was targeted as a proof of principle. The initial reaction conditions led to an unselective mixture of all possible pyrrole regioisomers. ISPCA was conducted on a training set of catalysts, and their performance was regressed against the scores from the top three principal components. Component loadings from this PCA space along with k-means clustering were used to inform the design of new test catalysts. The selectivity of a prospective test set was predicted in silico using the ISPCA model, and only optimal candidates were synthesized and tested experimentally. This data-driven predictive-modeling workflow was iterated, and after only three generations the catalytic selectivity was improved from 0.5 (statistical mixture of products) to over 11 (> 90% C) by incorporating 2,6-dimethyl- 4-(pyrrolidin-1-yl)pyridine as a ligand. The successful development of a highly selective catalyst without resorting to long, stochastic screening processes demonstrates the inherent power of ISPCA in de novo catalyst design and should motivate the general use of ISPCA in reaction development. </p> </div> </div> </div>


2018 ◽  
Author(s):  
Caitlin C. Bannan ◽  
David Mobley ◽  
A. Geoff Skillman

<div>A variety of fields would benefit from accurate pK<sub>a</sub> predictions, especially drug design due to the affect a change in ionization state can have on a molecules physiochemical properties.</div><div>Participants in the recent SAMPL6 blind challenge were asked to submit predictions for microscopic and macroscopic pK<sub>a</sub>s of 24 drug like small molecules.</div><div>We recently built a general model for predicting pK<sub>a</sub>s using a Gaussian process regression trained using physical and chemical features of each ionizable group.</div><div>Our pipeline takes a molecular graph and uses the OpenEye Toolkits to calculate features describing the removal of a proton.</div><div>These features are fed into a Scikit-learn Gaussian process to predict microscopic pK<sub>a</sub>s which are then used to analytically determine macroscopic pK<sub>a</sub>s.</div><div>Our Gaussian process is trained on a set of 2,700 macroscopic pK<sub>a</sub>s from monoprotic and select diprotic molecules.</div><div>Here, we share our results for microscopic and macroscopic predictions in the SAMPL6 challenge.</div><div>Overall, we ranked in the middle of the pack compared to other participants, but our fairly good agreement with experiment is still promising considering the challenge molecules are chemically diverse and often polyprotic while our training set is predominately monoprotic.</div><div>Of particular importance to us when building this model was to include an uncertainty estimate based on the chemistry of the molecule that would reflect the likely accuracy of our prediction. </div><div>Our model reports large uncertainties for the molecules that appear to have chemistry outside our domain of applicability, along with good agreement in quantile-quantile plots, indicating it can predict its own accuracy.</div><div>The challenge highlighted a variety of means to improve our model, including adding more polyprotic molecules to our training set and more carefully considering what functional groups we do or do not identify as ionizable. </div>


Author(s):  
Golokesh Santra ◽  
Nitai Sylvetsky ◽  
Gershom Martin

We present a family of minimally empirical double-hybrid DFT functionals parametrized against the very large and diverse GMTKN55 benchmark. The very recently proposed wB97M(2) empirical double hybrid (with 16 empirical parameters) has the lowest WTMAD2 (weighted mean absolute deviation over GMTKN55) ever reported at 2.19 kcal/mol. However, our xrevDSD-PBEP86-D4 functional reaches a statistically equivalent WTMAD2=2.22 kcal/mol, using just a handful of empirical parameters, and the xrevDOD-PBEP86-D4 functional reaches 2.25 kcal/mol with just opposite-spin MP2 correlation, making it amenable to reduced-scaling algorithms. In general, the D4 empirical dispersion correction is clearly superior to D3BJ. If one eschews dispersion corrections of any kind, noDispSD-SCAN offers a viable alternative. Parametrization over the entire GMTKN55 dataset yields substantial improvement over the small training set previously employed in the DSD papers.


Sign in / Sign up

Export Citation Format

Share Document