scholarly journals Pushing the limits of solubility prediction via quality-oriented data selection

2020 ◽  
Author(s):  
Murat Sorkun ◽  
J. M. Koelman ◽  
Süleyman Er

Abstract Accurate prediction of the solubility of chemical substances in solvents remains a challenge. The sparsity of high-quality solubility data is recognized as the biggest hurdle in the development of robust data-driven methods for practical use. Nonetheless, the effects of the quality and quantity of data on aqueous solubility predictions have not yet been scrutinized. In this study, the roles of the size and the quality of datasets on the performances of the solubility prediction models are unraveled, and the concepts of actual and observed performances are introduced. In an effort to curtail the gap between actual and observed performances, a quality-oriented data selection method, which evaluates the quality of data and extracts the most accurate part of it through statistical validation, is designed. Applying this method on the largest publicly available solubility database and using a consensus machine learning approach, a top-performing solubility prediction model is achieved.

Author(s):  
Michael Reiter ◽  
Uwe Breitenbucher ◽  
Oliver Kopp ◽  
Dimka Karastoyanova
Keyword(s):  

2014 ◽  
pp. 3-29 ◽  
Author(s):  
Michael Reiter ◽  
Uwe Breitenbucher ◽  
Oliver Kopp ◽  
Dimka Karastoyanova
Keyword(s):  

1987 ◽  
Vol 54 ◽  
pp. 3-3
Author(s):  
Stephen E. Frantzich

Integrating C-SPAN coverage into a traditional course provides some unique opportunities and burdens. On the opportunity side, the ability to see the subject matter relatively directly sparks interest, verifies class material and allows for some creative activities not possible using traditional resources. On the more negative side, the approaches outlined in this paper do not necessarily make teaching easier. Since faculty seldom have the opportunity to become C-SPAN “junkies” watching all the coverage, students will bring questions and examples to class which challenge the instructor more than the material stimulated by contact with traditional written sources. In evaluating many of the exercises, the instructor will have to rely on the student's interpretation and the quality of data selection and analysis. Grading will more often be based on how well the student makes his case, rather than the instructor knowing the contours of what the student should conclude ahead of time.


2019 ◽  
Vol 29 (3) ◽  
pp. 368-395 ◽  
Author(s):  
Thomas W. Price ◽  
Yihuan Dong ◽  
Rui Zhi ◽  
Benjamin Paaßen ◽  
Nicholas Lytle ◽  
...  
Keyword(s):  

2019 ◽  
Vol 79 ◽  
pp. 03001
Author(s):  
Fan Wang ◽  
Chunfu Shao ◽  
Qi Chen ◽  
Tianyi Meng ◽  
Changwen Li

ATR-FTIR combined with chemometrics was applied to establish SVM classification models aiming to evaluate sensory quality of Chinese Moutai-flavour liquor. Transformation of ATR-FTIR data, selection of effective wavenumbers as well as determination of c and gamma were performed in succession, while the verification of models was deployed applying unknown samples. Finally, taste-prediction models of raw grain and cleanliness have an accuracy reaching 90%. Model of after-taste has an accuracy of 80% and others are lower than 70%. As for some flavours, ATR-FTIR and chemometrics technology provided an effective method for quality analysis of Chinese Moutai-flavour liquor.


ADMET & DMPK ◽  
2020 ◽  
Author(s):  
Gabriela Falcón-Cano ◽  
Christophe Molina ◽  
Miguel Angel Cabrera-Pérez

In-silico prediction of aqueous solubility plays an important role during the drug discovery and development processes. For many years, the limited performance of in-silico solubility models has been attributed to the lack of high-quality solubility data for pharmaceutical molecules. However, some studies suggest that the poor accuracy of solubility prediction is not related to the quality of the experimental data and that more precise methodologies (algorithms and/or set of descriptors) are required for predicting aqueous solubility for pharmaceutical molecules. In this study a large and diverse database was generated with aqueous solubility values collected from two public sources; two new recursive machine-learning approaches were developed for data cleaning and variable selection, and a consensus model based on regression and classification algorithms was created. The modeling protocol, which includes the curation of chemical and experimental data, was implemented in KNIME, with the aim of obtaining an automated workflow for the prediction of new databases. Finally, we compared several methods or models available in the literature with our consensus model, showing results comparable or even outperforming previous published models.  


Author(s):  
B. L. Armbruster ◽  
B. Kraus ◽  
M. Pan

One goal in electron microscopy of biological specimens is to improve the quality of data to equal the resolution capabilities of modem transmission electron microscopes. Radiation damage and beam- induced movement caused by charging of the sample, low image contrast at high resolution, and sensitivity to external vibration and drift in side entry specimen holders limit the effective resolution one can achieve. Several methods have been developed to address these limitations: cryomethods are widely employed to preserve and stabilize specimens against some of the adverse effects of the vacuum and electron beam irradiation, spot-scan imaging reduces charging and associated beam-induced movement, and energy-filtered imaging removes the “fog” caused by inelastic scattering of electrons which is particularly pronounced in thick specimens.Although most cryoholders can easily achieve a 3.4Å resolution specification, information perpendicular to the goniometer axis may be degraded due to vibration. Absolute drift after mechanical and thermal equilibration as well as drift after movement of a holder may cause loss of resolution in any direction.


Sign in / Sign up

Export Citation Format

Share Document