Advanced PLS Techniques in Chemometrics and Their Applications to Molecular Design

Author(s):  
Kiyoshi Hasegawa ◽  
Kimito Funatsu

In quantitative structure-activity/property relationships (QSAR and QSPR), multivariate statistical methods are commonly used for analysis. Partial least squares (PLS) is of particular interest because it can analyze data with strongly collinear, noisy and numerous X variables, and also simultaneously model several response variables Y. Furthermore, PLS can provide us several prediction regions and diagnostic plots as statistical measures. PLS has evolved or changed for copying with sever demands from complex data X and Y structure. In this review article, the authors picked up four advanced PLS techniques and outlined their algorithms with representative examples. Especially, the authors made efforts to describe how to disclose the embedded inner relations in data and how to use their information for molecular design.

2010 ◽  
Vol 09 (supp01) ◽  
pp. 9-22 ◽  
Author(s):  
GUI-NING LU ◽  
XUE-QIN TAO ◽  
ZHI DANG ◽  
WEILIN HUANG ◽  
ZHONG LI

The environmental fate of polychlorinated dibenzo-p-dioxins and polychlorinated dibenzofurans (PCDD/Fs) has become a major issue in recent decades. Quantitative structure–property relationship (QSPR) modeling is a powerful approach for predicting the properties of environmental organic pollutants from their structure descriptors. In this study, QSPR models were established for estimating water solubility (- log S W ) and n-octanol/water partition coefficient ( log KOW) of PCDD/Fs. Quantum chemical descriptors computed with density functional theory at the B3LYP/6-31G(d) level and partial least squares (PLS) analysis with an optimizing procedure were used to generate QSPR models for - log S W and log K OW of PCDD/Fs. Optimized models with high correlation coefficients (R2 > 0.983) were obtained for estimating - log S W and log K OW of PCDD/Fs. Both the internal cross validation test [Formula: see text] and external validation test (R2 > 0.965) results showed that the obtained models had high-precision and good prediction capability. The - log S W } and log K OW values predicted by the obtained models are very close to those observed. The PLS analysis indicated that PCDD/Fs with larger electronic spatial extent (R e ), lower molecular total energy (E T ), and smaller energy gap between the lowest unoccupied and the highest occupied molecular orbitals (E LUMO -E HOMO ) tend to be less soluble in water but more lipophilic.


Sign in / Sign up

Export Citation Format

Share Document