Development of a Protein-Ligand Extended Connectivity (PLEC) Fingerprint and Its Application for Binding Affinity Predictions.
<div>Fingerprints (FPs) are the most common small molecule representation in cheminformatics. There are a wide variety of fingerprints, and the Extended Connectivity Fingerprint (ECFP) is one of the best-suited for general applications. Despite the overall FP abundance, only a few FPs represent the 3D structure of the molecule, and hardly any encode protein-ligand interactions. Here, we present a Protein-Ligand Extended Connectivity (PLEC) fingerprint that implicitly encodes protein-ligand interactions by pairing the ECFP environments from the ligand and the protein. PLEC fingerprints were used to construct different machine learning (ML) models tailored for predicting protein-ligand affinities (pK<sub>i/d</sub>). Even the simplest linear model built on the PLEC fingerprint achieved R<sub>p</sub>=0.83 on the PDBbind v2016 "core set”, demonstrating its descriptive power. The PLEC fingerprint has been implemented in the Open Drug Discovery Toolkit (https://github.com/oddt/oddt).</div>