Selected Applications in Drug Design and Property Prediction

Over the past decades, computer-assisted drug design and property prediction has evolved to a widely-used and well-established discipline to support the R&D process in the life sciences. In the following, some selected challenges in drug design are shown that have been addressed and successfully solved with software systems provided by Molecular Networks.

Analysis of HTS Results

Task

Develop a virtual screening tool based on the results of an HTS assay of a library of hydantoin derivatives to test new virtual libraries.

Solution

The hydrogen bonding potential for each compound of the hydantoin library (5,328 molecules containing 185 hits) was calculated and condensed into an autocorrelation vector. Two third of the entire dataset was used to train a self-organizing neural network (Kohonen network) to obtain a classification filter. The remaining third was sent through the filter that classified 96% of the hits and 92% of the non-hits correctly.

Software

  • Molecular descriptor package ADRIANA.Code for calculating the hydrogen bonding potential descriptors
  • Neural networks package SONNIA for developing the classification filter

Slide Show

Analysis of HTS results

Reference

Teckentrup, A.; Briem, H.; Gasteiger, J. Mining High-Throughput Screening Data of Combinatorial Libraries: Development of a Filter to Distinguish Hits from Nonhits. J. Chem. Inf. Comput. Sci. 2004, 44, 626-634. (DOI: dx.doi.org/10.1021/ci034223v)

Prediction of the Isoform Specificity of Human CYP450 Substrates

Task

Develop a model to predict the isoform specificity of human CYP450 substrates.

Solution

A dataset of 146 compounds with known isoform specificity towards the three major CYP450 iso-enzymes has been extracted from literature (80 3A4, 45 2D6 and 21 2C9 substrates). For each compound, a set of 242 descriptors was calculated including global molecular, shape and size-related, sub-structure based descriptors as well as 2D and 3D autocorrelation vectors. After an automatic feature selection step, the remaining 12 descriptors were submitted to a support vector machine (SVM) learning algorithm. The final model performed with an average predictivity from 90% (recall), 89% (leave-one-out cross validation) to 84% (2-fold cross validation). Furthermore, for an external validation dataset of 233 compounds with known predominant isoform specificity the model performed with a predictivity of 83%.

Software

Slide Show

Prediction of the isoform specificity of human CYP450 substrates

Reference

Terfloth, L.; Bienfait, B.; Gasteiger, J. Ligand-based Models for the Isoform Specificity of Cytochrome P450 3A4, 2D6, and 2C9 Substrates. J. Chem. Inf. Model. 2007, 47, 1688-1701. (DOI: dx.doi.org/10.1021/ci700010t)

Finding New Lead Structures and Lead Hopping

Task

Find new possible lead structures for dopamine and benzodiazepine agonists in a compound collection from a chemical supplier with unknown biological activity.

Solution

A dataset of 112 dopamine agonists (DPA) and 60 benzodiazepine agonists (BDA) was merged with 8,223 compounds available from a chemical supplier. All compounds were encoded by 2D autocorrelation vectors taking into account the constitution of the molecules as well as physicochemical atom properties and were projected into a two-dimensional map using a self-organizing neural network (Kohonen network). The resulting Kohonen map shows a clear separation of DPA, BDA and supplier compounds. However, a few compounds with unknown biological activity were assigned to neurons that also contained either BDAs or DPAs. Those compounds might be taken as lead structures for developing new BDA or DPA compounds.

Software

  • Molecular descriptor package ADRIANA.Code for calculating the 2D autocorrelation descriptors
  • Neural networks package SONNIA for generating and analyzing the Kohonen map

Slide Show

Finding new lead structures

Reference

Bauknecht, H.; Zell, A.; Bayer, H.; Levi, P.; Wagener, M.; Sadowski, J.; Gasteiger, J. Locating Biologically Active Compounds in Medium-Sized Heterogeneous Datasets by Topological Autocorrelation Vectors: Dopamine and Benzodiazepine Agonists. J. Chem. Inf. Comput. Sci. 1996, 36, 1205-1213. (DOI: dx.doi.org/10.1021/ci960346m)

Modeling Biological Activities

Task

Model the binding affinities of a dataset of 30 steroid molecules binding to the corticoid binding globulin receptor.

Solution

For each of the 30 steroids with known affinity to the to the corticoid binding globulin (CBG) receptor, the molecular electrostatic potential was calculated and condensed into an autocorrelation vector. By using a combination of a Kohonen and backpropagation neural network, these autocorrelation vectors were successfully applied to modeling the binding affinities of the steroid molecules to the CBG receptor with a cross-validated r2 of 0.86.

Software

  • Molecular descriptor package ADRIANA.Code for calculating the electrostatic potential descriptors
  • Neural networks package SONNIA for building and analyzing the predictive model

Slide Show

Modeling of biological activities

Reference

Wagener, M.; Sadowski, J.; Gasteiger, J. Autocorrelation of Molecular Surface Properties for Modeling Corticosteroid Binding Globulin and Cytosolic Ah Receptor Activity by Neural Networks. J. Am. Chem. Soc. 1995, 117, 7769-7775. (DOI: dx.doi.org/10.1021/ja00134a023)