QSAR Modeling of ErbB1 Inhibitors Using Genetic Algorithm-Based Regression
Gaining efficiency in quantitative structure–activity relationships
ErbB1kinase is the cell-surface receptor for epidermal growth factor. Itsdysfunction has been implicated in diseases such as cancer. This paperillustrates the use of the Reaxys® Medicinal Chemistry knowledgebase to buildefficient QSAR models using ErbB1 kinase as an example.
QSAR relates molecular descriptors to biological activity for known compounds.
Reaxys Medicinal Chemistry can provide computational chemists with large and diverse datasets for building predictive models. These models can be used for virtual screening or during lead optimization. Concisely, QSAR (quantitative structure–activity relationships) modeling is based on a mathematical equation that relates molecular descriptors to biological activity for known series of compounds to create a model for evaluating the activity of new chemical entities (1, 2) 1. Engel, T. (2006) Basic Overview of Cheminformatics. J. Chem. Inf. Model. 46: 2267–2277.
2. Polanski, J. Bak, A., Gielciak, R. and Magdiarz, T. (2006) Modeling robust QSAR. J. Chem. Inf. Model. 46: 2310–2318. . This paper illustrates the use of the Reaxys Medicinal Chemistry knowledgebase to build efficient QSAR models. ErbB1 kinase will be used as an example.
The first step in building a suitable dataset for QSAR modeling is to collect all pertinent chemical and biological information relating to the ErbB1 target. Searching by target name followed by selecting the specific isoform enables easy retrieval of all the relevant chemical and biological data for ErbB1 (Figure 1). Similar searches were conducted for the other ErbB kinases to illustrate the extent of chemical and biological data available in Reaxys Medicinal Chemistry (Table 1).
To facilitate comparisons of bioactivity data from different publications and assay types, all in vitro data points in Reaxys Medicinal Chemistry have pX values. pX values are calculated by transforming parameters such as EC50, IC50, Ki into the –log equivalent (pEC50, pIC50, pKi). These are normalized values assigned to the data that enable easy quantification of compound–target affinity and comparison of information from all around the world.
|Target||Number of bioactivities||Number of substances||Number of citations||Number of bioactivities with pX > 7||Number of substances with pX > 7|
An activity profile for the most potent inhibitors of all EGFR (ErbB) receptors with a pX value of greater than 7.0 (affinity <100 nM) can be generated and viewed as a Heatmap (Figure 2). The Heatmap visualizes the relationships between compounds and their targets in terms of key parameters, allowing rapid identification of relevant compound–target interactions. The highest pX values were selected for display in the Heatmap.
In the Heatmap, biological affinities or activities are quantified as a pX value and displayed from 1 (low activity) in blue to 15 (high activity) in red. The color of the Heatmap cells represents the maximal pX retrieved for a given compound (line) against a given target (column). The thumbnail provides an overview of the entire Heatmap with a panel highlighting the section of the map currently displayed on screen. The dataset can be analyzed using the data density display, which shows the number of compounds retrieved per target.