QSAR Modeling of ErbB1 Inhibitors


Application note

QSAR Modeling of ErbB1 Inhibitors Using Genetic Algorithm-Based Regression

Gaining efficiency in quantitative structure–activity relationships

ErbB1 kinase is the cell-surface receptor for epidermal growth factor. Its dysfunction has been implicated in diseases such as cancer. This paper illustrates the use of the Reaxys® Medicinal Chemistry knowledgebase to build efficient QSAR models using ErbB1 kinase as an example.

QSAR relates molecular descriptors to biological activity for known compounds.

Reaxys Medicinal Chemistry can provide computational chemists with large and diverse datasets for building predictive models.  These models can be used for virtual screening or during lead optimization. Concisely, QSAR (quantitative structure–activity relationships) modeling is based on a mathematical equation that relates molecular descriptors to biological activity for known series of compounds to create a model for evaluating the activity of new chemical entities (1, 2) 1. Engel, T. (2006) Basic Overview of Cheminformatics. J. Chem. Inf. Model. 46: 2267–2277.

2. Polanski, J. Bak, A., Gielciak, R. and Magdiarz, T. (2006) Modeling robust QSAR. J. Chem. Inf. Model. 46: 2310–2318.
. This paper illustrates the use of the Reaxys Medicinal Chemistry knowledgebase to build efficient QSAR models. ErbB1 kinase will be used as an example.

Dataset selection

The first step in building a suitable dataset for QSAR modeling is to collect all pertinent chemical and biological information relating to the ErbB1 target. Searching by target name followed by selecting the specific isoform enables easy retrieval of all the relevant chemical and biological data for ErbB1 (Figure 1). Similar searches were conducted for the other ErbB kinases to illustrate the extent of chemical and biological data available in Reaxys Medicinal Chemistry (Table 1).

To facilitate comparisons of bioactivity data from different publications and assay types, all in vitro data points in Reaxys Medicinal Chemistry have pX values. pX values are calculated by transforming parameters such as EC50, IC50, Ki into the –log equivalent (pEC50, pIC50, pKi). These are normalized values assigned to the data that enable easy quantification of compound–target affinity and comparison of information from all around the world.

Dataset selection Figure 1 | Elsevier
Figure 1. EGFR (ErbB) kinase query (Note: VEGFR not included in this search)
Target Number of bioactivities Number of substances Number of citations Number of bioactivities with pX > 7 Number of substances with pX > 7
EGFR all* 201,803 80,905 3,147 23,049 10,099
ErbB1 4,255 3,776 271 749 659
ErbB2 57,687 34,251 1,190 4,494 3,154
ErbB3 2,295 2,201 103 42 20
ErbB4 15,295 11,252 485 344 272
 
Table 1. Statistics on the EGFR (ErbB) kinase data available in Reaxys Medicinal Chemistry. *VEGFR not included, but ErbB variants, isoforms and mutants are included

An activity profile for the most potent inhibitors of all EGFR (ErbB) receptors with a pX value of greater than 7.0 (affinity <100 nM) can be generated and viewed as a Heatmap (Figure 2). The Heatmap visualizes the relationships between compounds and their targets in terms of key parameters, allowing rapid identification of relevant compound–target interactions. The highest pX values were selected for display in the Heatmap.

Dataset selection Figure 2 | Elsevier
Figure 2. Heatmap for inhibitors of all EGFR kinases with pX activity above 7.0 (affinity <100 nM)

In the Heatmap, biological affinities or activities are quantified as a pX value and displayed from 1 (low activity) in blue to 15 (high activity) in red. The color of the Heatmap cells represents the maximal pX retrieved for a given compound (line) against a given target (column). The thumbnail provides an overview of the entire Heatmap with a panel highlighting the section of the map currently displayed on screen. The dataset can be analyzed using the data density display, which shows the number of compounds retrieved per target.