Skip to main content Accessibility help
×
Hostname: page-component-77c89778f8-m42fx Total loading time: 0 Render date: 2024-07-17T15:28:47.926Z Has data issue: false hasContentIssue false

10 - QSAR in drug discovery

from PART II - COMPUTATIONAL CHEMISTRY METHODOLOGY

Published online by Cambridge University Press:  06 July 2010

Kenneth M. Merz, Jr
Affiliation:
University of Florida
Dagmar Ringe
Affiliation:
Brandeis University, Massachusetts
Charles H. Reynolds
Affiliation:
Johnson & Johnson Pharmaceutical Research & Development
Get access

Summary

INTRODUCTION

With nearly fifty years of rich history of methodology developments and applications (the Hansch article of 1963 is often considered first in the field), quantitative structure/activity relationship (QSAR) modeling is a well-established area of research. As is true perhaps for any computational field, QSAR modeling has been both blessed and sometimes cursed in the literature. In the first volume of the famous book series titled Reviews in Computational Chemistry, Boyd summarized several documented cases when QSAR modeling was instrumental in discovering new drugs of drug candidates in advanced phases of clinical trials. The methodologies used by that time were relatively simple, employing a small number of physical chemical descriptors and statistical methods such as multiple linear regression. QSAR modeling was viewed solely as a tool for lead optimization; that is, it was employed to elucidate the relationship between structure and activity in relatively small congeneric compound series and predict relatively small structural modifications leading to enhanced activity.

Since the late 1980s the field has changed dramatically, fueled by changes in the size, complexity, and availability of experimental data sets of biologically active compounds. These changes have been coincidental with the advances in chemometrics, resulting in a significant increase in the number of chemical descriptors as well as growing implementation of machine learning and advanced statistical modeling techniques available for QSAR studies.

Type
Chapter
Information
Drug Design
Structure- and Ligand-Based Approaches
, pp. 151 - 164
Publisher: Cambridge University Press
Print publication year: 2010

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Hansch, C.; Streich, M.; Geiger, F.; Muir, R. M.; Maloney, P. P.; Fujita, T.Correlation of biological activity of plant growth regulators and chloromycetin derivatives with Hammett constants and partition coefficients. J. Am. Chem. Soc. 1963, 85, 2817–2824.Google Scholar
Boyd, D.Successes of computer-assisted molecular design. In: Reviews in Computational Chemistry, Boyd, D.; Lipkowitz, K. B.; Eds. New York, NY: VCH; 1990, 355–371.
Doweyko, A. M.QSAR: dead or alive?J. Comput. Aided. Mol. Des. 2008, 22, 81–89.Google Scholar
Tropsha, A.Recent trends in quantitative structure-activity relationships. In: Burger's Medicinal Chemistry and Drug Discovery, Abraham, D.; Ed. New York, NY: John Wiley & Sons, 2003; 49–77.
Stouch, T. R.; Kenyon, J. R.; Johnson, S. R.; Chen, X. Q.; Doweyko, A.; Li, Y.In silico ADME/Tox: why models fail. J. Comput. Aided Mol. Des. 2003, 17, 83–92.Google Scholar
Jorgensen, W. L.; Tirado-Rives, J.QSAR/QSPR and proprietary data. J. Chem. Inf. Model. 2006, 46, 937.Google Scholar
Maggiora, G. M.On outliers and activity cliffs: why QSAR often disappoints. J. Chem. Inf. Model. 2006, 46, 1535.Google Scholar
Johnson, S. R.The trouble with QSAR (or how I learned to stop worrying and embrace fallacy). J. Chem. Inf. Model. 2008, 48, 25–26.Google Scholar
,PubChem. http://pubchem.ncbi.nlm.nih.gov/. 2008.
Roth, B. L.; Kroeze W. K. Screening the receptorome yields validated molecular targets for drug discovery. Curr. Pharm. Des. 2006, 12, 1785–1795.Google Scholar
,NCI. http://dtp nci nih gov/docs/3d_database/structural_information/smiles_strings html 2007.
,FDA. http://www.fda.gov/cder/Offices/OPS_IO/. 2005.
,NTP. http://ntp.niehs.nih.gov/ntpweb/. 2005.
,DSSTox. http://www.epa.gov/nheerl/dsstox/About.html. 2005.
Oprea, T.; Tropsha, A.Target, chemical and bioactivity databases: integration is key. Drug Discov. Today 2006, 3, 357–365.Google Scholar
Hansch, C.; Fujita, T.r-s-p analysis: a method for the correlation of biological activity and chemical structure. J. Am. Chem. Soc. 1964, 86, 1616–1626.Google Scholar
Zheng, W.; Tropsha, A.Novel variable selection quantitative structure–property relationship approach based on the k-nearest-neighbor principle. J. Chem. Inf. Comput. Sci. 2000, 40, 185–194.Google Scholar
Tropsha, A.Application of predictive QSAR models to database mining. In: Cheminformatics in Drug Discovery, Oprea, T.; Ed. Weinheim: Wiley-VCH; 2005, 437–455.
Tropsha, A.Predictive QSAR (quantitative structure activity relationships) modeling. In: Comprehensive Medicinal Chemistry II, Martin, Y. C.; Ed. Amsterdam: Elsevier, 2006; 113–126.
Papa, E.; Villa, F.; Gramatica, P.Statistically validated QSARs, based on theoretical descriptors, for modeling aquatic toxicity of organic chemicals in Pimephales promelas (fathead minnow). J. Chem. Inf. Model. 2005, 45, 1256–1266.Google Scholar
Tetko, I. V.Neural network studies. 4. Introduction to associative neural networks. J. Chem. Inf. Comput. Sci. 2002, 42, 717–728.Google Scholar
Zupan, J.; Novic, M.; Gasteiger, J.Neural networks with counter-propagation learning-strategy used for modeling. Chemometrics Intelligent Lab. Syst. 1995, 27(2), 175–187.Google Scholar
Devillers, J.Strengths and weaknesses of the back propagation neural network in QSAR and QSPR studies. In: Genetic Algorithms in Molecular Modeling, Devillers, J.; Ed. San Diego, CA: Academic Press; 1996; 1–24.
Engels, M. F. M.; Wouters, L.; Verbeeck, R.; Vanhoof, G.Outlier mining in high throughput screening experiments. J. Biomol. Screen. 2002; (7): 341–351.Google Scholar
Schuurmann, G.; Aptula, A. O.; Kuhne, R.; Ebert, R. U.Stepwise discrimination between four modes of toxic action of phenols in the Tetrahymena pyriformis assay. Chem. Res. Toxicol. 2003, 16, 974–987.Google Scholar
Xue, Y.; Li, H.; Ung, C. Y.; Yap, C. W.; Chen, Y. Z.Classification of a diverse set of Tetrahymena pyriformis toxicity chemical compounds from molecular descriptors by statistical learning methods. Chem. Res. Toxicol. 2006, 19, 1030–1039.Google Scholar
Breiman, L.; Friedman, J. H.; Olshen, R. A.; Stone, C. J.Classification and Regression Trees. Florence, KY: Wadsworth; 1984.
Deconinck, E.; Hancock, T.; Coomans, D.; Massart, D. L.; Vander Heyden, Y.Classification of drugs in absorption classes using the classification and regression trees (CART) methodology. J. Pharm. Biomed. Anal. 2005, 39, 91–103.Google Scholar
,MOE. http://www.chemcomp.com/fdept/prodinfo.htm#Cheminformatics. 2005.
Put, R.; Perrin, C.; Questier, F.; Coomans, D.; Massart, D. L.; Vander Heyden, Y. V.Classification and regression tree analysis for molecular descriptor selection and retention prediction in chromatographic quantitative structure-retention relationship studies. J. Chromatogr. A 2003, 988, 261–276.Google Scholar
Breiman, L.Random forests. J. Mach. Learn. Res. 2001, 45, 5–32.Google Scholar
Svetnik, V.; Liaw, A.; Tong, C.; Culberson, J. C.; Sheridan, R. P.; Feuston, B. P.Random forest: a classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 2003, 43, 1947–1958.Google Scholar
Put, R.; Xu, Q. S.; Massart, D. L.; Heyden, Y. V.Multivariate adaptive regression splines (MARS) in chromatographic quantitative structure-retention relationship studies. J. Chromatogr. A 2004, 1055, 11–19.Google Scholar
Friedman, J. H.Multivariate adaptive regression splines. Ann. Stat. 1991, 19, 1–67.Google Scholar
Vapnik, V. N.The Nature of Statistical Learning Theory. New York, NY: Springer-Verlag; 1995.
Aires-de-Sousa, J.; Gasteiger, J.Prediction of enantiomeric excess in a combinatorial library of catalytic enantioselective reactions. J. Comb. Chem. 2005, 7, 298–301.Google Scholar
Oloff, S.; Mailman, R. B.; Tropsha, A.Application of validated QSAR models of D1 dopaminergic antagonists for database mining. J. Med. Chem. 2005, 48, 7322–7332.Google Scholar
Chohan, K. K.; Paine, S. W.; Waters, N. J.Quantitative structure activity relationships in drug metabolism. Curr. Top. Med. Chem. 2006, 6, 1569–1578.Google Scholar
Golbraikh, A.; Tropsha, A.Beware of q2!J. Mol. Graph. Model. 2002a, 20, 269–276.Google Scholar
Tropsha, A.; Gramatica, P.; Gombar, V. K.The importance of being earnest: validation is the absolute essential for successful application and interpretation of QSPR models. QSAR Comb. Sci. 2003 22, 69–77.Google Scholar
Kubinyi, H.; Hamprecht, F. A.; Mietzner, T.Three-dimensional quantitative similarity-activity relationships (3D QSiAR) from SEAL similarity matrices. J. Med. Chem. 1998, 41, 2553–2564.Google Scholar
Novellino, E.; Fattorusso, C.; Greco, G.Use of comparative molecular field analysis and cluster analysis in series design. Pharm. Acta Helv. 1995, 70, 149–154.Google Scholar
Norinder, U.Single and domain made variable selection in 3D QSAR applications. J. Chemomet. 1996, 10, 95–105.Google Scholar
Tropsha, A.; Cho, S. J.Cross-validated R2-guided region selection for CoMFA studies. In: 3D QSAR in Drug Design, Vol. III, Kubinyi, H.; Folkers, G.; Martin, Y. C.; Eds. Dordrecht: Kluwer Academic; 1998, 57–69.
Golbraikh, A.; Tropsha, A.Predictive QSAR modeling based on diversity sampling of experimental datasets for the training and test set selection. J. Comput. Aided Mol. Des. 2002b, 16, 357–369.Google Scholar
Golbraikh, A.; Shen, M.; Xiao, Z.; Xiao, Y. D.; Lee, K. H.; Tropsha, A.Rational selection of training and test sets for the development of validated QSAR models. J. Comput. Aided. Mol. Des. 2003b, 17, 241–253.Google Scholar
Pavan, M.; Netzeva, T. I.; Worth, A. P.Validation of a QSAR model for acute toxicity. SAR QSAR Environ. Res. 2006, 17, 147–171.Google Scholar
Vracko, M.; Bandelj, V.; Barbieri, P.; Benfenati, E.; Chaudhry, Q.; Cronin, M.; Devillers, J.; Gallegos, A.; Gini, G.; Gramatica, P.; Helma, C.; Mazzatorta, P.; Neagu, D.; Netzeva, T.; Pavan, M.; Patlewicz, G.; Randic, M.; Tsakovska, I.; Worth, A.Validation of counter propagation neural network models for predictive toxicology according to the OECD principles: a case study. SAR QSAR Environ. Res. 2006, 17, 265–284.Google Scholar
Saliner, A. G.; Netzeva, T. I.; Worth, A. P.Prediction of estrogenicity: validation of a classification model. SAR QSAR Environ. Res. 2006, 17, 195–223.Google Scholar
Roberts, D. W.; Aptula, A. O.; Patlewicz, G.Mechanistic applicability domains for non-animal based prediction of toxicological endpoints: QSAR analysis of the schiff base applicability domain for skin sensitization. Chem. Res. Toxicol. 2006, 19, 1228–1233.Google Scholar
Estrada, E.; Patlewicz, G.On the usefulness of graph-theoretic descriptors in predicting theoretical parameters: phototoxicity of polycyclic aromatic hydrocarbons (PAHs). Acta Clin. Croat. 2004, 77, 203–211.Google Scholar
Moss, G. P.; Cronin, M. T. D.Quantitative structure-permeability relationships for percutaneous absorption: re-analysis of steroid data. Int. J. Pharm. 2002, 238, 105–109.Google Scholar
Leo, A. J.; Hansch, C.Role of hydrophobic effects in mechanistic QSAR. Perspectives in Drug Discov. Des. 1999, 17, 1–25.Google Scholar
Zhang, S.; Golbraikh, A.; Tropsha, A.Development of quantitative structure-binding affinity relationship models based on novel geometrical chemical descriptors of the protein-ligand interfaces. J. Med. Chem. 2006b, 49, 2713–2724.Google Scholar
Golbraikh, A.; Bonchev, D.; Tropsha, A.Novel chirality descriptors derived from molecular topology. J. Chem. Inf. Comput. Sci. 2001, 41, 147–158.Google Scholar
Kovatcheva, A.; Buchbauer, G.; Golbraikh, A.; Wolschann, P.QSAR modeling of alpha-campholenic derivatives with sandalwood odor. J. Chem. Inf. Comput. Sci. 2003, 43, 259–266.Google Scholar
Shen, M.; Xiao, Y.; Golbraikh, A.; Gombar, V. K.; Tropsha, A.Development and validation of k-nearest-neighbor QSPR models of metabolic stability of drug candidates. J. Med. Chem. 2003, 46, 3013–3020.Google Scholar
Shen, M.; LeTiran, A.; Xiao, Y.; Golbraikh, A.; Kohn, H.; Tropsha, A.Quantitative structure-activity relationship analysis of functionalized amino acid anticonvulsant agents using k nearest neighbor and simulated annealing PLS methods. J. Med. Chem. 2002, 45, 2811–2823.Google Scholar
Shen, M.; Beguin, C.; Golbraikh, A.; Stables, J. P.; Kohn, H.; Tropsha, A.Application of predictive QSAR models to database mining: identification and experimental validation of novel anticonvulsant compounds. J. Med. Chem. 2004, 47, 2356–2364.Google Scholar
Zhang, S.; Golbraikh, A.; Oloff, S.; Kohn, H.; Tropsha, A.A novel automated lazy learning QSAR (ALL-QSAR) approach: method development, applications, and virtual screening of chemical databases using validated ALL-QSAR models. J. Chem. Inf. Model. 2006a, 46, 1984–1995.Google Scholar
Golbraikh, A.; Shen, M.; Xiao, Z.; Xiao, Y. D.; Lee, K. H.; Tropsha, A.Rational selection of training and test sets for the development of validated QSAR models. J. Comput. Aided. Mol. Des. 2003a, 17, 241–253.Google Scholar
Mandel, J.Use of the singular value decomposition in regression-analysis. Am. Stat. 1982, 36, 15–24.Google Scholar
Afantitis, A.; Melagraki, G.; Sarimveis, H.; Koutentis, P. A.; Markopoulos, J.; Igglessi-Markopoulou, O.A novel QSAR model for predicting induction of apoptosis by 4-aryl-4H-chromenes. Bioorg. Med. Chem. 2006, 14, 6686–6694.Google Scholar
Netzeva, T. I.; Gallegos, S. A.; Worth, A. P.Comparison of the applicability domain of a quantitative structure-activity relationship for estrogenicity with a large chemical inventory. Environ. Toxicol. Chem. 2006, 25, 1223–1230.Google Scholar
Tong, W.; Xie, Q.; Hong, H.; Shi, L.; Fang, H.; Perkins, R.Assessment of prediction confidence and domain extrapolation of two structure-activity relationship models for predicting estrogen receptor binding activity. Environ. Health Perspect. 2004, 112, 1249–1254.Google Scholar
Helma, C.Lazy structure-activity relationships (lazar) for the prediction of rodent carcinogenicity and Salmonella mutagenicity. Mol. Divers. 2006, 10, 147–158.Google Scholar
Zhu, H.; Tropsha, A.; Fourches, D.; Varnek, A.; Papa, E.; Gramatica, P.; Oberg, T.; Dao, P.; Cherkasov, A.; Tetko, I. V.Combinatorial QSAR modeling of chemical toxicants tested against Tetrahymena pyriformis. J. Chem. Inf. Model. 2008, 48, 766–784.Google Scholar
Wang, X. S.; Tang, H.; Golbraikh, A.; Tropsha, A.Combinatorial QSAR modeling of specificity and subtype selectivity of ligands binding to serotonin receptors 5HT1E and 5HT1F. J. Chem. Inf. Model. 2008, 48, 997–1013.Google Scholar
Cerqueira, L. P.; Golbraikh, A.; Oloff, S.; Xiao, Y.; Tropsha, A.Combinatorial QSAR modeling of P-glycoprotein substrates. J. Chem. Inf. Model. 2006, 46, 1245–1254.Google Scholar
Kovatcheva, A.; Golbraikh, A.; Oloff, S.; Xiao, Y. D.; Zheng, W.; Wolschann, P.; Buchbauer, G.; Tropsha, A.Combinatorial QSAR of ambergris fragrance compounds. J. Chem. Inf. Comput. Sci. 2004, 44, 582–595.Google Scholar
Sachs, L.Handbook of Statistics. New York, NY: Springer-Verlag; 1984.
Zhu, H.; Tropsha, A.; Fourches, D.; Varnek, A.; Papa, E.; Gramatica, P.; Oberg, T.; Dao, P.; Cherkasov, A.; Tetko, I. V.Combinatorial QSAR modeling of chemical toxicants tested against Tetrahymena pyriformis. J. Chem. Inf. Model. 2008, 48(4), 766–784.Google Scholar
Aptula, A. O.; Roberts, D. W.; Cronin, M. T. D.; Schultz, T. W.Chemistry-toxicity relationships for the effects of di- and trihydroxybenzenes to Tetrahymena pyriformis. Chem. Res. Toxicol. 2005, 18, 844–854.Google Scholar
Netzeva, T. I.; Schultz, T. W.QSARs for the aquatic toxicity of aromatic aldehydes from Tetrahymena data. Chemosphere 2005, 61, 1632–1643.Google Scholar
Schultz, T. W.; Sinks, G. D.; Miller, L. A.Population growth impairment of sulfur-containing compounds to Tetrahymena pyriformis. Environ. Toxicol. 2001, 16, 543–549.Google Scholar
Schultz, T. W.; Cronin, M. T.; Netzeva, T. I.; Aptula, A. O.Structure-toxicity relationships for aliphatic chemicals evaluated with Tetrahymena pyriformis. Chem. Res. Toxicol. 2002, 15, 1602–1609.Google Scholar
Schultz, T. W.; Netzeva, T. I.Development and evaluation of QSARs for ecotoxic endpoints: the benzene response-surface model for Tetrahymena toxicity. In: Modeling Environmental Fate and Toxicity, Cronin, M. T. D.; Livingstone, D. J.; Eds. Boca Raton, FL: CRC Press; 2004, 265–284.
Schultz, T. W.; Netzeva, T. I.; Roberts, D. W.; Cronin, M. T.Structure-toxicity relationships for the effects to Tetrahymena pyriformis of aliphatic, carbonyl-containing, alpha,beta-unsaturated chemicals. Chem. Res. Toxicol. 2005, 18, 330–341.Google Scholar
Schultz, T. W.; Yarbrough, J. W.; Woldemeskel, M.Toxicity to Tetrahymena and abiotic thiol reactivity of aromatic isothiocyanates. Cell Biol. Toxicol. 2005, 21, 181–189.Google Scholar
Schultz, T. W.Structure-toxicity relationships for benzenes evaluated with Tetrahymena pyriformis. Chem. Res. Toxicol. 1999, 12, 1262–1267.Google Scholar
Schultz, T. W.; Hewitt, M.; Netzeva, T. I.; Cronin, M. T. D.Assessing applicability domains of toxicological QSARs: definition, confidence in predicted values, and the role of mechanisms of action. QSAR Comb. Sci. 2007, 26, 238–254.Google Scholar
Gramatica, P.Principles of QSAR models validation: internal and external. QSAR Comb. Sci. 2007, 26, 694–701.Google Scholar
Yang, C.; Richard, A. M.; Cross, K. P.The art of data mining the minefields of toxicity databases to link chemistry to biology. Curr. Comput. Aided Drug Des. 2006, 2, 135–150.Google Scholar
Irwin, J. J.; Shoichet, B. K.ZINC – a free database of commercially available compounds for virtual screening. J. Chem. Inf. Model. 2005, 45, 177–182.Google Scholar
Medina-Franco, J. L.; Golbraikh, A.; Oloff, S.; Castillo, R.; Tropsha, A.Quantitative structure-activity relationship analysis of pyridinone HIV-1 reverse transcriptase inhibitors using the k nearest neighbor method and QSAR-based database mining. J. Comput. Aided Mol. Des. 2005, 19, 229–242.Google Scholar
Zhang, S.; Wei, L.; Bastow, K.; Zheng, W.; Brossi, A.; Lee, K. H.; Tropsha, A.Antitumor Agents 252. Application of validated QSAR models to database mining: discovery of novel tylophorine derivatives as potential anticancer agents. J. Comput. Aided Mol. Des. 2007, 21, 97–112.Google Scholar
Hsieh, J. H.; Wang, X. S.; Teotico, D.; Golbraikh, A.; Tropsha, A.Differentiation of AmpC beta-lactamase binders vs. decoys using classification kNN QSAR modeling and application of the QSAR classifier to virtual screening. J. Comput. Aided Mol. Des. 2008, 22(9), 593–609.Google Scholar
Tropsha, A.; Cho, S. J.; Zheng, W. “New tricks for an old dog”: development and application of novel QSAR methods for rational design of combinatorial chemical libraries and database mining. In: Rational Drug Design: Novel Methodology and Practical Applications, Parrill, A. L.; Reddy, M. R.; Eds. Washington, DC: American Chemical Society; 1999, 198–211.
Cho, S. J.; Zheng, W.; Tropsha, A.Rational combinatorial library design. 2. Rational design of targeted combinatorial peptide libraries using chemical similarity probe and the inverse QSAR approaches. J. Chem. Inf. Comput. Sci. 1998, 38, 259–268.Google Scholar
Gussio, R.; Pattabiraman, N.; Kellogg, G. E.; Zaharevitz, D. W.Use of 3D QSAR methodology for data mining the National Cancer Institute Repository of Small Molecules: application to HIV-1 reverse transcriptase inhibition. Methods 1998, 14, 255–263.Google Scholar
Tropsha, A.; Zheng, W.Identification of the descriptor pharmacophores using variable selection QSAR: applications to database mining. Curr. Pharm. Des. 2001, 7, 599–612.Google Scholar
,Maybridge. http://www.daylight.com/products/databases/Maybridge.html 2005.
Babaoglu, K.; Simeonov, A.; Irwin, J. J.; Nelson, M. E.; Feng, B.; Thomas, C. J.; Cancian, L.; Costi, M. P.; Maltby, D. A.; Jadhav, A.; Inglese, J.; Austin, C. P.; Shoichet, B. K.Comprehensive mechanistic analysis of hits from high-throughput and docking screens against beta-lactamase. J. Med. Chem. 2008, 51, 2502–2511.Google Scholar
Austin, C. P.; Brady, L. S.; Insel, T. R.; Collins, F. S.NIH Molecular Libraries Initiative. Science 2004, 306, 1138–1139.Google Scholar

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×