Computational Chemistry for Drug Discovery - Springer Link

Encyclopedia of Nanotechnology DOI 10.1007/978-94-007-6178-0_100975-1 # Springer Science+Business Media Dordrecht 2015

Computational Chemistry for Drug Discovery Giulia Palermoa,b and Marco De Vivoa* a Department of Drug Discovery and Development - CompuNet, Istituto Italiano di Tecnologia, Genoa, Italy b Laboratory of Computational Chemistry and Biochemistry, Institute of Chemical Sciences and Engineering, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland

Synonyms Drug design; Molecular modeling

Definition Computational chemistry uses physics-based algorithms and computers to simulate chemical events and calculate chemical properties of atoms and molecules. In drug design and discovery, diverse computational chemistry approaches are used to calculate and predict events, such as the drug binding to its target and the chemical properties for designing potential new drugs.

Overview Computational methods are nowadays routinely used to accelerate the long and costly drug discovery process. Typically, once the drug discovery target is selected, drug discovery activities are divided into those for (1) the hit identification phase, in which the aim is the identification of chemical compounds with a promising activity toward the target; (2) the lead generation phase, in which hit compounds are improved in potency against the target; and, finally, (3) the lead optimization phase, in which lead compounds are optimized, generating druglike molecules ultimately able to exert their beneficial pharmacological effect in patients (Fig. 1). Computations can help in all these drug discovery activities, from drug target identification, commonly a receptor or an enzyme, to the design and optimization of a new drug-like compound. While computational methods for target identification rely mainly on computational sciences such as bioinformatics and computational genomics, different computational approaches are used once the target has been identified and the search for small molecule inhibitors has commenced, starting from the hit identification phase, moving toward the lead generation and optimization phases (Fig. 1). At that point, routinely applied computational chemistry approaches include methods for structure-based drug design (SBDD), when structural data of the target protein are available, and ligand-based drug design (LBDD), when structural information of the target is missing or not fully reliable. Overall, these methods facilitate the identification of promising chemical scaffolds that interfere favorably with the target’s function, producing a positive pharmacological effect. Experimental biochemical and pharmacological data on the new compounds, such as their in vitro inhibitory potency and in vivo efficacy, can be used to check the computational predictions while also forming the basis upon which better models can be constructed, leading to the design of superior compounds [1]. The impact of computational chemistry on drug discovery has been intensified in the last few decades by the rapid development of faster architectures and better algorithms for time-affordable high-level computations. Several theoretical methods that were once prohibitive for effective drug discovery are now *Email: [email protected] Page 1 of 15


increasingly used for hit identification and lead generation. Recently, for example, CPU-intensive and GPU-based free energy perturbation (FEP) calculations have been applied to accurately estimate the binding free energy of closely related chemical analogs, generating very promising results for lead generation and optimization [2]. Also, classical molecular dynamics (MD) is currently proposed as a practical computational tool for studying the energetics and kinetics of a ligand binding to a target protein. This is relevant for lead optimization, representing a new frontier in computationally driven drug discovery. Recently, in fact, compounds have been evaluated not only for their ability to bind tightly to the target but also for their capacity to remain bound to the target for a long time (i.e., considering kon and koff of binding), increasing the chances of efficacy in vivo. Finally, in the last decade, quantum mechanics (QM) has become ever more accessible for performing SBDD for lead generation and optimization. For example, QM and hybrid QM/MM methods are increasingly used to study the interaction of covalent inhibitors with a drug discovery target. It is challenging to tailor a perfect fit between a new compound and its target in order to generate potent inhibitory effects (i.e., high affinity), which is the goal during the lead generation phase. However, there are several other challenges during the lead optimization phase. A potent inhibitor is not a drug. There are other physicochemical variables that dictate the pharmacokinetics (PK) of each compound, affecting their drug-likeness and, ultimately, their efficacy and safety in vivo. Absorption, distribution, metabolism, excretion, and toxicity (ADMET) are key parameters that need to be optimized to generate a drug candidate with good chances of success in clinical trials. ADMET prediction at early stages of the drug discovery process is key to preventing, or at least limiting, later failures in costly clinical trials. In this respect, computational methods for chemometrics and quantitative structure–activity relationship (QSAR) approaches play a prominent role in creating predictive models for selecting and prioritizing compounds, typically during the lead optimization phase. Thus, each computational chemistry method can impact and accelerate a given phase of the drug discovery process, from docking and MD for hit identification and lead generation to QSAR for ADMET optimization (see Fig. 1). Detailed methodological descriptions of each method can be found in many excellent review articles and books that focus on the theoretical background of computational chemistry [3–5]. This essay aims instead to comprehensively outline the applicability of the computational chemistry methods and approaches used nowadays to accelerate drug design and discovery, with particular emphasis on SBDD. The point is to show how each computational approach suits better a certain phase of the drug discovery pipeline, from SBDD for the hit identification and lead generation phases to QSAR methods for lead optimization, where drug-like properties are tuned to generate a promising drug candidate. The everyday use of once-prohibitive computational methods, such as MD- and QM-based methods for SBDD, will also be highlighted.

Fig. 1 Drug discovery pipeline, typically formed by three phases, namely, “hit identification,” “hit to lead,” and “lead optimization.” Different computational approaches can be applied to each phase of the drug discovery pipeline Page 2 of 15


Computational Methods for SBDD Computational approaches to structure-based drug design (SBDD) rely on knowledge of the target protein structure, which is usually provided by high-resolution X-ray crystallography or NMR data. Through a detailed analysis of the interaction between the target structure and the (new) ligands, SBDD approaches allow informed decisions to be made in designing more potent (i.e., with high affinity for the target) and selective compounds. Therefore, these methods are mostly used for hit identification and during the hit-tolead phase (Fig. 1). A wide range of computational chemistry approaches can be applied to SBDD, including force field-based methods such as molecular docking calculations, classical MD- or Monte Carlo (MC)-based simulations, and more sophisticated QM-based methods [1, 6]. Ultimately, the new rationally designed compounds must be experimentally evaluated to verify whether they are potent inhibitors. The experimental test of each compound demonstrates the interdisciplinary nature of effective drug discovery. It is also essential for establishing and evaluating the predictive power of computational approaches in identifying and generating promising drug candidates.

Force Field-Based Approaches for SBDD Computational approaches based on molecular mechanics (MM) allow the energy and several properties of molecular systems to be computed [3]. The basic functional form of a force field describes the potential energy of the system, which is determined as the sum of different contributions that are parameterized to reproduce experimental or ab initio data. The interactions are divided into bonded interactions and nonbonded interactions. The typical force field equation (Chart 1) contains the terms for the bonded (highlighted in red) and nonbonded (highlighted in blue) interactions. In detail, the bonded terms describe the chemical bonds between two neighboring atoms, bond angles between three atoms, and dihedral angles between four atoms. Improper dihedral–angle terms can be additionally applied to maintain planar or tetrahedral conformations. The functional form of the bond and angle terms is quadratic, while the dihedral term uses a trigonometric function. In the typical force field equation, R is the distance between two atoms i and j that are bound together via a covalent bond, y is the bond angle, Req and yeq refer to equilibrium bond lengths and angles, and KR and Ky are the vibrational constants. Vn is the torsional barrier corresponding to the nth barrier of a given torsional angle with phase g. The last term of the typical force field equation refers to the nonbonded interactions, which are composed of a Lennard-Jones term for the van der Waals interactions and a Coulomb term for the electrostatic interactions between atoms i and j. Force fields like AMBER, OPLS, GROMOS, and CHARMM are extensively used to study standard biomolecular systems such as protein, DNA, and RNA. Corrections and extensions are recurrently

angles

bonds

Vn

Kq (q – q eq)2 +

KR(R – Req)2 +

V=

dihedrals

bonded interactions

+

non-bonded interactions

i