Insights into Molecular Assembly of ACCase ...

11 downloads 8142 Views 1MB Size Report
your own website. .... The docking results of BC and CT domains with BCCP domain (BCCP–BC and ... With this prior knowledge, the binding sites were created ..... five simulation systems, BC (free), CT (free), BCCP (free), BCCP–CT (bound) ...
Insights into Molecular Assembly of ACCase Heteromeric Complex in Chlorella variabilis—A Homology Modelling, Docking and Molecular Dynamic Simulation Study Namrata Misra, Prasanna Kumar Panda, Mahesh Chandra Patra, Sukanta Kumar Pradhan & Barada Kanta Mishra Applied Biochemistry and Biotechnology Part A: Enzyme Engineering and Biotechnology ISSN 0273-2289 Appl Biochem Biotechnol DOI 10.1007/s12010-013-0277-0

1 23

Your article is protected by copyright and all rights are held exclusively by Springer Science +Business Media New York. This e-offprint is for personal use only and shall not be selfarchived in electronic repositories. If you wish to self-archive your article, please use the accepted manuscript version for posting on your own website. You may further deposit the accepted manuscript version in any repository, provided it is only made publicly available 12 months after official publication or later and provided acknowledgement is given to the original source of publication and a link is inserted to the published article on Springer's website. The link must be accompanied by the following text: "The final publication is available at link.springer.com”.

1 23

Author's personal copy Appl Biochem Biotechnol DOI 10.1007/s12010-013-0277-0

Insights into Molecular Assembly of ACCase Heteromeric Complex in Chlorella variabilis—A Homology Modelling, Docking and Molecular Dynamic Simulation Study Namrata Misra & Prasanna Kumar Panda & Mahesh Chandra Patra & Sukanta Kumar Pradhan & Barada Kanta Mishra

Received: 21 March 2013 / Accepted: 29 April 2013 # Springer Science+Business Media New York 2013

Abstract Acetyl-CoA carboxylase (ACCase), a biotin-dependent enzyme that catalyses the first committed step of fatty acid biosynthesis, is considered as a potential target for improving lipid accumulation in oleaginous feedstocks, including microalgae. ACCase is composed of three distinct conserved domains, and understanding the structural details of each catalytic domain assumes great significance to gain insights into the molecular basis of the complex formation and mechanism of biotin transport. In the absence of a crystal structure for any single heteromeric ACCase till date, here we report the first heteromeric association model of ACCase from an oleaginous green microalga, Chlorella variabilis, using a combination of homology modelling, docking and molecular dynamic simulations. The binding site of the docked biotin carboxylase (BC) and carboxyltransferase (CT) were predicted to be contiguous but distinct in biotin carboxyl carrier protein (BCCP) molecule. Simulation studies revealed considerable flexibility for the BC and CT domains in the BCCP-bound forms, thus indicating the adaptive behaviour of BCCP. Further, principal component analysis revealed that in the presence of BCCP, the BC and CT domains exhibited an open-state conformation via the outward clockwise rotation of the binding helices. These conformational changes might be responsible for binding of BCCP domain and its translocation to the respective active sites. Various rearrangements of inter-domain hydrogen bonds (H-bonds) contributed to conformational changes in the structures. H-bond Electronic supplementary material The online version of this article (doi:10.1007/s12010-013-0277-0) contains supplementary material, which is available to authorized users.

N. Misra : P. K. Panda : B. K. Mishra Academy of Scientific and Innovative Research, CSIR-Institute of Minerals and Materials Technology, Bhubaneswar 751 013 Odisha, India N. Misra : P. K. Panda (*) Bioresources Engineering Department, CSIR-Institute of Minerals and Materials Technology, Bhubaneswar 751 013 Odisha, India e-mail: [email protected]

M. C. Patra : S. K. Pradhan Department of Bioinformatics, Odisha University of Agriculture and Technology, Bhubaneswar 751 003 Odisha, India

Author's personal copy Appl Biochem Biotechnol

interactions between the interacting residue pairs involving Glu201BCCP/Arg255BC and Asp224BCCP/Gln228CT were found to be essential for the intermolecular assembly. The present findings are consistent with previous biochemical studies. Keywords Acetyl-CoA carboxylase . Chlorella variabilis . Homology modelling . Docking . Molecular dynamic simulations . Principal component analysis . Microalgae . Biofuel

Introduction Acetyl-CoA carboxylase (ACCase) is a key regulatory enzyme that catalyses the first committed step of fatty acid biosynthesis by converting acetyl-CoA to malonyl-CoA. The ACCase activity has been implicated in formation of various compounds such as lipids, polyketides and flavonoids in both prokaryotes and eukaryotes [1]. In recent years, genetic engineering of ACCase for improved biofuel production has been an active area of intense research due to its critical role in augmenting lipid content of several oleaginous transgenic plants and microalgal species [2–5]. In addition, this enzyme has been an attractive target for development of new herbicides to selectively control grass weeds, and a number of studies have been undertaken in the past few years to understand the structural and mechanistic details underlying the resistance of mutant ACCase enzyme to herbicide inhibition [6–9]. ACCase is composed of three distinct conserved protein domains, viz. biotin carboxylase (BC), biotin carboxyl carrier protein (BCCP) and α-, β-carboxyltransferase (CT). These components catalyse the following partial reactions: BC;Mg2þ

    → BCCP  biotin þ ATP þ HCO−3  ←    BCCP  biotin  CO2 þ ADP þ Pi CT

 → BCCP  biotin  CO2 þ acetyl  CoA ←   malonyl  CoA þ biotin

ð1Þ ð2Þ

The BC domain carboxylate the biotin prosthetic group attached to the mobile arm of BCCP domain in the presence of ATP, bicarbonate and Mg2+ (Reaction 1), and the subsequent transfer of CO2 from the resulting carboxybiotin intermediate to acetyl-CoA by the CT domain (Reaction 2) to form malonyl-CoA. Thus, to effect catalysis, BCCP domain must shuttle the tethered biotin between BC and CT domains. Physically, two different isoforms of ACCase exist. A heteromeric (prokaryotic) form of ACCase where the domains are located on individual subunits is found in bacteria and in the chloroplast of plants and algae. In contrast, a single large homomeric ACCase (eukaryotic) where each of these constituent domains is located on a multifunctional polypeptide is found in fungi, animals and cytosol of plants and algae [5, 10]. Previous genetic engineering studies have confirmed that plastidial heteromeric ACCase is involved in de novo fatty acid biosynthesis [5]. Therefore, due to its direct involvement in lipid accumulation, a thorough understanding on its multi-domain structure is of prime importance. Although three-dimensional (3D) crystal structures of individual catalytic domains of heteromeric ACCase from several bacterial and eukaryotic species are known [11–16], an atomic resolution model of the entire ACCase protein is still unavailable, probably due to the labile nature of the complex [17]. This limits our understanding of the underlying inter-domain interactions and molecular dynamics leading to formation of the ACCase complex.

Author's personal copy Appl Biochem Biotechnol

In this context, computational modelling and docking protocols has become powerful alternatives to experimental approaches in elucidating 3D structures and protein–protein associations [18, 19]. For instance, a recent protein docking study has revealed the binding interaction between BCCP and BC domains of pyruvate carboxylase enzyme from Rhizobium etli [20], the latter being a close homolog of biotin-dependent ACCase [21]. Further, the structure–function relationship and dynamic behaviour of some essential proteins responsible for the biosynthesis of lipids in microalgae, namely the GPAT (glycerol-3phosphate acyl transferase), FabH (β-ketoacyl-ACP synthase III; KASIII) and TE (thioesterase), have been reported using a combination of homology modelling, docking and molecular dynamic (MD) simulations protocols [22–25]. In addition, past studies have demonstrated that such in depth analysis of crucial lipid biosynthetic enzymes will certainly facilitate engineering of potential microalgal strains for commercially viable biofuel production [26, 27]. In this work, for the first time, we report the heteromeric association model of ACCase enzyme from Chlorella variabilis, an oleaginous microalgal biofuel feedstock, employing homology modelling, docking, MD simulations and principal component analysis (PCA). The binding modes, conformational changes, concerted domain motions and intermolecular hydrogen bonds (H-bonds) were analysed for the developed structures. The docking results of BC and CT domains with BCCP domain (BCCP–BC and BCCP– CT) offer insights into the binding interfaces that contribute to their interactions and delineate the underlying molecular mechanism for regulating biotin access to the active sites of BC and CT.

Materials and Methods Homology Modelling Protein sequences of the C. variabilis heteromeric ACCase subunits, which include BC, BCCP, CTα and CTβ, were retrieved from the SwissProt database with accession numbers E1Z5P4, E1Z723, E1ZJK9 and F2YGI4, respectively. Homology models of each of the domains were generated using the Modeller 9v9 program [28]. Modeller calculates 3D structure of a given protein with non-hydrogen atoms based on an alignment between the target sequence and one or more known related crystal structures. The template structures used for homology modelling were selected on the basis of high percentage of sequence identity with the target protein. Specifically, (1) 3D model of BC domain (residues 89–446) was constructed using the crystal structure of Staphylococcus aureus BC, complexed with ATP analogue and Mg2+ ions (PDB code: 2VPQ chain A; 59 % sequence identity) [16]; (2) the BCCP biotinyl domain (residues 159–237) was constructed using the crystal structure of Escherichia coli BCCP protein (PDB code: 3BDO chain A; 43 % sequence identity) [11]; and (3) the α (residues 94–413) and β (residues 29–292) chains of CT domain were developed separately using the crystal structures of A chain (2F9I chain A; 50 % sequence identity) and B chain containing Zn2+ ion (2F9I chain B; 54 % sequence identity) of S. aureus CT, respectively [15]. The modelled α and β chains were subsequently superimposed onto the crystal structure of the template using Discovery Studio v3.1, to keep the same relative orientation of the two subunits and to construct a single heterodimeric structure of CT domain. For each of the homology models, one hundred conformations were generated, which were ranked and filtered according to the lowest discrete optimized protein energy (DOPE) scores for further analyses.

Author's personal copy Appl Biochem Biotechnol

Multiple sequence alignment of selected homologous sequences for the developed models were performed using MAFFT v6.9 [29] and further inspected and edited using BioEdit v7.1 [30]. The final alignment results were viewed through Jalview program (http:// www.jalview.org). Molecular Docking Protein–Ligand Docking Studies Docking of ATP with BC, biotin with BCCP, acetyl-CoA and biotin with β and α chains of CT domain, respectively, were carried out using the AutoDock 4.2 program [31]. AutoDock program combines a rapid energy evaluation through pre-calculated grids of affinity potentials using variety of search algorithms to find the most suitable binding region for a ligand on a given protein molecule. Hydrogen atoms were added to the receptor PDB files, and the structures were prepared for docking calculations using Kollman for all atomic partial charges and Gasteiger charges for protein molecules and ligands, respectively. The binding interface for the receptor protein structures (BC, BCCP and CT) was defined according to previously published reports. With this prior knowledge, the binding sites were created within 5 Å of the predicted active sites. Grid maps were generated around the active sites with 60×60×60 grid points and spacing of 0.375 Å for each protein molecule. The Lamarckian genetic algorithm was used to search for the best ligand conformer. The population size was set to 150, maximum number of energy evaluations was set to 2,500,000 and maximum number of generations was set to 270,000. The final conformations were clustered and ranked according to the AutoDock scoring function and further corroborated with knowledge of putative active site residues determined by previously reported mutational studies. The structure of ATP was extracted from the BC crystal structure used as template for homology modelling in this study, while structure files of other ligands such as biotin and acetyl-CoA were retrieved from PubChem database [32]. Protein–Protein Docking Studies The modelled BCCP domain was docked into the active site of BC and CT domains using the PatchDock server [33]. It is one of the most suitable methods for performing molecular docking of large protein–protein macromolecular complexes. PatchDock uses a molecular docking algorithm based on shape complementary principles. All docking studies were performed using the restraints that the B subdomain of BC and α chain of CT should interact with BCCP protein molecule. The clustering RMSD was set to 4 Å. The best ranked 50 each of the docked poses for BCCP–BC and BCCP–CT complexes were visually inspected and evaluated. The binding poses with lowest docking energy score were selected for further analyses. MD Simulations All MD simulation experiments were performed using GROMACS 4.0.5 program [34] with the GROMOS96 53a6 force field [35] at normal temperature and pressure (NPT) ensemble. The MD simulation protocol is as follows: polar hydrogen atoms were added, and the protonation state of ionisable groups was appropriately chosen to pH 7.0. Each system was solvated in a cubic box with a simple point charge water model. The distance between the solute and the box was maintained at 10 Å. The simulation systems were neutralized by

Author's personal copy Appl Biochem Biotechnol

adding appropriate counter ions (Na+ and Cl−) and replacing random water molecules. All simulations were run under periodic boundary conditions with NPT ensemble by using Berendsen coupling scheme [36] for maintaining the temperature (300 K) and the pressure (1 bar) constant. All bond lengths were constrained using the LINCS algorithm [37], and the electrostatic interactions were calculated by using the particle mesh Ewald method [38]. A 2fs time step was used to integrate the equations of motion. The systems were subjected to pre-equilibration process that included energy minimization using the steepest descent integrator until the maximum force reached 1,000 kJ/mol/nm, followed by positionrestrained dynamics for 10 ns. The trajectories were saved at every 1-ps interval. The GROMACS topologies for the ligands were generated using the Dundee PRODRG server [39]. Trajectories obtained from various simulations were analysed using grace (http:// plasma-gate.weiz-mann.ac.il/Grace/) and VMD [40] programs. Analysis of hydrogen bonds from the simulation trajectory was performed using the GROMACS g hbond utility with cutoff distance of 3.5 Å. The most prominent characteristic of the motions during the simulations was analysed through PCA [41, 42]. Pymol program (http://www.pymol.org/) was used to analyse PCA, which generates a porcupine plot showing a graphical summary of the motions along the trajectory. In a porcupine plot, each Cα atom has a cone pointing in the direction of motion of the atom; the length of the cone defines the amplitude of the motion, and the size of the cone indicates the number of each Cα atoms. Structure Validation The developed models were validated stereochemically using PROCHECK [43], QMEAN [44], PROSA [45], VERIFY3D [46] and ERRAT [47] web servers. All these structure validation tools use dispartate methods to calculate quantitative scores that can be used to assess model quality and guide further selection of the most accurate protein models. PROCHECK corroborates the stereochemical quality of protein structure by inspecting the accuracy of the dihedral angles (φ and ψ) in the Ramachandran plot. The QMEAN server uses a five-term composite scoring function which includes a torsional potential, a pairwise interaction potential, a solvation term, terms for quantitative comparison between predicted secondary structure and observed model secondary structure, and residue accessibilities [48]. The QMEAN score (qualitative model energy analysis) output of QMEAN server validates model quality, where lower score signifies a stable model. The PROSA calculates an overall Z-score that can be used to evaluate the structural quality of a computationally determined protein model with respect to crystal and NMR structures present in PDB. VERIFY3D is used to analyse the compatibility of the 3D protein model with its own amino acid sequence. ERRAT outputs an overall model quality factor to examine non-bonded atomic interactions. A higher score from VERIFY3D and ERRAT indicates a more native-like model.

Results and Discussion Structural Model and Validation of BC The 3D structure of C. variabilis BC protein was generated using S. aureus BC crystal structure in complex with two Mg2+ ions [16]. This protein belongs to the ATP-grasp family and is dependent on divalent cations like Mg2+ to regulate ATP activity [49, 50]. Other enzymes in this family include D-Ala: D-Ala ligase [51], glutathione synthetase [52], carbamoyl phosphate synthetase [53] and glycinamide ribonucleotide transformylase [54].

Author's personal copy Appl Biochem Biotechnol

The protein is formed by three subdomains referred to as A (N-terminal), B (ATP-grasp fold) and C (C-terminal), and is in agreement with the template (Fig. 1a, b). The A subdomain delineated by residues K89 to I189 of the model consists of five strands of parallel β-pleated sheets flanked on either side by a total of four α-helices in a conformation similar to Rossmann fold. The B subdomain (Val217-Tyr288) extends from the core region of the protein where it folds into two α-helical regions and three strands of β-sheets. Following the B subdomain, the polypeptide chain folds back into the body of the protein to form the Cterminal subdomain (Arg293-Lys446). In addition to these three major secondary structural motifs, an AB linker segment delineated by residues Ser193 to Ala212 is observed to connect the A and B subdomains of BC. Thus, BC appear to be cylindrical in shape, formed by A and C subdomains with the B subdomain positioned as a lid on top of the binding pocket [9, 55, 56]. Multiple sequence alignment revealed that the primary structure of BC domain was highly conserved among bacterial, algal and higher plant species (Fig. SS1). The quality of the predicted 3D model was further assessed using various structure validation tools. The accuracy of the model was evidently high (Table 1), as revealed by VERIFY3D, ERRAT (Fig. SS4a), PROSA (Figs. SS5a and SS6a) and QMEAN scores. Besides, the PROCHECK results (Fig. SS7a) indicated that the model possesses sufficient

Fig. 1 a Ribbon representation of the overall molecular architecture of BC representing the three distinct subdomains, viz. A, B and C, which are indicated in orange, green and yellow color, respectively. The bound ATP and Mg2+ ions are shown as dark pink sticks and light pink spheres, respectively. The P-loop and AB linker regions are also marked. b Superimposed backbones of modeled BC (green) and template (2VPQ) crystal structure (orange). c The amino acid residues located within approximately 4 Å of the ATP binding site in BC are shown. The residues (carbon: green; nitrogen: blue; oxygen: red) and ATP (pink) are shown in stick representation. H-bond interactions are displayed in black dashed lines with corresponding distances in angstrom. d Stick representation displaying the superimposition of conserved active site residues responsible for binding ATP in BC model (green) and template (orange). H-bond interactions are depicted by green dashed lines with corresponding bond distances in angstrom. Mg2+ ions are shown as spheres

Author's personal copy Appl Biochem Biotechnol Table 1 Assessment scores for homology models using structure validation programs PROSAe VERIFY3Df ERRATg

Structure PROCHECK Corea

Allowedb Generously Disallowedc QMEANd

BC

89.8 % 8.9 %

1.0 %

0.3 %

0.643

−4.84

69.36 %

BCCP

75.9 % 20.7 %

1.7 %

1.7 %

0.409

−5.5

84.81 %

50.00 %

CTα

89.5 % 7.3 %

2.2 %

1.1 %

0.702

−7.36

75.39 %

85.479 %

CTβ

88.2 % 10.0 %

1.3 %

0.4 %

0.696

−6.38

84.15 %

85.098 %

a

77.844 %

The core region corresponds to confirmations of the polypeptide backbone where there are no steric clashes

b

The allowed region corresponds to confirmations in which shorter Van der Waals radii used in the calculation, that is the atoms are allowed steric clashes, permitting consideration of left-handed alpha helices c

Disallowed regions involve steric hindrance between the side chain group and main chain atoms and typically occur in turn regions of proteins d

QMEAN score whose value should fall between the reliability zone of 0 and 1 PROSA Z-score indicates overall model quality score. It can be used to check whether the Z-score of the modelled structure is within the range of scores typically found for native protein of similar size

e

f

Percentage of residues with VERIFY3D average score >0.2

g

Percentage of the protein for which the calculated error value falls below the 95 % rejection limit

stereochemical quality having more than 99 % of the residues in the allowed region of the Ramachandran plot (89.8 % in most favoured, 8.9 % in additional allowed and 1.0% in generously allowed) and only one residue (Glu420) falling in the disallowed region, which was found to be placed far away from the binding pocket. ATP Binding Site of BC Since BC catalyses the ATP-dependent carboxylation of biotin, a docking study was performed to discern the molecular mechanism of interaction between ATP and BC domain. Previous mutational experiments have identified several critical amino acids located at the interface between B and C subdomains to be crucial for ATP interactions [16, 55]. In the present study, ATP was docked at the proposed binding site, and the lowest energy conformation with stable binding at active site was selected for further analysis. The docked pose of ATP and its interaction with BC is shown in Fig. 1c. Since the residues present within 4 Å of the binding site are functionally important due to their crucial role in biological interaction, mapping of these residues in the predicted binding site of BC was carried out. The residues Val289, Arg287, Glu286, Met242, Lys244, Ile243, Lys202, Gly253, Ala247, Arg252, Gly248, Gly249, Gly250, Gly251, Asn321, Gln318, His294 and Gly363 were found to be positioned around 4 Å of ATP. Three H-bond interactions were observed between the adenosine ring of ATP and Val289, Arg287 and Lys244 at a distance of 3.0, 3.0 and 3.1 Å, respectively, and the ribose moiety of ATP docked into a cavity formed by residues His294, Gln318 and Asn321. Hydrophobic interactions were observed between the phosphate group of ATP and several glycine residues, notably Gly248, Gly249, Gly250, Gly251 and Gly253, present in the P-loop region of BC [16] (Fig. 1c). These predictions are in well agreement with previous site-directed mutational studies of BC suggesting the critical role of glycine-rich P-loop in ATP binding [50, 57–59]. In addition to hydrophobic interactions, the phosphate group of ATP also exhibited stable H-bonds with Lys202 and Gly251 at a distance of 2.9 and 2.8 Å, respectively. Together, the observed docking results

Author's personal copy Appl Biochem Biotechnol

are consistent with previous experimental findings which corroborate that the hydrophobic and positively charged residues of BC are determinants for ATP binding [16, 55]. Furthermore, we superimposed the predicted ATP-binding site of the template with the model to analyse the variations in amino acids, which revealed that all the residues were identical, except for five residues in the model/template, viz. Arg287/Lys200, Val289/Ile202, Gly363/I276, Asn321/Met234 and Met242/Ile155 (Fig. 1d). These variations in active site residues could potentially be useful targets for developing engineered enzymes with desired substrate specificity. Structural Model and Validation of BCCP Homology-derived model of the C-terminal biotinyl domain of C. variabilis BCCP protein was determined employing the crystal structure of E. coli BCCP protein as template [11]. The N-terminal of the protein was not modelled because our focus of investigation was primarily on the BCCP biotinyl domain responsible for catalysing biotin-mediated reactions [60, 61]. The modelled protein (residues 159-237) contains two very similar sets of four stranded β-sheets compactly arranged around a conserved hydrophobic core (Fig. 2a), which is the characteristic fold in the biotin and lipoyl enzyme superfamily [62]. The central hydrophobic core of the biotinyl domain consists of residues Glu211, Pro204, Val200, Ile199, Phe184, Glu181, Pro177, Ala178, Pro179 and Glu180, which contribute towards stabilization of its β-sandwich conformation [8]. Multiple sequence alignment revealed that the C-terminal biotinyl domain of BCCP, which contains an average of 80 amino acids, is highly conserved in bacteria, algae and higher plants. Further critical analysis showed that the residues contributing to the hydrophobic core of the protein share high identity among the selected homologous species (Fig. SS2). The developed BCCP model was validated stereochemically using various structure validation tools. The statistical parameters obtained for the model are summarized in Table 1. The PROCHECK results revealed 75.9 %, 20.7 %, 1.7 % and 1.7 % of residues in most favoured, additionally allowed, generously allowed and disallowed regions, respectively, of Ramachandran plot (Fig. SS7b). The presence of long loop regions linking the β strands and predominance of polar residues in the BCCP protein having greater tendency to adopt unusual backbone conformations are the possible reasons for the slightly lower values in the Ramachandran plot [63, 64]. However, it is important to note here that all active site residues were found to be well within the allowed region, and the sole residue, Ser54, present in the disallowed region, was found to be far away from the predicted active site pocket. Moreover, high scores of ERRAT (Fig. SS4b), VERIFY3D, QMEAN and PROSA (Figs. SS5b and SS6b) programs further confirm the quality and reasonability of the modelled BCCP. Biotin Binding Site of BCCP Following structure determination, molecular docking of biotin with BCCP was performed to explore their potential binding interfaces. The best converged docked pose of the biotin moiety was near a tight β-turn located between β4 and β5 sheets, in a similar orientation as in the template [11]. Previous studies have shown that a conserved lysine residue of the ‘AMKLM’ biotinylation motif harbours biotin through covalent binding [65]. Interestingly, in our present study, no such consensus motif could be predicted in the modelled C. variabilis BCCP protein. Instead, a proline residue was observed at the structurally equivalent position to the lysine residue in the template (Fig. SS2). The binding site of the docked BCCP–biotin complex is shown in Fig. 2b. In analogy to the template, the docked pose

Author's personal copy Appl Biochem Biotechnol

revealed that the ureido ring of biotin interacts with a loop region (Arg174 to Phe184) of BCCP known as ‘thumb’. The biotin group makes H-bonds with residues such as Pro177 and Ala178 present at the thumb region and also with Gln206 located between β4 and β5 sheets. These results confirm a previous biochemical study demonstrating partial burial of the biotin prosthetic group in the surface of BCCP, lying close to the thumb region [66]. Apart from these crucial interactions, several other hydrophobic and polar residues also contribute for stable binding such as Tyr174, Glu201, Pro204, Val200, Ile199, Phe184, Glu181, Pro179, Glu180 and Ser176. Structural Model and Validation of CT BLAST analysis of the query protein sequences of C. variabilis CT domain against PDB database identified few crystal structures of CT domain, predominantly from bacterial species, as the most significant and reliable templates for determination of 3D structure of C. variabilis CT domain. Further, the crystal structure of CT domain from S. aureus with highest sequence identity of 50 % implied maximum structural conservation between the CT domains of S. aureus and C. variabilis ACCase. Therefore, in the present study, the dimeric structure of C. variabilis CT domain was generated using S. aureus CT crystal structure as template [15]. The α and β chains were modelled separately and were ranked by the DOPE score of the Modeller. One model for each of the chains was subsequently selected by taking into account the reciprocal positions of the two chains and the stereochemical quality as assessed by various structure validation tools. The best heterodimeric CT model containing both the chains was then further analysed for structural details. The protein is composed of two subdomains, N- and C-terminals, which are commonly referred to as β and α subunits, respectively (Fig. 3a, b). Both the subunits adopt a β–β–α superhelix fold, a characteristic topology of the crotonase superfamily [60, 67]. The core of the subunits is formed by a long twisted and tapered seven-stranded mixed β-sheets (C β1–β5/N β5–β9, C β6/N β11 and C β8/N β13) orthogonal to a short two-stranded β platform (C β5/N β10 and C β7/N β12). The seven-stranded β-sheet is shielded by surrounding α-helices (C α12/N α9, C α4/N α2, C α6/N α4, C α7/N α5 and C α8/N α6). These helical secondary elements constitute the triangular face of the wedge-shaped configuration (Fig. 3a, b). Sequence analysis of CT protein from bacteria, algae and higher plants showed a characteristic zinc domain

Fig. 2 a Ribbon diagram of BCCP. The biotin prosthetic group is shown in stick representation with carbon, nitrogen, oxygen and sulfur atoms colored in grey, blue, red and yellow, respectively. The secondary structural elements are labeled. b Stick representation displaying the interaction between biotin and the residues (carbon: green; nitrogen: blue; oxygen: red) of BCCP domain, along with H-bonds (shown in black dashed lines) and corresponding bond distances in angstrom. The thumb region is also labeled

Author's personal copy Appl Biochem Biotechnol

Fig. 3 Ribbon representation of CT domain showing the labeled secondary structures. a α subunit is colored in pink. b β subunit is colored in purple. The four cysteine residues of conserved zinc domain are indicated, and the zinc atom is shown as red sphere

(CX2CX15CX2C), which is strictly conserved in the β subunit (Fig. SS3). The zinc domain is a common cysteine four ‘zinc ribbon’ motif (Cys39, Cys36, Cys58 and Cys55) present in a wide variety of proteins including ribosomal proteins, RNA polymerase II and the basal transcription factors [15]. Although the precise function of this domain is unclear, it has been proposed to provide shelter to the acetyl-CoA binding site [15]. Further studies have also demonstrated that while complete abrogation of the enzymatic activity was observed with the deletion of CT zinc finger motif, mutagenesis of cysteine to alanine residues resulted in reduced enzyme activity [68]. The statistical parameters of various structure validation tools obtained for the final selected α, β subunits of CT domain are summarized in Table 1. Overall, high scores from ERRAT (Fig. SS4c, d), VERIFY3D, PROSA (Figs. SS5c, d and SS6c, d) and QMEAN programs indicated the selected models to be reliably accurate. Ramachandran plot for the CTα model showed about 99 % residues in the allowed regions (89.5 % in most favoured, 7.3 % in additional allowed and 2.2 % in generously allowed) and 1.1 % in disallowed region (Fig. SS7c). Similarly, in case of CTβ model, more than 99 % of the residues were found to be in the allowed regions (88.2 % in most favoured, 10.0 % in additional allowed and 1.3 % in generously allowed) and 0.4 % in disallowed region (Fig. SS7d). However, three residues of CTα model (Ala366, Asn124 and Thr98) and the lone residue of CTβ (Gln30) in the disallowed region were found to be away from the active site, thus corroborating the accuracy of the developed protein model for further studies. Acetyl-CoA and Biotin Binding Site of CT The binding site of acetyl-CoA and biotin substrates in S. aureus CT domain (template) was earlier reported by Bilder et al. [15] based on superimposition with Streptomyces coelicolor crystal structure [11]. In S. aureus, the biotin is located predominantly in α (C-terminal) subunit, while the acetyl-CoA is located in β (N-terminal) subunit, adjacent to the zinc binding motif of the CT domain [15]. The high sequence identity (≥50 %) between CT domains of S. aureus and C. variabilis suggests that the substrate binding sites may be completely conserved in these structures. As a step towards further validation, the binding pose of acetyl-CoA and biotin were determined for the modelled C. variabilis CT domain using AutoDock 4.2 program [31]. We analysed the 30 top ranked conformations clustered according to the AutoDock scoring function and found that in all of the structures, the acetyl-

Author's personal copy Appl Biochem Biotechnol

CoA and biotin were located in the active site similar to the proposed substrate-binding pose in template structure. The best docked pose of biotin/acetyl CoA and its interaction with the CT structure is shown in Fig. 4a, b. The acetyl-CoA substrate was found to be positioned within the binding pocket lined by residues Thr213, Phe136, Asn137, Arg103, Arg110, Asp106, Ala107, Met139, Ser142 and Val216. Similarly, the biotin substrate docks into a cavity formed by Arg241, Phe237, Arg333, Ser321, Ser297, Ile328, Tyr318, Leu329, Gln177, Lys98 and Arg175. Single H-bond interaction between acetyl-CoA and Arg103 was observed at the acetyl-CoA binding site having an inter-atomic distance of 1.9 Å. Similarly, three H-bond interactions were observed between biotin and Ser297 with distances 2.1, 2.3 and 1.9 Å, at the biotin-binding site (Fig. 4b). Prediction of Interaction between BCCP and BC/CT Domains of ACCase After determining the homology-derived models for the three domains of ACCase with wellvalidated geometry and energy profiles, we carried out protein–protein docking to understand the intermolecular association of BCCP domain with the active sites of BC and CT domains and the movement of intermediate carboxybiotin between them. The Binding Interface of BCCP–CT Complex Docked BCCP–CT complex was constructed using PatchDock. As shown in Fig. 5a, majority of the high scores point to a binding cleft of the CTα subunit that very well matches the loop regions of BCCP. Specifically, the loop residues that lie in the thumb region and between β6 and β7 sheets were found to be involved in binding. The binding cleft of CT presumably allows the biotin arm of BCCP to sweep through a large radius, alternatively reaching between the active sites of BC and CT. The BCCP–CT complex exhibited interaction of three residues, notably Ser221, Asp224 and Arg175 of BCCP with CT protein through six H-bonds. The residue Ser221 alone made three H-bonds with Tyr221; Arg175 made two H-bonds with residue Gly187; similarly, Asp224 interacted with Gln228 of CT through two H-bonds. The interacting H-bond distances were observed within

Fig. 4 a Molecular surface of CT showing the binding modes of acetyl-CoA and biotin at the active site of the domain. b Stick representation of the amino acid residues lying within the binding site of acetyl-CoA (green) and biotin (yellow). The helices and interacting residues of CTα and CTβ subunits are shown in pink and purple color, respectively. H-bond interactions are indicated by black dashed lines with corresponding distances in angstrom

Author's personal copy Appl Biochem Biotechnol

Fig. 5 a Schematic drawing of the intermolecular interactions between BCCP and CT domains. The BCCP domain (cyan) docks into the binding cleft region present in α subunit (pink) of CT domain. Key residues involved in interactions are shown in sticks, and H-bonds are represented by black dashed lines. b Schematic drawing of the intermolecular interactions between BCCP and BC domains. The BCCP domain (cyan) docks into the cleft region formed by B subdomain (green) and C subdomain (yellow) of BC. c The single heteromeric ACCase complex. The binding site of the BC and CT are predicted to be contiguous but distinct in BCCP molecule. The domains are individually colored: BC domain (orange), BCCP domain (cyan), α (lemon) and β (magenta) subunits of CT domain. The BC and CT domains are shown in ribbon, whereas the BCCP domain is shown in surface representation

the range of 1.2 to 2.4 Å (Fig. 5a). These observations are in accordance with previous predictions on BCCP protein binding site in propionyl-CoA carboxylase, a close homologue of ACCase enzyme in S. coelicolor, confirming the predominant hydrophilic characteristic of protein–protein interactions with an extensive network of H-bonding [11]. The Binding Interface of BCCP–BC Complex BCCP is known to interact predominantly with the B subdomain of BC [13]. However, further structural details that characterize the BCCP–BC interactions are unclear. Therefore, in an effort to elucidate the binding mechanism, a docked BCCP–BC complex was determined using PatchDock. Similar to BCCP–CT complex, the BCCP binds to a cleft region lined by residues mainly from the central α helices of B subdomain together with a few residues from the C subdomain of BC (Fig. 5b). The docking analysis revealed that the binding sites of CT and BC are contiguous but distinct in BCCP protein molecule. The binding interface includes the loop residues that lie in the thumb region and between β4 and β5 sheets of BCCP. In the BCCP–BC complex, electrostatic interactions between the charged residues on the surface contribute to the binding and stability unlike in the BCCP–CT complex, where hydrophilic interactions were found to be predominant. In addition to the electrostatic interactions, the BCCP–BC interface was characterized by six H-bonds. While two H-bonds were observed between Trp205 of BCCP and Arg320 of BC,

Author's personal copy Appl Biochem Biotechnol

the remaining four were shared between Glu201, Glu203, Thr172 and Gly171 of BCCP and Arg255, Glu273, Arg255 and Gln269 of BC, respectively, with distances in the range of 1.8–2.8 Å (Fig. 5b). The entire computational protocol involving multiple homology modelling and molecular docking studies finally converged onto a single heteromeric structure of ACCase showing inter-domain interactions (Fig. 5c). RMSD and RMSF Analyses The root mean square deviation (RMSD) of backbone atoms with respect to the initial conformation was calculated as a function of time to assess the conformational stability of the proteins during simulation studies. After an initial steep rise in the RMSD for the first ~3 ns, the five simulation systems, BC (free), CT (free), BCCP (free), BCCP–CT (bound) and BCCP–BC (bound) converged to a final stable RMSD of 3.4, 2.5, 2.8, 3.5 and 4.5 Å, respectively (Fig. 6a). Interestingly, the BCCP-bound structures (BCCP–BC and BCCP–CT) have shown the RMSD values to be greater than their corresponding unbound forms (BC-free and CT-free), indicating the induced structural flexibility of BC and CT domains upon binding of BCCP domain. In particular, a greater degree of fluctuation was observed in the BCCP–BC complex with a final RMSD value of 4.5 Å. To further analyse the intra-domain fluctuations that could eventually mediate the overall rearrangement of the protein complex, the RMSD of A, B and C subdomains of BC domain were determined. In our simulation studies, the B subdomain exhibited significant BCCP-mediated conformational changes with an elevated RMSD value, as compared to other two subdomains, viz. A and C (Fig. 6b). This explains the crucial role of subdomain B in binding BCCP, which is also consistent with the observation from our docking analysis. To study the fluctuations of individual residues in detail, the root mean square fluctuation (RMSF) of Cα atoms from its time average position was analysed for each of the five ensembles. The loop regions showed large fluctuations irrespective of the simulation systems because of fewer H-bond constraints to limit their flexibility. Comparative analysis of BCCP (free), BCCP–BC (bound) and BCCP–CT (bound) complexes revealed high RMSF values in BCCP–BC structure followed by BCCP–CT and BCCP (Fig. 7a). We observed that the thumb and loop regions of BCCP that harbour the biotin moiety increased their conformational fluctuations significantly upon binding to BC and CT domains, which is probably essential for the protein to act as an adaptor between the BC and CT domains and to facilitate inter-domain translocation of the biotinyl moiety [12]. In addition, higher flexibility was noticed in the B subdomain of BC, both in the unbound and BCCP-bound forms (Fig. 7b). The conformational flexibility of the B subdomain is reasonable, as 15 % of the amino acids found in this region are glycine, including the stretch of five residues (Gly248, Gly249, Gly250, Gly251 and Gly253) in tandem located in the P-loop of the B subdomain. This observation confirms the predicted instability of the B subdomain observed

Fig. 6 a RMSD of the Cα carbon atoms calculated for the five simulation systems. b RMSD analysis for the three subdomains of BC

Author's personal copy Appl Biochem Biotechnol

Fig. 7 RMSF of the Cα carbon atoms from their time-averaged positions of (a) BCCP-free (black), BCCP of BCCP–BC (red) and BCCP of BCCP–CT (blue). b BC (free) and BCCP–BC (bound). c CT (free) and BCCP– CT (bound). The unbound and BCCP-bound forms are marked using black and red lines, respectively

during our simulation studies, and it is in agreement with the reported crystal structures of BC demonstrating its conformational flexibility as a prerequisite for proper functioning of the enzyme [55]. Further critical analysis of the RMSF results of BCCP–BC and BCCP–CTbound complexes indicated that the overall flexibility of BC and CT also increased with binding to BCCP. In particular, greater magnitude of RMSF fluctuations were observed in regions directly in contact with BCCP, including the residues located in the central α helix of the B subdomain of BC and the C-terminal residues of CT α subunit (Fig. 7c). Such high flexibility is probably due to very few specific main-chain contacts that contribute to BCCP binding in the BC and CT domains. Similar observations were reported in a previous work showing BCCP binding in the BC domain of pyruvate carboxylase from R. etli [20]. Taken together, RMSD and RMSF analyses suggest that high flexibility and increased interdomain concerted dynamic motions, as observed in the modelled structures during

Author's personal copy Appl Biochem Biotechnol

simulations, could contribute to their rapid dislocation and translocation within the ACCase enzyme complex. PCA Analysis In order to characterize the overall domain motions of BC and CT and to understand how these movements are influenced by BCCP binding, we performed PCA analysis of the covariance matrix resulting from the trajectories. PCA identifies relevant low energy displacements of group of residues and emphasizes the amplitude and direction of dominant motions of proteins by projecting the trajectories onto a reduced dimensionality space, thus decomposing complex motions of proteins into a few principal motions, each of which is characterized by an eigenvector and an eigenvalue. The eigenvalue for a given motion represents the contribution of the corresponding eigenvector to the global motion of the protein. Only motions along the first few eigenvectors describe significant motions in a protein. Further, eigenvectors correspond to small Gaussian-distributed random fluctuations [69, 70]. In the present study, the first eigenvector accounts for 39 %, 60 %, 42 %, 52 % and 26 % of the motions in BC (free), BCCP–BC (bound), CT (free), BCCP–CT (bound) and BCCP (free), respectively (Fig. 8). These results imply that the cumulative variance captured by the first eigenvectors of both BC and CT domains are comparatively lower than their corresponding BCCP-bound forms. This is in agreement with our RMSD and RMSF analysis showing that BCCP binding increases correlated motions in the domains. Further, we observed that in the presence of ATP, the α helices within subdomain B of BC (BC-free) undertake a movement in anti-clockwise direction relative to the other two subdomains thereby closing off the active site pocket (Fig. 9a, b). It is pertinent to note that the crystal structure of BC from E. coli also exhibited a similar rotation of approximately 45° of B subdomain lid in the presence of ATP to attain an occluded state conformation [55]. In contrast, the B subdomain in BCCP–BC-bound complex displayed an open state conformation by an outward clockwise rotation, thus facilitating its binding with BCCP (Fig. 9c). Likewise, the BCCP–CT-bound complex also displayed a similar rearrangement in CTα domain (Fig. 10a, b). In addition to BC and CT domains, conformational movement was also observed in the BCCP structure of BCCP–BC and BCCP–CT-bound complexes (Figs. 9c and 10b). The regions involved in binding to BC and CT domains, particularly the thumb and biotin loop, showed greater fluctuations (anti-clockwise rotation) than the rest of the protein as

Fig. 8 Eigenvalue spectra of the diagonalised covariance matrix for the different simulation systems

Author's personal copy Appl Biochem Biotechnol

Fig. 9 a Dominant motions of BC domain in unbound and BCCP-bound simulation systems using PCA. Porcupine plot of the first eigenvector in (a) BC (free) domain with ATP substrate. b Snapshot showing the rotation of B subdomain (ribbon representation) relative to the other two subdomains (surface representation). c BCCP-BC (bound) complex. The BCCP domain is colored in cyan, whereas the A, B and C subdomains of BC are colored in orange, green and yellow, respectively. For clarity, ATP substrate of BC is not shown in the figure. The arrows indicate the direction of motions

revealed by the presence of more porcupine cones in these regions. Specifically, such fluctuations were more pronounced in BCCP–BC (Fig. 9c) as compared to BCCP–CT (Fig. 10b) complex, thus confirming our similar observations made from RMSF profiles. H-bond Interactions As noticeable conformational movements in BC and CT domains were observed after BCCP binding, we examined the changes in H-bond interactions during simulation. Snapshots of the superimposed interacting residues of BCCP–BC and BCCP–CT-bound complexes before and after 10 ns simulation are shown in Fig. 11a, b, respectively. After the simulation, the numbers of strong H-bonds were found to be reduced in both the BCCP docked complexes. However, in the BCCP–BC complex, the presence of few new H-bonds, salt bridge and cationic–pi interactions provide stability to the bound complex. A close analysis

Author's personal copy Appl Biochem Biotechnol

Fig. 10 a Dominant motions of CT domain in unbound and BCCP-bound simulation systems using PCA. Porcupine plot of the first eigenvector in (a) CT (free) domain. b CCPCT (bound) complex. The BCCP domain is colored in cyan, whereas the α and β subunits of CT are colored in pink and purple, respectively. The arrows indicate the direction of motions

of the results indicate that though the predicted H-bonded residues which include Glu201, Trp205, Thr172 and Gly171 of BCCP and Arg255, Arg320 and Gln269 of BC in the BCCP– BC complex were strictly conserved, they were found to form a totally different interaction network after simulation, except for the interaction between Glu201 and Arg255. Besides, a strong salt bridge interaction was observed between the two interacting residues after simulation, demonstrating that these residues contribute significantly towards maintaining a stable association between BCCP and BC. The residue Trp205 of BCCP, which was found to have H-bond contact with Arg320 of BC before simulation, was seen to head towards a fluctuating Asn321 during simulation, resulting in the formation of cationic–pi interaction in addition to maintaining a weak cationic–pi interaction with Arg320. Similarly, residues Thr172 and Gly171 of BCCP, which were initially H-bonded to Arg255 and Gln269 of BC, respectively, were observed to have moved apart from their original positions exhibiting weak electrostatic interaction between them. In addition, conformational alterations resulted in three new H-bonds in BCCP/BC that include Met169/Gln269, Met169/Gln272 and

Author's personal copy Appl Biochem Biotechnol

Fig. 11 The conformational changes observed at the active site region of the systems: (a) BCCP–BC and (b) BCCP–CT complex. The amino acid residues are shown in stick form where pink, cyan, yellow and green represent CTα, BCCP, C subdomain and B subdomain of BC, respectively; the residues are colored in their correspondingly light and dark shades before and after simulation and labeled in red and black color, respectively. The H-bond, salt bridge, electrostatic and cationic–pi interactions are shown in black, red, pink and blue dashed lines, respectively

Pro168/Gln272 (Fig. 11a). The distance plot in Fig. 12a shows that after 8 ns of simulation, the above residues showed H-bonding and maintained interactions till the end of the simulation. Similar variations in H-bond interactions within the binding site of BCCP and CT were noticed. The residue Asp224 of BCCP which was initially H-bonded to Gln228 of CT continued to preserve the interaction throughout the simulation together with two new Hbonds with Asn230 and Arg185, thus retaining the binding interface of BCCP–CT complex. On the other hand, residue Gly187 of CT which was H-bonded to Arg175 of BCCP before simulation was seen to fluctuate towards Arg236 resulting in the formation of a new H-bond

Author's personal copy Appl Biochem Biotechnol

Fig. 12 The distance plot of H-bond interactions observed between the residues of (a) BCCP and BC domains, and (b) BCCP and CT domains

(Fig. 11b). The distance plot of the four H-bonds in BCCP–CT complex during the course of simulation is shown in Fig. 12b. Altogether, the deviation in H-bond interaction during simulation may be attributed to the significant structural conformational changes as revealed in our PCA analysis. Thus, the dynamics and mutations of the predicted inter-domain binding residues should be taken into account while engineering the ACCase enzyme.

Conclusion In this study, homology modelling, docking and MD simulations were performed to unravel the substrate binding sites and interacting residues which govern the specificity and molecular recognition between BCCP domain and the other two catalytic domains, namely, BC and CT of C. variabilis heteromeric ACCase enzyme. Docking studies indicated that electrostatic and hydrophilic interactions predominantly contribute to the formation of BCCP–BC and BCCP–CT complexes, respectively. Further analysis of our results showed non-overlapping and distinct BC and CT binding sites in BCCP. In the presence of BCCP, the BC and CT domains exhibited greater conformational flexibility and inter-domain movement as revealed in MD simulations and PCA analysis. The conserved residues Glu201BCCP/Arg225BC and Asp224BCCP/Gln228CT were found to play a vital role in maintaining stability between the interacting domains. Overall, the present findings are in agreement with previous experimental reports, which will have implications in engineering ACCase enzyme for increased microalgal biofuel production. Acknowledgements This work was partially funded by the Department of Biotechnology, Government of India. N.M. acknowledges the support of the Council for Scientific and Industrial Research, India for granting Senior Research Fellowship. Technical help rendered by Mr. Bikram Kumar Parida in preparation of the figures is gratefully acknowledged.

References 1. Podkowinski, J., & Tworak, A. (2011). BioTechnologia - Journal of Biotechnology, Computational Biology and Bionanotechnology, 92, 321–335. 2. Klaus, D., Ohlrogge, J. B., Neuhaus, H. E., & Dormann, P. (2004). Planta, 219, 389–396.

Author's personal copy Appl Biochem Biotechnol 3. Roesler, K., Shintani, D., Savage, L., Boddupalli, S., & Ohlrogge, J. (1997). Plant Physiology, 113, 75–81. 4. Wan, M., Liu, P., Xia, J., Rosenberg, J. N., Oyler, G. A., Betenbaugh, M. J., et al. (2011). Applied Microbiology and Biotechnology, 91, 835–844. 5. Huerlimann, R., & Heimann, K. (2012). Critical Reviews in Biotechnology, 1–17. 6. Zhu, X. L., Yang, W. C., Yu, N. X., Yang, S. G., & Yang, G. F. (2011). Journal of Molecular Modeling, 17, 495–503. 7. Zhu, X. L., & Yang, G. F. (2012). Current Computer-Aided Drug Design, 8, 62–69. 8. Zhu, X. L., Zhang, L., Chen, Q., Wan, J., & Yang, G. F. (2006). Journal of Chemical Information and Modeling, 46, 1819–1826. 9. Zhu, X. L., Fei, H. G., Zhan, C. G., & Yang, G. F. (2009). Journal of Chemical Information and Modeling, 49, 1936–1943. 10. Tong, L. (2005). Cellular and Molecular Life Sciences, 62, 1784–1803. 11. Diacovich, L., Mitchell, D. L., Pham, H., Gago, G., Melgar, M. M., Khosla, C., et al. (2004). Biochemistry, 43, 14027–14036. 12. Athappilly, F. K., & Hendrickson, W. A. (1995). Structure, 3, 1407–1419. 13. Waldrop, G. L., Rayment, I., & Holden, H. M. (1994). Biochemistry, 33, 10249–10256. 14. Cho, C. Y., Yu, L. P., & Tong, L. (2009). Journal of Biological Chemistry, 284, 11690–11697. 15. Bilder, P., Lightle, S., Bainbridge, G., Ohren, J., Finzel, B., Sun, F., et al. (2006). Biochemistry, 45, 1712– 1722. 16. Mochalkin, I., Miller, J. R., Evdokimov, A., Lightle, S., Yan, C., Stover, C. K., et al. (2008). Protein Science, 17, 1706–1718. 17. Polyak, S. W., Abell, A. D., Wilce, M. C. J., Zhang, L., & Booker, G. W. (2012). Applied Microbiology and Biotechnology, 93, 983–992. 18. Marti-Renom, M. A., Stuart, A. C., Fiser, A., Sanchez, R., Melo, F., & Sali, A. (2012). Annual Reviews of Biophysics and Biomolecular Structures, 29, 291–325. 19. Smith, G. R., & Sternberg, M. J. E. (2002). Current Opinion in Structural Biology, 12, 28–35. 20. Lietzan, A. D., Menefee, A. L., Zeczycki, T. N., Kumar, S., Attwood, P. V., Wallace, J. C., et al. (2011). Biochemistry, 50, 9708–9723. 21. Jitrapakdee, S., & Wallace, J. C. (2003). Current Protein & Peptide Science, 4, 217–229. 22. Misra, N., & Panda, P. K. (2013). OMICS: A Journal of Integrative Biology, 17, 173–186. 23. Baral, M., Misra, N., Panda, P. K., & Thirunavoukkarasu, M. (2012). Biotechnology and Biotechnological Equipment, 26, 2794–2800. 24. Misra, N., Patra, M. C., Panda, P. K., Sukla, L. B., & Mishra, B. K. (2013). Journal of Biomolecular Structure and Dynamics, 31, 241–257. 25. Blatti, J. L., Beld, J., Behnke, C. A., Mendez, M., Mayfield, S. P., & Burkart, M. D. (2012). PLoS One, 7, 1–12. 26. Radakovits, R., Jinkerson, R. E., Darzins, A., & Posewitz, M. C. (2010). Eukaryotic Cell, 9, 486–501. 27. Yu, W. L., Ansari, W., Schoepp, N. G., Hannon, M. J., Mayfield, S. P., & Burkart, M. D. (2011). Microbial Cell Factories, 10, 91–102. 28. Sali, A., & Blundell, T. L. (1993). Journal of Molecular Biology, 234, 779–815. 29. Katoh, K., Kuma, K., Toh, H., & Miyata, T. (2005). Nucleic Acids Research, 33, 511–518. 30. Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., & Kumar, S. (2011). Molecular Biology and Evolution, 28, 2731–2739. 31. Morris, G. M., Goodsell, D. S., Halliday, R. S., Huey, R., Hart, W. E., Belew, R. K., et al. (1998). Journal of Computational Chemistry, 19, 1639–1662. 32. Wang, Y., Bolton, E., Dracheva, S., Karapetyan, K., Shoemaker, B. A., Suzek, T. O., et al. (2010). Nucleic Acids Research, 38, D255–D266. 33. Schneidman-Duhovny, D., Inbar, Y., Nussinov, R., & Wolfson, H. J. (2005). Nucleic Acids Research, 33, 363–367. 34. Van Der Spoel, D., Lindahl, E., Hess, B., Groenhof, G., Mark, A. E., & Berendsen, H. J. (2005). Journal of Computational Chemistry, 26, 1701–1718. 35. Oostenbrink, C., Villa, A., Mark, A. E., & van Gunsteren, W. F. (2004). Journal of Computational Chemistry, 25, 1656–1676. 36. Berendsen, H. J. C., Postma, J. P. M., van Gunsteren, W. F., Dinola, A., & Haak, J. R. (1984). Journal of Chemical Physics, 81, 3684–3690. 37. Hess, B., Bekker, H., Berendsen, H. J. C., & Fraaije, J. G. E. M. (1997). Journal of Computational Chemistry, 18, 1463–1472. 38. Darden, T., York, D., & Pedersen, L. (1993). Journal of Chemical Physics, 98, 10089–10092. 39. Schuttelkopf, A. W., & van Aalten, D. M. (2004). Acta Crystallographica Section D: Biological Crystallography, 60, 1355–1363.

Author's personal copy Appl Biochem Biotechnol 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70.

Humphrey, W., Dalke, A., & Schulten, K. (1996). Journal of Molecular Graphics, 14, 33–38. Yang, L. W., Eyal, E., Bahar, I., & Kitao, A. (2009). Bioinformatics, 25, 606–614. Lauria, A., Ippolito, M., & Almerico, A. M. (2009). Computational Biology and Chemistry, 33, 386–390. Laskowiski, R. A., Mac Arthur, M. W., Moss, D. S., & Thornton, J. M. (1993). Journal of Applied Crystallography, 26, 283–291. Benkert, P., Kunzli, M., & Schwede, T. (2009). Nucleic Acids Research, 37, W510–W514. Wiederstein, M., & Sippl, M. J. (2007). Nucleic Acids Research, 35, W407–W410. Eisenberg, D., Luthy, R., & Bowle, J. U. (1997). Methods in Enzymology, 277, 396–404. Colovos, C., & Yeates, T. O. (1993). Protein Science, 2, 1511–1519. Benkert, P., Tosatto, S. C., & Schomburg, D. (2008). Proteins, 71, 261–277. Galperin, M. Y., & Koonin, E. V. (1997). Protein Science, 6, 2639–2643. Climent, I., & Rubio, V. (1986). Archives of Biochemistry and Biophysics, 251, 465–470. Fan, C., Moews, P. C., Walsh, C. T., & Knox, J. R. (1994). Science, 266, 439–443. Hara, T., Kato, H., Katsube, Y., & Oda, J. (1996). Biochemistry, 35, 11967–11974. Thoden, J. B., Wesenberg, G., Raushel, F. M., & Holden, H. M. (1999). Biochemistry, 38, 2347–2357. Thoden, J. B., Firestine, S., Nixon, A., Benkovic, S., & Holden, H. M. (2000). Biochemistry, 39, 8791–8802. Thoden, J. B., Blanchard, C. Z., Holden, H. M., & Waldrop, G. L. (2000). Journal of Biological Chemistry, 275, 16183–16190. Kondo, S., Nakajima, Y., Sugio, S., Yong-Biao, J., Sueda, S., & Kondo, H. (2004). Acta Crystallographica, D60, 486–492. Post, L. E., Post, D. J., & Raushel, F. M. (1990). Journal of Biological Chemistry, 265, 7742–7747. Reinstein, J., Brune, M., & Wittenghofer, A. (1988). Biochemistry, 27, 4712–4720. Saraste, M., Sibbald, P. R., & Wittinghofer, A. (1990). Trends in Biochemical Sciences, 15, 430–434. Cronan, J. E., & Waldrop, G. L. (2002). Progress in Lipid Research, 41, 407–435. Samols, D., Thornton, C. G., Murtif, V. L., Kumar, G. K., Haase, F. C., & Wood, H. G. (1988). Journal of Biological Chemistry, 263, 6461–6464. Toh, H., Kondo, H., & Tanabe, T. (1993). European Journal of Biochemistry, 215, 687–696. Pal, D., & Chakrabati, P. (2002). Biopolymers, 63, 195–206. Gunasekaran, K., Ramakrishnan, C., & Balaram, P. (1996). Journal of Molecular Biology, 264, 191–198. Thelen, J. J., Mekhedov, S., & Ohlrogge, J. B. (2001). Plant Physiology, 125, 2016–2028. Fall, R. R., Glaser, M., & Vagelos, P. R. (1976). Journal of Biological Chemistry, 251, 2063–2069. Holden, H. M., Benning, M. M., Haller, T., & Gerlt, J. A. (2001). Accounts of Chemical Research, 34, 145–157. Kozaki, A., Mayumi, K., & Sasaki, Y. (2001). Journal of Biological Chemistry, 276, 39919–39925. Amadei, A., Linssen, A. B. M., & Berendsen, H. J. C. (1993). Proteins: Structure Function, and Bioinformatics, 17, 412–425. Garcia, A. E. (1992). Physical Review Letters, 68, 2696–2699.