Journal of Applied Bioinformatics & Computational

0 downloads 0 Views 1MB Size Report
Oct 23, 2018 - Annotation of Farnesyl. Diphosphate Synthase (FDS) of. Azadirachta indica. George Oche Ambrose1*, Olanrewaju John Afees2,3, Alakanse.
Ambrose, J Appl Bioinforma Comput Biol 2018, 7:3 DOI: 10.4172/2329-9533.1000157

Research Article

In Silico Sequence Analysis, Homology Modeling and Function Annotation of Farnesyl Diphosphate Synthase (FDS) of Azadirachta indica George Oche Ambrose1*, Olanrewaju John Afees2,3, Alakanse Suleiman Oluwaseun1, Chia Terkuma4,5, Mtomga Iorwuese6, Ayuba Abdullateef Kayode1, Balogun Basheer Ajibola1, Abiodun Wisdom Oshireku 1 , Afolayan Daniel Todimu1 , Fagbemi Ranti-Ade Rebecca1 and Adekunle Precious1 1Department

of Biochemistry, University of Ilorin, Ilorin, Nigeria

2Department

of Anatomy University of Ilorin, Ilorin, Nigeria

3Ben

Carson School of Medicine, Ileshan-Remo, Nigeria

4Department

of Anatomy, Nile University of Nigeria, Abuja

5Department

of Anatomy, University of Nigeria, Enugu, Campus

6Department

of Anatomy, Ahmadu Bello University, Zaria, Nigeria

Journal of Applied Bioinformatics & Computational Biology A SCITECHNOL JOURNAL well as supplies substrate for farnesylation and geranyl generation [2]. FDS catalyzes the condensation of diphosphate of C5 alcohols (Isopentenyl and Dimethylallyl) to form C10 and C15 diphosphate (geranyl and farnesyl) [3]. Farnesyl disphosphate serves as a substrate for dehydrodolichyl diphosphate synthase whose products function as carriers of sugar that forms glycoprotein and glycolipids [4,5]. It also serves as a signal molecule that shows interactions with orphan receptor [6]. FDS has been localized in the cytosol [7] mitochondria [8,9] chloroplasts [10] and peroxisomes [11]. Recent studies have shown that sterol pathway, and in particular, FDS is a promising candidate target for the development of new antiprotozoal drugs. In addition to its medical importance as an antiprotozoal activity, FDS is the most likely molecular target of aminobisphosphonates (e.g risedronate), a set of compounds that have been shown to have antiprotozoal activity [12]. This protein, together with other enzymes involved in isoprenoid biosynthesis, is an attractive drug target, yet little is known about the compartmentalization of the biosynthetic pathway which has been predicted in this study.

*Corresponding

author: George Oche Ambrose, Department of Biochemistry, University of Ilorin, Ilorin, Nigeria; E-mail: [email protected] Received Date: September 11, 2018; Accepted Date: October 16, 2018; Published Date: October 23, 2018

Abstract Azadirachta indica is commonly referred to as Neem and is a member of the mahogany family, Meliaceae. Azadirachta indica has enormous therapeutic benefits and can show antibacterial activity, antiviral activity, anti-inflammatory activity, antioxidant effect and anticarcinogenic activity. Hence, it is recognized as a tree for solving global health problems. In the present study, subcellular localization prediction reveals that FDS of Azadirachta indica is a cytoplasmic protein. We predicted the 3D structure of the protein using homology modeling as 3D structure prediction methodology. The 3D structure of FDS was determined using SwissModel. Quality assessment of the model indicated that the model is reliable. Furthermore, it was discovered that FDS is involved in biological processes; metabolic process, cellular process, cellular metabolic process, primary metabolic process and the biochemical function of the protein is involved in transferase activity and metal ion binding. Keywords: Neem; Sequence analysis; FDS; Homology modeling

Introduction The world health organization (WHO) estimated that about 80% of the population of people residing in developing countries depends exclusively on traditional medicine for their primary health care. Azadirachta indica, mainly cultivated in the Indian subcontinent, has been used extensively in the treatment of different [1]. FDS is a key enzyme in the biosynthesis of Isoprenoid which provides sesquiterpene precursors for several classes of essential metabolites such as sterols, ubiquinones, carotenoids and dolichols as

Figure 1: 3D structure of modeled FDS showing (a) Hydrogen bonds (b) Hydrophobic interaction. The 3D structure information of FDS will help us to understand its function in the formation of geranyl and farnesyl diphosphate and interaction of their domain with their ligands. The gap between the available sequence of protein and its solved structure was created because 3D structure prediction of proteins require X-ray crystallography and NMR spectroscopy which consumes a lot of time, tedious approach and generate a large amount of data. In silico method of 3D structure prediction has bridged this gap. Computational study of biological sequences has become a very informative field of modern science which is highly interdisciplinary, where statistical and algorithmic methods play a vital role [13]. In this present study we performed sequence analysis on Azadirachta indica FDS with its

All articles published in Journal of Applied Bioinformatics & Computational Biology are the property of SciTechnol and is protected by copyright laws. Copyright © 2018, SciTechnol, All Rights Reserved.

Citation:

Ambrose GO, Afees OJ, Oluwaseun AS, Terkuma C, Iorwuese M, et al. (2018) In Silico Sequence Analysis, Homology Modeling and Function Annotation of Farnesyl Diphosphate Synthase (FDS) of Azadirachta indica. J Appl Bioinforma Comput Biol 7:3.

doi: 10.4172/2329-9533.1000157 primary and secondary structure analysis. We also carried out homology modeling to find the 3D structure of FDS and finally ensure the quality of the predicted model.

Figure 2: Z-score of (a) query and (b) template protein using PROSAweb. included the molecular weight, amino acid composition, Hydrophobicity index, isoelectric point, Molar extinction coefficient (280nm), Molar absorbance, and Monoisotopic mass. Subcellular localization of any protein is crucial in understanding the protein function. Prediction of subcellular localization of FDS was carried out by CELLO v.2.5 [14,15].

Secondary structure prediction Secondary structure of FDS sequence was predicted using GORIV and SOPMA. GORIV method is based on information theory and was developed by J. Garnier, D. Osguthorpe and B. Robson [16]. PredictProtein [17] was also employed for calculating and analyzing the secondary structure features of FDS sequence.

3D structure prediction using homology approach

Figure 3: Ramachandran plot using Procheck.

Materials and Methods Protein accession and sequence analysis The protein sequence of FDS protein was retrieved from the NCBI database using the accession number of AIG15449.1.Physicochemical properties of FDS were computed by GPMAW tool (www.alphalyse.com/). The parameters evaluated by GPMAW

Volume 7 • Issue 3 • 1000157

In silico prediction of 3D structure of protein is based on Threading, Ab-Initio and Homology modeling. Homology modeling can only be useful for 3D structure prediction of the target protein if sequence similarity search of target sequence is more than 60%. Template was searched against Swiss-model (an online tool for 3D prediction) template library using Blast and HHBlits algorithms. Based on maximum identity and GMQE values, the best templates with PDB ID: 4kk2.1A having identity of 79.41% was selected from a total of 50 templates. Homology modeling was done using the template sequence whose structure has been solved by X-ray diffraction.

Quality and reliability assessments After the 3D model structure of FDS was generated, energy minimization was performed by GROMOS96 force field in a Swiss-pdb Viewer. Stereochemical analyses and structural evaluation were carried out using ProSA-web [18]. Procheck Ramachandran plot and Z-scores.

• Page 2 of 5 •

Citation:

Ambrose GO, Afees OJ, Oluwaseun AS, Terkuma C, Iorwuese M, et al. (2018) In Silico Sequence Analysis, Homology Modeling and Function Annotation of Farnesyl Diphosphate Synthase (FDS) of Azadirachta indica. J Appl Bioinforma Comput Biol 7:3.

doi: 10.4172/2329-9533.1000157 Visualization of the generated model was carried out using Discovery studio visualizer.

search against close orthologous family members using the NCBI Conserved Domain Database [19].

Function annotations of the protein The function of FDS was annotated using Profunc. To find the conserved domains in protein in order to identify its family, it was

Figure 4: Overall quality factor checked by ERRAT.

Submission of the model in protein model database (PMDB) The model generated for Azadirachta indica FDS protein was successfully submitted in the protein model database (PMDB) [20] having PMDB ID: PM0081789.

using SOPMA and GOR IV and the result reveals that the sequence is mainly composed of Alpha helix and Beta sheets. Table 2 shows a comparative analysis by GOR IV and SOPMA. It is predicted from the result that FDS is chiefly made up of alpha helix (64.62% SOPMA; 41.81% GOR IV) and random coil (27.19% SOPMA; 39.18% GOR IV).

Results and Discussion Protein sequence analysis GPMAW was used to calculate the physicochemical properties from the protein sequence. FDS was predicted to have 342 amino acids residues, with molecular weight of 39,570.32 Daltons and Isoelectric point of 5.67. An isoelectric point less than 7 indicates negatively charged protein. The hydrophobic index of a protein is a number representing the hydrophobic properties of its amino acid residues [21]. The negative hydrophobic index of FDS of -0.32 indicates that the protein is hydrophilic and soluble in nature. Leucine and Glutamic amino acids were found to be in rich amounts in the protein, hence its hydrophilic nature and has a better interaction with water. The maximum and minimum number of amino acid present in FDS sequence is leucine (11.99%) and tryptophan and methionine (1.46%). Comparative composition of query sequence and the template sequence is shown in the Table 1. Functions within the cell are often localized in specific cellular compartments; therefore, predicting the subcellular localization of unknown proteins can provide information about their functions and can be useful in understanding the mechanisms of disease and in developing drugs. CELLO was used to predict that our protein is a cytoplasmic protein. The secondary structure of FDS was predicted

Volume 7 • Issue 3 • 1000157

Figure 5: 3D structure of FDS protein.

3D Structure Prediction using Homology Modeling Approach The 3D structure of protein is very important in understanding Proteins functions, their localization and interactions. The most common structure prediction method is Homology modeling. The prediction of the tertiary structure or 3D structure of FDS was done using Swiss Model with an initial search for solved templates with

• Page 3 of 5 •

Citation:

Ambrose GO, Afees OJ, Oluwaseun AS, Terkuma C, Iorwuese M, et al. (2018) In Silico Sequence Analysis, Homology Modeling and Function Annotation of Farnesyl Diphosphate Synthase (FDS) of Azadirachta indica. J Appl Bioinforma Comput Biol 7:3.

doi: 10.4172/2329-9533.1000157 similar sequences. Table 3 gives the result of top five best templates, and the best templates were aligned with target amino acids sequence. Templates having the best E-value, maximum number of query sequence covered and percentage similarity were selected for homology modeling. 4kk2.1.A was selected for homology modeling which is an X-ray diffraction model of Crystal structure of Artemisia spiciformis chimeric FPP/GFPP synthase. 3D structure of the modeled FDS showed its hydrogen bonds and hydrophobic interactions between the amino acids residues (Figure 1). Z-score and Ramachandran plot, among several assessment methods, were used to check the quality and reliability structure of the model. The Z-score is indicative of overall model quality and is used to check whether the input structure is within the range of scores typically found for native proteins of similar size [22]. PROSAweb was used to find the value of Z-score of both template and query. Z-score of query protein was -10.81 and Z-score of template was -10.56 (Figure 2). The stereochemical quality of a protein is checked using Procheck server. This is done by analyzing residue-byresidue geometry and overall structure geometry. The quality of this model was assured by using this tool to determine Ramachandran plot. The result of the Ramachandran plot showed 94.1% of residues in the most favorable region indicating that it is a reliable and good quality model (Figure 3). A model having more than 90% residues in the most favorable region is considered as a good quality model. Furthermore, the reliability of the model was assessed by ERRAT that analyzes the statistics of non-bonded interactions between different atom types and plots the value of error function versus position of a 9-residue sliding window [22]. ERRAT result shows the overall quality factor of 99.094 (Figure 4). The obtained 3D structure of FDS protein was visualized on Discovery studio visualizer (Figure 5). Name

3-Letter

FDS

4kk2

Alanine

Ala

21

17

Cysteine

Cys

7

5

Aspartic Acid

Asp

23

25

Glutamic Acid

Glu

30

25

Phenylalanine

Phe

14

16

Glycine

Gly

17

21

Histidine

His

9

16

Isoleucine

Ile

16

18

Lysine

Lys

30

28

Leucine

Leu

41

39

Methionine

Met

5

10

Asparagine

Asn

14

10

Proline

Pro

8

11

Glutamine

Gln

11

15

Arginine

Arg

14

13

Serine

Ser

19

26

Threonine

Thr

11

17

Volume 7 • Issue 3 • 1000157

Valine

Val

27

30

Tryptophan

Trp

5

5

Tyrosine

Tyr

20

19

342

366

Total residues

Table 1: Comparative amino acid composition of query and template proteins.

Function annotation of the protein ProfFunc was used to hypothetically annotate the function of the Azadirachta indica FDS protein. It was discovered that the protein is involved in four major biological processes; metabolic process, cellular process, cellular metabolic process and primary metabolic process. The biochemical function of the protein includes inositol pentakisphosphate 2\-kinase activity, endopeptidase activity, dimethylallyltranstransferase activity, transferase activity and metal ion binding activity. Further investigation about the function of modeled FDS protein was carried out by finding its family; this was done by conducted search in the NCBI Conserved Doman Database (NCBI CDD) to find conserved domains so that its family can be identified. The result revealed that azadirachta indica FDS protein has polyprenyl_synt domain and belong to polyprenyl synthetase family. Secondary Structure

GOR IV

SOPMA

Alpha helix

41.81%

64.62%

310 helix

0.00%

0.00%

Pi helix

0.00%

0.00%

Beta bridge

0.00%

0.00%

Extended strand

19.01%

5.56%

Beta turn

0.00%

2.63%

Bend region

0.00%

0.00%

Random coil

39.18%

27.19%

Ambiguous states

0.00%

0.00%

Other states

0.00%

0.00%

Sequence length

342

342

Table 2: Secondary structure prediction of FDS protein by GOR IV and SOPMA.

Submission of the model in protein model database (PMDB) The predicted structure of the protein was submitted to protein model database (PMDB) and can be found using PMDB ID: PM0081789.

Conclusion The major aim of this study was to perform sequence analysis, structure analysis and homology modeling on azadirachta indica FDS. We have adopted various sequence and structure analysis tools which were useful in understanding the sequence and its structure. ProFunc was used to functionally annotate the protein while NCBI CDD was

• Page 4 of 5 •

Citation:

Ambrose GO, Afees OJ, Oluwaseun AS, Terkuma C, Iorwuese M, et al. (2018) In Silico Sequence Analysis, Homology Modeling and Function Annotation of Farnesyl Diphosphate Synthase (FDS) of Azadirachta indica. J Appl Bioinforma Comput Biol 7:3.

doi: 10.4172/2329-9533.1000157 used to search conserved domain of the protein. Finally, as part of our study, we considered homology modeling approach to propose the first 3D structure of the azadirachta indica FDS protein. The predicted 3D structure will give more insight in understanding the function and structure of the protein. In addition, this structure can be used in drug design or understanding interactions between proteins.

References 1. 2. 3.

4.

5. 6.

7. 8.

9.

Kumar VS, Navaratnam V (2013) Neem (Azadirachta indica): Prehistory to contemporary medicinal uses to humankind. Asian Pac J Trop Biomed 3: 505-514. Szkopiñska A, Plochocka D (2005) Farnesyl diphosphate synthase; regulation of product specificity. Acta Biochim Pol 52: 45-55. Gabelli SB, McLellan JS, Montalvetti A, Oldfield E, Docampo R, et al. (2006) Structure and mechanism of the farnesyl diphosphate synthase from Trypanosoma cruzi: Implications for drug design. Proteins: Struct, Funct, Bioinf 62: 80-88. Correll CC, Ng Linda, Edwards PA (1994) Identification of farnesol as the non-sterol derivative of mevalonic acid required for the accelerated degradation of 3-hydroxy-3-methylglutarylcoenzyme A reductase. J Biol Chem 269: 17390-17393. Shearer AG, Hampton RY (2005) Lipid‐mediated, reversible misfolding of a sterol‐sensing domain protein. The EMBO Journal 24: 149-159. Shivdasani RA, Rosenblatt MF, Zucker-Franklin D, Jackson CW, Hunt P, et al. (1995) Transcription factor NF-E2 is required for platelet formation independent of the actions of thrombopoeitin/ MGDF in megakaryocyte development. Cell 81: 695-704. Grünler J, Ericsson J, Dallner G (1994) Branch-point reactions in the biosynthesis of cholesterol, dolichol, ubiquinone and prenylated proteins. Biochim Biophys Acta 1212: 259-277. Runquist M, Ericsson J, Thelin A, Chojnacki T, Dallner, G (1994). Isoprenoid biosynthesis in rat liver mitochondria. Studies on farnesyl pyrophosphate synthase and trans-prenyltransferase. J Biol Chem 269: 5804-5809. Cunillera N, Arró M, Delourme D, Karst F, Boronat A, et al. (1996). Arabidopsis thaliana contains two differentially expressed farnesyl-diphosphate synthase genes. J Biol Chem 271: 7774-7780.

Volume 7 • Issue 3 • 1000157

10. 11. 12.

13. 14.

15. 16. 17. 18. 19. 20. 21. 22.

Sanmiya K, Ueno O, Matsuoka M, Yamamoto N (1999) Localization of farnesyl diphosphate synthase in chloroplasts. Plant Cell Physiol 40: 348-354. Olivier LM, Krisans SK (2000) Peroxisomal protein targeting and identification of peroxisomal targeting signals in cholesterol biosynthetic enzymes. Biochim Biophys Acta 1529: 89-102. Ortiz-Gómez A, Jiménez C, Estévez AM, Carrero-Lérida J, RuizPérez LM, et al. (2006) Farnesyl diphosphate synthase is a cytosolic enzyme in Leishmania major promastigotes and its overexpression confers resistance to risedronate. Eukaryotic cell 5: 1057-1064. Giancarlo R, Siragusa A, Siragusa E, Utro F (2007) A basic analysis toolkit for biological sequences. Algorithm Mol Biol 2: 10. Yu CS, Lin CJ, Hwang JK (2004) Predicting subcellular localization of proteins for Gram‐negative bacteria by support vector machines based on n‐peptide compositions. Protein sci 13: 1402-1406. Yu CS, Chen YC, Lu CH, Hwang JK (2006) Prediction of protein subcellular localization. Proteins: Struct, Funct, Bioinf 64: 643-651. Cuff JA, Clamp ME, Siddiqui AS, Finlay M, Barton GJ (1998) JPred: A consensus secondary structure prediction server. Bioinformatics 14: 892-893. Rost B, Yachdav G, Liu J (2004) The predictprotein server. Nucleic acids research 32: W321-W326. Wiederstein M, Sippl MJ (2007) ProSA-web: Interactive web service for the recognition of errors in three-dimensional structures of proteins. Nucleic Acids Res 35: W407-W410. Marchler-Bauer A, Bryant SH (2004) CD-Search: Protein domain annotations on the fly. Nucleic Acids Res 32: W327-W331. Arnold K, Kiefer F, Kopp J, Battey JN, Podvinec M, et al. (2009) The protein model portal. J Struct Funct Genomics 10: 1-8. Kyte J, Doolittle RF (1983) A simple method for displaying the hydropathic character of a protein. J Mol Biol 157: 105-132. Idrees S, Nadeem S, Kanwal S, Ehsan B, Yousaf A, et al. (2012) In silico sequence analysis, homology modeling and function annotation of Ocimum basilicum hypothetical protein G1CT28_OCIBA. Int J Bioautomat 16: 111-118.

• Page 5 of 5 •