Hyperlipidemia, Disease Associations and Top 10 ...

1 downloads 0 Views 948KB Size Report
of lipid signaling. VASP and COTL1 are second interactors of lipid while HHAT is third interactor of lipid. However, the novel potential drug targets need further ...
Hyperlipidemia, Disease Associations and Top 10 Potential Drug Targets: A Network View Sneha Rai and Sonika Bhatnagar* Computational and Structural Biology Laboratory, Division of Biotechnology, Netaji Subhas Institute of Technology, Dwarka, New Delhi 110078, India.

Corresponding author: *Dr. Sonika Bhatnagar Computational and Structural Biology Laboratory, Division of Biotechnology, Netaji Subhas Institute of Technology, New Delhi 110078, India. Phone: +91-11-25099027, Fax: +91-11-25099022. E.mail: [email protected] (S. Bhatnagar) [email protected]

1

Abstract: Due to sedentary life style and lipid rich diet, the prevalence of acquired hyperlipidemia (AH) has increased. In this work, a lipid protein-protein interaction network (LPPIN) was prepared for AH by incorporating differentially expressed genes in obese fatty liver as seed nodes, protein interactions from PathwayLinker and lipid interactions from STITCH4.0. Pathway and disease cluster analysis was performed using KOBAS2.0. Topological analysis was carried out using Cytoscape3.2.0. Cholesterol, diacylglycreol(DAG), phosphatidylinositol-bis-phosphate(PIP2) and inositol triphosphate (IP3) were identified as core lipids that influence the signaling pathways in the LPPIN. RAC Serine/Threonine-protein kinase (AKT1) was a highly essential central protein. The Gastrin-CREB pathway was highly enriched and all enriched pathways in the LPPIN showed crosstalk with the Phosphatidylinositol-3-Kinase (PI3K)-Akt pathway, correlating with the central role of AKT1 in the network. The disease clusters identified in the LPPIN were CVD, cancer, Alzheimer’s disease (AD) and Type II diabetes(T2D). Commercially approved drug targets for hyperlipidemia in each disease cluster may be repurposed for treatment of the specific disease. We also report here the top 10 potential drug targets that control progression from hyperlipidemia to the respective disease state. ToppGene Suite was employed to identify candidates followed by a) discarding high closeness centrality nodes; b) selecting nodes with high bridging centrality (BrC). Three potential targets could be mapped to specific disease clusters in the LPPIN. Lipids associated with AH and each disease cluster identified may be useful as prognostic fingerprints. Our work provides an integrative view of lipid-protein interactions leading to AH and its associated diseases. Keywords: Acquired hyperlipidemia; Lipid protein interaction network; Hyperlipidemia associated diseases; Drug targets; Fingerprint

2

Abbreviations: CVD, Cardiovascular Diseases; AH, Acquired Hyperlipidemia; CE, cholesteryl ester; SM, sphingomyelin;

PC,

phosphatidylcholine;

lPC,

lysophosphatidylcholine;

lPE,

lysophosphatidylethanolamine; lPS, lysophosphatidylserine; PS, phsophatidylserine; TAG, triacylglycerol; DAG, diacylglycreol; DEG, differentially expressed genes; LPPIN, lipid-proteinprotein interaction network; FA, fatty acyl; GL, glycerolipids; GPL, glycerophospholipids; SL, sphingolipids; PPIN, protein-protein interaction network; BC, betweenness centrality; closeness centrality; BrC, bridging Centrality; PI, phosphatidylinositol; AA, arachidonic Acid; FcεR , Fc epsilon receptor; NGF, nerve growth factor; TLR, Toll-like receptor; FcγR, Fc gamma receptor; Alzheimer’s disease- AD; cAMP response element binding protein- CREB; peroxisome proliferator-activated receptor- PPAR; phosphatidylinositol bis-phosphate- PIP2; inositol triphosphate- IP3; phosphatidylinositol 3-kinase- PI3K; RAC-alpha serine/threonine-protein kinase- Akt; Type 2-Diabete- T2D; Free Fatty Acids- FFA; phosphatidylinositol-3,4,5triphosphate- PI-3,4,5-P(3); phosphatidylinositol-3,4-bisphosphate- PI-3,4-P(2); peroxisome proliferator-activated receptor gamma- PPAR-γ; proprotein convertase subtilisin kexin type 9PCSK9; 3-hydroxy-3-methyl-glutaryl-CoA reductase- HMGCR; microsomal triglyceride transport protein- MTTP; peroxisome proliferator activated receptor α- PPARα; niemann-pick c1-like 1- NPC1L1; Apolipoprotein B-100- ApoB-100; coactosin-like binding protein 1COTL1; vasodilator stimulated Phosphoprotein- VASP; hedgehog acyltransferase- HHAT

3

Introduction Hyperlipidemia is a well known and widely accepted risk factor that leads to the advancement of atherosclerosis, finally leading to Cardiovascular Disease (CVD) (Lewandowski et al., 2011; Kundumani-Sridharan et al., 2013; Navar-Boggan et al., 2015). Studies carried out to reveal occurrence of Coronary Artery Disease risk factors in India showed that 45.6% of individuals suffered from dyslipidemia (Sekhri et al., 2014). Rise in the level of dietary lipids and cholesterol is also associated with development of certain types of cancer (Sako et al., 2004; Habis et al., 2014; McDonnell et al., 2014). Hyperlipidemia associated increase in plasma concentration of lipids may occur either due to familial causes or acquired causes. In either form, hyperlipidemia is linked to the increased incidence of premature atherosclerosis, pancreatitis and chylomicronemia syndrome (Cox et al., 1990; Stang et al., 2005; Xu et al., 2015). The primary form of hyperlipidemia occurs mainly due to genetic defects in lipid metabolism (De CastroOros et al., 2010). The common causes of Acquired Hyperlipidemia (AH) are lipid rich/ unhealthy diet, obesity, glucose intolerance, hypothyroidism, liver diseases, cigarette smoking, alcohol consumption etc. Increase in urbanization is accompanied by the increase in consumption of lipid and energy rich food as well as decrease in physical activity leading to AH (Yusuf et al., 2001; Evans et al., 2004). Liver plays an important role in lipid metabolism (Nguyen et al., 2008) and fatty liver is an important player in the pathogenesis of hyperlipidemia, showing strong correlation between liver fat, insulin resistance and overproduction of triglyceride laden large VLDL particles (Stahlman et al., 2012). The key players in pathogenesis of hyperlipidemia are the elevated lipid levels. Therefore, lipidomics techniques are employed to determine the levels of various type and number of lipids under physiological/ pathological conditions. These studies have revealed that human plasma contains distinct lipid species belonging to six broad lipid classes, namely: fatty acyls (FA), glycerolipids (GL), glycerophospholipids (GPL), sphingolipids (SL), sterols, and prenols (Quehenberger et al., 2010). GL like diacylglycerols (DAG) and phosphatidylcholine (PC) are elevated in dyslipidemic individuals. Cholesterol derived cholesteryl ester (CE) are also increased during lipid imbalance. Similarly, palmitic and vaccenic acid contribute to increased levels of triacylglycerol (TAG) (Stahlman et al., 2012). Apart from their role in energy metabolism and cell membrane structure, lipids (e.g. phosphinositides, eicosanoids, SL and fatty acids) affect diverse cellular processes like apoptosis, cell proliferation, metabolism and 4

migration (Wymann et al., 2008). Abnormalities in either lipid level or lipid signaling cascade is an important component of pathophysiology in a number of diseases (Watson, 2006; Chen et al., 2015; Feng et al., 2015; Henk et al., 2015; Schmitz et al., 2015). The effect of lipids in formation, stability, and disruption of atherosclerotic plaque is also well established (Meikle et al., 2011). In atherosclerotic plaque, nearly 24 different lipid species belonging to CE, sphingomyelin (SM), PC, lysophosphatidylcholine (lPC), lysophosphatidylethanolamine (lPE), phosphatidylethanolamine (PE), lysophosphatidylserine (lPS), phosphatidylserine (PS) and TAG species were identified (Stegemann et al., 2011). Microarray studies conducted on familial form of hyperlipidemia showed that genes involved in uptake, synthesis, cellular outflow and disposal of cholesterol were differentially expressed. An increase in lipid accumulation in monocytes due to rise in uptake of oxidized/ native LDL was observed during these studies (Watson, 2006; Mosig et al., 2008). However, till date, studies on AH have been carried out in animal models only (Puskas et al., 2004; Kim et al., 2005; Takahashi et al., 2012). Several systems biology studies integrating the interplay between small molecules and proteins have played a crucial role in study of drug treatment, signal transduction and metabolism (Butcher et al., 2004; Schadt et al., 2006; Cheng et al., 2012; Bhattacharya et al., 2013). However, till date, there is no study that integrates differentially expressed genes with protein-lipid interactions to provide an integrated view of the protein-protein and protein-lipid interactions in AH. In view of this, we employed the publicly available Differentially Expressed Gene (DEG) data from human fatty liver with protein-lipid interaction information to develop a lipid protein-protein interaction network (LPPIN) of AH by two tier data integration. Further, topological and pathway analysis of the LPPIN was performed to identify choke points, enriched pathways and central effectors in AH. The Gastrin- (cAMP response element binding protein) CREB pathway was most significantly overrepresented in the LPPIN. CVD, T2D, AD and cancer disease gene clusters figured prominently in the LPPIN. Critical nodes for the development of AH based complications and candidate drug targets were identified in each disease cluster. The LPPIN provides a comprehensive model for the effect of lipids in health and disease. Further, several approved drug targets for the treatment of AH could be mapped on to every disease cluster identified, and have implications for the polypharmacology of AH induced disorders. Candidate drug targets were mined from each disease cluster. Seven common lipid

5

species associated with AH and unique lipids in each disease cluster may act as useful markers for the development of AH and its associated diseases.

Materials & Methods Data mining Seed nodes were mined and data was integrated from three sources: a) Expression profile of GSE15653 (Pihlajamaki et al., 2009) was downloaded from Gene Expression Omnibus (GEO) database consisting of fatty liver sample obtained from 5 lean controls and 4 obese test subjects. Identification of DEGs was done by comparing lean and obese groups using GEO2R. GEO2R carries out comparisons on original submitter-supplied processed data tables using the GEOquery and Limma R packages from the Bioconductor project (http://www.bioconductor.org). To adjust P-values for multiple testing, Benjamini and Hochberg false discovery rate method was used. The DEGs were selected based on cut-off value, |logFC| >1.5 and P-value 2 were taken to be significant. The following parameters of centrality were computed: The degree of a node is the number of other nodes connected to it. Nodes with large degree are called hub nodes and have crucial regulatory function in the cell. Betweenness centrality (BC) of a network is a measure of number of shortest paths passing through that node. A node with high BC value controls flow of information through the network and is called a bottleneck node. Nodes having high degree and BC values are significant in scalefree biological network, and are termed as Hub-Bottlenecks (HBNs). These nodes are likely to be essential (Hahn et al., 2005; Joy et al., 2005) to the system and often coincide with high degree hub nodes (Goh et al., 2003). The closeness centrality (CC) analysis, selects nodes having minimum average distance to all other nodes in the network. CC evaluates the probability that a node (protein/ lipid) is functionally relevant to several other nodes (protein/ lipid) in the network and is used to determine core central nodes of the network (da Silva et al., 2008; Scardoni et al., 2009). Nodes possessing high CC have been used to identify central metabolites in a large metabolic network

8

(Ma et al., 2003). Usually, the essential nodes of the network are centrally located and have high degree, CC and BC (Hahn et al., 2005). BrC is a new centrality measure that is used to identify nodes that are situated between highly connected modules and thus transduce a large amount of information. BrC of a node is calculated as the product of its BC and bridging coefficient; where bridging coefficient is a measure of how well a node is located between highly connected components of the network (Hwang et al., 2008). Bridging nodes are noticeably different from nodes having high degree and BC as they are less lethal and are regulated independently. Owing to their importance in information flow combined with low lethality, bridging nodes are good candidates for drug targets, especially in the human system (Hopkins, 2008). Functional annotation and disease gene identification In order to identify overexpressed pathways, processes and disease genes in the lipidprotein network, KOBAS 2.0 (http://kobas.cbi.pku.edu.cn) (Xie et al., 2011) web server was employed. For functional annotation entire list of network proteins was submitted and parameters used were; annotate and identify, Homo sapiens as the species background, and all databases enabled. Statistically significant hits were identified by applying hypergeometric test and Benjamini-Hochberg FDR correction and P-valve ≤ 0.05. KOBAS 2.0 server integrates information for 1327 species from 5 pathway databases (PID, Reactome, KEGG PATHWAY, Panther and BioCyc) and utilizes information from OMIM, KEGG DISEASE, GAD, FunDO and NHGRI GWAS Catalog for human disease annotation. Disease cluster analysis The core disease genes (P-value