Application of Chemical Proteomics to Biomarker

0 downloads 0 Views 5MB Size Report
Proteomics Methods and Applications in cardiac research ...... Trendelenburg, G., Hummel, M., Riecken, E. O., and Hanski, C. (1996) Molecular ...... (Duchenne muscular dystrophy which is a common genetic disease resulting from mutations.
    Application of Chemical Proteomics to Biomarker Discovery in Cardiac Research

Application of Chemical Proteomics to Biomarker Discovery in Cardiac Research

  Toepassing van Chemical Proteomics voor Biomarker Identificatie in Hartonderzoek (met een samenvatting in het Nederlands)

Proefschrift ter verkrijging van de graad van doctor aan de Universiteit Utrecht op gezag van de rector magnificus, prof. dr. J.C. Stoof, ingevolge het besluit van het college voor promoties in het openbaar te verdedigen op woensdag 15 september 2010 des middags te 4.15 uur

 

door Thin Thin Aye geboren op 2 juni 1978, te Yangon, Myanmar  

Promotor: Co-promotor:

Prof. dr. A.J.R. Heck Dr. A. Scholten

This thesis is dedicated to my beloved (late) father who supported me.

The cover picture was kindly provided by Dr. Arjen Scholten.

Table of Contents Chapter 1 Proteomics in cardiovascular research 1.1 Protein profiling 1.2 Quantitative proteomics applied in cardiovascular research

9

1.3 Functional proteomics approach 1.4 Proteomics and biomarker discovery

Chapter 2 A multi-angular mass spectrometric view at cyclic nucleotide dependent protein kinases: in-vivo characterization and structure/function relationships

32

Chapter 3 Selectivity in enrichment of PKA RI and RII isoforms and their interactors using modified cAMP affinity resins

77

Chapter 4 Proteome-wide protein concentrations in the human heart

102

Chapter 5 Reorganized PKA-AKAP Associations in the Failing Human Heart

123

Chapter 6 Summary and outlook Samenvatting Curriculum Vitae List of publications Dankwoord

142

 

Chapter 1 Introduction

9

Proteomics in cardiovascular research Cardiovascular diseases (CVD) refer to the dysfunctional conditions of the heart, arteries, and veins that supply oxygen to vital, life-sustaining areas of the body. It is one of the main killers throughout the western world; claiming more than 17 million lives each year. According to the World Health Organization (WHO), an estimated 30% of mortalities result from various forms of CVDs; among which, a certain proportion can be partly attributed to genetic defects. For instance, hypertrophic cardiomyopathy originates from a variety of mutations in genes encoding sarcomere proteins, the structural proteins in heart muscle (myocardium) [1]. However, for CVDs such as myocardial infarction (MI) and congestive heart failure, it is unclear which genes are responsible for the onset of disease. It is wellknown that apart from genetic predisposition, other environmental factors such as stress level, food intake and personal life-styles can contribute to phenotypes that can lead to CVDs. As such, proteins, the functional and dynamic entities of cells and tissues; rather than genes, are the most useful indicators for studying the onset, progression, as well as the therapeutic intervention of the various forms of CVDs. Proteomics, the study of protein expression, function and interaction in an organism at different states, is an invaluable technique for gaining insights into CVDs. This burgeoning area of research is aptly termed cardiovascular proteomics. This thesis will focus on the use of so-called shotgun and chemical proteomics technologies to study cardiac diseases at a global level, but also at the more targeted level of signal transduction.

1. Proteomics Methods and Applications in cardiac research Cardiovascular proteomics can be arbitrarily divided into four sections for the convenience of discussion, namely (i) Protein Profiling - the identification of all the proteins present in a system, (ii) Quantitative Proteomics - the investigation of protein abundance in a relative or absolute manner in a specific system at different conditions or perturbations, (iii) Functional Proteomics - the study of protein functions via their interactions with DNA, metabolites or other proteins, (iv) Biomarker discovery - the search for protein candidates that can serve as specific and sensitive indicators of CVDs in which distinction between the onset of disease (diagnostic), disease progression (prognostic) and after therapeutic intervention (drug biomarkers) will be mentioned.

1.1. Protein profiling Protein profiling is the large scale identification and classification of proteins in an organism. Recent advances in proteomic technologies permit large scale protein profiling by applying two main approaches; (i) gel-based protein profiling and (ii) gel-free protein profiling, also referred to as 2D-LC-MS/MS or MudPIT (Multi-dimensional Protein Identification). It is illustrated in Figure 1. 10

Figure 1. After protein extraction, for instance from heart tissue, samples are prepared for mass spectrometric analysis using gel based (A) and gel-free (B) workstreams. At each stage, separations can be varied to accomplish an optimal analytical depth. Mass spectrometric analysis can also be varied, using different instrumentation and different instrumental parameters. 1.1.1. Gel-based proteomics approaches One of the most common approaches is the coupling of 1D SDS-PAGE (1-DE) with LC-MS/MS. The gel is cut into 10-70 pieces and subsequently digested and subjected to mass spectrometric analysis. Advantage of this approach is that it retains important protein molecular weight (MW) information in combination with the large scale identification of the proteins present, although the dynamic range and resolution obtained by using 1-DE is often too poor to analyze low abundant proteins and/or post-translational modifications (PTMs). Alternatively, 2-DE (two-dimensional gel electrophoresis) can be applied. In 2-DE, proteins are first separated by their respective isoelectric points (pI) with iso-electric focusing (IEF, 1st 11

dimension) followed by SDS-PAGE where proteins are again separated according to their MW (2nd dimension). The combination of two orthogonal separation techniques results in the distribution of proteins as spots across two dimensional gel profiles with high resolution. Therefore, 2-DE is somewhat suitable for the evaluation of PTMs. For instance, 2-DE was applied to uncover the differential phosphorylation states of troponins, which regulate Ca2+ sensitivity of the cytoskeleton to control contraction-relaxation cycles of the heart [2]. To date, 2DE gel protein-mapping public depositories of human cardiac proteins such as HSC-2DPAGE [3, 4], HEART-2DPAGE [5], and HP-2DPAGE [6] have been established. These can be used to compare profiles of tissues from healthy and diseased origin to uncover novel players in different pathological conditions of the heart. In addition, 2-DE heart protein databases for other animals, such as rat [7], dog [8], pig and cow are also under construction [9]. Nevertheless, 2-DE is typically biased against membrane proteins, low abundant proteins and proteins with extreme pIs and MWs. Typically, a 2-DE gel contains 1000-1500 spots, thereby limiting its application to the top 10 percent of the proteome. Also, after 2-DE, each spot has to be analyzed separately to uncover the identity of the protein, leading to a large amount of sample handlings. In order to circumvent this, profiling strategies based on chromatographic separation coupled with high resolution mass spectrometry were developed, now often termed MUDPIT or shot-gun, to allow large scale peptide sequencing of protein mixtures [10-13]. 1.1.2. Gel-free proteomics approaches MudPIT allows the identification of proteins which have variable pIs, molecular weights and chemical properties, making this method less biased. For instance, more than one thousand unique proteins from a human left ventricle tissue lysate could be identified in a single MudPIT analysis [14], indicative of the maturity of this technology. In another example, the large-scale investigation of sub-cellular localization and annotation of the cardiac proteome of mouse has been reported by using such shotgun proteomic methods [15]. Although it has been highly successful in determining the protein composition of biological samples, it still suffers from several drawbacks. First, the huge dynamic range of proteins expressed in complex biological mixtures, which easily exceeds six orders of magnitude in cells [16] and ten orders in body fluids [17] is currently not matched by the mass spectrometers that operate at maximally 4-5 orders of magnitude in dynamic range. This negates true comprehensive proteomics, as this prevents the detection of important lowabundance proteins like signaling proteins. Also, using two dimensional peptide separations, molecular weight information on proteins identified is lost.

1.2. Quantitative proteomics applied in cardiovascular research The afore-mentioned profiling proteomics approaches may be fruitful to characterize the complex biological samples but it has less potential to study the dynamics of protein 12

expression, which plays a key role in many pathological effects. One important goal in cardiovascular proteomics is to assess protein concentration as a function of a distinct pathological perturbation. Nowadays, it is possible to obtain information on differential expression of many proteins by examining the intensities of protein spots/bands in a given gel or peak intensities within a MS spectrum. These and other so-called label-free quantitation methods have been developed. Another category of quantitative proteomics workflows makes use of stable isotope labels. Both classes of quantitative proteomics are discussed in more detail below and are summarized in Figure 2.

Figure 2. Schematic representation of several common quantitative mass spectrometry workflows. (A) Label-free quantitation, which compares two or more MS-analyses by peptide intensities, or spectral counts. Isotope Label-based quantitative proteomics can be subdivided depending on incorporation of the stable isotope (B) at the metabolic stage, (C) after protein isolation, and (D) after digestion to peptides. In yet another approach, (E) synthetic stable isotope-labeled peptides are spiked into samples at given concentration as internal standard for isotope based absolute quantitation. 1.2.1. Label-free quantitative proteomics In label-free quantitation, proteins are first digested to peptides before tandem MS (MS/MS) analysis and database searching for identification. This approach is schematically 13

depicted in Figure 2A. Relative protein abundance between two samples can be determined by i) chromatographic peak intensity measurements [18] or ii) the comparison of spectral counts [13, 19-21]. The use of peak intensity in LC-MS was first reported by Chelius et al. [22]. In this method, peptide peaks are first distinguished from background noise and from neighboring peaks (peak detection) and isotope patterns of detected peaks are assigned by deconvolution. LC-MS retention times are carefully adjusted in order to correctly match the corresponding mass peaks between multiple LC-MS runs (peak matching) to accommodate day-to-day variations of the chromatographic system. Chromatographic peak intensity (either peak area or peak height) is calculated and normalized to enable a more accurate matching and quantitation, making this method useful for the analysis of changes in protein abundances in complex biological samples. In the spectral counting approach, relative protein quantitation is achieved by comparing the number of identified MS/MS spectra from the same protein in each of the multiple LC-MS/MS datasets. The approach is rapid and sensitive within a protein dynamic range 3-4 order of magnitude. This method can also be automated and is amendable for large-scale proteome analysis.Gramolini et al. performed a large scale proteomics survey of cardiac ventricle isolated from a mouse model of cardiomyopathy over-expressing a phospholamban mutant and demonstrated the impairment of Ca2+ handling at different time point. They used a rigorous comparative profiling strategy based on a relatively unbiased and sensitive method of protein detection and spectral counting to reveal the temporal patterns of differential protein expression [23]. Theoretically, an increase in protein abundance typically results in an increase in the number of its identified peptides, and vice versa [20, 24, 25]. Since larger proteins tend to contribute more peptides/spectra than smaller ones, simply because they can generate more peptides, spectral counting data needs to be adjusted to avoid abundance over-estimation of high MW proteins compared to low MW ones. A number of groups have proposed various types of normalized abundance factors based on transformed spectral counts. One of the simplest calculations is Fabb, in which the number of spectral counts in a protein is corrected by the protein’s MW (spectral counts/MW) [26]. Another example of such an approach is the normalized spectral abundance factor (NASF) which is calculated as the number of spectral counts identifying a protein, divided by the protein’s length (L) (spectral counts/L) for all proteins in the experiment [27].The Protein abundance index (PAI) is another example of a spectral count based quantitation method. It can be estimated by calculating the ratio between the number of observed peptides and the number of observable (i.e., theoretically predicted) peptides per protein [28] which shows a linear relationship with the logarithm of protein concentration. The index PAI value is later converted to an exponentially modified PAI (emPAI) which is proportional to protein content in a protein mixture [29].The values of emPAI can be calculated easily and do not require additional experimentation in protein

14

identification experiments. It can be routinely used for reporting approximate absolute protein abundances in a large-scale analysis. Recently, a modified spectral counting strategy termed absolute protein expression (APEX) profiling was developed by the Marcotte group to measure the absolute protein concentration per cell from the proportionality between the protein abundance and the number of peptides observed [30, 31]. The key to APEX is the introduction of appropriate correction factors that make the fraction of expected number of peptides and the fraction of observed number of peptides proportional to one another. The protein’s absolute abundance is indicated by an APEX score, which is calculated from the fraction of observed peptide mass spectra associated with one protein, corrected by the prior estimate of the number of unique peptides expected from a given protein during a MudPIT experiment. The critical correction factor for each protein (called Oi value) is calculated by using a machine learning classification algorithm to predict the observed peptides from a given protein based upon peptide length and amino acid composition. The APEX technique has recently been implemented in the APEX Quantitative Proteomics Tool [32], a free open source software for the absolute quantification of proteins. Very recently, the APEX-technique was shown to correlate well with real absolute quantitation experiments based on spiked internal standards analyzed by selected reaction monitoring mass spectrometry to determine protein concentrations [33]. Most recently, Griffin et. al developed a normalized label-free quantitative method called normalized spectral index (SIN) which combines three MS abundance features: peptide count, spectral count and summed fragment-ion (MS/MS) intensity [34]. SIN is calculated based on cumulative fragment ion intensity (MS/MS) for each significantly identified peptide (including all its spectra) of a particular protein. SIN is highly reproducible by eliminating the variances between replicate MS measurements. It also accurately quantifies and predicts protein abundance of thousands of proteins in replicate MS measurements of the same and distinct samples. 1.2.2. Label-based quantitative proteomics A common strategy to quantify proteins is to select a reference point or internal standard. This internal standard should have similar physiochemical properties to the analyte, so that it behaves identically during the chromatographic separation and subsequent MS analysis. Incorporation of stable isotopes is one of the best internal standards as it contains the above mentioned properties, while at the same time, it creates a mass difference that is easily detected in the mass spectrometer. A number of stable isotope labeling approaches have been developed for “shotgun” quantitative proteomics. These include SILAC (Stable Isotope Labeling by Amino acids in Cell culture) [35], ICAT (Isotope-Coded Affinity Tag) [19, 36], 18 O/16O enzymatic labeling [37], ICPL (Isotope Coded Protein Labeling) [38], TMT (Tandem Mass Tags [39], iTRAQ (Isobaric Tags for Relative and Absolute Quantification) [40], and dimethyl labeling [41]. 15

In metabolic labeling (Figure 2B), the incorporation of stable isotopes into proteins is performed by supplying these isotopes to the growth media consumed and metabolized by cells [35]. The isotope labels are then either incorporated as the single carbon or nitrogen source or incorporated via specific auxotrophic amino acids that contain heavy isotopes, called SILAC. The method of stable isotope metabolic labeling is also successfully used in variety of model organisms ranging from bacteria and yeast to drosophila and up to mammals [42]. Recently, a mouse bearing solely heavy isotope labeled lysine residues was developed by the Mann lab. This so-called SILAC-mouse is labeled through a diet containing only a 13 C6-substituted heavy version of lysine and to maintain this diet over several generations to achieve full labeling. No obvious effects on growth, behavior or fertility were observed in this mouse model [43]. It is a versatile tool for quantitative (tissue) proteomics. Initial MS analysis of different generations of these SILAC mice allowed following the incorporation rates of heavy lysine into newly synthesized proteins in various tissues under in vivo conditions. Figure 2C and D represent chemical derivatization techniques to achieve isotope labeling of proteins and peptides respectively. This is particularly advantageous for human or animal tissue samples where metabolic-based incorporation cannot easily be achieved. A handful of different chemical derivatization platforms are available and they can be divided into two major classes, based on their specific readout by the mass spectrometer: i) quantitation based on the relative intensities of fragment ion peaks at fixed m/z values within an MS/MS spectrum (MS/MS-based quantation) such as iTRAQ and TMT and ii) quantitation based on the relative intensities of extracted ion chromatograms (XICs) for precursor ions within a single data set (MS-based quantitation). This method includes AQUA, ICAT and dimethyl labeling, but also SILAC uses this as the basis for quantitation. The detailed methodologies will be explained below. The iTRAQ method uses amine-specific isobaric reagents to label the primary amines of the peptides. It can label up to eight different biological samples [44]. Labeled peptides from the different samples are mixed, analyzed and quantitated. The MS spectra of each peptide in the sample are simple and easy to interpret due to the isobaric nature of the tags. Upon fragmentation, the isobaric amine groups release reporter ions with distinct m/z (e.g. 114.1, 115.1, 116.1 and 117.1 for 4-plex iTRAQ). Measuring the relative intensities between reporter ions determines the relative abundance of the peptide in the respective samples. The relative abundance measurements at the protein level can be measured by combining the reporter ion intensities from multiple peptides. iTRAQ was successfully applied to determine the relative and absolute quantitation of drug-protein binding in which a mixed broadspecificity kinase inhibitor matrix is used in combination with free kinase inhibitors [45]. It was possible to identify new drug targets for clinically important kinase inhibitors. However, this labeling reagent is costly and less stable. Although current high accuracy instrumentation MS is able to identify MS/MS fragments at low m/z, ion statistics used in peak recognition

16

and quantitation is relatively poor compared to MS-based quantitation due to the overall lower amount of ions in MS/MS. In the ICAT method, proteins from two or more different biological samples are labeled with a 13C or 12C-ethylene glycol linker with a biotin affinity tag and a thiol-specific reactive group that selectively couples to the side chain of reduced cysteine residues [19]. The labeled samples are mixed, digested, purified with an avidin column to allow enrichment of labeled peptides prior to MS-analysis [46]. The relative abundances of peptides are quantified according to their intensity ratios. A major advantage of this tagging system is that it facilitates the enrichment of the modified peptides via affinity purification of the biotin moiety, thereby enhancing the detection of low abundance proteins. However, a major bottleneck is that ICAT reagents selectively label the less frequent cysteine residues so that proteins without a cysteine will not be quantified using this method. Dimethyl labeling through reductive amination was introduced into the proteomics field by Hsu et. al. [41]. Similar to iTRAQ and ICAT, labeling is performed after proteolysis. Formaldehyde reacts with the N-terminus or the amino group of the lysine side-chains to form an intermediate shiff base that is subsequently reduced to a methyl group by application of sodium cyanoborohydride (NaBH3CN). The resulting secondary amine follows the same reaction to form a tertiary amine with two methyl groups. This results increase in a mass of 28 Da (2x 12CH3) or 32 Da (2x 12CHD2) per modified amino group. The method can be extended to a triplex strategy by introducing a third label using 13CD2O and NaBD3CN. This strategy is applicable for the analysis of complex samples including cell lysate and affinity purified proteins [47, 48]. Fully-automated, online and on-column sequential triplex dimethyl labeling is the latest advance for this type of labeling [49]. Large advantage of this technique is that the cost of the reagents is minimal. Probably the most direct approach for the introduction of stable isotope-labeled peptides is to chemically synthesize them and 'spike' known quantities into the sample as internal standards (Figure 2E). This approach applies well to the quantification of candidate biomarkers in body fluids [50]. To reduce interference from background ions, quantification can be performed on specific fragments of the peptide generated in a triple-quadrupole mass spectrometer using selected reaction monitoring (SRM), often also referred to as MRM. The MS is set to detect a preprogrammed precursor-fragment combination with very high sensitivity and specificity. The internal peptide standard is introduced during or after protein digestion. Because suitable internal standards need to be identified and synthesized, this approach is usually limited to a small number of preselected proteins for follow-up, rather than work as a discovery tool. Although stable isotopic labeling technology for protein quantification has been applied successfully, it remains technically difficult to comprehensively characterize the global proteome due to the high costs of the labeling reagents, the nature of the methodology and especially by the dynamic range limitation of the mass spectrometers when analyzing 17

complex biological samples. Furthermore, simultaneous quantification of proteins from a large population of samples is also often problematic.

1.3. Functional proteomics approach Differential changes in protein profile with the help of quantitative proteomics are heavily used to evaluate the involvement of critical pathways in a time- and diseasedependent manner. Although current sensitive MS analysis allows the identification of thousands of proteins in a single experiment, the biological role of these proteins are not revealed. And it is now evident that proteins do not act alone, but rather in a concerted way together with others in its vicinity. Hence, the functions of proteins and their molecular mechanisms can be implicated by their interacting protein partners, so called “guilt-byassociation”. This makes it important to investigate proteins in the context of their formed complexes, rather than on an individual basis [51, 52]. Illustrative examples of such protein complexes are signaling modules, which consist of multiple-signaling proteins which are responsible for the fast and efficient transmission of a specific stimulus. This can be achieved through physically tethering the signaling modules to a cellular compartment where its function is required. For instance, as described in this thesis, the protein kinase A anchoring proteins (AKAPs, see review [53]) tether a kinase called cAMP-dependent protein kinase (PKA) to distinct loci within the cells to accommodate specificity in space and time. This is much needed as PKA is implicated in a large amount of functions. For instance, in heart it plays a major role in contraction and relaxation through regulating the Ca2+ concentration in the cardiac myocytes. Furthermore, the importance of other heart signaling modules has been shown, including β-adrenergic receptor associating proteins [54], as well as many protein kinase C binding partners (see review [55]). However, many of these signaling molecules are relatively low abundant in the heart, when compared to the amount of cytoskeletal and other muscle proteins. Therefore, the earlier discussed protein profiling approach often does not permit the identification of these complexes, and if identified, only with a marginal amount of sequence coverage. Hence, methods to detect these low abundant proteins in a complex sample are required. Reduction of sample complexity can be achieved by proteome prefractionation techniques like immunodepletion of high-abundant proteins, subcellular fractionation and affinity purification methods. The latter has been applied in this thesis and will be the focus of the next paragraphs. The selective nature of the affinity purification methods allows a large enrichment of a specific subset of the proteome, allowing a more thorough investigation of these enriched proteins. 1.3.1. Key technologies in targeted proteomics Targeted proteomics is a hypothesis-driven approach where proteome prefractionation is based on selective molecular interactions between ‘bait’, which can be a small molecule, ligand or a protein and its targets, i.e. (other) proteins. Up to date, there are several 18

methodologies applied to isolate protein complexes and identify their constituents. There are two main small molecule based approaches that are collectively called chemical proteomics (Figure 3, see review [56]).

Figure 3. Targeted proteomics workflows strategies to enrich and characterize low abundant signaling complexes. (A) Cardiac tissue lysates are incubated with a soluble activity-based probe. After binding of the probe to its target(s), the reactive group is activated to form covalent interactions with the target proteins. Specific capture of the probe/protein complex can then be achieved; (B) Beads are directly coated with the affinity-based ligand and subsequently incubated with a protein lysate to isolate the ligand’s targets. The enriched protein complexes are washed with mild buffer to remove non-specific binding. Subsequently common separation techniques can be applied, followed by digestion and LC-MS MS analysis. 19

As shown in Figure 3A, the first method, called ABPP, for activity based protein profiling, makes use of highly engineered small molecules (probes) that usually consist of four parts: an affinity region, a reactive moiety, a linker region and a coupling region. The affinity region is the actual ‘bait’ that interacts with the protein(s) of interest. Once bound, the reactive group can be activated. Usually these groups are photosensitive. Activation couples the probe covalently to its primary target. The linker connects the probe to affinity bait, called the coupling group, for which often biotin is used. This second bait allows the efficient purification of the targeted protein(s) from the complex lysate. The advantage of ABPP is the ability to monitor enzyme activity directly rather than being limited to proteins or mRNA abundance. Recently, ABPP combined with MS analysis (ABPP-MudPIT) enabled the identification of hundreds of active enzymes from a single biological system [57]. It is especially useful for profiling inhibitor selectivity as the potency of an inhibitor can be tested against hundreds of targets simultaneously. The possible drawback of the ABPP approach is the bulky nature of the tag which affects the protein-binding affinity, probe uptake and cellular and tissue distribution thus hindering in vivo profiling experiments. It is also less suited for diluted large volume samples like body fluids, which would require large amounts of expensive probes. The alternative approach depicted in Figure 3B makes use of preimmobilized small molecule ‘baits’. This requires the generation of a linkable version of the compound of interest, for which many different chemistries can be used. The generated affinity-beads can be incubated with a complex protein lysate to isolate targets of the ‘bait’. The affinity purification step is based on the highly specific reversible interaction of proteins with the immobilized compound. The captured sub-proteome can then be retrieved from the affinity beads by different elution steps. Typically this approach suffers from non-specific binding for which several approaches are reported to reduce it, such as by using competition binding assays [58] and varying the linker region between the beads and the compound [59]. There are also several protein-‘bait’ based methods, such as immunoprecipitation, and the use of tagged proteins, of which the tandem affinity purification (TAP-tag method) is an elegant example. The TAP-tag consists of two IgG binding domains of ProtA and a calmodulin binding peptide (CBP) separated by a TEV protease cleavage site [60]. In sequential affinity purification, interacting proteins of the tagged protein can be isolated and identified. A clear advantage of the small-molecule methods is that they can be applied to protein lysates of any origin, even to very rare human tissue. The protein based method requires ectopic expression of the tagged-version in a cell-line and negates the use of tissue samples. Immunoprecipitation, which is feasible in primary tissue, requires the availability of a clean antibody, which is not arbitrary for most proteins. Also a proper control is often difficult to attain in such an approach. These disadvantages do not apply to chemical proteomics, although chemical proteomics also suffers from some drawbacks. For instance, it can only 20

target proteins that have affinity for small molecules and it requires upfront knowledge on chemical interaction characteristics of the target of interest, although the technique is also applied in the unbiased identification of drug targets. Both enrichment methods are highly suitable for coupling to MS for interaction analysis. In addition, due to the enrichment, more comprehensive analysis of PTMs becomes feasible. The affinity bead-based chemical proteomics setup is the method of choice in this thesis, where we apply it to the study of cardiac disease.

1.4. Proteomics and biomarker discovery The discovery of novel biomarkers involves the profiling of biological samples in search for disease or drug related qualitative and quantitative changes in protein levels and/or modifications. This can be achieved by the comparative analysis of protein expression in normal and diseased tissues. Therefore several requirements apply; (i) the ability to detect as many proteins as possible, (ii) a dynamic range that is wide enough to detect low abundant proteins, (iii) confident protein identification, (iv) high reproducibility and consistency of the platforms so that the biological differences can be sufficiently distinguished from technical ones, (v) the ability to quantify the significant differences over sample variability and (vi) the ability to profile and compare a large number of samples for validation. The initial mapping of healthy proteomes of plasma, urine, liver and heart have been reported [61, 62]. Currently, a protein database of cardiac proteins has been constructed [6, 63, 64], which also includes alterations of proteins observed in cardiomyopathic samples. An example of protein-based cardiac disease biomarkers derived exclusively from cardiac tissue are the cardiac troponins that are in clinical use for the diagnosis of acute myocardial infarction. Another CVD biomarkers are creatine kinase [65] and high-sensitivity C-reactive protein (CRP) [66]. The progress of biomarker discovery at the protein level is challenging due to the dynamic range of proteins present in the systems. Anderson et al. reported that the plasma proteome spans a linear dynamic range of 10-12 orders of magnitude [17] while current proteomic techniques only resolve protein abundance within 3-4 orders of magnitude. Although high abundant and tissue leakage proteins in the plasma can serve as a surrogate measure of cardiac disease progression, existing biomarkers provide less accurate value of predictive patient risk. Hence, there is a noticeable shift in the approach to biomarker discovery away from the direct analysis of body fluids and towards comparing diseased and healthy primary tissues or proximal fluids. Major disadvantage of this approach is the necessity to measure the observed differential proteins back in a body fluid to avoid invasive test procedures. Recently, de Kleijn and group discovered that local ruptured plaques contain molecular information that is predictive for antherothrombotic events in all vascular territories and that the local antherosclerotic plaque acts as a potential source of prognostic biomarkers [67]. In their study, 21

they used longitudinal section of plaques and compared plaque proteins between two patient types; one with diagnostic cardiovascular event and one with follow-up treatment in order to investigate the prognostic effect using the complementary power of proteomics. The future challenges for biomarker discovery will be the advancement of mass spectrometric instrumentation, developing novel and even better upfront sample separations, but also the further application of specific enrichment techniques to uncover lower abundant signature proteins that are indicative of protein changes in vivo. Thus proteomics will continue to be an indispensable approach to decipher cellular mechanisms and to link these mechanisms to cardiovascular disease and health.

1.5. Animal models of heart disease Most of the technologies discussed above can identify a relatively large pool of proteins within a specific system or proteins which are able to interact with other proteins of interest. However, in order to further validate protein targets that affect or are affected by cellular functions in a living animal, in vivo experiments are required. Proteomics is also an emerging tool in this area of research. Thus far, a handful of animal models of human heart disease have been studied with proteomics. Recently, a gel-based kinase assay coupled to MS identification was used as an approach to map global kinase activity in the context of cardiomyopathy in the postnatal heart of transgenic mice expressing activated MKK6 (mitogen-activated protein kinase) [68]. A differential proteomic profiling study was performed on right ventricular hypertrophy using a rat model of pulmonary artery banding [69]. White et. al used proteomics to characterize global changes in cardiac protein expression in response to ischemia/reperfusion injury [70]. However, the gene expression pattern of small animals and larger mammals such as humans are different. Hence, investigation has to move into larger mammals such as dog [71] and bovine [72].

22

2. Scope and outline of this thesis In cardiac research, the prognostic outcomes following diagnosis are still relatively poor. Although significant progress has been made in identifying genetic, physiological and environmental factors that predispose individuals to different cardiac diseases, the etiology of these has exhibited an unanticipated level of complexity. Heart muscle expresses several thousands of distinct proteins, several hundreds of which are likely tissue specific and critical for heart muscle function, performance and capacity [73]. Hence, the systematic identification of these proteins and the determination of their relative abundance in healthy and diseased cardiac tissue could provide better understanding of the molecular determinants of the disease. Finding the proteins that specifically change in response to certain pathologies is the ultimate goal of biomarker discovery. Understanding the molecular mechanisms of signaling proteins is an important aspect of these goals as these often pose as promising therapeutic targets [74]. These signaling proteins are the orchestrators of tissue function. As a consequence, their function and behavior is very complex and dynamic with constant activation and deactivation through protein modification, allowing the system to quickly achieve equilibrium where cell function is optimal for the environmental conditions at hand. Under developing pathological circumstances, small but chronic alterations in this complex signaling network could result in the development of diseases such as cardiac hypertrophy, myopathy and ultimately heart failure. Hence, additional research into the molecular basis of signaling proteins is needed to better understand disease initiation and progression at the molecular level to ultimately help the further development of therapeutic strategies. Protein kinases and phosphatases are one of the main signaling proteins that control cardiac contraction and rhythm. PKA (cAMP-dependent protein kinase) and PKG (cGMPdependent protein kinase) are two important examples of kinases that are heavily involved in cardiac function. Chapter 2 reviews the various contributions of mass spectrometry to better characterize and understand PKA and PKG signaling. The in depth characterization of these kinases is reviewed with respect to their PTMs (especially phosphorylation), the similarities and differences between their various isozymes, the identification and specificity of their binding partners that have been studied by proteomics based methodologies. In addition, studies on structural properties by for instance native mass spectrometry, H/D exchange and ion mobility are also discussed. Combining mass spectrometry based data with other biophysical and biochemical data has been of great help to unravel the intricate regulation of kinase function in the cell in all its magnificent complexity. PKA is the main target of the second messenger cAMP. PKA is a widely distributed and multifunctional kinase in many tissues and cell types, where it is involved in a multitude of different signaling pathways in many compartments of the cell. To prevent the simultaneous cross activation of parallel PKA pathways, its function is tightly regulated 23

through interaction of PKA with the highly diverse family of AKAPs (A-kinase anchoring proteins). These localize PKA activity in space and time. The genetically distinct different isoforms of PKA each bind to different AKAPs, however, for many AKAPs their specificity is largely unknown. Therefore, chapter 3 emphasizes on the development of a chemical proteomics method to screen the specificity of different PKA-isoforms towards their AKAPs directly in tissue and cell lysates. Based on the differential affinity chromatography characteristics of two cAMP-analogs coupled to mass spectrometry the specificity of many AKAPs are confirmed, but also new specificities are established by this method. The results described in this chapter also provide insight into the presence of cell or tissue specific AKAPs. As mentioned above, we strongly consider signaling proteins as putative biomarkers for cardiac disease, as they are more likely to function at the onset of disease. In the heart, cAMP is a key regulator of excitation-contraction coupling (ECC) and mediates the sympathetic control over this mechanism through the activation of PKA [75]. It is also shown that in dilated cardiomyopthic hearts (DCM), where the ECC is severely hindered, there are alterations in cAMP signaling [76]. Therefore, in chapter 4, we investigated the potential of a cAMP based chemical proteomics method for its application in biomarker discovery. A large set of PKA-AKAP complexes were enriched from human left ventricle, thereby allowing their detailed study by mass spectrometry. Not only do the concentrations of certain cyclic nucleotide based proteins alter in DCM, also differential association of PKA to AKAPs was observed between the healthy and DCM stage. These data provide important clues on the specific dysregulation of specific PKA signaling nodes upon progression to DCM and pose as a promising new tool for biomarker discovery in cardiac diseases. Chapter 5 describes the study of the healthy human left ventricle proteome in the finest detail, using a multifaceted analysis platform combining differential sample fractionations, enzymatic digestions and peptide fragmentation techniques (CAD and ETcaD) to enhance (i) protein coverage, (ii) sequence coverage, (iii) protein identification confidence and, (iv) accurate protein concentration determination. These absolute abundance data provide a valuable resource to identify putative novel biomarkers for cardiac disease, but also allow the specific evaluation of signaling pathways in the ventricle in the context of their abundance. Special emphasis is put on the expression levels of kinases and phosphatases in the healthy human left ventricle, in conjunction to endogenous phosphorylation sites identified. With the recent advances in mass spectrometry, quantitative assessments of proteins hold immerse potential for biomarker discovery of cardiac disease. In addition, subproteomics analysis by means of depletion of abundant proteins, affinity purification and subcellular fractionation may facilitate the identification of vital molecules by reducing the complexity of the biological systems. With a panel of candidate diagnostic biomarkers, validation can be performed on larger sample sizes using various proteomics techniques. In 24

conclusion, chapter 6 summarizes and remarks on the various contributions of mass spectrometry-based proteomics which offers a promising platform to understand pathogenesis of cardiac disease and subsequently for discovery of biomarker for early detection.

25

References: 1 Maron, B. J. (2009) Hypertrophic cardiomyopathy centers. Am J Cardiol. 104, 11581159 2 Stanley, B. A., Gundry, R. L., Cotter, R. J. and Van Eyk, J. E. (2004) Heart disease, clinical proteomics and mass spectrometry. Dis Markers. 20, 167-178 3 Corbett, J. M., Wheeler, C. H., Baker, C. S., Yacoub, M. H. and Dunn, M. J. (1994) The human myocardial two-dimensional gel protein database: update 1994. Electrophoresis. 15, 1459-1465 4 Corbett, J. M., Wheeler, C. H. and Dunn, M. J. (1995) Coelectrophoresis of cardiac tissue from human, dog, rat and mouse: towards the establishment of an integrated twodimensional protein database. Electrophoresis. 16, 1524-1529 5 Pleissner, K. P., Soding, P., Sander, S., Oswald, H., Neuss, M., Regitz-Zagrosek, V. and Fleck, E. (1997) Dilated cardiomyopathy-associated proteins and their presentation in a WWW-accessible two-dimensional gel protein database. Electrophoresis. 18, 802-808 6 Muller, E. C., Thiede, B., Zimny-Arndt, U., Scheler, C., Prehm, J., Muller-Werdan, U., Wittmann-Liebold, B., Otto, A. and Jungblut, P. (1996) High-performance human myocardial two-dimensional electrophoresis database: edition 1996. Electrophoresis. 17, 1700-1712 7 Li, X. P., Pleissner, K. P., Scheler, C., Regitz-Zagrosek, V., Salnikow, J. and Jungblut, P. R. (1999) A two-dimensional electrophoresis database of rat heart proteins. Electrophoresis. 20, 891-897 8 Dunn, M. J., Corbett, J. M. and Wheeler, C. H. (1997) HSC-2DPAGE and the twodimensional gel electrophoresis database of dog heart proteins. Electrophoresis. 18, 27952802 9 McGregor, E. and Dunn, M. J. (2003) Proteomics of heart disease. Hum Mol Genet. 12 Spec No 2, R135-144 10 Link, A. J., Eng, J., Schieltz, D. M., Carmack, E., Mize, G. J., Morris, D. R., Garvik, B. M. and Yates, J. R., 3rd. (1999) Direct analysis of protein complexes using mass spectrometry. Nat Biotechnol. 17, 676-682 11 McCormack, A. L., Schieltz, D. M., Goode, B., Yang, S., Barnes, G., Drubin, D. and Yates, J. R., 3rd. (1997) Direct analysis and identification of proteins in mixtures by LC/MS/MS and database searching at the low-femtomole level. Anal Chem. 69, 767-776 12 Peng, J. and Gygi, S. P. (2001) Proteomics: the move to mixtures. J Mass Spectrom. 36, 1083-1091 13 Washburn, M. P., Wolters, D. and Yates, J. R., 3rd. (2001) Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat Biotechnol. 19, 242-247 14 Kline, K. G., Frewen, B., Bristow, M. R., Maccoss, M. J. and Wu, C. C. (2008) High quality catalog of proteotypic peptides from human heart. J Proteome Res. 7, 5055-5061 26

15 Bousette, N., Kislinger, T., Fong, V., Isserlin, R., Hewel, J. A., Emil, A. and Gramolini, A. O. (2009) Large-scale characterization and analysis of the murine cardiac proteome. J Proteome Res. 8, 1887-1901 16 Ghaemmaghami, S., Huh, W. K., Bower, K., Howson, R. W., Belle, A., Dephoure, N., O'Shea, E. K. and Weissman, J. S. (2003) Global analysis of protein expression in yeast. Nature. 425, 737-741 17 Anderson, N. L. and Anderson, N. G. (2002) The human plasma proteome: history, character, and diagnostic prospects. Mol Cell Proteomics. 1, 845-867 18 Wang, G., Wu, W. W., Zeng, W., Chou, C. L. and Shen, R. F. (2006) Label-free protein quantification using LC-coupled ion trap or FT mass spectrometry: Reproducibility, linearity, and application with complex proteomes. J Proteome Res. 5, 1214-1223 19 Gygi, S. P., Rist, B., Gerber, S. A., Turecek, F., Gelb, M. H. and Aebersold, R. (1999) Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat Biotechnol. 17, 994-999 20 Liu, H., Sadygov, R. G. and Yates, J. R., 3rd. (2004) A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal Chem. 76, 41934201 21 Gilchrist, A., Au, C. E., Hiding, J., Bell, A. W., Fernandez-Rodriguez, J., Lesimple, S., Nagaya, H., Roy, L., Gosline, S. J., Hallett, M., Paiement, J., Kearney, R. E., Nilsson, T. and Bergeron, J. J. (2006) Quantitative proteomics analysis of the secretory pathway. Cell. 127, 1265-1281 22 Chelius, D. and Bondarenko, P. V. (2002) Quantitative profiling of proteins in complex mixtures using liquid chromatography and mass spectrometry. J Proteome Res. 1, 317-323 23 Gramolini, A. O., Kislinger, T., Alikhani-Koopaei, R., Fong, V., Thompson, N. J., Isserlin, R., Sharma, P., Oudit, G. Y., Trivieri, M. G., Fagan, A., Kannan, A., Higgins, D. G., Huedig, H., Hess, G., Arab, S., Seidman, J. G., Seidman, C. E., Frey, B., Perry, M., Backx, P. H., Liu, P. P., MacLennan, D. H. and Emili, A. (2008) Comparative proteomics profiling of a phospholamban mutant mouse model of dilated cardiomyopathy reveals progressive intracellular stress responses. Mol Cell Proteomics. 7, 519-533 24 Gao, J., Friedrichs, M. S., Dongre, A. R. and Opiteck, G. J. (2005) Guidelines for the routine application of the peptide hits technique. J Am Soc Mass Spectrom. 16, 1231-1238 25 Zybailov, B., Coleman, M. K., Florens, L. and Washburn, M. P. (2005) Correlation of relative abundance ratios derived from peptide ion chromatograms and spectrum counting for quantitative proteomic analysis using stable isotope labeling. Anal Chem. 77, 6218-6224 26 Scholten, A., Poh, M. K., van Veen, T. A., van Breukelen, B., Vos, M. A. and Heck, A. J. (2006) Analysis of the cGMP/cAMP interactome using a chemical proteomics approach in mammalian heart tissue validates sphingosine kinase type 1-interacting protein as a genuine and highly abundant AKAP. J Proteome Res. 5, 1435-1447 27

27 Florens, L., Carozza, M. J., Swanson, S. K., Fournier, M., Coleman, M. K., Workman, J. L. and Washburn, M. P. (2006) Analyzing chromatin remodeling complexes using shotgun proteomics and normalized spectral abundance factors. Methods. 40, 303-311 28 Rappsilber, J., Ryder, U., Lamond, A. I. and Mann, M. (2002) Large-scale proteomic analysis of the human spliceosome. Genome Res. 12, 1231-1245 29 Ishihama, Y., Oda, Y., Tabata, T., Sato, T., Nagasu, T., Rappsilber, J. and Mann, M. (2005) Exponentially modified protein abundance index (emPAI) for estimation of absolute protein amount in proteomics by the number of sequenced peptides per protein. Mol Cell Proteomics. 4, 1265-1272 30 Lu, P., Vogel, C., Wang, R., Yao, X. and Marcotte, E. M. (2007) Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation. Nat Biotechnol. 25, 117-124 31 Vogel, C. and Marcotte, E. M. (2009) Absolute abundance for the masses. Nat Biotechnol. 27, 825-826 32 Braisted, J. C., Kuntumalla, S., Vogel, C., Marcotte, E. M., Rodrigues, A. R., Wang, R., Huang, S. T., Ferlanti, E. S., Saeed, A. I., Fleischmann, R. D., Peterson, S. N. and Pieper, R. (2008) The APEX Quantitative Proteomics Tool: generating protein quantitation estimates from LC-MS/MS proteomics results. BMC Bioinformatics. 9, 529 33 Malmstrom, J., Beck, M., Schmidt, A., Lange, V., Deutsch, E. W. and Aebersold, R. (2009) Proteome-wide cellular protein concentrations of the human pathogen Leptospira interrogans. Nature. 460, 762-765 34 Griffin, N. M., Yu, J., Long, F., Oh, P., Shore, S., Li, Y., Koziol, J. A. and Schnitzer, J. E. Label-free, normalized quantification of complex mass spectrometry data for proteomic analysis. Nat Biotechnol. 28, 83-89 35 Ong, S. E., Blagoev, B., Kratchmarova, I., Kristensen, D. B., Steen, H., Pandey, A. and Mann, M. (2002) Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol Cell Proteomics. 1, 376-386 36 Han, D. K., Eng, J., Zhou, H. and Aebersold, R. (2001) Quantitative profiling of differentiation-induced microsomal proteins using isotope-coded affinity tags and mass spectrometry. Nat Biotechnol. 19, 946-951 37 Mirgorodskaya, O. A., Kozmin, Y. P., Titov, M. I., Korner, R., Sonksen, C. P. and Roepstorff, P. (2000) Quantitation of peptides and proteins by matrix-assisted laser desorption/ionization mass spectrometry using (18)O-labeled internal standards. Rapid Commun Mass Spectrom. 14, 1226-1232 38 Kellermann, J. (2008) ICPL--isotope-coded protein label. Methods Mol Biol. 424, 113-123 39 Dayon, L., Hainard, A., Licker, V., Turck, N., Kuhn, K., Hochstrasser, D. F., Burkhard, P. R. and Sanchez, J. C. (2008) Relative quantification of proteins in human cerebrospinal fluids by MS/MS using 6-plex isobaric tags. Anal Chem. 80, 2921-2931 28

40 Ross, P. L., Huang, Y. N., Marchese, J. N., Williamson, B., Parker, K., Hattan, S., Khainovski, N., Pillai, S., Dey, S., Daniels, S., Purkayastha, S., Juhasz, P., Martin, S., BartletJones, M., He, F., Jacobson, A. and Pappin, D. J. (2004) Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol Cell Proteomics. 3, 1154-1169 41 Hsu, J. L., Huang, S. Y., Chow, N. H. and Chen, S. H. (2003) Stable-isotope dimethyl labeling for quantitative proteomics. Anal Chem. 75, 6843-6852 42 Gouw, J. W., Krijgsveld, J. and Heck, A. J. Quantitative proteomics by metabolic labeling of model organisms. Mol Cell Proteomics. 9, 11-24 43 Kruger, M., Moser, M., Ussar, S., Thievessen, I., Luber, C. A., Forner, F., Schmidt, S., Zanivan, S., Fassler, R. and Mann, M. (2008) SILAC mouse for quantitative proteomics uncovers kindlin-3 as an essential factor for red blood cell function. Cell. 134, 353-364 44 Bantscheff, M., Boesche, M., Eberhard, D., Matthieson, T., Sweetman, G. and Kuster, B. (2008) Robust and sensitive iTRAQ quantification on an LTQ Orbitrap mass spectrometer. Mol Cell Proteomics. 7, 1702-1713 45 Bantscheff, M., Eberhard, D., Abraham, Y., Bastuck, S., Boesche, M., Hobson, S., Mathieson, T., Perrin, J., Raida, M., Rau, C., Reader, V., Sweetman, G., Bauer, A., Bouwmeester, T., Hopf, C., Kruse, U., Neubauer, G., Ramsden, N., Rick, J., Kuster, B. and Drewes, G. (2007) Quantitative chemical proteomics reveals mechanisms of action of clinical ABL kinase inhibitors. Nat Biotechnol. 25, 1035-1044 46 Smolka, M. B., Zhou, H., Purkayastha, S. and Aebersold, R. (2001) Optimization of the isotope-coded affinity tag-labeling procedure for quantitative proteome analysis. Anal Biochem. 297, 25-31 47 Boersema, P. J., Aye, T. T., van Veen, T. A., Heck, A. J. and Mohammed, S. (2008) Triplex protein quantification based on stable isotope labeling by peptide dimethylation applied to cell and tissue lysates. Proteomics. 8, 4624-4632 48 Boersema, P. J., Raijmakers, R., Lemeer, S., Mohammed, S. and Heck, A. J. (2009) Multiplex peptide stable isotope dimethyl labeling for quantitative proteomics. Nat Protoc. 4, 484-494 49 Raijmakers, R., Berkers, C. R., de Jong, A., Ovaa, H., Heck, A. J. and Mohammed, S. (2008) Automated online sequential isotope labeling for protein quantitation applied to proteasome tissue-specific diversity. Mol Cell Proteomics. 7, 1755-1762 50 Anderson, L. and Hunter, C. L. (2006) Quantitative mass spectrometric multiple reaction monitoring assays for major plasma proteins. Mol Cell Proteomics. 5, 573-588 51 Gavin, A. C., Bosche, M., Krause, R., Grandi, P., Marzioch, M., Bauer, A., Schultz, J., Rick, J. M., Michon, A. M., Cruciat, C. M., Remor, M., Hofert, C., Schelder, M., Brajenovic, M., Ruffner, H., Merino, A., Klein, K., Hudak, M., Dickson, D., Rudi, T., Gnau, V., Bauch, A., Bastuck, S., Huhse, B., Leutwein, C., Heurtier, M. A., Copley, R. R., Edelmann, A., Querfurth, E., Rybin, V., Drewes, G., Raida, M., Bouwmeester, T., Bork, P., Seraphin, B., 29

Kuster, B., Neubauer, G. and Superti-Furga, G. (2002) Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature. 415, 141-147 52 Ho, Y., Gruhler, A., Heilbut, A., Bader, G. D., Moore, L., Adams, S. L., Millar, A., Taylor, P., Bennett, K., Boutilier, K., Yang, L., Wolting, C., Donaldson, I., Schandorff, S., Shewnarane, J., Vo, M., Taggart, J., Goudreault, M., Muskat, B., Alfarano, C., Dewar, D., Lin, Z., Michalickova, K., Willems, A. R., Sassi, H., Nielsen, P. A., Rasmussen, K. J., Andersen, J. R., Johansen, L. E., Hansen, L. H., Jespersen, H., Podtelejnikov, A., Nielsen, E., Crawford, J., Poulsen, V., Sorensen, B. D., Matthiesen, J., Hendrickson, R. C., Gleeson, F., Pawson, T., Moran, M. F., Durocher, D., Mann, M., Hogue, C. W., Figeys, D. and Tyers, M. (2002) Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature. 415, 180-183 53 Ruehr, M. L., Russell, M. A. and Bond, M. (2004) A-kinase anchoring protein targeting of protein kinase A in the heart. J Mol Cell Cardiol. 37, 653-665 54 Pouyssegur, J. (2000) Signal transduction. An arresting start for MAPK. Science. 290, 1515-1518 55 Vondriska, T. M., Pass, J. M. and Ping, P. (2004) Scaffold proteins and assembly of multiprotein signaling complexes. J Mol Cell Cardiol. 37, 391-397 56 Bantscheff, M., Scholten, A. and Heck, A. J. (2009) Revealing promiscuous drugtarget interactions by chemical proteomics. Drug Discov Today. 14, 1021-1029 57 Barglow, K. T. and Cravatt, B. F. (2007) Activity-based protein profiling for the functional annotation of enzymes. Nat Methods. 4, 822-827 58 Scholten, A., van Veen, T. A., Vos, M. A. and Heck, A. J. (2007) Diversity of cAMPdependent protein kinase isoforms and their anchoring proteins in mouse ventricular tissue. J Proteome Res. 6, 1705-1717 59 Shiyama, T., Furuya, M., Yamazaki, A., Terada, T. and Tanaka, A. (2004) Design and synthesis of novel hydrophilic spacers for the reduction of nonspecific binding proteins on affinity resins. Bioorg Med Chem. 12, 2831-2841 60 Puig, O., Caspary, F., Rigaut, G., Rutz, B., Bouveret, E., Bragado-Nilsson, E., Wilm, M. and Seraphin, B. (2001) The tandem affinity purification (TAP) method: a general procedure of protein complex purification. Methods. 24, 218-229 61 Hanash, S. (2004) HUPO initiatives relevant to clinical proteomics. Mol Cell Proteomics. 3, 298-301 62 Omenn, G. S. (2004) Advancement of biomarker discovery and validation through the HUPO plasma proteome project. Dis Markers. 20, 131-134 63 Evans, G., Wheeler, C. H., Corbett, J. M. and Dunn, M. J. (1997) Construction of HSC-2DPAGE: a two-dimensional gel electrophoresis database of heart proteins. Electrophoresis. 18, 471-479 64 Pleissner, K. P., Sander, S., Oswald, H., Regitz-Zagrosek, V. and Fleck, E. (1996) The construction of the World Wide Web-accessible myocardial two-dimensional gel 30

electrophoresis protein database "HEART-2DPAGE": a practical approach. Electrophoresis. 17, 1386-1392 65 Watanabe, M., Okamura, T., Kokubo, Y., Higashiyama, A. and Okayama, A. (2009) Elevated serum creatine kinase predicts first-ever myocardial infarction: a 12-year populationbased cohort study in Japan, the Suita study. Int J Epidemiol. 38, 1571-1579 66 Ji, S. R., Ma, L., Bai, C. J., Shi, J. M., Li, H. Y., Potempa, L. A., Filep, J. G., Zhao, J. and Wu, Y. (2009) Monomeric C-reactive protein activates endothelial cells via interaction with lipid raft microdomains. Faseb J. 23, 1806-1816 67 de Kleijn, D. P., Moll, F. L., Hellings, W. E., Ozsarlak-Sozer, G., de Bruin, P., Doevendans, P. A., Vink, A., Catanzariti, L. M., Schoneveld, A. H., Algra, A., Daemen, M. J., Biessen, E. A., de Jager, W., Zhang, H., de Vries, J. P., Falk, E., Lim, S. K., van der Spek, P. J., Sze, S. K. and Pasterkamp, G. (2009) Local Atherosclerotic Plaques Are a Source of Prognostic Biomarkers for Adverse Cardiovascular Events. Arterioscler Thromb Vasc Biol 68 Fernando, P., Deng, W., Pekalska, B., DeRepentigny, Y., Kothary, R., Kelly, J. F. and Megeney, L. A. (2005) Active kinase proteome screening reveals novel signal complexity in cardiomyopathy. Mol Cell Proteomics. 4, 673-682 69 Faber, M. J., Dalinghaus, M., Lankhuizen, I. M., Bezstarosti, K., Dekkers, D. H., Duncker, D. J., Helbing, W. A. and Lamers, J. M. (2005) Proteomic changes in the pressure overloaded right ventricle after 6 weeks in young rats: correlations with the degree of hypertrophy. Proteomics. 5, 2519-2530 70 White, M. Y., Cordwell, S. J., McCarron, H. C., Prasan, A. M., Craft, G., Hambly, B. D. and Jeremy, R. W. (2005) Proteomics of ischemia/reperfusion injury in rabbit myocardium reveals alterations to proteins of essential functional systems. Proteomics. 5, 1395-1410 71 Heinke, M. Y., Wheeler, C. H., Chang, D., Einstein, R., Drake-Holland, A., Dunn, M. J. and dos Remedios, C. G. (1998) Protein changes observed in pacing-induced heart failure using two-dimensional electrophoresis. Electrophoresis. 19, 2021-2030 72 Weekes, J., Wheeler, C. H., Yan, J. X., Weil, J., Eschenhagen, T., Scholtysik, G. and Dunn, M. J. (1999) Bovine dilated cardiomyopathy: proteomic analysis of an animal model of human dilated cardiomyopathy. Electrophoresis. 20, 898-906 73 Gramolini, A. O., Kislinger, T., Liu, P., MacLennan, D. H. and Emili, A. (2007) Analyzing the cardiac muscle proteome by liquid chromatography-mass spectrometry-based expression proteomics. Methods Mol Biol. 357, 15-31 74 Sridhar, R., Hanson-Painton, O. and Cooper, D. R. (2000) Protein kinases as therapeutic targets. Pharm Res. 17, 1345-1353 75 Lissandron, V. and Zaccolo, M. (2006) Compartmentalized cAMP/PKA signalling regulates cardiac excitation-contraction coupling. J Muscle Res Cell Motil. 27, 399-403 76 Movsesian, M. A. and Bristow, M. R. (2005) Alterations in cAMP-mediated signaling and their role in the pathophysiology of dilated cardiomyopathy. Curr Top Dev Biol. 68, 2548 31

Chapter 2 A Multi-Angular Mass Spectrometric View at Cyclic Nucleotide Dependent Protein Kinases: In Vivo Characterization and Structure/Function Relationships

Arjen Scholten1,2 Thin Thin Aye1,2 Albert J.R. Heck1,2 1

Department of Biomolecular Mass Spectrometry, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Sorbonnelaan 16, 3584 CA Utrecht, the Netherlands 2

Netherlands Proteomics Centre, Utrecht University, The Netherlands

Mass Spectrom Rev. 2008 Jul-Aug;27(4):331-53. Review.

32

Abstract Mass spectrometry has evolved in recent years to a well-accepted and increasingly important complementary technique in molecular and structural biology. Here we review the many contributions mass spectrometry based studies have made in recent years in our understanding of the important cyclic nucleotide activated protein kinase A (PKA) and protein kinase G (PKG). We both describe the characterization of kinase isozymes, substrate phosphorylation, binding partners and post-translational modifications by proteomics based methodologies as well as their structural and functional properties as revealed by native mass spectrometry, H/D exchange MS and ion mobility. Combining all these mass spectrometry based data with other biophysical and biochemical data has been of great help to unravel the intricate regulation of kinase function in the cell in all its magnificent complexity.

33

Contents 2.1.

Introduction

2.1.1. Kinases, signaling and second messengers 2.1.2. The cyclic nucleotide dependent protein kinases PKA and PKG 2.1.2.1. Genes, isozymes, distribution and localization 2.1.2.2. Structural Features 2.1.2.2.1. The N-terminal domain: dimerization, intracellular localization and autoinhibition 2.1.2.2.2. Cyclic nucleotide binding domains 2.1.2.2.3. Catalytic domain 2.2. Mass Spectrometry based Proteomic Analyses 2.2.1. Characterization of isozymes 2.2.2. Identification of PKA/PKG interactors 2.2.3. Phosphorylation states of PKA and PKG 2.2.4. Phosphorylation by PKA and PKG 2.3.

Mass Spectrometry based Structural Biology

2.3.1. Structural properties of PKA probed by mass spectrometry 2.3.2. Structural properties of PKG probed by mass spectrometry 2.3.2.1. Native mass spectrometry analysis of cGMP binding and co-factor binding to PKG 2.3.2.2. HDX mass spectrometry analysis of cGMP binding to PKG 2.4.

34

Outlook

2.1. Introduction 2.1.1. Kinases, signaling and second messengers Protein kinases represent a large and diverse class of proteins that play essential roles in intracellular signal transduction. All kinases catalyze the same chemical reaction; phosphorylation by transferring the γ-phosphate of ATP to the hydroxyl moiety of the substrate. Modulation of kinase activity is vitally important and their malfunction is linked to many diseases. In eukaryotic systems a plethora of possibly more than 500 genes encode for kinases, as a whole often referred to as the “kinome”(1), illustrating their importance and specificity of function. Organisms coordinate activities at every level of their organization often through complex multiprotein signaling complexes. In multicellular organisms, intercellular signaling events can act over great distances to induce physiological responses. In many cases, detection of a primary signal by cognate receptors affects the level of a second messenger that in turn controls the activity of kinases or phosphotases. Classic examples of such second messenger molecules are the cyclic nucleotides cyclic adenosine monophosphate (cAMP) and cyclic guanosine monophosphate (cGMP). A rise in cAMP concentration is induced by binding of, among others, catecholamines to a G-coupled β-adrenergic receptor, that is directly associated to adenylate cyclase that converts ATP into cAMP(2). cGMP can be induced by nitric oxide (NO) that originates from nearby different cell types. NO directly activates the intracellular soluble guanylate cyclase (sGC) that produces cGMP from GTP(3, 4). The formation of cGMP can also be initiated by a membrane bound guanylate cyclase (pGC) that responds to extracellular binding of natriuretic peptides(5). The second messenger molecules may activate a range of proteins, including both kinases and phosphotases. The major target of the second messenger molecules cAMP and cGMP are the protein kinases PKA (cAMP-dependent protein kinase) and PKG (cGMP-dependent protein kinase), respectively, the main subjects of this review. Through the years PKA, and to a lesser extent PKG, have been the subject of extensive investigation and much of the knowledge we currently have on kinase activity and regulation originates from these studies. In recent years mass spectrometry has taken an important role in these studies, for instance in the characterization of isozymes of the kinases, their post-translational modifications and interaction partners in the multi-protein signaling scaffolds, but also in the more in-depth structural analysis of conformational adaptation of the kinases upon activation, co-factor binding or binding to a scaffold. Here we review many of these recent studies, highlighting specifically the role of mass spectrometry, in all its flavors ranging from state-of-the art proteomics analysis using nanoLC MS/MS to more structural biology oriented studies using H/D exchange monitored by mass spectrometry to monitor protein folding and conformations. Figure 1 displays an overview of the areas of research wherein mass spectrometry has contributed to enhance our understanding of the important kinases PKA and PKG. Before we review these mass spectrometry based studies we provide an introduction on the current knowledge about the protein kinases PKA and PKG.

35

Figure 1. Schematic outline of this review, depicting the areas of research wherein mass spectrometry has contributed to our improved understanding of the kinases PKA and PKG. 2.1.2. The cyclic nucleotide dependent protein kinases PKA and PKG 2.1.2.1. Genes, isozymes, distribution and localization The major target of cGMP is cGMP-dependent protein kinase (PKG). Mammals have two PKG genes, prkg1(6, 7) and prkg2(8), that encode, PKG type I and type II, respectively. PKG I has two splice variants, Iα and Iß that yield PKG forms with distinct differences in their first 80-100 amino acids.(9, 10) All PKG isozymes form dimeric protein structures with similar domain architectures, as depicted in Figure 2. At the very N-terminus reside the dimerization domain, followed by the autoinhibitory sequence, two cGMP binding domains and a catalytic domain. Binding of cGMP to both cGMP binding domains fully activates the protein to phosphorylate intracellular targets. Different types of PKG have different tissue distributions: the Iα-isozyme is mainly found in lung, heart, dorsal root ganglia and cerebellum. The Iß isozyme is highly expressed in platelets, hippocampal neurons and olfactory bulb neurons. The main effecter of cAMP is PKA. Although PKA has a similar domain architecture to PKG, it is composed of two genetically distinct subunits, the regulatory subunit (PKA-R) and the catalytic subunit (PKA-C), which form a heterotetrameric holoenzyme [(PKA-R)2-(PKA-C)2]. Upon stimulation by cAMP, the holoenzyme dissociates into [R2(cAMP)4] and two free, and active, PKA-C subunits that can phosphorylate intracellular targets (Figure 2). As cAMP levels drop, it dissociates from the PKA-R dimer, thereby preparing it to bind, and inactivate, PKA-C again (14, 15). The mammalian PKA family is quite diverse and includes four types of PKA-R and three of PKA-C, each encoded by a unique gene; PKA-RIα, -RIIα, -RIß, RIIß and PKA-Cα and –Cß and -Cγ.(16, 17).

36

Figure 2. Domain organization and structural models of PKA and PKG, in their inactivated and activated form. A) Domain organization of PKA. B) Domain organization of PKG. Domains with high sequence homology between PKA and PKG are depicted in similar colors. AI=auto-inhibitory domain. C) Structural model of PKA activation. PKA is a holo-enzyme consisting of a regulatory subunit dimer and two catalytic subunits. Activation of PKA occurs when four molecules of cAMP bind to the R subunit dimer, two to each subunit. When both cAMP binding sites are occupied the R subunit adopts a confirmation with low affinity for the C subunit and the holo-enzyme dissociates. D) Structural model of PKG activation. Activation of PKG occurs when four molecules of cGMP bind. They induce a significant conformational change in the protein whereby the auto-inhibitory (AI) domain which obstructs activity is discharged. Smooth muscle, including uterus, vessels, intestine and trachea contain both Iα and Iß isozymes(11). Myristoylation anchors PKG type II at the plasma membrane(12). This type is only present in kidney, cerebellum and mucosa.(13) [Color Figure can be viewed in the online issue, which is available at www.interscience.wiley.com] The current available evidence suggests that each PKA-R isozyme can generate homo-dimers, which can associate with either PKA-C isozyme.(18, 19). To make things even more complex, also the existence of a PKA-RIα/RIß heterodimer was reported.(20) A further diversification of PKA family members in mammals is attained by the presence of multiple splice variants of the C-subunits.(21) This eventually leads to a large variety of different PKA holoenzymes. Concerning tissue distributions, it was demonstrated that RIα is expressed in the heart and central nervous system, whereas RIß expression is more restricted to nervous tissues such as the spinal cord and the brain.(22) Furthermore, RIIα and RIIß are both expressed in the brain, and show distinct patterns of expression with RIIα predominantly expressed in the heart and RIIß expressed mainly in the liver and fat tissue(23). The differential expression pattern hints towards different functional assignments of the different

37

PKA subtypes, thereby suggesting they are not redundant proteins. This is further reiterated by studies using different PKA-R knock-out mice models as reviewed earlier (24). Where and when enzymes become active has profound implications on the cellular processes that they control. For PKA, strong spatial resolution is attained through the interaction of the regulatory subunit dimer with the diverse family of A-kinase anchoring proteins (AKAPs), which localizes cAMP signaling complexes, often including phosphatases and phosphodiesterases, to discrete intracellular compartments(25). Since the discovery of the first AKAP, MAP2 (26), the family of PKA anchoring proteins has grown to (including splice variants) more than 50 very diverse members. An overview of the in the literature described AKAPs, including the nomenclature synonyms and splice variants, the tissues they have been primarily detected in, and the concomitant proteins observed in the scaffolding complexes, is given in Table 1. Table 1. Summary of AKAP nomenclature, described binding partners and tissue distribution. For AKAP7 and AKAP9, specific splice isoforms, depicted between brackets, form different complexes or show different tissue distributions. Gene

Synonyms

symbol

Splice variants

AKAP1

D-AKAP1

Proteins in complex

Tissue(s)

Refs

PKA, PP1, PDE4A, PTPD1, AMY-1

Testis, sperm

(27-32)

PKA

Kidney,

sAKAP84 AKAP121 AKAP149 AKAP2

AKAP-KL PALM2-AKAP2

AKAP3

AKAP4

lung,

(33, 34)

cerebellum, heart

AKAP110

PKA,

FSP95

Ropporin1,

AKAP4,

PDE4A,

Sp17,

Gα13,

Fibrous

SOB1

sheath protein 1 and 2

p82

PKA,

AKAP82

protein 1 and 2

ASP,

AKAP3,

Fibrous

sheath

Testis, sperm

(35-40)

Testis, sperm

(38, 41)

Brain, neurons

(42-53)

muscle,

(54-61)

(62-66)

Fsc1 AKAP5

AKAP150

PKA, AMPA-R, PP2B, PKC, β2 -AR,

AKAP75

Kir2.1,

AKAP79

R,L-type Ca2+-channels, GABA-A-R,

SAP97,

PSD-95,

NMDA-

IQGAP1, Adenylate Cyclase V and VI, ASIC1a and ASIC2a AKAP6

mAKAP

PKA, PDE4D3, RyR, PP1, NCX1,

Skeletal

AKAP100

PKC,

heart, brain

PP2A,

nesprin-1α,

Epac1,

ERK5-kinase, PDK1 AKAP7

AKAP15

PKA, Cav - and Nav-channels, AQP2

Pancreas, brain (α

AKAP18

channels

and

ß),

(γ),

kidney

placenta and

heart (δ) AKAP8

AKAP95

PKA,

p68 RNA helicase, D-type

Cyclins,

PDE4A,

AMY-1,

ACAP-

Brain,

thyroid,

(30, 32, 67-75)

T-lymphocytes

D2/Eg7, Caspase-3, G1 -S Cylcins, Fidgetin, HDAC3 AKAP9

38

AKAP350

PKA, NMDA-R, PP1, PKCε, KCNQ1

Brain,

AKAP450

channel, TACC1,

muscle,

Casein kinase 1,

skeletal heart,

(76-87)

CG-NAP

Intracellular

Yotiao

γ-tubulin

Cl-

channel

complex

(GCP2/3),

Ran,

2

(CLIC), and

PDE4D3,

3 IP3-

receptor

testis,

pancreas

(Yotiao)

Brain,

colon,

liver

(AKAP350)

Kidney

(AKAP450) AKAP10

D-AKAP2

PKA, PDZK1

AKAP11

AKAP220

PKA, PP1c, GSK3ß

Testis, kidney, lung

AKAP12

Gravin

PKA, PKC, ß2-AR, CaM, Cyclin-D,

AKAP250

GalT, Src

(88, 89) (90-92)

Brain, testis

(93-99)

Heart,

(100-104)

SSeCKS AKAP13

AKAP-Lbc

PKA, Rho, 14-3-3, CTNNAL1, tTG

Ht31 AKAP14

T-AKAP80

placenta,

lung -

(105)

Testis

AKAP28 VIL2

Ezrin

PKA, CFTR, NHERF, E3KARP, NHE3

Villin 2

Spleen,

intestine,

uturus,

kidney,

(106-108)

heart WAVE1

SCAR

PKA, Abl (Abelson tyrosine kinase), Arp2/3,

NAP125,

Brain

(109-111)

PIR121,

HSPC300, Abi, WRP, Rac, profilin Rab32

-

PKA

Liver, Kidney

(112)

SKIP

-

PKA, Sphingosine kinase type 1

Heart

(113, 114)

Brefeldin A

PKA, exocyst protein 70, AMY1

BIG2

inhibited GEP2

Heart,

brain,

placenta,

kidney,

(115) (116-118)

pancreas MTG16b

Myeloid

PKA, PDE4A, PDE7A

T-lymphocytes

(30, 119)

PKA, Vezatin

Eyes, ears

(120, 121)

PKA, Rab27, Myosin-Va/VIIA

brain, kidney,

(122, 123)

Translocation Gene 16b Myosin VIIA MyRIP

Slac2-c

heart, liver, lung, muscle, testis and pancreas Pericentrin

-

PKA, PKC

Kidney,

thymus,

(124, 125)

liver MAP2

Microtubule

PKA, Tubulin, ERK3

Brain, testis

(126-128)

Heart,

(129-131)

associated protein 2 Myosprin

CMYA5

PKA, dysbindin, desmin

Skeletal

muscle Merlin

Schwannomin

PKA, Ezrin

Neurons,

(132, 133)

hippocampus Synemin α4-integrin

-

PKA, many cytoskeletal proteins

Heart, brain

(134)

PKA, paxillin

Ubiquitous

(135, 136)

The most important similarity between AKAPs is their ability to anchor PKA-R. Much effort has been put in the characterization of the PKA-AKAP interaction. Early on, deletion mapping studies identified a region of MAP2 that mediates association with PKA-RII (137, 138). Later work by Carr et al. identified the interaction domain to be an amphipathic helix of approximately 14-17 amino acid residues long (139). An amphipathic helix is defined as an alpha helix with opposing polar and non-polar faces, i.e. a hydrophilic and a hydrophobic side. Although no strong consensus sequence has emerged, helical wheel alignments of over

39

20 AKAPs have shown the presence of the amphipathic helix to be essential for PKA anchoring. Further evidence for this was provided by the introduction of presumed helix disrupting residues in the amphipathic helix domain that revealed an abolished PKA-R interaction both in vitro and in vivo (139, 140). Most AKAPs known to date have high affinity specifically for the RII regulatory subunit of PKA, whereas a few bind also or even more specifically to RI. For instance, the dual specificity AKAP mAKAP (AKAP6) binds to both RI and RII, in a similar fashion via the amphipathic helix. Recently, it was shown that α4 integrins bind specifically to the RI sub-unit of PKA (135). However this interaction requires the intact holo-enzyme and seems to bind through a different kind of interaction as observed for the other AKAPs, i.e. not via the dimerization motif of the RI dimer, extending the repertoire of PKA scaffolding proteins in cAMPsignaling even further. 2.1.2.2. Structural Features PKG and PKA belong to the family of serine/threonine kinases. Both are activated by cyclic nucleotides and share many common structural features. Both kinases contain three functional domains: An amino-terminal domain that mediates dimerization, intracellular localization and auto-inhibition; a regulatory domain that has two in-tandem cyclic nucleotide binding pockets; and a catalytic domain consisting of an ATP/Mg2+ and a substrate binding site. In Figure 2 schematic representations are given of the domain structure of PKA and PKG. Besides the strong similarities, the striking difference between the two is that PKA forms the earlier mentioned [(PKA-R)2-(PKA-C)2]- hetero tetramer, whereas PKG is a homodimer, whereby the regulatory and catalytic domain are present on a single polypeptide. In recent years, crystallographic structures of PKA-R, PKA-C and the hole-enzyme, have greatly contributed to our understanding of protein kinase function and regulation. (For review, Taylor et al. (14, 15)). To date, there is no crystal structure of PKG available. Below, the three different functional domains of PKA and PKG will be discussed in more detail. 2.1.2.2.1. The N-terminal domain: dimerization, intracellular localization and autoinhibition For both cyclic nucleotide kinases, the N-terminus regulates the same three functions, dimerization, intracellular localization and auto-inhibition. For PKA, detailed NMR in solution structure determination by Newlon et al. (141) allowed a comprehensive view on the interaction between the two PKA-R monomers. In addition, they describe the interaction of the PKA-R dimer with a representative AKAP-anchoring domain peptide, which allowed the first view at the molecular mechanism underlying PKA-R’s intracellular localization through interaction with AKAPs (141-143). The (PKA-R)2 molecule forms a so-called X-type four helix bundle dimerization motif with an extended hydrophobic face. This hydrophobic face is essential for the interaction with the non-polar side of the AKAP’s amphipathic helix motif.

40

The structure of the dimerization domains of all PKG isozymes (Iα, Iß and II) are very different from PKA. They dimerize through an α-helix with a hydrophobic leucine/isoleucine zipper motif. This helix contains either a leucine, or isoleucine, at every first out of seven amino acids, also referred to as a heptad repeat.(144-146) The presence of this motif, makes all the PKG isozymes homodimeric proteins.(147-150) Deletion studies within the N-terminal domain revealed several of its functions. It was established early on that proteolytic cleavage of PKG Iα yielded a fragment, later designated as PKG ∆1-77 that is monomeric(151), binds two cGMP molecules and is constitutively active.(152) These data indicate that the N-terminus keeps PKG in the inactive state, or auto-inhibited and assures dimerization. The autoinhibitory region of PKG contains a pseudo-substrate sequence (K/R-K/R-X-G/A-I/V-SA-E-P/S) that efficiently inhibits it in absence of bound cGMP.(153) Later, the autoinhibitory domain was pinpointed to be located around Ile63 and Ser64 in PKG Iα.(154, 155) Besides regulation of PKG by cGMP, a more complex regulating mechanism of PKG was proposed when it was found that both PKG isozymes can autophosphorylate several residues in their N-termini.(149, 156-158) Although the exact physiological role of the autophosphorylation events is unclear, it was found that recombinant autophosphorylated PKG showed an increased affinity for cAMP in vitro.(158-160) Although PKG Iα and Iß do not differ in sequence beyond the N-terminus, the activation constant (Ka) is shifted 15-fold up for PKG Iß.(148) Therefore it was hypothesized that the N-terminus also mediates PKG’s affinity for cGMP. PKG Iα was observed to have a high and a low affinity site, that bind cGMP with 10 and 150 nM Kd values respectively, while PKG Iß has two low affinity sites. (148, 161) High affinity binding of cGMP to PKG Iα is based on positive cooperativity between the two binding sites, i.e. binding of one molecule cGMP facilitates the binding of another. Clearly this was mediated by the N-terminus as PKG ∆1-77 had lost cooperativity.(162). Although much less well defined as PKA-AKAP interactions, for PKG it was found that the N-terminus is essential for interaction with so-called G-kinase anchoring proteins (GKAPs).(163-165) Interestingly, PKG type II is myrisotylated at the amino terminus, localizing the enzyme to the plasma membrane and enabling it to phosphorylate the intestinal chloride channel CFTR.(12, 166) 2.1.2.2.2. Cyclic nucleotide binding domains Throughout cyclic nucleotide binding proteins, the motifs present in these pockets is well conserved across species, even as far as E. Coli and Drosophila.(167, 168). All cAMP and cGMP binding sites show high structural and sequence conservation. Crystallographic structures of PKA-R showed that each cAMP-binding domain is composed of a helical subdomain and an eight-stranded β-barrels where cAMP binds. The essential feature of the β-barrel is a conserved phosphate binding cassette (PBC) that anchors the cAMP. The cAMP binding domains are joined to the dimerization domain by a flexible linker, which also

41

includes the autoinhibitory sequence that docks to PKA-C in the absence of cAMP. There is also strong conservation between cAMP and cGMP binding domains. Interestingly, a single residue seems to determine cAMP/cGMP specificity. In PKG (and cGMP specific phosphodiesterases (PDE) and cyclic nucleotide gated ion channels (CNG), the cGMP specificity (>100 fold over cAMP) is largely determined by the presence of a single threonine (e.g. T177 and T301 in human PKG Iα) that specifically interacts with the guanine base of cGMP.(169) In PKA, this residue is replaced by an alanine (e.g. A211 and A335 in human PKA-RIα), thereby providing an almost entirely hydrophobic environment for the more hydrophobic adenine moiety. When the threonine of PKG’s cGMP binding domain is substituted by an alanine, the cGMP specificity over cAMP is annulled. Likewise, the implementation of a threonine on the invariant alanine in PKA increases its affinity for cGMP.(170, 171) Another important site in these pockets is the glutamate within the conserved sequence F-G-E (about 10 amino acids upstream of the earlier mentioned T or A residues), which forms a hydrogen bond with the riboside 2’-hydroxyl group of either cGMP or cAMP. An arginine next to the threonine (or alanine) is essential and conserved to chelate the cyclic phosphate diester.(172-174) 2.1.2.2.3. Catalytic domain Not only PKA and PKG, but all eukaryotic protein kinases, show high structural similarity in their catalytic domains; the two-lobed catalytic core consists of a small lobe with the ATP-binding domain and a large lobe that harbors the peptide (substrate) binding and catalytic site.(175, 176) From extensive crystallographic analyses of PKA-R and PKA-C in the presence and absence of cAMP, ATP and peptide substrates and inhibitors, major aspects of the structure function relationships could be studied. Some of the results obtained with PKA could be extrapolated to PKG, for which a crystal structure is still lacking. The study by Dostmann et al. in which the catalytic domain of PKG is modeled to PKA-C is most useful in this respect.(177) Extensive studies utilizing peptide libraries revealed very similar substrate specificity for PKA and PKG: KRAERKASIY and TQAKRKKSNA, or more generally R(R/K)X(S/T), and (R/K2-3)(X/K)(S/T), respectively. (177, 178) Therefore it is believed that subtle differences in the substrate interaction sites on the large catalytic lobe of PKG and PKA determine substrate specificity. Many substrates of PKA and PKG can be phosphorylated by both kinases in vitro, once again reiterating the importance of compartmentalization for them in vivo. The crystallographic structure of the PKA-C subunit in complex with the inhibitor peptide PKI 5-24 and Mg2+/ATP revealed many of the kinase/substrate interactions involved in catalysis.(179, 180) Extrapolation of these results to PKG shows many similarities.(181) First there is the GXGXXGXV motif that starts at Gly366 that is postulated to function as a hydrophobic pocket to bind the adenine ring of ATP. Further towards the C-terminus resides the linker region (Lys431-Ser454) that links the two lobes. Conserved Glu443 of this linker is

42

believed to function as the electronegative interactor with one of the basic residues in the substrate consensus sequence (possibly Lys6 in TQAKRKKSNA). The region between Tyr481 and Asn488 is candidate to function as the catalytic loop with Asp483 as the possible proton acceptor of the serine or threonine hydroxyl group. Lys485, also in the catalytic loop is implicated to facilitate phosphotransfer by neutralizing the negative charge of the γ-phosphate. The very C-terminus of PKG is likely to contribute to substrate recognition, as it does in PKA.(182) In the cleft between the two lobes, the actual phosphotransferase reaction takes place. The most important structural feature within the catalytic domain of PKG is the permanent phosphorylation of Thr516 (in PKG Iα).(183) This phosphorylation is essential for catalytic activity. In many other kinases a similar phosphorylation at a threonine or tyrosine residue in the catalytic domain is essential for activity and in several cases this event is actually working as the on/off switch of the protein. Therefore the region that contains this phosphorylation is also designated as the activation loop (184). It is believed that hydrogen bonds of the phosphate oxygens permit proper orientation of the substrate. 2.2. Mass Spectrometry based Proteomic Analyses 2.2.1. Characterization of isozymes The identification of specific isozymes of PKA and PKG occurring in vivo is not straightforward. For instance, initially, two different isozymes of PKA, type I and type II, were identified based on their pattern of elution from cellulose columns (185), but this method requires significant band shifts, and is typically unable to resolve more homologue isozymes. Moreover, also for the PKA catalytic subunit several isoforms have been observed. The specific isozymes of PKA (and PKG) have high sequence similarity (for PKA see Figure 3), but often only a single isozyme is recruited for specific tasks (186, 187). Illustrative is the identification of PKA-isozyme specific interactions with different AKAPs (188). This makes the proper analysis of specific isozymes in the study of cyclic nucleotide signaling function imperative. Hitherto, the identification of specific isozymes of PKA and PKG in vivo in a complex sample, such as a protein lysate, relies heavily on the use of enrichment techniques based on specific antibodies, which however turned out to be quite cumbersome as all PKA (and PKG) isozymes have high sequence and structural homology. In recent years, mass spectrometry, and specifically nanoLC MS/MS, has developed towards becoming the standard on high-throughput protein identification. One of the major challenges in the mass spectrometric identification of proteins in highly complex mixtures is the limited dynamic range. This makes the identification of somewhat lower abundant proteins (signaling proteins like PKA and PKG) in the presence of high abundant housekeeping proteins (like actin, GAPDH, myosin etc.) intrinsically difficult. Therefore, the characterization and identification of e.g. cAMP/cGMP signaling proteins by mass spectrometry requires a pre-fractionation. For instance, by the use of abovementioned

43

PKA/PKG targeted antibodies or alternatively by using affinity resins using immobilized cAMP or cGMP. The latter approach is nowadays often referred to as chemical proteomics (75, 189, 190). For isozyme characterization, identification by a few peptides, which is still the standard in most proteomics experiments, is not sufficient. Isozyme characterization requires high sequence coverage of the proteins of interest, which can be best obtained by a combination of protein enrichment, in-solution digestion by a variety of proteases and stateof-the-art nanoLC-MS/MS analysis. A specific enrichment with immobilized cAMP proved very valuable in the identification of all in vivo occurring PKA-R isozymes in mammalian tissue within a single experiment (34). To illustrate this in Figure 3 the sequences of all four known PKA R subunits in human (RIα, RIβ, RIIα and RIIβ) are aligned, whereby the obtained sequence coverage in chemical proteomics experiments in human heart tissue is indicated (unpublished data). Peptides that are potentially shared by different isozymes are gray-colored, whereas for isozymes specific/unique peptides are in boxes. These unique peptides are true indicators of the presence of the isozyme in the sample. In principle these unique detected peptides provide means for differential and absolute quantification of the isozyme, using stable isotope labeling (191) or the AQUA methodology, preparing an isotopically labeled internal standard (192). Unfortunately, the affinity enrichment technique using immobilized cAMP does not directly enrich for the catalytic sub-unit of PKA, for which also many isozymes have been described. Bowen at al. (193) reported a similar strategy that may be used to focus on PKA-C isozymes and its splice-variants occurring in the nematode C. Elegans. Therefore, the Csubunits were affinity purified using an immobilized PKI (Protein Kinase Inhibitor) peptide. Mass spectrometric analysis revealed several isozymes and splice-variants whose abundance was found to depend significantly on the developmental stage of the nematode (193). 2.2.2. Identification of PKA/PKG interactors In a pioneering study by Lohmann et al. (127), immobilized cAMP was amended in combination with SDS-PAGE to find that the regulatory subunit of PKA co-purifies with a variety of specific proteins and that this group of specific proteins is different in different tissues. One of the co-purified proteins was identified as MAP2, designated as the first AKAP two years earlier (26). The diverse family of AKAPs provides strong spatial resolution for PKA through interaction with the regulatory subunit of the dimer. Through these interactions cAMP is accumulated, in larger multiprotein signaling complexes, often including phosphatases and PDEs, in discrete intracellular compartments.(25) With current mass spectrometric techniques, the identification of many different AKAPs in a single sample is achievable (34, 114). We performed a similar experiment as Lohmann et al.(127) but identified all proteins by state-of-the-art LC-MS/MS. In a single experiment on a single mouse ventricular tissue sample, we were able to identify 13 different AKAP families. For several of them multiple splice variants were identified based on distinct isozyme determining

44

peptide identifications. It was also demonstrated how such an approach can aid in the identification of novel AKAPs. For instance, by probing the cAMP-interactome dataset of protein sequences in silico for the presence of the AKAP interaction domains.(139, 194) In this way, we designated sphingosine kinase 1 interacting protein (SKIP) as a potential novel AKAP in rat ventricular tissue (114).

Figure 3. Alignment of the four human PKA regulatory subunits RIα, RIβ, RIIα and RIIβ, together with the extensive sequence coverage obtained in cAMP affinity-based chemical proteomics experiments in human heart tissue (unpublished data), whereby the gray boxed peptide sequences are unique peptides, whereas the red colored boxes represent shared peptide sequences in between at least two of the isozymes [Color Figure can be viewed in the online issue, which is available at www.interscience.wiley.com]

45

Figure 4. Overview of detected AKAPs in chemical proteomics experiments on mouse ventricular heart tissue, whereby their putative intracellular interaction points are annotated. Adapted from (34). [Color Figure can be viewed in the online issue, which is available at www.interscience.wiley.com] In Figure 4, an overview of all AKAPs detected in such chemical proteomics experiments on mouse ventricular heart tissue are depicted, whereby their putative intracellular interaction points are annotated. More and more biologists start to use the protein identification power of current LC-MS/MS techniques, often at the exploratory point of an investigation where the characterization of specific interactors of a target protein is required. These unknown interactors are often obtained by different affinity purification methods like chemical proteomics (195), genetically tagged proteins (196) and immunoprecipitation. Often, the initial identification of an interactor or complex constituent is the starting point for a range of functional biological experiments. In the cyclic nucleotide signaling field this is of particular interest, as the AKAP conundrum shows that specificity of signaling is attained by intracellular localization of specific PKA-AKAP complexes near their targets. A nice example of this latter approach was reported by Schlossmann and co-workers who observed the copurification of an unknown protein when isolating PKG Iβ by immobilized cGMP and immunoprecipitation (187). Initially, mass spectrometry was used for identification of this novel protein that was designated as IRAG. The IP3-receptor was also found to be part of this complex. Subsequent experiments then showed the function of this complex in the regulation

46

of intracellular calcium levels. In experiments in our laboratory we were able to co-purify PKG I with IRAG in rat lung tissue; obtaining a high sequence coverage allowing us to identify several phosphorylation sites on IRAG (unpublished data). A potential advantage and disadvantage of the use of immobilized cyclic nucleotide beads is that they cross-react, as both cAMP and cGMP beads enrich both for PKA and PKG. It would be interesting in the future to design beads that would be able to selectively enrich for either PKA or PKG, or even better specific PKA isozymes. 2.2.3. Phosphorylation states of PKA and PKG Besides performing phosphorylation on substrate proteins, it is known that both PKA and PKG are in vivo phosphorylated and have the ability to autophosphorylate (156, 197) PKA-C is found autophosphorylated at Thr197 and Ser338, while PKA-RII autophosphorylates at Ser97 (198). PKA-RI seems not to autophosphorylate itself (199). In addition, PKA-R and PKA-C are subject to heterophosphorylation by other kinases, such as casein kinase II and glycogen synthese kinase 3β (200). LC-ESI-MS/MS techniques allow the study of protein phosphorylation due to specific ion fragmentation patterns that occur upon dissociation of the phosphopeptides in tandem MS. In combination with vital phosphopeptide enrichment techniques, such as metal ion affinity resins based on immobilized metal affinity chromatography (IMAC) (201, 202) and the more recently introduced use of solid titanium dioxide (TiO2) particles (157, 203, 204) a wide-range analysis of in vivo phosphorylation sites has become available. Implementation of these techniques in analytical platforms has increased the abilities to detect phosphorylated peptides tremendously, even when they are quite low abundant, for instance due to sub-stoichiometric phosphorylation events. The TiO2-based phosphopeptide enrichment technique was pioneered in our laboratory. In fact, the analysis of PKG’s autophosphorylation sites was one of the first biological questions addressed using this new enrichment method (157). Bovine PKG Iα was in vitro auto-phosphorylated for 0, 10 and 60 minutes. Using the TiO2 enrichment technique and two different proteases the PKG sequence could be largely covered. From the extracted ion chromatograms the amount of phosphorylation was differentially quantified and the extent of autophosphorylation could be determined. In this study Thr516 and Ser26 were found to be endogenously phosphorylated in PKG Iα purified from bovine lung. Thr516, as mentioned earlier, resides in the activation loop and is required for catalytic activity. Ser50, Thr58 and Ser72 were found to be rapidly autophosphorylated, while autophosphorylation on Ser44, Ser64 and Thr84 showed a slower incorporation. More recently, the mass spectrometric characterization of endogenous mouse ventricular PKG enriched by immobilized cGMP, revealed the presence in vivo of a phosphate at Ser 64, Thr84 and Thr516 (205). Characterization of PKA-RIα in mouse ventricular tissue resulted in the identification of Ser77 and Ser83, while for PKA RIIα the known Ser97 was found (34).

47

Figure 5. (A) Overlayed Single Ion Chromatograms (SICs) of PKA RIα peptide 71-92 in the non-, singly and doubly phosphorylated form. SICs were generated with precursor masses of the 3+ ions at m/z 802.09, 828.74 and 855.40 ± 0.02 Da respectively. B) MS/MS spectrum corresponding to peak at 35.82 min. C) tandem MS spectrum of peak at 36.55 D) MS/MS spectrum of doubly phosphorylated peptide with a retention time of 37.29 min. Also depicted are observed b- and y-ions that were crucial for the annotation of the phosphate group at the specified position(s) in the differentially modified peptide 71-92. Adapted from (34). Interestingly, as illustrated in Figure 5, for PKA-RIα, in mouse ventricular tissue, peptides originating from the amino acids 71-92, containing the phosphorylation sites Ser77 and Ser83, were detected as non-, singly and doubly phosphorylated, revealing that at least 4 differently phosphorylated PKA RIα species are present in vivo in mouse heart tissue, adding to the complexity in regulation in the PKA system. Recently more mass spectrometric data on the post-translational modifications of PKA-C have been reported as well (206), including two autophosphorylation sites on the Cα-subunit of recombinantly expressed murine PKA. The exact physiological role for most of these phosphorylation events remain to be solved, which is not a simple task.

48

2.2.4. Phosphorylation by PKA and PKG Based on an extensive body of work with peptide substrates in vitro and mapping of potential physiological phosphorylation sites in vivo, PKA is well known to phosphorylate substrates with the general motif R(R/K)X(S/T), whereas the consensus for PKG is (R/K23)(X/K)(S/T). In a recent large scale phosphoproteomic dataset obtained from human activated cells (203) thousands of phosphopeptides were reported. Evaluation of short protein sequence motifs around the detected sites for agreement with the known PKA consensus motif identified hundreds of putative PKA induced phosphorylation sites (203). Similarly, in other large phosphoproteomics datasets (202, 204) numerous putative PKA and PKG induced phosphorylation sites can be identified. Huang et al. (207) reported a more targeted systematic proteomics approach, using stable isotope dimetyl labeling (208, 209), whereby quantification depended on MS detection, for the identification of substrates of PKA and PKG, in pregnant rat uteri, using recombinant PKA and PKG. To facilitate detection, exogenous phosphatases were added to the samples to remove intrinsic phosphorylation followed by a heating step to inactivate all enzymes. A total of 61 and 12 substrate candidates were identified in vitro for PKA and PKG, respectively, whereby most of these sites contained consensus motifs of each kinase with only a few sites overlapped, indicating a good specificity. Moreover, differential phosphoproteomics analysis using stable isotope dimethyl labeling and MS was performed to detect the change of protein phosphorylation upon kinase stimulation in vivo. It is expected that in the next few years a wealth of new data on phosphoproteins and the specific sites of phosphorylation will become available, both from comprehensive phosphoproteomics analysis as well as more targeted approaches, zooming in on a specific kinase isozyme or substrate. This will certainly help to better understand the specificity of PKA and PKG, and their isozymes, in vivo. However, as mentioned above, the functional annotation of this wealth of phosphoproteomics data provides already the next big challenge. 2.3. Mass Spectrometry based Structural Biology For PKA, primarily through the availability of high-resolution X-ray structures, a wealth of information is available on the structural properties, albeit mostly of individual domains, as described in the introduction. However, to fully understand the molecular function and structural biology of PKA and PKG, one would ideally investigate the dynamic properties of the kinases in solution, which can be attained through a variety of biophysical techniques. To complement crystallographic structures, fluorescence anisotropy (210, 211), NMR (141, 212) and, most recently, FT-IR (213) have been used to probe the dynamical behavior of PKA-R in its binding to PKA-C and AKAPs. Fluorescence anisotropy for instance showed, before the holoenzyme structure was solved, that the flexible linker of PKA-RIα became more ordered upon binding to PKA-C. This was confirmed for PKA-RIIß, by endogenous tryptophan fluorescence.(214) FT-IR studies suggested an overall increase in dynamics of both PKA-R and PKA-C in the holoenzyme, compared to the cAMP bound

49

PKA-R and active PKA-C.(213) The fact that cAMP binding stabilizes PKA-R was also obtained by the observed urea unfolding stabilization of 7.1 kcal mol-1 upon cAMP-binding (215). NMR studies on the N-terminal docking/dimerization (PKA-R 1-44) domain of PKA revealed the different interfaces that PKA-RI and PKA-RII present, thereby explaining the specificity for the individual subtypes of AKAPs (142, 188, 212). 2.3.1. Structural properties of PKA probed by mass spectrometry To complement the above studies and to gain further insight into the structural changes that occur upon ligand binding and/or protein-protein interactions hydrogen/deuterium exchange (HDX) in combination with mass spectrometry has been used to study PKA. Detailed reviews on this technique are available (216, 217). Komives and co-workers (218) investigated the detailed protein-peptide interface of PKA-C in presence of ATP/Mg2+ and PKI(5-24), a strong PKA inhibitor peptide. Matrix assisted laser desorption ionization (MALDI) time of flight (TOF) MS was amended for the analysis (217, 219, 220). With the help of the PKA-C crystal structure (180), already available at the time, the peptides originating from the ATP and PKI interaction sites could be retrieved and considered for their deuterium incorporation. Clearly, in presence of these molecules, the sites of interaction became less accessible, which was directly indicative of their location within the protein structure. HDX mass spectrometry has also been utilized for other questions into the structural biology of PKA, like the mapping of the interaction surface between PKA-R and PKA-C. The initial report used a double truncation mutant of PKA-RIα (94-244), in presence and absence of cAMP and PKA-C(221), and later the experiment was performed with full length PKA-RIα (222) and PKA RIIβ (223, 224). These analyses validated many important interaction residues on both PKA-R and PKA-C involved in deactivation of PKA, of which a couple were already known from extensive, labor intensive, targeted mutagenesis studies. In addition, the differences indicated that cAMP induced a conformational change over long distance within the protein structure to abolish PKA-C binding.(221) All these observations were later confirmed by the first holoenzyme crystal structure of PKA (225). A quite recent very interesting study employing HDX-MS focused on the differences in interactions between PKA-RI and PKA-RII when interacting with an -helical A-kinase binding (AKB) motif from the dual specificity D-AKAP2. Interestingly, it was observed that D-AKAP2 uses two distinct binding modes towards the different PKA-R isozymes (226) as schematically depicted in Figure 6. Moreover, HDX-MS was used to evaluate, in combination with other biophysical methods, the kinasedead mutant of PKA-C, in which Lys72 was replaced with a Histidine (227). This mutant was further studied in its unphosphorylated and phosphorylated (at Thr 197) form. Interestingly, the kinase-dead PKA-C could still bind ATP. The HDX-MS results indicated that the small lobe was much more exposed in the mutant, making the kinase less stable. This effect was diminished by the phosphorylation of Thr197, a striking demonstration of the long-distance effect of phosphorylation on the organization of the active site in PKA-C (227).

50

Figure 6. Difference in measured HDX exchange mapped onto the structures of RI and RII D/D domains when interacting with a peptide mimicking the interaction surface of the dual specificity AKAP2. A) Side view and B) View looking down onto the AKAP binding surface. The ligand-induced protected regions are highlighted in red (>50%), orange (25–50%), yellow (