A Survey of Chloroplast Protein Kinases and Phosphatases in Arabidopsis thaliana

Protein phosphorylation is a major mode of regulation of metabolism, gene expression and cell architecture. In chloroplasts, reversible phosphorylation of proteins is known to regulate a number of prominent processes, for instance photosynthesis, gene expression and starch metabolism. The complements of the involved chloroplast protein kinases (cpPKs) and phosphatases (cpPPs) are largely unknown, except 6 proteins (4 cpPKs and 2 cpPPs) which have been experimentally identified so far. We employed combinations of programs predicting N-terminal chloroplast transit peptides (cTPs) to identify 45 tentative cpPKs and 21 tentative cpPPs. However, test sets of 9 tentative cpPKs and 13 tentative cpPPs contain only 2 and 7 genuine cpPKs and cpPPs, respectively, based on experimental subcellular localization of their N-termini fused to the reporter protein RFP. Taken together, the set of enzymes known to be involved in the reversible phosphorylation of chloroplast proteins in A. thaliana comprises altogether now 6 cpPKs and 9 cpPPs, the function of which needs to be determined in future by functional genomics approaches. This includes the calcium-regulated PK CIPK13 which we found to be located in the chloroplast, indicating that calcium-dependent signal transduction pathways also operate in this organelle.


INTRODUCTION
Phosphorylation of amino acid side chains can modulate the conformation, activity, localization and stability of proteins, and around one-third of all eukaryotic proteins are thought to be reversibly phosphorylated [1]. Therefore, protein phosphorylation plays a central role in regulating cellular functions, particularly in signal transduction pathways (reviewed in: [2,3]). The reversible phosphorylation of proteins involves two types of enzymes: protein kinases (PKs) and phosphatases (PPs). In signaling pathways, PKs can be arranged in cascades, allowing amplification, feedback, cross-talk and branching in the transduction of the signal [2,4,5].
Various types of PKs and PPs exist in plants, classified according to the presence of domains in addition to the catalytic domain mediating (de)phosphorylation (e.g. receptor kinases), their regulation (e.g. Ca 2+ /calmodulin-dependent kinases) or their amino acid substrate (e.g. serine/threonine, tyrosine or histidine kinases) [4,[6][7][8][9][10][11][12][13][14][15]. The complete genome sequence of the model plant Arabidopsis thaliana allowed to survey comprehensively its total complement of PKs and PPs, resulting in more than 800 PKs [16] and 112 PPs [9]  The chloroplast, the characteristic organelle of green algae and plants, is thought to contain around 3000 different proteins in the model plant species A. thaliana [20,21]. Among those, a number is reversibly phosphorylated including proteins involved in the photosynthetic light reaction [22,23], starch metabolism [24] and transcription [25,26]. However, only few chloroplast PKs (cpPKs) and PPs (cpPPs) have been experimentally identified in A. thaliana and other flowering plants so far. Here we will summarize current knowledge on Arabidopsis cpPKs and cpPPs and present genomic approaches for the systematic identification and characterization of the chloroplast complement of PKs and PPs.

EXPERIMENTALLY CHARACTERIZED cpPKs AND cpPPs
The reversible phosphorylation of thylakoid proteins has been associated with the regulation of the migration between the photosystems of LHCII, the light-harvesting complex of photosystem II (PSII) [27,28], as well as with the turnover of PSII proteins [29,30]. Based on sequence homology to the cpPK Stt7, isolated from the green alga Chlamydomonas reinhardtii, the two A. thaliana cpPKs STN7 and STN8 have been identified [31] and their thylakoid localization has been experimentally shown [32,33] ( Table 1). The analysis of loss-of-function mutants showed that STN7 is required for LHCII phosphorylation and state transitions [32,33], whereas STN8 is necessary for the reversible phosphorylation of the PSII proteins D1, D2, CP43, PSII-H and CP29 [33,34]. Further biochemical analyses will be required to clarify whether STN7 and STN8 directly phosphorylate photosynthetic proteins, or whether they operate in phosphorylation cascades involving additional thylakoid PKs. Additional members of such phosphorylation cascades might be represented by the so-called thylakoid-associated PKs TAK1, 2 and 3 [35,36] ( Table 1). However, in contrast to STN7 and STN8 these proteins lack a chloroplast transit peptide (cTP).
The chloroplast subunit of casein kinase 2, cpCK2 , has been originally identified in mustard (Sinapis alba L.). cpCK2 phosphorylates in vitro components of the plastid transcription apparatus [37] and the corresponding orthologue has been also detected in chloroplasts of A. thaliana [38] (Table 1). Three further cpPKs have been tentatively identified during two proteomic studies [39,40] but their localization has not been confirmed by independent approaches yet ( Table 1).
Only two cpPPs have been identified so far. The dualspecificity protein phosphatase DSP/SEX4 can bind to starch granules and is involved in the regulation of starch metabolism [41][42][43]. AtRP1 exhibits bifunctional PK/PP properties and is capable of (de)phosphorylation of the regulatory threonine residue of Arabidopsis pyruvate, orthophosphate dikinase (PPDK) [44].

GENOME-WIDE PREDICTION OF cpPKs AND cpPPs
The vast majority of chloroplast proteins are targeted to the organelle by their N-terminal signal sequences, the chloroplast transit peptide (cTP), and imported via the Tic/Toc translocon [45]. For the prediction of cTPs various algorithms have been developed, and the accuracy of prediction can be further improved by combining several predictors. Thus, the specificity of combinatorial cTP prediction increases with the number of predictors used; as expected, this gain in specificity occurs at the expense of sensitivity [21]. We have employed nine different algorithms for cTP prediction and all the four cTP-containing cpPKs and cpPPs, which were unambiguously identified experimentally, namely STN7, STN8, cpCK2 , AtRP1 and DSP4/SEX4, were correctly predicted by at least 6 of the 9 predictors ( Table 1). Applying the "6 of the 9" predictor combination to the entire complement of 970 PKs and 217 PPs encoded in the nuclear genome of A. thaliana, 45 PKs and 21 PPs should contain a cTP ( Table 2). Remarkably, among the cpPKs predicted by the 6/9 approach, transmembrane (TM)-receptor PKs and related PKs are relatively under-represented, whereas non-TM PKs are clearly over-represented ( Table 2). Among the predicted cpPPs, serine/threonine phosphatases are underrepresented but PP 2C-type PPs predominate ( Table 2).

EXPERIMENTAL VALIDATION OF A REPRESEN-TATIVE SET OF PREDICTED cpPKs AND cpPPs
How many of the tentative cpPKs and cpPPs found by the 6/9 approach are truly located in the chloroplast? To answer this question, the N-terminal regions of the proteins containing the tentative cTPs were fused to the redfluorescent protein (RFP) [46] and transfected into Arabidopsis protoplasts. Comparison of the position of signals from the RFP fusions and from chlorophyll autofluorescence allowed to validate the chloroplast location of the tentative cpPPs and cpPKs (Fig. 1). Of the 9 predicted cpPKs tested, Predictions of chloroplast targeting were performed with the following algorithms: Predotar [49], TargetP [50], Protein Prowler [51], AAIndexLOC [52], PredSL [53], SLP-Local [54], WoLF PSORT [55], MultiLOC [56] and PCLR [57]. Several combinations were tried, in which a PK was considered to be chloroplast located when predicted by at least n ('n of 9') of the nine predictors, with n ranging from 1 to 9. a according to PlantsP Kinase and Phosphatase database (http://plantsp.genomics.purdue.edu/html/families.html); b not classified yet; c TAK1 does not posses a cleavable cTP; d chloroplast location was not reproducible (Bonardi, Pesaresi, Becker, Schleiff, Leister, unpublished data).   only two proteins actually exhibited a chloroplast location, notably At1g51170 and At2g34180 both belonging to class 4 of non-TM PKs (Table 3). Interestingly, At2g34180 is annotated as "CBL (calcineurin B-like calcium sensor) protein interacting protein kinase 13" (CIPK13) and contains a NAF domain thought to mediate the interaction with CBL proteins ( Table 4). This supports the view that calcium-dependent signal transduction pathways also operate in chloroplasts. This is in line with the recent finding of calcium regulation of chloroplast protein translocation [47,48].
In contrast to cpPKs, a large fraction of tentative cpPPs are truly located in the chloroplast; thus, for 7 of the 13 tentative cpPPs a chloroplast location could be confirmed (Table 3). Taken together, identification of cpPKs by cTP prediction is very error-prone, whereas cpPPs could be identified in a relatively reliable way. One can only speculate why the cTP prediction produced so many false positive results for PKs. A possible explanation is that the kinase domain of the proteins interferes with cTP prediction; at least in one case (At4g36950) the falsely predicted cTP overlapped with the kinase domain.

CONCLUDING REMARKS
Around 10% of the nuclear genes in Arabidopsis are estimated to encode chloroplast proteins [20]. If this extrapolation is extended to major protein families such as protein kinases and phosphatases, more than 100 cpPKs and 20 cpPPs should operate in the photosynthetic organelle. Although our genomic approach more than doubled the set of known enzymes involved in reversible chloroplast protein phosphorylation, the current number of 6 confirmed cpPKs and 9 cpPPs is still much smaller than the one based on the genome-wide extrapolation described above and also much smaller than expected based on the cTP predictions of PKs and PPs (see Table 2).
Although the chloroplast --as an endosymbiotic organelle derived from a prokaryotic ancestor --provided the eukaryotic cell with the two-component receptor family (which is related to the bacterial two-component histidine kinase receptors), it is tempting to speculate that the photosynthetic organelle might have failed to quantitatively adopt in return the eukaryotic concept of reversible protein phosphorylation at serine and threonine residues. This could mean that the actual number of PKs and PPs in the chloroplast is markedly lower than expected. Nevertheless, only approaches combining bioinformatic prediction and experimental validation of subcellular location of proteins as outlined in this work will systematically contribute to identify the entire complement of chloroplast protein kinases and phosphatases.  [62].