Context Surrounding Processing Sites Is Crucial in Determining Cleavage Rate of a Subset of Processing Sites in HIV-1 Gag and Gag-Pro-Pol Polyprotein Precursors by Viral Protease*

Background: Processing of HIV-1 Gag/Gag-Pro-Pol polyprotein by viral protease is an essential step to produce infectious virus particle. Results: Processing of CA/SP1, SP1/NC, and SP2/p6 sites by HIV-1 protease are affected by context. Conclusion: Complex substrate interactions beyond the active site of the enzyme define HIV-1 processing rate. Significance: This study helps to understand role of context outside cleavage sites in HIV-1 processing. Processing of the human immunodeficiency virus type 1 (HIV-1) Gag and Gag-Pro-Pol polyproteins by the HIV-1 protease (PR) is essential for the production of infectious particles. However, the determinants governing the rates of processing of these substrates are not clearly understood. We studied the effect of substrate context on processing by utilizing a novel protease assay in which a substrate containing HIV-1 matrix (MA) and the N-terminal domain of capsid (CA) is labeled with a FlAsH (fluorescein arsenical hairpin) reagent. When the seven cleavage sites within the Gag and Gag-Pro-Pol polyproteins were placed at the MA/CA site, the rates of cleavage changed dramatically compared with that of the cognate sites in the natural context reported previously. The rate of processing was affected the most for three sites: CA/spacer peptide 1 (SP1) (≈10-fold increase), SP1/nucleocapsid (NC) (≈10–30-fold decrease), and SP2/p6 (≈30-fold decrease). One of two multidrug-resistant (MDR) PR variants altered the pattern of processing rates significantly. Cleavage sites within the Pro-Pol region were cleaved in a context-independent manner, suggesting for these sites that the sequence itself was the determinant of rate. In addition, a chimera consisting of SP1/NC P4–P1 and MA/CA P1′–P4′ residues (ATIM↓PIVQ) abolished processing by wild type and MDR proteases, and the reciprocal chimera consisting of MA/CA P4–P1 and SP1/NC P1′–4′ (SQNY↓IQKG) was cleaved only by one of the MDR proteases. These results suggest that complex substrate interactions both beyond the active site of the enzyme and across the scissile bond contribute to defining the rate of processing by the HIV-1 PR.

atively regulated by the presence of the SP1 domain after release from the NC domain, which may create an unfavorable environment for cleavage by the HIV-1 protease.
The viral protease is a homodimer with the active site at the dimer interface (11)(12)(13), whereas the cleavage sites in the Gag and Gag-Pro-Pol polyproteins are structurally asymmetric, sharing little amino acid sequence homology (14). Crystallographic studies using peptides corresponding to the cleavage sites within the Gag and Gag-Pro-Pol polyproteins have revealed that the HIV-1 protease achieves its substrate specificity in part by recognizing a conserved shape rather than a particular amino acid sequence (15). In these studies, all of the peptide substrates were shown to adopt an asymmetric extended ␤-strand conformation when bound in the active site of the enzyme creating a consensus volume termed the "substrate envelope." Recently, Ozen et al. (16) demonstrated how a particular substrate that fits within the substrate envelope is influenced by both substrate dynamics and size. In this study, the volume of the CA/SP1, NC/SP2, and SP2/p6 substrates protruded beyond the substrate envelope more than expected based on their size. These substrates were highly dynamic, resulting in large deviations from the crystal structure and a worse fit within the substrate envelope. Consistent with their slow rate of cleavage, the CA/SP1 and NC/SP2 substrates were shown to score as being the most dynamic among the substrates analyzed (16).
The order of cleavage obtained from the kinetic studies using peptide substrates corresponding to the cleavage sites in Gag and Gag-Pro-Pol differs from the order of cleavage in the context of the full-length Gag polyprotein (3,9,(17)(18)(19)(20)(21)(22). Furthermore, the peptide substrates used in different studies did not reproduce the same order of cleavage (19,21,22), suggesting that additional determinants beyond amino acid sequence and local secondary structure of the cleavage sites are involved in Gag and Gag-Pro-Pol processing. One clear example is that in the context of full-length Gag polyprotein, the cleavage of the CA/SP1 is negatively affected by the initial cleavage at the SP1/NC site (3). There are also reported differences in both the order and the cleavage rate of peptide substrates when different lengths of peptides were used, implying that amino acids outside the active site groove may play a role in Gag and Gag-Pro-Pol cleavage. One example of the importance of the context surrounding cleavage sites in HIV-1 Gag processing was demonstrated in the study by Tritch et al. (5) using a protein substrate derived from the Gag polyprotein. When the P4 -P4Ј residues of the MA/CA site were replaced with the P4 -P4Ј residues of the SP1/NC site, the rate of cleavage was enhanced compared with that of the MA/CA site; however, the extent of enhancement was shown to be far less than that expected compared with cleavage of this site in its normal setting, suggesting that the context outside the local cleavage site is also involved in the efficient cleavage of the SP1/NC site. It has also been reported that processing site context, specifically sequences flanking the C terminus of the protease, can affect enzyme activity (23).
Although the context outside of the cleavage site has been suggested to play a role as a determinant for the substrate specificity of the HIV-1 protease, studies regarding contextual influ-ences on HIV-1 processing have not been addressed extensively using folded protein substrates. Therefore, to examine the effect of context on HIV-1 processing, we conducted a comparative analysis of the Gag and Gag-Pro-Pol cleavage sites in a heterologous environment using a sensitive HIV-1 protease assay. In this assay, the MA/CA fusion protein modified with several alterations was used as a substrate. A tetracysteine motif (CCGPCC) was introduced within the N-terminal domain of CA to allow binding of the FlAsH (fluorescein arsenical hairpin) reagent (24 -26). In addition, to differentiate the size of the cleavage products from two pooled substrates, a GST tag was fused to the N terminus of MA, and the C-terminal domain of CA was truncated. By utilizing this substrate labeled with FlAsH reagent, we were able to measure specific proteolysis in gel-based assays as a function of HIV-1 protease cleavage. When the residues P4 -P4Ј of the cleavage sites within Gag and Gag-Pro-Pol were placed at the MA/CA site, we observed changes in the order of the cleavage rate, indicating that the context outside the P4 -P4Ј residues plays a crucial role for efficient Gag and Gag-Pro-Pol cleavage. Changes in the order of the cleavage rates were also seen with a multi-drug resistant (MDR) protease. Combining the P4 -P1 residues from the SP1/NC site and the P1Ј-P4Ј residues from the MA/CA site resulted in a chimeric substrate that was not significantly cleaved by any of the proteases tested in this study, whereas the reciprocal construct was selectively cleaved by one of the MDR proteases but not by the wild type protease. These results indicate that more complex interactions than just those of the substrate amino acids within the active site of the protease are required for efficient Gag and Gag-Pro-Pol cleavage by the HIV-1 protease and that this new series of folded protein substrates offers a tool to examine protease specificity.

EXPERIMENTAL PROCEDURES
Constructs-The primers to construct plasmids used in this study were designed so that a His tag was introduced at the N terminus of each protein. The plasmid pARKz1k1-5LTRgag, containing a fragment of 5Ј LTR and the gag region of pNL-CH, an infectious molecular clone derived from the pNL4-3 clone of HIV-1 (27), was used as a template to amplify the full-length MA/CA coding region by PCR. The PCR product was digested with NdeI, which was introduced in the PCR primers, and cloned into the NdeI site of pET30b (Novagen, Madison, WI) to generate the pMA/CA precursor plasmid. A tetracysteine motif (CCGPCC) was introduced within the N-terminal domain of CA (His-87-Ala-92) by site-directed mutagenesis following the QuikChange method (Stratagene, La Jolla, CA) to create the final pMA/CA plasmid (see Fig. 1A). For pMA/CA⌬, two overlapping PCR fragments, one containing the coding region for a GST tag amplified from pET41b (Novagen) and the other containing the full-length MA/CA coding region amplified from pMA/CA, were used in an overlapping extension PCR. The resulting PCR product was cloned into the NdeI site of pET30b, and then the C-terminal domain of CA was truncated by substituting Ser-278 with a stop codon. The pMA/CA⌬-noTC was created in the same way that the pMA/CA⌬ was generated except that the full-length MA/CA coding region was amplified from pARKz1k1-5LTRgag to exclude the tetracysteine motif from the CA region. To generate pMA/CA⌬-Y132I, a Y132I mutation was introduced at the P1 position of the MA/CA cleavage site by site-directed mutagenesis using the pMA/CA⌬ as the template. For the constructs shown in Fig. 1B, eight codons representing the P4 -P4Ј positions of the MA/CA cleavage site in the pMA/CA⌬ were replaced with either the equivalent codons from the CA/SP1, SP1/NC, NC/SP2, SP2/p6, TF/PR, PR/RT, RT/IN, or chimeric cleavage sites using overlapping extension PCR. The SP1/NC P4 -1 was generated by overlapping extension PCR so that the P4 -P1 residues are derived from the SP1/NC cleavage site and the P1Ј-P4Ј residues are derived from the MA/CA cleavage site. The SP1/NC P1Ј-4Ј was generated in the same way that the SP1/NC P4 -1 was generated except that the P4 -P1 residues are derived from the MA/CA cleavage site and the P1Ј-P 4Ј residues are derived from the SP1/NC cleavage site. The structure of each of the constructs was confirmed by DNA sequence analysis.
Expression and Purification of HIV-1 PR Substrates-Expression of recombinant proteins in Escherichia coli BL21 (DE3) was carried out by a modified version of established protocols (28,29). Briefly, recombinant proteins were expressed in Mag-icMedia (Invitrogen) for 7 h, and the cells were collected by centrifugation. The cell pellet was lysed in Tris-buffered saline (25 mM Tris base, 3 mM KCl, and 140 mM NaCl) at pH 7.5 with 1 mM DTT and 1% Triton X-100 and then sonicated. Following clarification by centrifugation, the recombinant proteins containing a His 6 tag at the N termini were purified from the soluble fraction by affinity chromatography using a nickel-chelating column (Novagen). The purified recombinant proteins were dialyzed against 20 mM sodium acetate (pH 7.0), 1 mM EDTA, 2 mM DTT, and 10% glycerol (30). Protein concentrations were determined by using the Bradford assay (Bio-Rad, Hercules, CA), and the purity of the purified proteins was analyzed by protein staining using SimplyBlue SafeStain (Invitrogen) after SDS-PAGE (31).
Protease Expression and Purification-The HIV-1 protease was expressed in E. coli TAP 106 cells and purified from inclusion bodies as previously described (32). Briefly, the inclusion body centrifugation pellet was dissolved in 50% acetic acid followed by another round of centrifugation to remove impurities. Size exclusion chromatography was used to separate high molecular weight proteins from the protease. The protein was refolded in 50 mM sodium acetate at pH 5.5, 5% ethylene glycol, 10% glycol, and 5 mM DTT.
FlAsH Labeling Reactions-The FlAsH reagent (Invitrogen) was incubated overnight (ϳ15 h) with the MA/CA-protein substrate in a 1:1 molar ratio at room temperature (33) in proteolysis buffer (50 mM sodium acetate, pH 7.0, 150 mM NaCl, 1 mM EDTA, 2 mM 2-mercaptoethanol, and 10% glycerol). A total concentration of 6 M of the protein substrates was used for FlAsH labeling in a total volume of 228 l.
Gel-based Protease Assay by HIV-1 PR-To compare processing efficiency of the heterologous substrates (cleavage sites introduced within the MA/CA site) with the MA/CA substrate, 3 M of each of two substrates (a heterologous substrate plus the MA/CA substrate) was labeled with 6 M FlAsH reagent in a tube with a total volume of 228 l. An aliquot of 24 l was taken for the 0-min time point before the addition of 0.1 M HIV-1 PR to the substrate. Proteolysis was performed at 30°C, and aliquots of 24 l were taken at various time points, and the reaction was stopped by adding 4ϫ SDS-PAGE loading buffer. After the last time point, the protein in the aliquots was resolved by SDS-PAGE without a boiling step. The concentration of a reducing reagent, 2-mercaptoethanol, in the SDS-PAGE loading buffer was kept at 2 mM to keep the FlAsH reagent coupled to the substrate. The gels were briefly rinsed with water, and the fluorescently labeled protein bands were visualized by fluorescence imaging using a Typhoon 9400 (GE Healthcare/Amersham Biosciences) with excitation at 488 nm and emission at 526 nm. The relative quantitation of the visualized protein bands was performed by using the image analysis software ImageQuant TL (GE healthcare). To calculate the relative rate of individual cleavage site, the percentage amount of the cleavage products within the linear range was plotted as a function of the time. Briefly, the time points representing either less than or equal to 20% cleavage were used for linear regression, and then the slope value of the individual cleavage site was used to determine cleavage rate relative to the MA/CA site. Unless specified, all of the labeling and proteolysis reactions were performed at pH 7.0 because of the dependence of the FlAsH reagent on pH. For binding specificity of the FlAsH reagent to a tetracysteine motif, the labeling reaction of the 3 M substrate (MA/CA⌬ or MA/CA⌬-noTC) was performed in the absence or presence of 3 M FlAsH reagent at pH 7.0. Proteolysis was performed in the absence or presence of the 2 M HIV-1 PR at 30°C for 3 h, and the resulting protein products were subjected to SDS-PAGE. The fluorescent protein bands were visualized by Typhoon 9400, and the same gel was stained with SimplyBlue SafeStain for Coomassie staining. For the effect of pH on HIV-1 PR activity, 3 M substrate (MA/CA⌬) was used for labeling reaction at pH 5.5, 6.0, 6.5, or 7.0 following the protocol described above.
Estimation of van der Waals Potential and Hydrogen Bonding-For the structural analyses, crystal structures were used when available. When an experimentally determined structure was not available, the structure of the substrate complex that was most similar in sequence to the target substrate was used as the template. To this template, the mutations were introduced in silico using Maestro from Schrodinger Suite 2011 (Schrodinger, Portland, OR). All of the structures (experimental and modeled) were prepared for the structural analyses using the Protein Preparation Wizard of Maestro. In the preparation step, hydrogen atoms were added, and the missing side chains were filled in and refined. The hydrogen bonding network was optimized at neutral pH using the exhaustive sampling option. Orientations of the crystallographic waters were also sampled for the optimal hydrogen bonding network. Energy minimization was performed on the coordinates of the hydrogen atoms only, followed by a second restrained minimization on the whole system until root mean square deviation converged to 0.3 Å for crystal structures and 0.5 Å for the modeled structures. After preparation, each structure was analyzed for the protease-substrate hydrogen bonds and van der Waals contacts. The total number of hydrogen bonds was counted by using the default geometric criteria in Maestro (2.5 Å maximum distance, 120°minimum donor angle, and 90°minimum acceptor angle). Protease-substrate van der Waals interactions were calculated by a simplified Lennard-Jones potential function as described in detail elsewhere (16).

Design of Substrates for Gel-based HIV-1 Protease Assay
Employing FlAsH Reagent-To study the effect of context/environment on the processing of the HIV-1 Gag and Gag-Pro-Pol cleavage sites by the HIV-1 PR, we developed a protease assay in which a Gag protein substrate is labeled with a FlAsH reagent, a nonfluorescent biarsenical derivative of fluorescein that becomes fluorescent upon binding to its target (26). In this assay, a folded MA/CA protein containing an intact MA/CA cleavage site was used as a prototype substrate with the following modifications. A tetracysteine motif (CCGPCC) was introduced into a surface loop region within the N-terminal domain of CA (amino acids 87-92), near the position previously used to label intact virus (34), to allow for FlAsH reagent binding, resulting in a substrate designated as MA/CA (Fig. 1A). The substrate MA/CA⌬ (GST-MA-CA⌬CTD) was made by fusing a GST tag to the N terminus of MA to make the substrate larger and truncating the C-terminal domain (CTD) of CA to decrease the size of the cleaved labeled product (Fig. 1A). The MA/CA⌬-noTC is the same substrate as the MA/CA⌬ except that the MA/CA⌬-noTC lacks the tetracysteine motif needed for FlAsH reagent binding. To introduce a heterologous cleavage site into the context of the MA/CA cleavage site, the MA/CA cleavage site within MA/CA⌬ was replaced with each of the seven cleavage sites from the Gag and Gag-Pro-Pol polyprotein, individually generating substrates with the cleavage sites CA/SP1, SP1/ NC, NC/SP2, SP2/p6, TF/PR, PR/RT, and RT/IN, as shown in Fig. 1B. Because of the differences in size of the two substrates, MA/CA and MA/CA⌬, and the labeled cleavage products, CA and CA⌬, this strategy enables us to perform proteolysis reactions in the presence of the MA/CA substrate as an internal control and to resolve the cleavage products by SDS-PAGE.
Efficient Processing of MA/CA⌬ by HIV-1 PR and Specific Binding of FlAsH Reagent to Tetracysteine Motif-Because the MA/CA fusion protein was modified for the assay, we first compared the efficiency of cleavage of the altered substrate MA/CA⌬ (Х56 kDa) with that of the MA/CA protein (Х40 kDa) in the gel-based assay, placing equal molar amounts of each substrate into the reaction. Because the two substrates and their cleavage products differ in size and hence can be resolved by SDS-PAGE, we were able to mix the two substrates and carry out the protease assay together. Fig. 2A shows the cleavage products over a 3-h time period visualized by fluorescence imaging. The N-terminal domain of CA harbors the tetracysteine motif where the FlAsH reagent binds; thus only the proteins containing the N-terminal domain of CA are detected by fluorescence imaging. Both substrates were processed by the HIV-1 protease at a similar rate over time as assessed by the band intensity of the cleaved CA products, the full-length CA released from MA/CA, and the CA⌬ (Х15 kDa) released from MA/CA⌬ (Fig. 2, A and B). Thus, the alterations made to the MA/CA protein did not affect its cleavage efficiency by the protease nor result in significant artifactual degradation of the protein or cleavage at fortuitous sites. We also tested cleavage efficiency of the SP2/p6 site (a site that is cleaved slowly, see below) in both substrates, MA/CA and MA/CA⌬, and we observed very similar cleavage rates with each substrate (data not shown), confirming that the alterations we made to MA/CA did not affect cleavage at the MA/CA site. The faint band running slightly slower than CA protein, noted with an arrow in Fig. 2A, seems to be an unknown contaminant specifically of the MA/CA⌬ prep that is labeled with the FlAsH reagent; its nature is unclear because it is present in the uncleaved prep, and the band intensity does not change in the presence of protease.
Studies done in vivo and in vitro have previously demonstrated that the FlAsH reagent binds a tetracysteine motif with high affinity and specificity (26), and proteins labeled with the FlAsH reagent have been analyzed by SDS-PAGE (24). However, in this previous report the detection of such protein bands in a gel was not optimal, which may have been due to the use of crude E. coli lysates for FlAsH labeling. To test the specificity of the FlAsH reagent binding to the tetracysteine motif, two substrates, one harboring a tetracysteine motif (MA/CA⌬) and the other lacking a tetracysteine motif (MA/CA⌬-noTC), were visualized by both Coomassie staining and fluorescence imaging in the absence or presence of FlAsH reagent and protease cleavage (Fig. 2C). In the presence of protease, Coomassie Blue staining showed that both MA/CA⌬ and MA/CA⌬-noTC gen- erated the same two cleavage products, p40 GST-MA and the p15 truncated CA protein, CA⌬. The same gel visualized by fluorescence imaging revealed that only the bands (indicated by the asterisks) of the proteins harboring a tetracysteine motif had the FlAsH reagent. The fact that the MA/CA⌬-noTC protein did not show any nonspecific binding of the FlAsH reagent to the protein confirms that the interaction between the tetracysteine motif and the FlAsH reagent is highly specific. The band intensity of FlAsH-labeled protein visualized by fluorescence imaging depends on the number of the molecules rather than the size of the molecules, providing a broad and linear detection range. When we compared the sensitivity of detection between several methods, the gel-based assay using FlAsHlabeled protein showed sensitivity of detection comparable with that of silver staining (data not shown). Finally, to show specificity of protease cleavage, we combined the wild type substrate with a substrate where the P1 tyrosine was replaced with isoleucine to block protease cleavage. When these substrates were incubated together with protease, only the substrate with the wild type cleavage site was cleaved (Fig. 2D).
Lower pH (Ͻ6.0) Is Not Optimal for Protease Cleavage of Protein Substrate MA/CA⌬-The gel-based protease assay was performed at pH 7.0 because the FlAsH reagent becomes nonfluorescent at pH values below 6.0 (35). However, most enzymatic analyses of the HIV-1 protease using peptide substrates have been performed at lower pH, typically near pH 5.0, and the rates of cleavage of the MA/CA and CA/SP1 sites have been shown to be accelerated at pH 5.0 compared with pH 7.0 when peptide substrates were used (3). Although HIV-1 protease assays using full-length Gag polyprotein have been performed at pH 7.0 (3, 5), the cleavage rate of the CA/SP1 in the context of the full-length Gag polyprotein was shown to be significantly increased at lower pH relative to cleavage of MA/CA (3). Therefore, to examine whether the processing of the MA/CA substrate by the HIV-1 protease at pH 7.0 is optimal compared with the cleavage at pH 5.5, the enzyme activity of the protease at different pH values, pH 5.5, 6.0, 6.5, and 7.0, was compared. Surprisingly, in the context of a large protein substrate, the protease cleaved the MA/CA site more efficiently at pH values higher than 6.0, although no significant difference was seen between pH 6.0 and 7.0 (Fig. 3, A and B). As expected, band intensity was uniformly decreased at the pH 5.5 because of the FlAsH reagent becoming nonfluorescent at lower pH (for both the precursor and the product). In contrast with previous studies using peptide substrates, the cleavage rate of the protein substrate by the protease was dramatically accelerated at higher pH (approximately a 10-fold increase at pH values above 6.0) compared with the rate of cleavage observed at pH 5.5 (Fig. 3C). Thus, optimal pH in this protease assay using MA/CA substrate is between 6.5 and 7.0, which is more consistent with physiological pH.
Impact of Heterologous Context on Cleavage Rates of Processing Sites within Gag and Gag-Pro-Pol Polyproteins-The processing sites in the Gag polyprotein are cleaved at different rates resulting in sequential Gag processing during the maturation process to produce an infectious virion. In a previous in vitro study (3,9,10), the processing rates of the different cleavage sites in the Gag precursor gave the following relative rates of cleavage (setting the MA/CA cleavage rate to 1): SP1/NC 14X, SP2/p6 1.6X, MA/CA 1X, and CA/SP1 and NC/SP2 0.04X. In the current study, we replaced the eight amino acids (P4 -P4Ј) at the MA/CA cleavage site within the MA/CA⌬ substrate with the cleavage site sequences from the other Gag and Gag-Pro-Pol cleavage sites: CA/SP1, SP1/NC, NC/SP2, SP2/p6, TF/PR, PR/RT, or RT/IN. Although no changes were made in amino acid sequences required for processing, the transposition of these sequences resulted in the cleavage sites being placed in a heterologous environment in the context of the Gag precursor. If the context outside of each cleavage site plays a role in determining cleavage rates, we would expect to see changes in the relative order of Gag cleavage. In this design, the initial rate of cleavage defines the specificity constant (k cat /K m ), and the difference in the initial rate of the two different substrates, when present at equal concentrations, defines the difference in the specificity constant. In the following experiments the initial rate of cleavage of the wild type MA/CA substrate was set as 1.
As noted above, these experiments were designed so that the wild type protein (MA/CA) and the protein with the alternative cleavage site had different sizes as both substrate and cleavage product, which allowed us to mix the two substrates and carry out the cleavage reaction in a single tube. Thus, the wild type substrate, MA/CA (Fig. 1A), was included in each reaction as an internal control to measure the relative cleavage rate. Fluorescence images of the cleavage reactions for the CA/SP1, SP1/NC, and RT/IN and the resulting time course of the cleavage for all of the substrates are shown in Fig. 4 (A and B), respectively. When the cleavage rates were compared with that of the MA/CA site, we observed significant differences in the rate of cleavage for some of the different Gag and Gag-Pro-Pol cleavage sites (Fig. 4, B and C). As can be seen in Fig. 4C, the sites can be grouped into those cleaved within 3-fold of the rate of MA/CA and those that were cleaved much more slowly. MA/CA and RT/IN have similar cleavage rates in the context of Gag-Pro-Pol (9) and also similar rates in the context of MA/CA (Fig. 4C). SP1/NC has a 10-fold higher cleavage rate in its homologous position, but when transposed the rate was slower than MA/CA, but cleavage was still detectable (Fig. 4C). CA/SP1 is cleaved slowly, but this rate is dictated by prior removal of the NC domain by cleavage at SP1/NC; when the SP1/NC cleavage site is blocked then CA/SP1 cleavage occurs at a rate that is similar to MA/CA (3), which was also the case when the site is placed at the MA/CA site (Fig. 4C). NC/SP2, TR/PR, and PR/RT are all cleaved slowly in the context of Gag and Gag-Pro-Pol (3,9,10) and were cleaved poorly when transposed into the MA/CA site (Fig. 4C). The most anomalous rate is that of SP2/p6, which is similar to MA/CA in the context of Gag (3) but was cleaved very poorly when the site was transposed (Fig. 4C). However, the magnitude of the decrease of the rate of cleavage of SP1/NC and SP2/p6 is similar compared with the rate of cleavage in Gag, suggesting that especially for these two sites their context in Gag plays a significant role in enhancing the rate of cleavage.

Inhibitor Resistance Mutations in HIV-1 Protease Can Change Processing Site Preferences-
The presence of multiple mutations in the protease region can reduce binding affinity to protease inhibitors leading to resistance to the protease inhibitors. It is possible that these protease variants also alter substrate specificity (36,37). To see whether MDR HIV-1 protease variants can affect the order of the rates of the Gag and Gag-Pro-Pol cleavage, we tested two MDR HIV-1 protease variants in our protease assay: PR G2 containing amino acid substitutions L10I, G48V, I54V, L63P, and V82A; and PR G4 containing amino acid substitutions L10I, L63P, A71V, G73S, I84V, and L90M. As shown in the Fig. 5, the PR G2 enzyme had relative rates of cleavage of the different substrates that were overall similar to the wild type enzyme with the exception of the SP1/NC site, which was cleaved at a six times faster rate than the MA/CA site and now more similar to its rate in full-length Gag (3), and the CA/SP1 site, which was cleaved moderately faster compared with the MA/CA site. In contrast, the PR G4 variant resulted in relative rates that were similar to those of the PR WT with the exception of the RT/IN site, which was modestly reduced (Fig.  6). Thus, the two MDR proteases showed distinct but mostly modest effects on substrate processing except for one of them   context of the Gag polyprotein or in the context of the peptide substrates, and these two sites were differentially recognized by the wild type and two MDR proteases (Figs. 4 -6). Thus, to explore substrate specificity further, we investigated two chimeric substrates: SP1/NC P4 -1 and SP1/NC P1Ј-4Ј. SP1/NC P4 -1 harbors P4 -P1 derived from the SP1/NC site and P1Ј-P4Ј derived from the MA/CA site, whereas SP1/NC P1Ј-4Ј contains P4 -P1 derived from the MA/CA site and P1Ј-P4Ј derived from the SP1/NC site. When these two chimeric substrates were tested for cleavage efficiency by the three different proteases, PR WT , PR G2 , and PR G4 , the chimeric substrates were poorly cleaved by the all of the enzymes (Fig. 7, A and B). The one exception was the SP1/NC P1Ј-4Ј chimera, which was cleaved faster than the MA/CA site but slower than the SP1/NC site, when cleaved by the PR G2 enzyme. Although the P1Ј to P4Ј amino acids accounted for much of the enhanced efficiency of the PR G2 protease for the SP1/NC site, in most cases the chimeras were poor substrates compared with their parental sequences, implying that the amino acids on either side of the scissile bond are interdependent for optimal processing. Substrate specificity of the HIV-1 protease is determined by a number of elements responsible for the molecular interactions between protease and substrates. Thus, it is possible that changes in the molecular interactions between protease and the substrate caused by the alterations made in the SP1/NC P4 -1 create a substrate that cannot be cleaved by the HIV-1 protease. To explore this possibility, the total number of hydrogen bonds and the van der Waals contacts between the protease and the chimeric substrates (P4 -P4Ј) within the structure of the protease-substrate complexes were analyzed using previously determined structures (16) and models of the chimeric sub-strates derived from these structures. The van der Waals interactions reflect the favorable protease-substrate contacts caused by local packing in the hydrophobic binding groove, whereas hydrogen bonds play a crucial role in substrate specificity (15,16). When we compared the results from the chimeric substrates to those from the parental substrate sites MA/CA and SP1/NC, the SP1/NC P4 -1 had fewer total hydrogen bonds (Fig. 7C) and less van der Waals contact potential than the parental substrates (Fig. 7D). The SP1/NC P1Ј-4Ј substrate exhibited relatively high scores in these measures of molecular interactions between the protease and the substrate compared with the other substrates (Fig. 7, C and D); however, it was cleaved poorly by the PR WT and PR G4 proteases (Fig. 7, A and  B). The structural data seem to reflect these cleavage patterns only when both the substrate dynamics within the binding site and the impact of the residues surrounding the processing sites on the protease-substrate interactions are incorporated into the molecular modeling studies.

DISCUSSION
The HIV-1 PR has been an important target for a successful group of HIV-1 inhibitors because of its essential role in the proteolytic processing of HIV-1 Gag and Gag-Pro-Pol polyprotein for the formation of infectious virus. However, much still remains to be learned about the determinants of HIV-1 PR specificity in the context of the processing/assembly cascade. Most of our knowledge regarding PR specificity has come from the studies using peptides as substrates representing the Gag and Gag-Pro-Pol cleavage sites, which cannot account for possible elements residing outside of the local cleavage site sequences, P4 -P4Ј. There has been a disagreement between the order of cleavage obtained from the studies with peptide substrates and the order of cleavage in the context of full-length Gag polyprotein. Moreover, the order of cleavage of various cleavage sites ranked by different studies with peptide substrates is not consistent (19,21,22). One discrepancy among these studies is the length of the peptide used, which suggests that the local environment surrounding the heptapeptide cleavage site may play a role in determining substrate specificity of the HIV-1 PR, although methodological differences may also account for the reported rate differences. Thus, in an attempt to explore the question of context further, we performed a comparative analysis of the effect of context on HIV-1 processing by introducing Gag and Gag-Pro-Pol cleavage sites into the heterologous context of the Gag MA/CA cleavage site. By labeling the protein substrate, MA/CA⌬, with a FlAsH reagent that binds the tetracysteine motif introduced within the NTD region of CA, we were able to detect and measure proteolysis of the substrate by the HIV-1 PR. In addition, we were able to manipulate the length of both the upstream and downstream segments of the products to allow two substrates and their labeled products to be resolved together by SDS-PAGE, which enabled us to run the wild type sequence as an internal control/ standard in each reaction. Replacing the MA/CA cleavage site with cleavage sites from the Gag and Gag-Pro-Pol polyproteins resulted in similarities and differences in the expected rates of cleavage, revealing that the context surrounding at least three of the cleavage sites plays a major role in determining the rate of cleavage by the HIV-1 PR. In addition, the use of MDR HIV-1 PR variants, PR G2 and PR G4 , also selectively affected the rates of cleavage. The use of a fluorescent tag has also allowed us to develop a protease assay based on cleavage of a folded protein using fluorescence anisotropy, which is amenable to large scale screens directed at inhibitors of either the protease or the substrate. 3 Our goal in using a folded protein substrate was to be able to examine the cleavage site requirements in the context of the larger protein. We found that with the large MA/CA protein substrate, the optimal pH for cleavage by the protease was between 6.5 and 7.0. This result differs from earlier studies using peptides as substrates, which showed an optimum pH below 6 for proteolysis (3, 17, 38 -40). In an in vitro assay using radiolabeled Gag polyprotein as a substrate, Pettit et al. (3) observed that the CA/SP1 site was cleaved about 20-fold faster at pH 5 compared with a reaction run at pH 7, measured as the rate relative to the MA/CA site. For synthetic peptide substrates, the K m values were higher at pH 7 relative to lower pH, suggesting that peptides interact less well with the HIV-1 protease at pH 7 than at lower pH (38). The higher rate of cleavage at neutral pH that we observed is more consistent with the expectation that the pH in the virion should be the same as that in the cell.
Our previous work looking at the relative rates of the cleavage in the Gag and Gag-Pro-Pol polyprotein (3,9) provide a context for interpreting the rates of cleavage of the same sites when placed in the MA/CA site. We propose that three phenomena are at work in defining the rates of cleavage when the sites are placed in this heterologous site. First, the rate of cleavage for one set of sites is largely determined by the sequence of the cleavage site such that the rate of cleavage is similar at the 3 S.-K. Lee, C. A. Schiffer, and R. Swanstrom, manuscript in preparation. homologous site and in the MA/CA site. A corollary of this assumption is that the MA/CA site itself is somewhat neutral in terms of the contribution of context. We have observed that the addition of a glycine spacer (three glycines) upstream and downstream of the MA/CA site has a uniform effect on all cleavage sites in modestly reducing the rate of cleavage, suggesting that there is no significant contribution of context to a specific sequence at the MA/CA site. 4 Thus, we interpret the similar cleavage rates of MA/CA, RT/IN, NC/SP2, TF/PR, and PR/RT in both homologous and heterologous sites as being determined largely by the cleavage site sequence itself. Hence, it should be possible to significantly increase the rate of cleavage of the slowly cleaved sequences with changes in the cleavage site sequence itself, a possibility we are exploring.
Second, the rate of cleavage is negatively regulated in the homologous site such as the CA/SP1 site, which is known to be under negative regulation in its homologous site. When placed in the MA/CA site, its rate of cleavage was consistent with removal of that negative regulation. In its homologous context, the CA/SP1 site is cleaved very slowly. Placement of this site in the MA/CA context increased its rate of cleavage to ϳ40% of the rate of cleavage of the MA/CA site. Similar increases in the rate of cleavage occur either when the downstream SP1/NC cleavage is blocked or when the pH is lowered (3), although in another study the enhanced rate did not result in complete cleavage at the CA/SP1 site in either both transfected cells or virus particles (6). In addition, the amino acid sequences at the CA/SP1 boundary have been predicted to adopt an ␣-helical structure (41); however, the NMR structure of a fragment including the C-terminal domain of capsid through the NC domain did not reveal an ␣-helical structure at the CA/SP1 boundary because of the flexibility of this region (42). It is possible that the amino acids at the CA/SP1 boundary region undergo a conformational change upon the initial cleavage at the SP1/NC site to function as a negative regulator on the cleavage at the CA/SP1 site, although the mechanism is still unclear.
Finally, we interpret two sites as being under local positive regulation in their homologous sites. Both the SP1/NC site and the SP2/p6 site are cleaved at significantly slower rates in the heterologous context of the MA/CA protein than in their homologous sites. Curiously, both are at the C-terminal end of a spacer peptide, and both play significant roles in the interaction with viral RNA (4,(43)(44)(45).
The SP2/p6 site is efficiently cleaved in its homologous site (3), and in the p15 (NC/SP2/p6) intermediate, its cleavage is strongly enhanced by binding of RNA to NC (4, 7). However, out of this context, i.e. when placed at the MA/CA site, the rate of cleavage is very slow, analogous to the rate of cleavage in the absence of RNA. It is unlikely that RNA serves as a cofactor for the protease in the cleavage of the SP2/p6 site. The alternative explanation is that RNA binding to NC promotes a conformational change that greatly enhances the presentation of the cleavage site sequence in its homologous context. It has been shown that it is the binding of RNA to NC that is important for this rate enhancement; however, the nature of the conforma-tional change induced by RNA binding is not known. Based on this argument, the SP2/p6 site alone is suboptimal for cleavage, as seen when placed in the MA/CA site, but a local conformation is able to overcome the suboptimal sequence context to enhance recognition and/or cleavage by the viral protease when in its homologous context of NC/SP2/p6.
The SP1/NC also appears to be under positive regulation in its homologous site. This site is the first site cleaved in Gag, and its rate of cleavage in full-length Gag is significantly faster than the rate of cleavage of the MA/CA site, in contrast to its rate of cleavage when this site is placed at the MA/CA site. It should be noted that in these experiments the sequence of the NL4-3 cleavage site was used, and it contains isoleucine at the P1Ј position instead of methionine. The more common P1Ј methionine increases the rate of cleavage of the SP1/NC site severalfold when placed in this heterologous site, 5 but it is still well below the much higher relative cleavage rate seen in the homologous site. In either form the cleavage rate of the SP1/NC site in the context of the MA/CA was far less than that in its original context, which is consistent with an earlier study demonstrated by Tritch et al. (5) using in vitro translated Gag proteins. Cleavage at the SP1/NC site is important for the maturation of viral RNA from a low stability dimer to the high stability dimer associated with the mature virus particle (46). Although the failure of NC to bind RNA greatly reduces the rate of cleavage of the SP2/p6 site, the failure to bind RNA does not affect the generation of the p15 intermediate, suggesting that the rapid cleavage at the SP1/NC site is not dependent on NC binding to RNA. Thus, the nature of the positive regulator that makes the SP1/NC site the most rapidly cleaved site in either Gag or Gag-Pro-Pol is unknown.
With the MDR HIV-1 PR variants PR G2 and PR G4 , we observed two types of changes with respect to the relative rates of cleavage. One type of change resulted in a modest, 2-3-fold increase or decrease in the rate of cleavage relative to the MA/CA site. In our experimental design we measure rates relative to the wild type MA/CA site. If the mutant proteases had a specific change in the rate of cleavage at the MA/CA site, we would not be able to discern this effect versus a change at another site. Thus, it is difficult to interpret these relatively small changes with respect to mechanism. However, there was an ϳ10-fold relative increase by PR G2 in the relative rate of cleavage of the SP1/NC site, suggesting a significantly enhanced interaction between this site and the MDR protease. An examination of the number of hydrogen bonds or van der Waals interactions did not provide an obvious explanation for this change. It is possible that the mutant protease interacts with substrates in a way that at least in part compensates for the positive regulatory element that affects the rate at the homologous site. We speculate that the P1Ј to P4Ј residues of the SP1/NC site contribute to the significant proportion of the positive interactions with the PR G2 mutant protease (Fig. 7). The use of either SP1/NC or MA/CA half-sites also showed that neither half-site when paired with the complementary half-site was significantly active as a substrate for the wild type enzyme. Therefore, the chimeric substrates composed of two half-sites derived from the SP1/NC and MA/CA sites are antagonistic in their interaction with the protease, and effects across the scissile bond caused by the substitutions must alter the interaction of each half-site within the enzyme.
In this report, we demonstrate the importance of the context surrounding cleavage sites on Gag processing by using a heterologous context, the MA/CA, suggesting that interactions between the PR residues and the substrates are not confined within the active site of the protease. We suggest that context plays a different role for different sites, either being relatively neutral, providing negative regulation, or providing positive regulation. Even within a cleavage site sequence, the two halfsites can play a positive effect with each other in determining the rate of cleavage. These long range features of the regulation of cleavage rates cannot be addressed using peptide substrates and therefore point to the need to use Gag protein substrates to explore the role of context in regulating cleavage rates.