Extent and Dynamics of Polymorphism in the Malaria Vaccine Candidate Plasmodium falciparum Reticulocyte–Binding Protein Homologue-5 in Kalifabougou, Mali

Abstract. Reticulocyte-binding homologues (RH) are a ligand family that mediates merozoite invasion of erythrocytes in Plasmodium falciparum. Among the five members of this family identified so far, only P. falciparum reticulocyte–binding homologue-5 (PfRH5) has been found to be essential for parasite survival across strains that differ in virulence and route of host-cell invasion. Based on its essential role in invasion and early evidence of sequence conservation, PfRH5 has been prioritized for development as a vaccine candidate. However, little is known about the extent of genetic variability of RH5 in the field and the potential impact of such diversity on clinical outcomes or on vaccine evasion. Samples collected during a prospective cohort study of malaria incidence conducted in Kalifabougou, in southwestern Mali, were used to estimate genetic diversity, measure haplotype prevalence, and assess the within-host dynamics of PfRH5 variants over time and in relation to clinical malaria. A total of 10 nonsynonymous polymorphic sites were identified in the Pfrh5 gene, resulting in 13 haplotypes encoding unique protein variants. Four of these variants have not been previously observed. Plasmodium falciparum reticulocyte–binding homologue-5 had low amino acid haplotype (h = 0.58) and nucleotide (π = 0.00061) diversity. By contrast to other leading blood-stage malaria vaccine candidate antigens, amino acid differences were not associated with changes in the risk of febrile malaria in consecutive infections. Conserved B- and T-cell epitopes were identified. These results support the prioritization of PfRH5 for possible inclusion in a broadly cross-protective vaccine.


INTRODUCTION
The efficacy of vaccines targeting Plasmodium falciparum, the deadliest human malaria parasite, has been limited in part by genetic diversity in the vaccine antigens in endemic areas, 1,2 resulting in higher protection against parasites carrying alleles identical to the vaccine genotype. [3][4][5][6] Molecular epidemiological surveys of P. falciparum genetic diversity at vaccine-testing sites can identify the most prevalent genetic variants of candidate antigens for consideration in designing broadly protective vaccines and assess the role of specific polymorphisms in strain-specific immunity. 4 Plasmodium falciparum reticulocyte-binding protein homologue-5 (PfRH5) is a promising new blood-stage vaccine antigen that is moving quickly toward clinical development. [7][8][9] The invasion of merozoites into the erythrocyte is a multistep process that starts with merozoite binding and reorientation onto the red blood cell (RBC), followed by the formation of a tight junction and culminates with parasite entry into the erythrocyte. Essential to this process is PfRH5, a 63-kDa protein encoded by a gene located in the subtelomeric region of chromosome 4. The gene (PF3D7_0424100) is 1,581 nucleotides long and is refractory to knockout in diverse strains. 10 The erythrocyte receptor for RH5 is basigin (CD147), and antibodies specific to either RH5 or basigin inhibit erythrocytic invasion by P. falciparum in a dose-dependent manner in in vitro tests. 11 Furthermore, IgG antibodies induced after PfRH5 immunization have higher potency for controlling RBC invasion by both homologous and heterologous P. falciparum strains in growth inhibition assays, as compared with IgG raised against the apical membrane antigen-1 (AMA1) and merozoite protein-1, two other leading blood-stage vaccine antigens. 12 Importantly, the pfrh5 gene exhibits limited polymorphism across multiple P. falciparum laboratory strains 10,13 and geographic isolates. 14 Moreover, allelic exchange experiments have determined that specific PfRh5 residues are critical for host erythrocyte invasion. For example, a nonsynonymous change in codon 204 (I204K) converted a poorly invasive strain into a highly invasive one. 10 In a separate set of experiments, this invasion-enabling single nucleotide polymorphism (SNP) at position 204 changed the protein structure and greatly modified its binding specificity and infectivity. 15 Taken together, these findings suggest that the functional requirement for erythrocyte invasion exert strong purifying selection on the locus and might constrain the genetic diversity of pfrh5. To evaluate the potential clinical implications of this hypothesis and to generate more evidence on the extent of natural diversity in PfRH5 at a single vaccine-testing field site, we investigated the prevalence and dynamics of pfrh5 polymorphisms during acute malaria episodes in Malian children and adults. Longitudinal analyses were carried out to assess whether specific pfrh5 alleles, or changes at this locus in consecutive infections, are associated with the development of malaria symptoms. This prospective study aimed to identify specific PfRH5 amino acid residues that are critical in clinical malaria, and to identify conserved B-and T-cell epitopes that should be taken into account in the design of a malaria vaccine intended to provide broader, straintranscending protection against diverse parasites than was achieved with first-generation blood-stage malaria vaccines.

MATERIALS AND METHODS
Sample origin. Samples used in this study originated from Kalifabougou, a rural village located 45 km northwest of Bamako, Mali. Malaria transmission is seasonal (June-December) and hyperendemic, with seasonal peaks following the rains in July through October. The study population and clinical parameters have been described previously. 16 Briefly, 695 healthy children and adults, aged 3 months to 25 years, were enrolled in this ongoing study beginning in May 2011. Samples and clinical data for this study were collected from May 2011 through January 2012. Study participants were surveyed during scheduled visits occurring every 2 weeks and anytime they presented with malaria-like symptoms. During surveillance, clinical parameters were measured and malaria parasites were detected by thick smears. For this study, a clinical malaria episode was defined as any parasitemia by blood smear with a temperature greater or equal to 37.5°C. Participants with positive blood smears were treated according to the National Malaria Control Program guidelines in Mali. Peripheral whole-blood samples were collected by finger stick in capillary collection tubes (Sarstedt, Newton, NC) for the cryopreservation of the cell pellet and/or as dried blood spots (DBS) on 903 filter paper (Whatman/GE Healthcare, Piscataway, NJ). Longitudinal polymerase chain reaction (PCR) analysis of DBS collected at scheduled visits was performed as previously described. 17 Whole-blood samples from participants with at least two malaria episodes during the season were selected for genomic DNA (gDNA) extraction as described below (Figure 1).
DNA extraction and genotyping. Whole-blood gDNA was extracted from blood samples and DBSs using the QIAmp 96 DNA kit (Qiagen, Valencia, CA). To genotype polymorphisms in pfrh5, a nested-PCR was used to amplify the major exon in all the samples. Briefly, primers F1 (59 GCAATAAAAAA AACGAAGAATCAAGA 39) and R1 (59 TGAAAATATTCCA TTTTAATTGGGAC 39) were used in the primary PCR reaction (1,465 bp). Conditions for this amplification were 94°C for 5 minutes, followed by 35 cycles of denaturation at 94°C for 1 minute; primer annealing at 48°C for 1 minute, and extension at 60°C for 1.5 minutes with a final extension at 60°C for 5 minutes. PCR products from the first amplification were subsequently used as a template for a second round of amplification (1,418 bp) with primers F2 (59 GAAAATAATCT GACGTTACTACCA 39) and R2 (59 GACATCATTGAACTTCAT TTGTAG 39). Amplification conditions were the same as in the primary amplification with only 10 cycles instead of 35. PCR reactions were carried out in a total volume of 50 μL with 47 μL of Platinum Super Mix in 96-well plates (Invitrogen, Carlsbad, CA), 1 μL of each primer (1 μM final concentration) and 1 μL of malaria parasite DNA. Secondary PCR had 47 μL of Platinum Mix, 1 μL of each (1 μM final concentration) primer and 1 μL of primary PCR product. Secondary PCR products were electrophoresed on a high-throughput precast gel system (Invitrogen), then visualized and photographed under ultraviolet light with ethidium bromide as the stain.
PCR products were purified using filter plates (Edge Biosystems, Inc., Gaithersburg, MD) attached to a vacuum manifold and eluted in water, and sequenced in a 10 μL reaction using BigDye v3.1 (Applied Biosystems, Carlsbad, CA). The sequencing products were ethanol precipitated and run in 7 μL HiDi formamide on an ABI 3730XL 96-capillary sequencer (Thermo Fisher Scientific, Waltham, MA). Following sequencing, fragments were assembled, edited and aligned to P. falciparum 3D7 reference (GenBank accession number: XM_001351508) sequence using Sequencher v.5.0 (Gene Codes Corporation, Ann Arbor, MI). Infections were classified as single/predominant allele infections or as multiple allele infections if the secondary peak height was greater than 50% of the primary peak height at any polymorphic position. Sequence alignment was performed using Bioedit version 7. 18 Population genetics analysis. Haplotypes were defined as an uninterrupted sequence of nucleotides or amino acid sequence variants, and the pfrh5 locus of the P. falciparum 3D7 strain (GenBank accession number: XM_001351508) was used as the reference sequence for all the comparisons. The nucleotide diversity at the locus, among sequences generated from samples collected during the study, was assessed by estimating the parameter π, the average number of nucleotide differences per site between two random sequences. 19 Bootstrap resampling was used to simulate the distribution of π and to calculate confidence intervals (CIs). These analyses were conducted using MEGA6 20 and DnaSP v5. 21 The prevalence of nonsynonymous substitutions was estimated using SAS v9.1 (SAS Institute Inc., Cary, NC). To assess if regions of Pfrh5 evolve under neutrality (Kimura 1983), the McDonald and Kreitman test was measured using Plasmodium reichenowi (PRCDC_0322700) as an outgroup, and analyses were performed using DnaSP v5. 21 The hypothesis that positive selection has occurred in PfRH5 branch/branches tree was FIGURE 1. Within-host dynamics of Plasmodium falciparum reticulocyte-binding homologue-5 (PfRH5) amino acid polymorphisms during consecutive malaria infections. The gray color represents asymptomatic infections, and the white color represents clinical malaria. Different abbreviations (Asparagine: N, Lysine: K, Glycine: G, and Glutamate: E) indicate different alleles at a polymorphic amino acid site within PfRH5. In this hypothetical example, at this amino acid position, the individual goes from being asymptomatic during his first two infections (N then K) to symptomatic during his third infection (G). The individual is later infected with variants harboring different amino acids than the previous two infections but in one instance similar to the first infection (N). assessed by applying the branch-site test 22 and the branchsite unrestricted statistical test for episodic diversification (BUSTED) implemented in the HYPHY package. 23 Although BUSTED identifies the effect of sporadic positive selection on a subset of branches in the lineage, the branch-site test uses a χ 2 distribution with one degree of feedom to compare a model, which allows positive selection on one or more variants of a group to a model that does not permit positive selection (P < 0.05 for sites positively selected). Tajima's D statistic was also calculated, with a positive value of Tajima's D indicating an excess of alleles at intermediate frequency, consistent with balancing selection, and negative values consistent with directional selection or with rapid population expansion following a bottleneck. To assess recombination events in pfrh5, which may result in negative Tajima's D values and lead to a false inference of directional selection or rapid expansion, an estimate of recombination (R) and the minimum number of recombination events (RM) 24 in the data were estimated using DnaSP v5. These recombination events were confirmed by the genetic algorithm recombination detection test. 25 Genetic association between polymorphic sites and the effect of intragenic recombination on sequence polymorphism were assessed using the Z ns 26 and the ZZ 27 statistics, respectively. To group PfRH5 sequences, a sequence clustering analysis was conducted. Structurama, a Bayesian clustering algorithm software, 28 which treats the number of populations as a random variable, was used to determine the posterior number of populations (K). We identified two populations by this Dirichlet process. We then used Structure version 2.3.3 29 to assign Rh5 sequences to haplotype groups. Ten runs of 50,000 burn-ins and 100,000 iterations were performed for K = 2 using an admixture model.
Within-host amino acid dynamics. The frequency of protein sequence variants, or haplotypes (N = 13), and of amino acid residue polymorphic sites (N = 10) were compared in clinical and asymptomatic malaria cases using a χ 2 test while correcting for multiple measurements (two side P values: 0.05/ 13 = 0.003 and 0.05/10 = 0.005, respectively). Generalized estimating equations were used to perform logistic regression to estimate whether specific pfrh5 alleles or shifts in pfrh5 alleles were associated with the development of clinical symptoms in an individual's consecutive infections. Both single-and multiple-clone infections were included in the analysis. The primary outcome was whether an individual's second of two paired consecutive infections was symptomatic or asymptomatic, with the presence of infection being defined based on the results of the pfrh5 PCR ( Figure 1). Study participants' paired consecutive episodes were classified as follows: absence of infection to asymptomatic (OA, N = 40); absence of infection to symptomatic (OS, N = 9); asymptomatic to asymptomatic (AA, N = 58); asymptomatic to symptomatic (AS, N = 16); symptomatic to asymptomatic (SA, N = 13); and symptomatic to symptomatic (SS, N = 3). The primary predictor for the analysis was the presence of a specific pfrh5 variants or whether there was a change in amino acid sequence variant in the next consecutive infection. The models included potential confounding variables such as age group, or symptoms in the first of the two infections. Analysis was performed using SAS v9.1.
Human subject protections. The study protocols were reviewed and approved by the institutional review boards of the University of Sciences, Techniques, and Technology of Bamako, Mali; the University of Maryland Baltimore; and the National Institute of Allergy and Infectious Diseases. The study participants gave their consent and/or assent to participate in the study.

RESULTS
Data summary. Blood spots on filter paper and/or wholeblood specimens (N = 312) were selected from the incidence study sample repository. Samples had been collected biweekly from individuals with at least two malaria episodes during study follow-up starting at the beginning of the 2011 malaria transmission season. Of these specimens, 198 (63.5%) were successfully sequenced using primers specific to the Pfrh5 gene. The dynamics of consecutive malaria infections and clinical episodes was assessd using 139 sequences generated from paired-consecutives episodes as described in Table 1. Amino acid sequence variants were defined based on single amino acid polymorphisms. Of the 198 sequences generated (GenBank accession number: MG012489-MG012686) and based on the quality of the sequences, all individuals had only one haplotype.
Polymorphisms and haplotype prevalence of pfrh5 gene. Ten nonsynonymous polymorphisms were identified in pfrh5. These mutations and the amino acid residues in which they Study participants' paired consecutive episodes were classified as follows: absence of infection to asymptomatic (OA); absence of infection to symptomatic (OS); asymptomatic to asymptomatic (AA); asymptomatic to symptomatic (AS); symptomatic to asymptomatic (SA); symptomatic to symptomatic (SS).
PfRH5 DIVERSITY AND DYNAMICS are present are described as follows: D67E, N117S, Y147H, H148D, C203Y, D249Y, D305G, V371I, I407V, and I410M. The most common polymorphisms were C203Y (66.8%) followed by H148D (5%). Based on these 10 amino acids polymorphisms, 13 unique protein variants were identified ( Table 2). Seven of these variants were seen only once in the current dataset. The most frequent variant, which we define by the composition at the 10 variable residue positions (variant A: DNYHYDDVII), had a prevalence of 57.3% whereas the frequency of the second most prevalent variant (Variant B: DNYHCDDVII) was 30.7% (Table 2). An analysis by the Dirichlet process using Structurama identified these two groups as the main populations in this transmission site. An analysis of the distribution and dynamics of variants during the 8 months of follow-up showed that the incidence of variants A and B were only different in July (P = 0.03, χ 2 = 4.5). Interestingly, the incidence of the two major haplotypes fluctuate slightly and alternate over time (Figure 2), which is characteristic of frequency-dependent selection. Also, no difference was seen in the incidence of the C203Y and the H148D mutations during follow-up (χ 2 = 8.01, P = 0.33; and χ 2 = 3.18, P = 0.86, respectively).
Genetic diversity parameters were estimated for the whole pfrh5 gene. At this study site, pfrh5 nucleotide diversity (π ± SD) and haplotype diversity (h ± SD) were 0.00061 ± 0.00005 and 0.58 ± 0.027, respectively. All polymorphisms in the data occurred at nonsynonymous sites, with π S = 0 and π N = 0.00075. The average number of pairwise nucleotide differences between sequences in this sample set was 0.74.
To assess whether positive selection might be responsible for the low diversity observed in pfrh5 and influence its eventual use as a vaccine candidate, the McDonald and Kreitman test was conducted with P. reichenowi used as a source of the divergence estimates to be compared with polymorphism in P. falciparum. Despite the higher rate of nonsynonymous polymorphism relative to synonymous polymorphism, the test does not support a departure from neutral expectation (two-tailed P value = 0.55). This may result from the lack of power due to the small number of polymorphic sites, which was exacerbated by the elimination from the test of three codons with multiple variable positions. Moreover, the estimate of the coefficient of variation for α s generated using a maximum likelihood model (gamma distribution with shape 1) was small (< 0.000001) inferring a constant evolution rate among sites. Finally, the Tajima's D statistic did not show clear departure from neutrality (Tajima's D = −1.32; as a rule of thumb, a D statistic with absolute value less than 2 is not considered significant). Despite not being significant, the strong negative Tajima's D value may be suggestive of a rapid population expansion following a bottleneck.
Recombination is a process that creates new haplotypes by reassorting variants of a locus. These novel haplotypes may be different from the vaccine allele(s), limiting its efficacy. To evaluate this second source of genetic diversity in the dataset, the contribution of recombination to pfrh5 diversity was estimated. Sequence analyses suggested that the overall recombination parameter (R), which is the recombination rate per generation between the most distant sites of Pfrh5, was 77.3. An estimate of R between adjacent sites of Pfrh5 was 0.064, whereas the minimum number of recombination events (Rm) in the history of the dataset was one. These findings are   indicative of meiotic recombination occurring within the Pfrh5 locus. Values of the genetic association estimated between polymorphic sites (Zns) and the effect of intragenic recombination (ZZ) on sequence polymorphism were 0.0216 and 0.0256, respectively, indicative of low linkage disequilibrium between polymorphic sites.
Dynamics of amino acids in relation to malaria symptoms. Participants who had clinical malaria experienced one to five malaria episodes during the 9-month study period. Nearly half (48%) of these participants experienced three malaria episodes. A logistic regression model was used to estimate whether specific pfrh5 alleles were associated with the development of clinical symptoms in consecutive infections. The primary predictor for the analysis was the presence of a specific pfrh5 allele. Pfrh5 variants and malaria clinical episodes. To investigate whether individual codons or PfRH5 variants affected the risk of clinical malaria episodes, association between the most prevalent PfRH5 variants and residues and the incidence of clinical malaria was evaluated. Neither variant A (χ 2 = 2.46, P = 0.12) nor variant B (χ 2 = 2.45, P = 0.11) was associated with malaria risk. Furthermore, the incidences of two of the most prevalent amino acid polymorphisms, H148D and C203Y, were identical ([χ 2 = 0.001, P = 0.97] and [χ 2 = 5.30, P = 0.02], respectively) in study participants when malaria clinical cases were compared with asymptomatic infections (P value corrected for multiple comparisons, P = 0.05/10 = 0.005).
Antigenic diversity of B-cell and T-cell epitopes within PfRH5. To ascertain whether a subset of epitopes may replace the whole PfRH5 antigen as a conserved immunogen, IURs in addition to B-and T-cell epitopes were identified and screened for the presence of polymorphisms. Three B-cell epitopes, eight major histocompatibility complex (MHC) class I-restricted T-cell epitopes, 175 MHC class II-restricted T-cell epitopes, and three IURs (highly disordered) were identified ( Figure 3, Table 3). Eight of 10 mutations in the pfrh5 gene were located in antigenic/epitope regions (χ 2 = 8.1, P = 0.004) (Figure 3), revealing a significant polymorphisms enrichment in regions of immunogenic potential. A conserved region (amino acid residues 250-304) located in the middle of the gene was highly disordered (over-representation of few residues with low predicted secondary structure) but also enriched in B-cell epitopes.

DISCUSSION
Plasmodium falciparum reticulocyte-binding homologue-5 is a conserved malaria vaccine candidate that induces antibodies that are protective in in vitro assays, 32 non-human primate studies 7 and field studies. 16,33 Plasmodium falciparum reticulocyte-binding homologue-5 interacts with basigin (CD147), an erythrocyte surface protein, to allow the invasion of the malaria parasite into the erythrocyte. The goal of this study was to assess the genetic diversity, haplotype prevalence, and dynamics of the PfRH5-encoding gene to guide the selection of variants and/or B-and T-cell epitopes that are representative of the parasite population of this study site for possible inclusion in a broadly effective malaria vaccine. Examining the dynamics of antigen diversity in prospective cohort studies can show evidence to guide the design of vaccines that will be efficacious against diverse targets. Clinical trials have shown that some malaria vaccines based on a single strain induce protection that is specific to the vaccine strain. 5,34-36 Likewise, pneumococcal 37 and influenza 38 vaccine trials have shown the importance of antigen polymorphisms on vaccine efficacy and the selection of variants to include in vaccine formulations.
In the present study, we analyzed 198 sequences from Kalifabougou, Mali. Power calculations based on sequences extracted from MalariaGEN database show that, with this sample size, we have 83.6% power to identify 85.7% of all alleles present, suggesting that this sample size provides a reliable assessment of population diversity. Two protein variants that represent most of the circulating alleles in this region were identified. In support of previous findings based on small numbers of parasite strains, 10,11,39 our large field study confirmed that pfrh5 was highly conserved in a setting where other vaccine antigens show extensive polymorphism. The nucleotide diversity, as measured by π, was very low (0.00061 ± 0.000012) compared with the diversity observed in the locus encoding the well-known blood-stage AMA1 (0.0163) from the same geographical region. 40 Despite the high rate of polymorphism in nonsynonymous relative to synonymous sites, usually suggestive of positive selection, none of the statistical inference methods used to detect departure from neutrality was indicative of evolution under immune pressure, possibly because the tests were underpowered to detect it in this dataset.
In addition, recombination events that may introduce new protein sequence variants in the locus were very low (only one event observed), as indicated by low R, Rm, Zns, and ZZ statistics. The low effective recombination rate may also be explained by the limited number of variable sites that are not singletons, which prevents the detection of recombination events. The very low values of nucleotide diversity not only in nonsynonymous (π NS ) sites but also in silent sites (π S ) and observed in this dataset are usually indicative of a very recent common ancestor to all sequences, as one might expect from a hard sweep or a recent, strong bottleneck. This observation is consistent with a previous study that suggests that a small chromosomal region containing Pfrh5 and three other genes was recently acquired via lateral gene transfer from another Plasmodium lineage, possibly an ancestor of Plasmodium adleri, 41 an event that would impact polymorphism similarly to a very recent bottleneck.
Ten nonsynonymous SNPs were identified in pfrh5, which gave rise to 13 protein variants with frequencies ranging from 0.5% to 57%. In comparison, using sequences from the blood-stage vaccine candidate AMA1, 186 unique amino acid haplotypes were identified in Bancoumana (prevalences ranging from 0.3% to 2.54%) 40 and 214 unique haplotypes were observed in Bandiagara (prevalences from 0.2% to 3.6%), 5 two other towns in Mali. Plasmodium falciparum reticulocyte-binding homologue-5 amino acid polymorphism C203Y, located near the boundary of the region involved in the PfRH5-basigin interaction was the most variable residue observed in this dataset. None of the codons within PfRH5basigin interface (S197, I204, N347, Y358, and E362) 42 were polymorphic. The conservation of these residues is suggestive of the functionally critical role they play in the interaction between PfRH5 and basigin to allow parasite invasion into RBCs. The overall conservation of pfrh5 is highly unusual for what would be expected of a gene with high expression during the blood-stage and for which the product is exposed to the host's immune system, and therefore more inclined to exhibit a high degree of amino acid diversity. 43,44 However, as mentioned previously, these observations may be more reflective of the unusual evolutionary history of the pfrh5 locus in P. falciparum 41 than of its interactions with the human immune system.
It is also possible that the protein has limited exposure to the immune system as it has to form a complex with P. falciparum reticulocyte-binding protein homologue-5interacting protein 45 and P113. 46 Therefore, polymorphisms in this protein may be found at flexible or exposed sites. Findings from Kenya 47 and Mali, 16 suggest that anti-PfRH5 antibody concentrations were very low in malaria-exposed subjects compared with other blood-stage antigens, providing support for the notion that PfRH5 is either nonimmunogenic or that its immunogenicity is undermined by mechanisms that reduce immune recognition or response. The low antigenicity of PfRH5 in natural settings will likely require the use of a potent adjuvant in a RH5based vaccine to induce the high antibody levels needed for protection.
Using naturally-occurring clinical malaria episodes, we assessed whether specific pfrh5 alleles were associated with clinical disease, suggesting that certain alleles may allow for more efficient erythrocyte invasion and potentially more virulent infections. This study did not identify amino acids that were associated with the parasite's ability to invade RBCs, at least as measured by association with clinical outcomes. All residues at position 204 in this dataset were shown to be wild type, suggesting a high relative fitness of an isoleucine (I) in this position. Although this post hoc analysis was based on a relatively small number of paired consecutive infections, the very low degree of polymorphism in PfRH5 compared with other blood-stage antigens supports the conclusion that not enough time has occurred for genetic variants to emerge. Identifying epitopes that are conserved and localized in intrinsically unstructured regions of PfRH5 has important implications for vaccine design. Our study revealed 21 B-and T-cell epitopes that are conserved and located in disordered regions across all the sequences from the Kalifabougou site. Based on our data, the inhibition activity of these epitopes and/or fragments should be assessed by functional assays to ascertain their relevance as candidate epitopes of a PfRH5based vaccine.
We have advocated routinely assessing whether polymorphism in vaccine candidates is immunologically important and therefore relevant to vaccine development. The data reported here on the diversity and dynamics of this antigen's alleles and their association with the hazard of clinical malaria episodes may inform the design of a PfRH5-based vaccine or, more likely, of a multivalent, multistage malaria vaccine that improves on the modest efficacy of the first-generation vaccines. 48 The results of this study suggest that polymorphism in PfRH5 is very limited, stable during consecutive infection that occur over the course of a malaria transmission season and not associated with clinical risk. This high degree of conservation and lack of association between genetic variation and clinical disease is in sharp contrast to other leading bloodstage vaccine antigens whose efficacy was constrained by allelic escape. 3,5,34,49,50 Although encouraging for PfRH5's prospects as a vaccine, even this limited degree of polymorphism allows for the possibility of vaccine escape and allelespecific efficacy under the intense selection of vaccination as supported by the apparent signal of frequency-dependent selection. Moreover, although changes in the P. falciparum circumsporozoite protein (CSP) were not associated with clinical risk in paired consecutive infections, 51 the CSP-based vaccine RTS,S nevertheless did have allele-specific efficacy. 6 If the predominant alleles identified in this study are confirmed to predominate in other epidemiological settings, it would be prudent to include both alleles in a vaccine containing PfRH5 to reduce the effect of allele-specific efficacy.