Molecular analysis of HLA-DQB1 alleles in childhood common acute lymphoblastic leukaemia.

Epidemiological studies suggest that childhood common acute lymphoblastic leukaemia (c-ALL) may be the rare outcome of early post-natal infection with a common infectious agent. One of the factors that may determine whether a child succumbs to c-ALL is how it responds to the candidate infection. Since immune responses to infection are under the partial control of (human leucocyte antigen) HLA genes, an association between an HLA allele and c-ALL could provide support for an infectious aetiology. To define the limit of c-ALL susceptibility within the HLA region, we have compared HLA-DQB1 allele frequencies in a cohort of 62 children with c-ALL with 76 newborn controls, using group-specific polymerase chain reaction (PCR) amplification, and single-strand conformation polymorphism (SSCP) analysis. We find that a significant excess of children with c-ALL type for DQB1*05 [relative risk (RR): 2.54, uncorrected P=0.038], and a marginal excess with DQB1*0501 (RR: 2.18; P=0.095). Only 3 of the 62 children with c-ALL have the other susceptibility allele, DPB1*0201 as well as DQB1*0501, whereas 15 had one or the other allele. This suggests that HLA-associated susceptibility may be determined independently by at least two loci, and is not due to linkage disequilibrium. The combined relative risk of the two groups of children with DPB1*0201 and/or DQB1*0501 is 2.76 (P=0.0076). Analysis of amino acids encoded by exon 2 of DQB1 reveal additional complexity, with significant (P<0.05) or borderline-significant increases in Gly26, His30, Val57, Glu66-Val67 encoding motifs in c-ALL compared with controls. Since these amino acids are not restricted to DQB1*0501, our results suggest that, as with DPB1, the increased risk of c-ALL associated with DQB1 is determined by specific amino acid encoding motifs rather than by an individual allele. These results also suggest that HLA-associated susceptibility to c-ALL may not be restricted to the region bounded by DPB1 and DQB1. ImagesFigure 2

Childhood acute lymphoblastic leukaemia (ALL) is the most frequently occurring leukaemia among children of the developed countries (Breslow and Langholz, 1983;Parkin et al., 1988;Linet and Devesa, 1991). A unique feature of ALL in white Caucasian children is the peak number of cases that occurs between the ages of 2 and 5 years (Stewart, 1961;Parkin et al., 1988;Linet and Devesa, 1991). This first appeared in the UK as a mortality peak after 1920 (Court Brown and Doll, 1961), and has remained to the present as an incidence peak, despite a marked reduction in mortality in the past 3 decades. Immunophenotyping has shown that this early peak is almost entirely due to the B-cell precursor form of common ALL (c-ALL) (Greaves, Pegram and Chan, 1985). Suggestions that the c-ALL peak was unmasked following a decline in pre-emptive childhood mortality from infectious disease between the early 1900s and 1940 (Stewart and Kneale, 1969;Kneale, 1971) have been questioned by Greaves et al. (1991) on the grounds that no similar increase in other subtypes of childhood acute leukaemia has been seen.
The role of infection in the aetiology of childhood ALL has recently been re-examined in relation to its cell biology, the timing of post-natal infection (Greaves and Alexander, 1993) and population mixing (Kinlen et al., 1990), and has suggested that childhood c-ALL may be the rare outcome of a common but unidentified neonatal infection. This poses questions about the contribution of potentially rate-limiting host factors such as genetic susceptibility (Taylor, 1994). In order to determine whether these could involve immune response genes, we previously analysed the frequency of HLA -DPBI alleles in c-ALL, and found an increased risk in children with HLA-DPBI*0201 [relative risk (RR) 2.1-2.9] (Taylor et al., 1995).
To try to define the physical limit of genetic susceptibility within the HLA region, we have now typed the same cohort of children for alleles at the HLA-DQBI locus. This is a related HLA class II gene approximately 420 kb telomeric to DPBI (Trowsdale and Campbell, 1992). DQBI alleles are associated with susceptibility and resistance to several malignant and non-malignant diseases notably adult T-ALL (Uno et al., 1988), cervical intraepithelial neoplasia and human papilloma virus (HPV) (Apple et al., 1994), type 1 (insulin-dependent) dibetes mellitus (Todd, Bell and McDevitt, 1987) and melanoma (Lee et al., 1994). We report that children who have DQBI*0501 are at increased risk of c-ALL, and that this does not appear to be in linkage disequilibrium with DPBJ*0201. We also find that as with DPBI, the increased risk associated with DQBI is strongest in relation to key amino acid motifs rather than with individual alleles.

Materials and methods Patients and controls
The patients in this study comprise the same cohort as that described previously (Taylor et al., 1995) and consists of 62 children (one child from the previous study is excluded) with c-ALL treated at a single centre (Royal Manchester Children's Hospital) in the north west of England, between 1990 and 1992. To reduce bias to an absolute minimum, only children in whom a diagnosis of c-ALL was confirmed by cytology and immunophenotyping according to the Medical Research Council's 11th ALL trial (UKALL XI) guidelines are included in the analysis. Patients with other leukaemia Correspondence: GM  subtypes. including those with unclassified ALL are excluded. Blood samples were obtained from children, usually in remission, either in the clinic or at home with parental consent. The blood was anticoagulated with EDTA, and frozen before DNA extraction within 12 h of donation. Control blood samples were obtained from the placental side of the clamped umbilical cord of normal full-term babies delivered at St Mary's Hospital. Preterm deliveries, caesarean sections and pregnancies with complications were excluded. DNA extraction and amplification Genomic DNA was extracted from fresh or frozen blood samples using established methods (Blin and Stafford, 1976), and an exon 2 fragment of DQB1 amplified in the polymerase chain reaction (PCR; Erlich and Arnheim, 1992) using the oligonucleotide primers depicted in Table I  These DQBJ locus-specific primers do not amplify the related but non-expressed DQB2 or DQB3 genes. The primers were designed to amplify four groups of alleles (nomenclature according to WHO guidelines): the *021*03 group (0201, 0301, 0302, 0303, 0304), the *04 group (0401, 0402), the *05 group (0501, 0502, 0503, 0504) and the *06 group (0601, 0602, 0603, 0604, 0605, 0606, 0607, 0608).
Single-strand conformation polv morphism (SSCP) typing DQBI alleles were detected using the SSCP method described by Orita et al. (1989), modified for HLA-DQBI typing as described by Lo et al. (1992) and Summers et al. (1992). For this, 15 p1 of each PCR product was mixed with 2 p1 of loading dye (0.5% bromophenol blue, 0.5% xylene cyanol in deionised formamide), and heated to 95CC for 10 min to   Table I. denature the DNA and reduce the sample volume to 4-5 p1. The samples were loaded onto non-denaturing polyacrylamide gels (PAGs) in a water-cooled vertical mimn-electrophoresis unit (Cambridge Electrophoresis, Cambridge, UK). SSCP conditions for all group-specific PCR products were the same, including the ratio of acrylamide-methylbisacrylamide (39:1), which was prepared from stock Easigel mix (Scotlab, Strathclyde, UK) and solid acrylamide (BDH, Lutterworth, UK). SSCP gels were run using 0.5X TBE and 0.8 mm spacers, at 9 mA constant current. SSCP band patterns were visualised using silver staining (Qiagen, Dusseldorf, Germany) following the manufacturer's instructions and photographed using a 35 mm SLR camera without filters and Agfa Ortho film. Data analysis DQBI allele frequencies in c-ALLs and controls were calculated as percentages of total alleles. Phenotype frequencies were determined as the percentage of test subjects with a given allele in either heterozygous or homozygous form. Genotype frequencies were obtained by calculating the proportion of individuals with combinations of two alleles. Relative risks (RRs) were obtained by the cross-product odds ratio method, and these were tested for significance using 2 x 2 (x2) contingency analysis using SIMFIT. Since this study was used for hypothesis generation, no correction for the number of alleles tested has been applied. Resuls DQB1 molecular typing In preliminary studies to optimise DQBI molecular typing, we used generic primers obtained from the British Society for Histocompatibility and Immunogenetics (BSHI) to amplify exon 2 from genomic template DNA, and BSHI-derived sequence specific oligonucleotide (SSO) probes to detect specific alleles. However, in our hands, we were unable to distinguish certain DQBI alleles using this system. We therefore designed a set of nine primers as shown in Table   I and Figure 1, which we used in various combinations to amplify groups of alleles, and SSCP analysis to distinguish the alleles within each group. Using this method we were also able to confirm and clarify DQBI allele assignments obtained on HLA homozygous and heterozygous test cells typed with genenc prmers and SSO probes. All pairs of group-specific primers were found to amplify in the predicted manner according to Table I. No amplification of other allelic groups was obtained, and it was not necessary to introduce deliberate mismatches into the primer sequences to improve speficity.
The results in Figure 2 show SSCP band patterns obtained with the DQBI 02/03 group-specific primers on homozygous and heterozygous cells. They are distinct enough to allow alleles to be assigned in heterozygotes. No DNA was available for the following rare alleles: DQB1*0304, *0605-*0608, but these alleles would have been amplified by our group-specific primers and may give distinct bands. aPhenotype frequencies in which there is a significant difference between c-ALLs and controls. bRelative risk (RR) for which there is borderline significance between c-ALL and controls. *P-value. DQB1 allele frequency in childhood c-ALL Details of the children with c-ALL who were typed for DQBJ in this study are given in a previous paper (Taylor et al., 1995). Briefly, 62 children with a confirmed diagnosis of c-ALL, consisting of 36 boys and 26 girls (M/F 1.42:1) with a mean age of 5.6 years and age range of 1 -13 years constituted the patient cohort. About one-third of the girls, but nearly half of the boys were diagnosed at 3-4 years of age. Eight of the 62 children with c-ALL were from ethnic minorities, mainly of Asian origin. The present cohort of c-ALLs comprises 83% of children with ALL ascertained by the Manchester Children's Tumour Registry during the study period. The remainder of the ALLs were excluded from the analysis as being other subtypes, or not confirmed as c-ALL. The control series used here consists of cord blood samples from 76 Caucasian newborn infants delivered in St Mary's between 1990 and 1992. The only subjects excluded from this cohort were preterm and caesarean deliveries and congenital abnormalities. A total of 14 DQBI alleles were identified by SSCP typing of the patient and control groups and of these, 12 alleles were detected in the c-ALLs and ten in the controls. Alleles missing both from patients and controls included *0401 and *0504, whereas *0502 and *0503 were absent from the control series only. A comparison of allele frequencies (Table II) shows a deficit of DQBI*0201 in c-ALLs compared with the controls (c-ALL, 24.2% vs controls, 37.5%), but an excess of *0501 in c-ALLs (c-ALLs, 16.9% vs controls, 9.2%). There is no difference in *0602 frequency in c-ALL and controls, but there is an excess of the combined *0601, *0603 and *0604 alleles in c-ALLs (c-ALLs, 17% vs controls, 9.3%). In addition, the combined frequency of *0302 and *0303 in c-ALL is about half that in the controls (c-ALLs, 7.2% vs controls, 15.1%).
The deficit of *0201 and excess of *0501 alleles in c-ALL is seen more informatively in the phenotype frequency analysis (Table II), which compares the frequency of patients and controls who carry specific alleles. There is a deficit of 11.8% of children with c-ALL who have DQBI*0201, compared with infant controls (c-ALLs vs infants, 43.5% vs 55.3%), although this is not significant. However, 13.2% more children with c-ALLs have *0501 (c-ALLs vs infants, 29% vs 15.8%), and this reaches borderline significance. Moreover if we combine phenotype frequencies of all *05 series alleles (*0501, *0502, *0503), the cumulative RR (2.54) is significant (P= 0.038). In addition, cumulative phenotype frequencies indicate that over twice as many children with c-ALL type for *0601, *0603 or *0604 (c-ALLs vs infants, 33.9% vs 15.8%), and nearly half as many type for *0302 and *0303 (14.5% vs 26.4%).
Since there are nearly 50% more boys than girls in our c-ALL cohort, we examined whether there might be a aDQBI phenotypes in which there is > 10% difference in frequency in females compared with males. bRelative risks for alleles showing an increased or decreased frequency in c-ALL. *P-value. difference in DQBI allele and phenotype frequencies when the sexes were compared. Table III shows that the frequency of boys who type for *0501 is over twice that of girls (boys, 36.8% vs girls, 16.7%, RR, 2.92), whereas the frequency of girls who type for *0602 is over three times that of boys (girls, 33.3% vs boys, 10.5%, RR, 0.24, P=0.05).
Genotype frequencies in c-ALLs and infant controls are compared in Table IV. The most striking difference is the percentage homozygosity (c-ALL, 19.3% vs infants, 42%, P=0.0016). This is largely the result of a marked difference in the frequency of *0201 homozygotes (c-ALL vs infants, 4.8% vs 19.7%). However, the cumulative frequency of non-*0201 homozygotes is also less in c-ALL than it is in the controls (c-ALLs, 14.5% vs controls, 22.3%). Cumulative frequencies of certain heterozygotes, such as *0501/*0201 and *0501/*0301 show quite marked differences (c-ALL, 16.2% vs controls, 6.5%).
It is possible that the increased risk of c-ALL associated with DQBI*0501 could be due to linkage disequilibrium with DPBI*0201, the allele previously shown to be associated with an increased risk of c-ALL in the same cohort (Taylor et al., 1995). When we analysed the frequency of patients with both alleles, however (Table V) we found that only 3 of the 62 children with c-ALL typed for DPBI*0201 and DQBI*0501, 29 had neither allele and 15 each had either DPBI*0201 or DQBI*0501. Although there is no significant difference in the frequency of c-ALLs and controls with both alleles or DPBI*0201 alone, the increased frequency of c-ALLs with DQBI*0501 but lacking DPBI*0201 is borderline, and there is a significant decrease in patients with neither allele compared with controls. Taking the cumulative frequency of c-ALLs with one or both candidate susceptibility alleles in comparison with the controls (c-ALLs, 53% vs controls, 29%), the combined RR is 2.76 (P=0.0076).
The DQBJ phenotype analysis shown in Table II shows that frequency of c-ALL children with alleles other than DQBI*0501 is also increased. It is possible that this is owing to associations with amino acids common to more than one allele, as we found for DPBJ. We therefore analysed the DQBJ data by calculating the frequency of selected exon 2 encoded amino acids in the patient and control groups. Amino acids at seven positions were compared (Table VI), of which three (positions 26, 30 and 57) consist of 3-4 single amino acids, and four positions (13-14, 66-67, 70-71, 86-87) consist of two amino acids. For alleles we scored all motifs, but for phenotypes we have again expressed the results as the percentage of subjects with that motif, (heterozygotes and homozygotes are counted once). This means that a person can be heterozygous for a DQBI allele, but homozygous for an amino acid. Table VI shows that there are significant or borderline differences between c-ALLs and infant controls for all positions, with RR values ranging from 2.36 to 3.80. Taken together the results show increases in Gly26, His30, Val57 and Glu66-Val67 in c-ALL. aDQB1*0501 genotypes with an excess of c-ALLs. bHomozygotes.

Discussion
In our previous study we found that children with HLA -DPBI*0201 are at about twice the risk of developing common ALL as normal (Taylor et al., 1995). Since there is a possibility that the DPBI*0201 is not itself the c-ALL susceptibility gene, but is in linkage disequilibrium with some  aAmino acid position coded by exon 2. bAmino acid motifs significantly increased in c-ALL. cPhenotypes significantly increased in c-ALL. dSignificant difference (P < 0.05).
other HLA or non-HLA gene, we carried out further molecular studies on the same cohort to try to define the physical limit of susceptibility to c-ALL in the HLA region. The results reported here show that the risk in children who type for the DQBJ*05 allele series is similar to that for DPB1*0201. More importantly, this does not seem to be due to linkage disequilibrium between DPB1*0201 and DQBI*0501, since only 3 of the 62 patients typed for both alleles. Lack of linkage disequilibrium between DPBI*0201 and DQBI*0501 is supported by other studies (Baisch and Capra, 1993). Whereas it is still possible that a susceptibility allele linked to these alleles at both loci maps to the DPBI-DQB1 interval, we suggest that susceptibility is conferred independently by the two alleles. Moreover, alignment of the amino acids involved in increased DPBJ susceptibility (Val6, Asp55, Glu69) does not provide evidence for an overlap with amino acids involved in DQBI susceptibility.
It is not as easy to distinguish DQBI alleles by generic PCR amplification and SSO probe hybridisation as it is for DPBI alleles. For this reason, we designed a series of nine DQBJ group-specific primers, which were used in various combinations to amplify four allelic groups (*02/*03, *04, *05 and *06). In conjunction with SSCP analysis of the group-specific PCR products we were able to distinguish 14 DQBI alleles in heterozygotes. Five alleles were not detected in our patient and control series, including *0304, and *0605 -*0608, although it is unlikely that they would not be detected by our typing system. DQBI*0304 shares sequences with primers P3a and P3R, and should therefore be amplified by them. Since it differs at several polymorphic positions from other *03 alleles, it should have been resolved by SSCP. DQBI*0605-*0608 should also amplify with the *06 groupspecific primers, since P3b is homologous to these alleles at the 5' end. Although there are no sequence data for the 3' end of these alleles, we would expect them to resemble the other alleles in the series. The possibility that they may not separate from the other alleles in the group (*0601 -*0604) by SSCP would have made little difference to the overall result had they been present.
A number of molecular typing techniques based on PCRmediated amplification have been devised to detect DQBI alleles. In an early method (Bugawan and Erlich, 1991), a pair of DQBI generic primers and 16 SSO probes were used to distinguish 15 DQBI alleles. However, other authors using this method have favoured a combination of group-specific amplification and SSO-based typing (Malkentin et al., 1991) to improve resolution. The exacting conditions required for SSO probe hybridisation have led to the development of alternative typing methods, including sequence-specific primer (SPP)-based amplification (Olerup, Aldener and Fogdell, 1993;Salazar et al., 1993). While this has the benefit of being both rapid and high resolution, novel alleles within the 608 amplified sequence could be missed, and sequence variations outside this region would go undetected. For disease association studies, in which time is less important than in donor-recipient transplant matching, PCR-SSCP using group-specific amplification combines a low-resolution PCR-SSP approach, with the sequence-dependent resolving power of PAG electrophoresis. This approach was first demonstrated for DQBI by Lo et al. (1992), and the value of group-specific amplification documented by Carrington et al. (1992). Although we have used silver staining to assign alleles, our method should lend itself to more rapid typing using fluorescent primers in conjunction with an automated sequencer. We have not corrected our results for the number of allele comparisons, so we cannot at present completely rule out that the DQBI*05 association occurred by chance. The difficulty of providing conclusive proof of an increased genetic risk due to DQBI*05 in c-ALL is not one of ascertainment, because our patients are an unbiased prospective group from a single geographical region. The problem concerns the relative rarity of this disease, despite the fact that childhood c-ALL is itself the commonest childhood malignancy. At the level of risk found here, we would have needed at least twice the number of patients to verify the association. However, our results provide a useful basis for hypothesis generation and a preliminary answer to the question of whether the DPBI*0201 association is due to linkage disequilibrium.
If genetic susceptibility to c-ALL is conferred by DQBI*05 it suggests some degree of specificity in the interaction with an environmental co-factor. Since there is no evidence from our studies so far that DPBI or DQBI alleles in childhood c-ALL are mutated in the germ line, we assume that the increased risk is conferred by the normal allele itself. This raises the question as to why a relatively common HLA allele should increase the risk of a previously lethal childhood disease such as c-ALL, a clear selective disadvantage to the allele. We previously suggested in the context of DPBI that one explanation for this could be that since HLA alleles are neither necessary nor sufficient to cause leukaemia they may confer protection to some numerically more important infectious agent. Thus, c-ALL high-risk HLA alleles might be less effective in protecting against a leukaemogenic infection caused by molecular mimicry of the susceptibility allele by the infectious agent, but be effective against some other infection. Since the transformation of pre-B progenitor cells into c-ALL requires at least two mutations, it seems unlikely that HLA alleles determine whether a fetus will succumb to an in utero infection. This might be expected to be associated with birth abnormalities in children with leukaemia.
A different interpretation is suggested by the Greaves hypothesis (Greaves and Alexander, 1993). This postulates that the chance of a second mutation in a preleukaemic pre-B progenitor population is promoted by post-natal infectioninduced cell division. HLA alleles associated with a risk of c-ALL might in this case be the target of, or induce, the immunostimulatory signals which lead to the increased division of the pre-leukaemic population. If an infectious agent contained immunogenic peptides capable of binding to alleles at more than one HLA class II locus, this might maximise immune recognition but also enhance pre-B stimulation. Such a phenomenon might explain the apparent independence of DPBI-and DQBI-associated susceptibility to c-ALL.
We found a difference in the percentage of boys and girls with DQBI*0501 (RRM:F, 2.92), and DQB1*0602 (RRM:F, 0.24). This sex difference in the frequency of a susceptibility allele was absent in our previous analysis of DPBI (Taylor et al., 1995). One explanation for the sex difference could be that there are subtle malefemale differences in genetic susceptibility to immune-associated disease as previously suggested (Purtilo, 1979). Since there is an excess of males to female with c-ALL in our series, it is possible that Xlinked immunoregulatory genes influence the ability of males and females to respond to HLA-associated immunogenic peptides. Genotype analysis revealed a higher frequency of DQBI*0201 homozygosity in the controls than in the patients. We can exclude typing artifact due to the misidentification of heterozygotes as homozygotes as both series were typed at the same time, using the same reagents and techniques. We can also exclude increased HLA homozygosity due to consanguineous ethnic minority parents, since we excluded Asian infants from the control series. The result suggests that *0201 homozygosity may have a protective effect in c-ALL. It is of interest to note that none of the amino acid sequence motifs found in *0501, or any of the other alleles associated with an increased risk of c-ALL is also found in *0201. More specifically, the serine residue at position 30, and arginine-lysine at positions 70-71 of *0201, both differ from the motifs associated with DQBI susceptibility to c-ALL, and may be considered as conferring protection.
The amino-acid motifs associated with an increased risk of c-ALL include glycine26, histidine30, valine 5 and glutamic acid66-valine67. Of the four motifs at position 57, aspartic acid (Asp57) is present in only 48% of children with c-ALL compared with 64% of infant controls, and is negatively associated with c-ALL. It has been known for some time that Asp57 confers resistance to insulin-dependent diabetes (Todd et al., 1987;Morel et al., 1988;Tosi et al., 1994), although the presence or absence of this motif is not sufficient to account for inherited susceptibility (Thomson et al., 1988;Kockum et al., 1993). Although there are few similarities between c-ALL and IDDM other than their occurrence in childhood, the possibility that a common infectious agent, or similar immunoregulatory defect with a different outcome is involved would repay further study.