The 1995 Walter Hubert Lecture--molecular epidemiology of human cancer: insights from the mutational analysis of the p53 tumour-suppressor gene.

The 1995 Walter Hubert Lecture - molecular epidemiology of human cancer: insights from the mutational analysis of the p53 tumour-suppressor gene

Classical epidemiological studies have identified populations and families at increased cancer risk. Molecular epidemiology has a more ambitious goal, i.e. the identification of individuals at high cancer risk in these cancer-prone populations and families. Achieving this goal is challenging both current molecular technologies and epidemiological designs, and exposing bioethical dilemmas.
The two facets of molecular epidemiology of human cancer risk are assessment of carcinogen exposure and inherited or acquired host cancer-susceptibility factors (reviewed in Harris, 1991;Perera and Santella, 1993). The interaction between these two facets determines an individual's cancer risk. This paradigm can also improve cancer risk assessment (Figure 1). When combined with carcinogen bioassays in laboratory animals and classical epidemiology molecular epidemiology can contribute to the four traditional aspects of cancer risk assessment: hazard identification, doseresponse assessment, exposure assessment and risk characterisation. Improved cancer risk assessment has broad public health and economic implications (National Research Council, 1994).
Weighty bioethical consequences follow the identification of high-risk individuals (Li et al., 1992). The bioethical issues include: autonomy, privacy, justice and equity ( Figure 1). One can argue that the knowledge of one's risk can be beneficial. However, more encompassing bioethical issues arise, such as an individual's responsibility to family members and psychosocial concerns regarding the genetic testing of children (Li et al., 1992). Therefore, the uncertainty of the current individual risk assessments and the limited availability of genetic counselling services dictate caution and, many argue, the restriction of genetic testing to those conditions amenable to preventative or therapeutic intervention.
This lecture will discuss cancer susceptibility genes as inherited host factors and then focus on the mutational spectrum of the p53 tumour-suppressor gene and the testing of hypotheses generated by the analysis of this gene. The discussion of the p53 gene will complement the excellent 1993 Walter Hubert Lecture presented by Arnold Levine .
Correspondence: CC Harris Received 27 July 1995;accepted 18 August 1995 The investigation of rare cancer-prone families has led to the identification of germline mutations in genes that are frequently somatically mutated in sporadic cancers. Examples of these syndromes are listed in Table I. The altered genes encode proteins that perform diverse cellular processes, including transcription, cell cycle control, xenobiotic metabolism and DNA repair. The increased cancer risk of an individual carrying one of these germline mutations can be extraordinarily high, i.e. more than 1000-fold in xeroderma pigmentosum (complementation groups A-G) ( Figure 2). However, high-risk inherited conditions are rare in the general population and number only a few cases in 105 live births. More common inherited cancer-susceptibility conditions, e.g. deficiencies in the N-acetyltransferase (NAT) genes or glutathione S-transferase genes, may contribute a more substantial attributable risk in a carcinogen-exposed population.
The recently identified cancer-susceptibility genes involved in breast-ovarian cancer (BRCAI) and hereditary nonpolyposis colorectal cancer occur at an intermediate rate of one in several hundred live births in the general population. The frequency of these cancer-susceptibility genes and their attributable cancer risk are important considerations in developing public health policy for genetic screening of the general population. Different public health and bioethical considerations apply to the genetic screening of family members of individuals carrying a high cancer risk allele in their germ line (Li et al., 1992).

Mutational spectra of tumour-suppressor genes
Mutational spectrum analysis, the study of the types and locations of DNA alterations, describes the often characteristic patterns of DNA changes caused by endogenous and exogenous mutagens. Alterations of cancer-related 'genes found in tumours not only represent the interactions of carcinogens with DNA and cellular DNA repair processes, but also reflect the selection of those mutations that provide premalignant and malignant cells with a clonal growth advantage. Study of the frequency, timing, and mutational spectra of p53 and other cancer-related genes provides insights into the aetiology and molecular pathogenesis of cancer and generates hypotheses for future investigations. These include questions regarding carcinogen-DNA interactions, functions of the affected gene products, mechanisms of carcinogenesis in specific organs or tissues and features of general cell biology, such as DNA replication and repair.
The types of mutations in tumour-suppressor genes are most frequently nonsense mutations, deletions and insertions that produce either an assent or truncated protein product. These mutations are clearly 'loss of function mutations.' The p53 tumour suppressor has an unusual spectrum of mutations when compared with other suppressor genes, e.g. APC, BRCA-1 or p16"NK4 ( Figure 3). Missense mutations in which the encoded protein contains amino acid substitutions are commonly found in the p53 tumour-suppressor gene. The missense class of mutations can cause both a loss of tumoursuppressor function and a gain of oncogenic function by changing the repertoire of genes whose expression are controlled by this transcription factor (Lane and Benchimol., 1990;Dittmer et al., 1993;Hsiao et al., 1994). The p53 gene was initially classified as an oncogene until it was discovered in the late 1980s that the cDNAs cloned from murine and human tumour cells contained missense mutations; it was correctly classified when a true wild-type p53 gene construct suppressed the growth of tumour cells (Eliyahu et al., 1989;Finlay et al., 1989;Baker et al., 1990;Diller et al., 1990;Mercer et al., 1990;Chen et al., 1991;Cariello et al., 1994). This Dr Jekyll and Mr Hyde duality may be one explanation of the remarkable frequency of p53 mutations in human cancer. The p53 gene is well suited to mutational spectrum analysis for several reasons. First, since p53 mutations are common in many human cancers, a sizable database of about 5000 entries has accrued, the analysis of which can yield statistically valid conclusions . The modest size of the p53 gene (11 exons, 393 amino acids) permits study of the entire coding region, and it is highly conserved in vertebrates, allowing extrapolation of data from animal models (Soussi et al., 1990). Point mutations that alter p53 function are distributed over a large region of the molecule, especially in the hydrophobic midportion (Hollstein et al., 1991;Levine et al., 1991;Greenblatt et al., 1994), where many base substitutions alter p53 conformation and sequence-specific transactivation activity; thus, correlations between distinct mutants and functional changes are possible. Frameshift and nonsense mutations that truncate the protein can be located outside of these regions, so evaluation of the entire DNA sequence yields relevant data. This situation differs from the ras oncogenes whose transforming mutations occur primarily in three codons, a few sequence motifs and a critical functional domain (Park and Vande Woude., 1989). The diversity of p53 mutational events permits more extensive inferences regarding mechanisms of DNA damage and mutation.

Molecular archaeology of p53 mutations
Mutations can arise by either endogenous mutagenic mechanisms or exogenous mutagenic agents and are archived in the spectrum of p53 mutations found in human cancer (Hollstein et al., 1991;Levine et al., 1991;Harris, 1993;Greenblatt et al., 1994;Soussi et al., 1994)  more identical bases or at repeats of 2to 8-base pair DNA motifs, either in tandem or separated by a short intervening sequence. Several mechanisms are probably involved (Ripley, 1990). The mechanism that has been most studied is called slipped mispairing, a misalignment of the template DNA strands during replication that leads to either deletion, if the nucleotides excluded from pairing are on the template strand, or insertion, if they are on the primer strand. When direct repeat sequences mispair with a complementary motif nearby, the intervening oligonucleotide sequence may form a loop between the two repeat motifs and be deleted (Jego et al., 1993;Krawczak and Cooper, 1991). More lengthy runs and sequence repeats are more likely to generate frameshift mutations. The detection of errors in replication of reporter genes has helped quantify this phenomenon (Kunkel, 1993). The deletions and insertions in the p53 gene found in human tumours also may be biologically selected from the broad array of such mutations occurring in human cells. When compared with the distribution of missense mutations, these types of mutations occur more frequently in exons 2-4 (54%) and 9-11 (77%) than in exons 5-8 (20%) (Figure 4). The N-terminus of the p53 protein (encoded by exons 2-4) (reviewed in Vogelstein and Kinzler, 1992a;Liu et al., 1993;Lu and Levine, 1995;Thut et al., 1995) has an abundance of acidic amino acids that are involved in transcriptional function of p53 (Fields and Jang, 1990;Raycroft et al., 1990), binds to transcription factors such as tata box binding protein (TBP) in transcription factor complex IID (TFIID) (Seto et al., 1992;Liu et al., 1993;Mack et al., 1993;Martin et al., 1993;Truant et al., 1993), and experimental studies have shown that multiple point mutations are required to inactivate its transcriptional transactivation function (Lin et al., 1994). The carboxy terminus (encoded by exons 9-11) of the p53 protein is enriched in basic amino acids that are important in the oligomerisation and nuclear localisation of the p53 protein (reviewed in Clore et al., 1994;Lee et al., 1994;Hupp and Lane, 1995;Jeffrey et al., 1995), recognition of DNA damage (Bakalkin et al., 1994;Jayaraman et al., 1995) and induction of apoptosis (XW Wang and CC Harris, unpublished results). Multiple point mutations are infrequently found in the p53 gene, which is consistent with the target theory, i.e. exogenous mutagens target the p53 gene within the context of the entire human genome. Therefore, deletions and insertions would be a more efficient mutagenic mechanism than single point mutations in disrupting these Nterminal and C-terminal functional domains. Deamination of DNA is a spontaneous chemical process ( Figure 5). For example, 5-methylcytosine comprises about 3% of the deoxynucleotides, occurs primarily at CpG dinuileotides, and can deaminate to form thymidine. If this G-T mismatch is not repaired, C to T transitions arise. Deamination of cytosine can also generate C to T transitions if uracil glycosylase and G -T mismatch repair are inefficient. Oxyradicals can enhance the rate of the deamination reaction (Wink et al., 1991;Nguyen et al., 1992) so that the production of nitric oxide by inducible nitric oxide synthase could contribute to this endogenous mechanism of mutagenesis.
The missense mutations in the p53 gene are non-random. Five of the six mutational hotspots in the p53 gene occur at CpG dinucleotides in codons encoding the basic amino acid, arginine ( Figure 6). These mutational hotspots are at sites that are essential for maintaining the interface between the p53 protein and its DNA consensus site responsible for DNA binding and transcriptional activity. This structurefunction relationship became readily apparent when the crystal structure of the p53 protein was compared with the p53 mutation spectrum (Cho et al., 1994;Prives, 1994).
The narrow mutational spectra exhibited by some mutagens has popularised the idea that each agent might leave a specific identifying 'fingerprint' of site and type of DNA damage (Vogelstein and Kinzler., 1992b). It is probably more realistic to expect that carcinogens will produce mutation patterns that are characteristic and instructive but not as unique as fingerprints. Examples of associations between exposure to carcinomas and p53 mutational spectra  Frameshift Splice X In-frame deletion/in sense 22% Figure 3 Classes of mutations found in tumour-suppressor genes.
in human cancer are shown in Table II. The induction of skin carcinoma by ultraviolet light is indicated by the occurrence of p53 mutations at dipyrimidine sites including CC to TT double base changes (Brash et al., 1991;Ziegler et al., 1994). The high frequency of CC to TT transitions in the nontranscribed DNA strand is a reflection of strand-specific repair of the p53 gene (Evans et al., 1993). Since patients with xeroderma pigmentosum group C have a severe deficiency in nucleotide excision repair of the non-transcribed strand of DNA, one would expect a higher frequency of CC to TT transitions with a coding strand bias in skin carcinomas from these individuals (Evans et al., 1993). This prediction has proven to be correct (Figure 7) (Dumaz et al., 1994). Mutations of dipyrimidine sites in skin carcinomas also show a non-random distribution among sites within the p53 gene. Recent studies have shown that rates of cyclobutane dimer repair very among codons within p53 (Tornaletti and Pfeifer, 1994). Therefore, preferential strand repair and preferential sequence repair of the actively transcribing p53 gene influence the p53 mutational spectrum of UV-induced skin carcinomas.
The p53 mutational spectrum of hepatocellular carcinoma is a second example of a molecular linkage between carcinogen exposure and cancer. In liver tumours from persons   observed in hepatocellular carcinoma from Asia, Africa and North America (Figure 8). The mutation load of 249ser mutant cells in non-tumorous liver also is positively correlated with dietary aflatoxin B1 (AFB1) exposure (Aguilar et al., 1994). Exposure of aflatoxin B1 to human liver cells in vitro produces 249Ser (AGG to AGT) p53 mutants (Aguilar et al., 1993; K Mace, F Aguilar, CC Harris and GP Pfeifer, unpublished results). These results indicate that expression of the 249Ser mutant p53 protein provides a specific growth and/or survival advantage to liver cells and are consistent with the hypothesis that p53 mutations can occur early in liver carcinogenesis. Since cellular context may influence the pathobiological effects of specific mutants of p53, the 249Ser mutant may be especially potent in hepatocytes. The enhanced growth rate of p53-null HEP-3B cells by transfected 2495er mutant p53 indicates a gain of oncogenic function and is consistent with this hypothesis (Ponchel et al., 1994). The 249Ser mutant p53 also is more effective than other p53 mutants (143Aa, 175His, 248TrP and 282His) in inhibiting wild-type p53 transcriptional transactivation activity in human liver cells   (Figure 9). One hypothesis concerning generation of liver cancers with 249ler mutation is: (a) aflatoxin B1 is metabolically activated to form the promutagenic N7dG adduct; and (b) enhanced cell proliferation due to chronic active viral hepatitis allows both fixation of the G:C to T:A transversion in codon 249 of the p53 gene and selective clonal expansion of the cells containing this mutant p53 gene. In addition to producing chronic active hepatitis, HBV also has other important pathobiological effects. For example, hepatitis B viral gene products may form complexes with cellular transcription factors, e.g. ATF2 (Maguire et al., 1991), up-regulate transcription of cellular and viral genes (Twu and Schlozmer, 1987;Spandau and Lee, 1988 Figure 6 Schematic representation of the p53 molecule. The p53 protein consists of 393 amino acids with functional domains, evolutionarily conserved domains and regions designated as mutational hotspots. Functional domains include the transactivation region (amino acids 20-42; gold block), the sequence-specific DNA-binding region (amino acids 100-293), the nuclear localisation sequence (amino acids 316-325; dark green block) and the oligomerisation region (amino acids 319-360; green block). Cellular or oncoviral proteins bind to specific areas of the p53 protein. Evolutionarily conserved domains (amino acids 17-29, 97-292 and 324-352; magenta areas) were determined using the MACAW program. Seven mutational hotspot regions within the large conserved domain are also identified (amino acids 130 -142, 151 -164, 171 -181, 193 -200, 213 -223, 234 -258 and 270 -286; violet blocks). Functional domains and protein binding sites (aqua bars underneath) were compiled from references. Vertical lines above the schematic missense mutations; lines below schematic, non-missense mutations. The majority of missense mutations are in the conserved hydrophobic mid-region, whereas non-missense (nonsense, frameshift, splicing and silent mutations) are distributed throughout the protein, determined primarily by sequence context. Courtesy of Dr Curtis C Harris; Artwork by Dorothea Dudek.
including DNA repair and apoptosis may be another consequence of cellular protein-HBV oncoprotein complex formation. Since the HBVX gene is frequently integrated and expressed in human hepatocellular carcinomas from high-risk geographic areas (Unsal et al., 1994;Paterlini et al., 1995), we have focused our attention on the X protein, which binds to p53 (Feitelson et al., 1993;Wang et al., 1994;Ueda et al., 1995) and inhibits its sequence-specific DNA binding and transcriptional activity (Wang et al., 1994). HBV protein also inhibits p53-dependent apoptosis . Based on the above results, we have speculated that HBV protein may modulate p53 function in nucleotide excision DNA repair , including repair of AFBI-DNA adducts, and are currently testing this hypothesis. HBV integration also could increase genomic instability, including abnormal chromosomal segregation and increase rates of DNA recombination (Hino et al., 1989(Hino et al., , 1991. Therefore, a second hypothesis of liver carcinogenesis emerges in which integration of the HBV gene is the initial event in these high cancer risk geographic areas and AFB,-mediated 249Ser p53 mutation is the second genetic lesion that leads to further genomic instability.

Conclusions
Cancer risk assessment, a highly visible discipline in public health, has relied historically on classical epidemiology, including chronic exposure of rodents to potential carcinogens, and the mathematical modelling of these findings. The field has been forced to steer a prudent course of conservative risk assessment because of limited knowledge of the complex pathobiological processes during carcinogenesis: differences in the metabolism of carcinogens, different DNA repair capacities, variable genomic stability among animal species and variation among individuals with inherited cancer predisposition have made definitive analysis of cancer risk almost impossible (Harris, 1991;Barrett and Wiseman, 1992 143' 175~iU8 248w 273F igure 9 Dominant negative effects of p53 mutants on the transcription of wild-type p53 in a p53-null human liver cell line (HEP-3B). the scientific basis of risk assessment continues to be, and should continue to be, actively investigated (National Research Council, 1994).
The association of a suspected carcinogenic exposure and cancer risk can be studied in populations with classic epidemiological techniques. However, these techniques are not applicable to the assessment of risk in individuals. Molecular epidemiology, in contrast, is a field that integrates molecular biology, in vitro and in vivo laboratory models, biochemistry and epidemiology to infer individual cancer risk (reviewed in Harris, 1991;Shields and Harris, 1991;Perera and Santella, 1993) (Figure 1). Carcinogen-macromolecular adduct levels, and somatic cell mutations can be measured to determine the biologically effective dose of carcinogen. Molecular epidemiology also explores host cancer susceptibilities, such as carcinogen metabolic activation, DNA repair, endogenous mutation rates and inheritance of tumour-suppressor genes. Substantial interindividual variation for each of these biological end points has been shown -'1^-^0 %001 vJ I I 268 (Harris, 1991) and, therefore, highlights the need for assessing cancer risk on an individual basis. Investigations of the p53 tumour-suppressor gene are an example of the recent progress in molecular aspects of cancer research. A better understanding of molecular carcinogenesis and molecular epidemiology will eventually decrease the qualitative and quantitative uncertainties associated with the current state of cancer risk assessment and improve public health decisions concerning cancer hazards. Indeed, determination of the type and number of mutations in p53 and other cancer-related genes in tissues from 'healthy' people may allow the identification of those at increased cancer. risk and their consequent protection by preventative and therapeutic measures (Figure 1).