A critical analysis of prognostic factors for survival in intermediate and high grade non-Hodgkin's lymphoma. Scotland and Newcastle Lymphoma Group Therapy Working Party.

Between 1979 and 1987 the Scotland and Newcastle Lymphoma Group registered 972 adults with Working Formulation high or intermediate grade non-Hodgkin's lymphoma. Clinical, pathological and investigational data were recorded prospectively on a computer database allowing analysis for prognostic factors. We have derived prognostically important characteristics and have tested prospectively the validity of the prognostic index on a geographically distinct sub-set of patients from the Edinburgh/Borders clinics. Multivariate analysis showed the following factors to be important in declining order of power; advancing age, worsening performance status, CNS/liver involvement, abnormal white cell count, 'B' symptoms and advancing clinical stage. Patient individual scores allowed them to be aggregated into one of three distinct prognostic groupings separated by arbitrary cut-points into a Best Group (39%) where the median survival exceeds 5 years (53% alive at 5 years), an Intermediate Group (30%) with median survival of 21 months (21% alive at 5 years), and a Worst Group (31%) whose median survival is 7 months (8% alive at 5 years). Similar prognostic group separations occurred when analysis was confined to: patients younger than 70 years; patients treated with initial chemotherapy; patients treated with initial radiotherapy; patients within any of the major pathological sub-groups.

In 1974 and 1975, DeVita and colleagues reported 37% prolonged disease free survival in patients with advanced stage diffuse histiocytic non-Hodgkin's lymphoma (DHL). Because relapses were rare beyond 2 years follow up, it was suggested that complete remissions lasting longer than 2 years could be considered cures (Schein et al., 1974;DeVita et al., 1975;Bonnadonna et al., 1976).
Since that time increasingly complex chemotherapy regimens have been designed in an attempt to achieve higher complete remission rates. The assumption has been that more complete remissions will lead to more cures, despite recent evidence that disease-free survival continues to fall with prolonged follow-up (Fisher et al., 1987;DeVita et al., 1988).
However, many of these trials were conducted in single tertiary referral centres, amongst younger patients than those seen in primary referral centres in the United Kingdom or Europe (Table I). Concern has grown that the majority of NHL patients were not in fact benefiting to the expected degree through the application of complicated, expensive and toxic regimes.
Patients were staged by standard techniques. All patients underwent physical examination and routinely had measurements of haematology and biochemistry, a chest X-ray, and either ultrasound scan, computed tomography or lymphangiogram to stage abdominal disease. Bone marrow routinely comprised trephine and aspiration examination.
Treatment included a variety of regimens. Twenty-five patients had missing treatment data. Twenty-four received surgery with curative intent, mostly for early stage GI tract disease. One hundred and eight received less than attempted curative therapy from physicians in SNLG, mostly due to very rapid demise, or to the patient moving home locality prior to completing treatment. One hundred and ninety-two patients received radical radiotherapy alone. Four hundred and ninety-five patients received 'curative' chemotherapy alone. One hundred and twenty-eight patients received combined modality therapy. Of the 623 patients who received chemotherapy, 192 received variations on the C-MOPP regime (with or without prednisone or procarbazine, cyclophosphamide exchanged for mustine), 280 received variations on the CHOP regime (with or without bleomycin or methotrexate), 99 received variations of the BACOP regime (e.g. with or without methotrexate) and 52 received alternating cycles of CHOP with an etoposide containing regimen. The majority of patients were treated at the discretion of the attending physician. A minority of patients were taking part in controlled trials of one chemotherapy regime against another (current SNLG trial). Univariate survival analysis (Breslow, 1970) was used to select any factor showing an association with death. For example univariate analysis of pathology by the Working Formulation showed it to be significant at P<0.05. Factors showing such an association were included in the multivariate analysis.
To provide an independent test group on which to validate our prognostic index all patients presenting to a single centre, Edinburgh and Borders (EB), were excluded arbitrarily from multivariate analysis. This was felt by the advising statistician to be the least biased technique of providing a test group. There were 310 of these patients leaving 662 patients from other centres available for multivariate analysis.
Multivariate analysis used Cox's proportional hazards model (Cox, 1972). Factors of least prognostic significance were eliminated one by one in a manual step-down procedure, to maximise data inclusion.
Step-down continued until all remaining variables were significant at P < 0.05. The alternative approach, an exhaustive step-up procedure, adding one variable to the model at a time until no new variable was significant at P<0.05, yielded the same final prognostic index.
Factors examined for prognostic significance included age, sex, performance status (ECOG rating 0 to 4), previous malignancy, centre of diagnosis, Rappaport and Working Formulation histopathology subgroups, clinical stage, number of nodal sites involved, extranodal sites of origin or involvement including all major organ systems as well as thyroid, thymus, bone and skin, nodal sites of involvement including Waldeyer's ring, B symptoms as a group and individually (weight loss, fever, night sweats), rashes, erythrocyte sedimentation rate, haemoglobulin, platelet count, white blood count (differential white cell counts were not recorded in the database). Evidence of organ involvement was defined as reasonable certainty of organ involvement on clinical, radiological or pathological grounds. Clinical stage followed the Ann Arbor definitions, and does not represent pathological stage when used in this model. B symptoms were present if any one of fever, night sweats or weight loss were present. Continuous variables were treated in a number of different ways, as linear continuous variables, as transformations of the continuous variable or as a series of discrete intervals by the use of multiple cutpoints. All of these treatments of the continuous variables were tested separately for prognostic significance, to choose the most informative approach.
Information on bulk disease was only available for patients diagnosed in the last 2 years of study and therefore the prognostic significance of bulk of disease could not be analysed. However number of extranodal sites of disease was recorded and analysed. A number of investigations were excluded from initial analysis because data was complete for <80% of patients. These included lymphogram, bone scan or X-rays, gallium scan, computed tomography or ultrasound scan of abdomen, staging laparotomy, marrow trephine or marrow aspiration. Our strategy for analysing these investigations was, first, to derive a basic prognostic model using more general presenting features as listed above. These investigations contributed indirectly to that analysis via their influence on clinical stage or evidence of organ involvement. Having derived this basic prognostic index, we then added the specific results of these excluded investigations, one by one, back into the model, thereby minimising data loss. We could thus identify specific investigations which would improve significantly the prognostic accuracy of our more general, basic model. There were none. This means that use of the index does not demand the performance of any single specific extra investigation. Our strategy for analysing treatment was similar. First we excluded treatment data from analysis, thus deriving a general prognostic model on all patients, based solely on presenting features. We then analysed the significance of adding treatment data (by a step up procedure) to the basic model. The model was not improved by the inclusion of treatment data. We also looked for statistically significant interactions between treatments and other prognostic variables which might have suggested that different models would be appropriate for different treatment groups. There were none, suggesting that the index could be effectively applied to any treatment sub-group. As a final check of this general applicability across treatments, we validated the index on the independent group of EB patients stratified by treatment received. The results are discussed below. Response data were excluded from analysis because the aim was to identify presenting features of prognostic significance for survival, and these are often obscured by including response in a multivariate analysis.

Results
Best survival was predicted for fit young patients with stage I or 2 disease, no liver or CNS involvement, no B symptoms and a normal white cell count.
Independent adverse prognostic features (in declining order of strength) were advancing AGE, declining PERFOR-MANCE STATUS (ECOG), involvement of CNS, involvement of LIVER, abnormal WHITE CELL COUNT, B SYMPTOMS and advanced STAGE.
Prognosis worsened in linear fashion with advancing age. ECOG rating 1 or 2 was worse than ECOG rating 0. ECOG rating 3 or 4 was worse than ECOG rating 1 or 2.
Abnormal white cell count was defined as < 4 or > 11 x 109/l. Deviations at either extreme carried the same additional risk. Clinical stage 1 and 2 were equivalent. Clinical stage 3 or 4 carried the same additional risk over clinical stage 1 or 2. Liver involvement, or CNS involvement, carried additional prognostic significance, over and above their influence on clinical stage. Marrow involvement, as evidenced by trephine biopsy or aspiration, had no additional significance beyond its influence on clinical stage. Extranodal and nodal sites of origin were not significant prognostic features. The number of involved sites was also not a significant prognostic factor. Cox's model provided coefficients reflecting the prognostic importance of the significant factors. Using these coefficients a simple, additive, multivariate prognostic index could be constructed, and any patient assigned an index score (Table  III). A high score predicted poor prognosis.
The coefficients themselves are difficult to interpret in real terms so, to provide a meaningful impression of the real significance of these prognostic features we have also expressed their influence in terms of relative risk, listed in column X of Table III. Relative risk refers to the number of times a patient's risk of death is multipled at any time, given the presence of an adverse prognostic feature. The influence of any two features is multiplicative so that for example the presence of ECOG rating 2 and B symptoms means risk of death is multiplied by (1.7 x 1.5) = 2.55-fold. The death rate Simple additive index created using coefficientshigh score implies poor prognosis. e.g. 60 year old patient, fitness 2, stage 4 with liver involved but no B symptoms and normal white count scores: (60 x 0.023 for age) + (0.053 for fitness) + (0.29 for stage) + (0.48 for liver involvement = 2.68 and falls into the worst prognostic group. Best Prognostic Group < 2.0, Intermediate Prognostic Group 2.0 -2.6, Worst Prognostic Group > 2.6. for all patients provided the mean prognostic reference score of 1. The index was applied to all patients in the,analysis group, and cutpoints chosen to separate a lowest scoring 33% (best predicted survival) from an intermediate 33% and a highest scoring 33% (worst predicted survival). These cutpoints were essentially arbitrary, and different cutpoints could be chosen to separate different proportions of patients.
Applying the index and cutpoints to the independent group of 310 EB patients validated the index. Figure 3 shows overall survival of all patients presenting to EB and treated with conventional chemotherapy and/or radiotherapy. No plateau survival is apparent, median survival is 23 months and 5 year survival is only 31%. This overall survival curve for EB patients is similar to that for all patients from all centres (Figure 2). Figure 4 shows survival of the EB group stratified by index score. The index separates three distinct prognostic subgroups.
In particular a worst surviving group (30% of patients), with median survival of only 7 months and 5 year survival of 8%, can be identified. Characteristics of patients from all centres who fall into this group are shown in Table IV Figure 4 Survival of Edinburgh and Borders patients grouped by index score. Numbers of patients remaining at risk are shown below the x-axis at 10 month intervals for each group. All patients with complete data for the index are included.
year survival (53%) for this group approached that claimed in North American trials. Characteristics of patients from all centres who fall into this group are shown in Table IV. The index was similarly validated on younger patients (aged <70 years) since in future it may be used to select patients for aggressive therapies. Figure 5 shows equally good separation of three distinct prognostic groups amongst patients under 70 years.
The best group (47% of patients) had 5 year survival of 54% and did not reach median survival.
The intermediate group (27% of patients) had median survival of 26 months and 5 year survival of 28%.
The worst group (25% of patients) had median survival of 9 months and 5 year survival of 9%. Thus a very poorly surviving group of relatively fit patients younger than 70 is identified. Numbers of patients remaining at risk are shown below the x-axis at 10 month intervals for each group. All patients with complete data for the index are included.
The index was also validated by testing it on a variety of subgroups of the independent EB patients.
Thus when restricted to 155 DLL patients the best group had 5 year survival of 54%, the intermediate group 5 year survival of 31 %, and the worst group 5 year survival of 11% ( Figure 6).
When restricted to 96 diffuse small cell lymphoma (DSL) patients, the best group had 5 year survival of 53%, the intermediate group 5 year survival of 17%, and the worst group 5 year survival of 0% (Figure 7). Similar patterns were seen in the other smaller histopathological groups.
When restricted to 170 stage 3 or 4 patients, the best group had 5 year survival of 44%, the intermediate group 5 year survival of 18%, and the worst group 5 year survival of 9% (Figure 8).
When restricted to 137 stage 1 or 2 patients, the best group had 5 year survival of 59%, the intermediate group 5 year survival of 23%, and the worst group 5 year survival of 8% (Figure 9).
The index was also valid when tested on subgroups of EB  Figure 9 Survival of Edinburgh and Borders patients with stage I and II disease: patients grouped by index score. Numbers of patients remaining at risk are shown below the x-axis at 10 month intervals for each group. All patients with complete data for the index are included.
patients stratified by treatment received. Three discrete prognostic groups are separated amongst patients who went on to receive radiotherapy and also amongst patients who went on to receive chemotherapy. Thus of 85 patients receiving radiotherapy during first line treatment, the best group had 5 year survival of 48%, the intermediate group 5 year survival of 25%, the worst group 5 year survival of 11% (Figure 10). Of 186 patients receiving chemotherapy during first line treatment, the best group had 5 year survival of 55%, the intermediate group 5 year survival of 22% and the worst group 5 year survival of 8% (Figure 11). Further subdivision by specific chemotherapy regimen could not be performed because numbers in each group became too small. Never-theless this broad validity across major treatment subgroups allows the identification of poorly surviving patients whatever treatment is proposed.

Discussion
In this paper we al., 1984;Horning et al., 1984;Dixon et al., 1986;Coleman et al., 1988). Thus censoring patients from survival analysis when they die of 'other causes', assuming these can be 80 9 1 An accurately defined, may result in an over-optimistic estimate of disease free survival, since these deaths will occur in generally older and less fit patients, who may be the most 60 '1 I l likely to relapse. To avoid these problems, and because our interest is in the real survival expectations we can offer our patients, we decided to use as our endpoint death from any 40 L1 [ cause. Coleman has recently suggested that this is the most appropriate and reproducible endpoint to choose (Coleman et al., 1987). 20 i Several important features are apparent in the overall survival data for all our patients, with all stages and pathologies within high and intermediate NHL (Figure 2) plateau is apparent, and 5 year survival is 24%.
)f patients remaining at risk are shown below the x-axis These data contrast unfavourably with results reported for th intervals for each group. All patients with complete many small therapeutic trials in specialist referral centres in he index are included. the USA. Optimistic reviews of NHL therapy have summarised the apparent progress due to the application of increasingly complex chemotherapy DeVita et al., 1988). ProMACE-MOPP, ProMACE-CytaBOM, M-BACOD, m-BACOD, COP-BLAM and its refinements, and MACOP-B are all contemporary variations on this theme 80 ll lll (Fisher et al., 1983;Fisher et al., 1987;Skarin et al., 1983;Canellos et al., 1987;Laurence et al., 1982;Boyd et al., 1988;Klimo & Connors, 1987). However response and survival 60 15 XL were significantly related to several patient and disease characteristics (Fisher et al., 1977;Cabanillas et al., 1978;Lenhard et al., 1978;Stein et al., 1979;Fisher et al., 1981;Armitage et al., 1982;Trump & Mann, 1982;Leonard et al., 40 1983;Sullivan et al., 1983;Al-Katib et al., 1984;Steward et al., 1984;Horning et al., 1984). The patients treated at different centres often differed widely as a result of selection 20 pressures occurring in the referral process. Differences in age, st,age marrow involvement and CNS disease were all suggested as potential reasons for differing results (Stein, 1984 (letter);Fisher et al., 1984 (letter);Honegger & Cavalli, 1984;Coleman et al., 1987 (Monfardini et al., 1984). he index are included.
Our analysis of prognostic features provides a partial explanation for these contrasts. We have derived a multivariate prognostic index by the analysis of a large group of 662 generally provide the total National Health Service patients, a much larger group of lymphoma patients than has esources for those regions. Thus we can with some previously been subjected to such an analysis. The validity of suggest that this group of patients represents a the index has been demonstrated powerfully on an indepenunselected sample of total NHL cases occurring dent group of 310 patients selected by geography alone, a LG boundaries. )ther causes of death will be commoner amongst our patients. Connors, in describing the MACOP-B programme, mentioned the specific exclusion of unfit patients, and patients over 70 years (Connors & Klimo, 1988). Canellos, describing m-BACOD, mentioned that ony 3% of patients had ECOG rating >2, (Canellos et al., 1987). In SWOG sequential trials of m-BACOD reduced doses were administered to elderly or unfit patients, and these patients achieved a CR rate of only 27%. The CR rate (67%) for the younger fitter group was quoted as an estimate of m-BACOD efficacy (Miller et al., 1988). In the CHOP sequential trials at SWOG age had a powerful effect on CR and survival rates (Dixon et al., 1986). In the COP-BLAM 3 and COP-BLAM 4 trials, where patients had at age distribution closer to SNLG experience, age had an important effect on CR rates or durability (Boyd et al., 1988;Coleman et al., 1988). Shipp, who also noted fitness as an important determinant of response and survival, drew attention to the well recognised importance of fitness in prognosis of solid tumours, and the lack of data concerning fitness as a prognostic factor in NHL (Shipp et al., 1986). Other multivariate analyses have detected age as an important prognostic feature (Danieu et al., 1986;Al-Katib et al., 1984;Homing et al., 1984;Lenhard et al., 1978;Kaminski et al., 1986). Armitage has recently noted that many trials select patients on the basis of age and fitness, though often descriptions of fitness do not appear in treatment reports (Armitage & Cheson, 1988). Our analysis provides evidence that the selection of patients on the basis of age or fitness is likely significantly to influence results. This is true whether or not the selection occurs as a deliberate policy, or as the result of uncontrolled pressures in the referral process. These conculsions underline the importance of publishing good descriptions of patients entering trials, to allow comparisons to be made more readily (Carter, 1985). Ideally a prognostic index might be used to estimate the expected survival of patients in different trials. As noted above, the two most important prognostic factors, age and performance status, have also been detected by other investigators. The details of the index are also in broad agreement with other analyses. CNS disease has long been recognised as a poor prognostic feature, and indeed provided the impetus for the introduction of high dose methotrexate into many regimes. It would seem that our patients would benefit from more aggressive therapy for CNS involvement. CNS disease has been correlated with marrow disease in the past. Perhaps this explains in part our failure to detect marrow involvement as an independent adverse feature. Several multivariate analyses have implicated marrow involvement as a poor prognostic feature (Bloomfield et al., 1974;Armitage et al., 1982;Fisher et al., 1981) whereas others have not (Cabanillas et al., 1978;Leonard et al., 1983;Shipp et al., 1986;Stein et al., 1979;Horning et al., 1984). In our analysis the only organ site of involvement (other than CNS) which carried additional prognostic significance beyond its influence on stage was liver involvement. Others have demonstrated an influence of liver involvement on prognosis, by univariate analysis (Fisher et al., 1981;Stein et al., 1979) or by multivariate analysis (Steward et al., 1984). Interestingly Steward (1984) showed marrow involvement was a poor prognostic feature for CR but not for survival, whereas liver involvement carried a poor prognosis for survival but not for CR. The adverse significance of B symptoms has been noted in many analyses (Bloomfield et al., 1974;Leonard et al., 1983;Steward et al., 1984;Armitage & Cheson, 1988;Fisher et al., 1981;Sullivan et al., 1983;Al-Katib et al., 1984). Cabanillas showed symptoms had an adverse significance for survival but not for CR (Cabanillas et al., 1978). One previous multivariate analysis has noted both high and low white cell counts as adverse prognostic features of approximately equal weight (Leonard et al., 1983).
The failure of pathological sub-type to influence prognosis is interesting. Whilst this was a highly significant prognostic variable on univariate analysis, when it was included in our multivariate index it became non-significant. This implies that between different pathological sub-types there are important differences in the distribution of age, stage, fitness, liver or CNS involvement, white cell count, or B symptoms. These differences must account in part for the crude survival differences observed between pathological sub-types. In the future, the finer detail of pathological description afforded by immunochemistry and molecular biological analyses may help to refine the prognostic value of 'pathology'. When one examines the proportion of patients with each pathological group which fall into each prognostic group, as predicted by the index, significant differences are apparent (Table IV). This fact explains why a univariate analysis of prognosis by pathological sub-type appears significant.
The failure of treatment variables significantly to improve the index does not imply that treatment had no effect on outcome. During the study period treatment for HIG NHL was selected primarily on the basis of age, fitness and stage, and so the effects of treatment are allowed for in the coefficients derived for these covariates, which are all included in the index. It is important to recognise that, whilst the index is applicable across the range of treatment groups in this study, it will only remain so under the conditions that assigned our patients to these different treatment groups.
The utility of our prognostic index is demonstrated by its capacity to separate three distinct prognostic groups when applied to a range of patient and treatment subgroups of our independent EB patients. Thus as shown above it is useful when applied to patients under 70, to DLL patients, to DSL patients, to early stage patients, to advanced stage patients, and to patients stratified by treatment received.
In conclusion, this simple additive index is applicable across a range of patient and treatment groups. It uses readily available data at presentation to allow: (1) better prediction of survival for the individual patient; (2) stratification of future treatment studies, and (3) selection of poor risk younger patients (under 70 years) for novel or aggressive therapy. The application of such an index to results reported in different patient groups could facilitate better comparison of these results. Importantly, the study also demonstrates that patient selection could largely account for the variety of results in earlier treatment studies.