Construction of a North American Cancer Survival Index to Measure Progress of Cancer Control Efforts

Introduction Population-based cancer survival data provide insight into the effectiveness of health care delivery. Comparing survival for all cancer sites combined is challenging, because the primary cancer site and age distribution of patients may differ among areas or change over time. Cancer survival indices (CSIs) are summary measures of survival for cancers of all sites combined and are used in England and Europe to monitor temporal trends and examine geographic differences in survival. We describe the construction of the North American Cancer Survival Index and demonstrate how it can be used to compare survival by geographic area and by race. Methods We used data from 36 US cancer registries to estimate relative survival ratios for people diagnosed with cancer from 2006 through 2012 to create the CSI: the weighted sum of age-standardized, site-specific, relative survival ratios, with weights derived from the distribution of incident cases by sex and primary site from 2006 through 2008. The CSI was calculated for 32 registries for all races, 31 registries for whites, and 12 registries for blacks. Results The survival estimates standardized by age only versus age-, sex-, and site-standardized (CSI) were 64.1% (95% confidence interval [CI], 64.1%–64.2%) and 63.9% (95% CI, 63.8%–63.9%), respectively, for the United States for all races combined. The inter-registry ranges in unstandardized and CSI estimates decreased from 12.3% to 5.0% for whites, and from 5.4% to 3.9% for blacks. We found less inter-registry variation in CSI estimates than in unstandardized all-sites survival estimates, but disparities by race persisted. Conclusions CSIs calculated for different jurisdictions or periods are directly comparable, because they are standardized by age, sex, and primary site. A national CSI could be used to measure temporal progress in meeting public health objectives, such as Healthy People 2030.


Introduction
Progress in meeting cancer control objectives can be measured by using a combination of statistics on cancer incidence, populationbased survival, and mortality (1)(2)(3). Comparing survival, in particular, among geographic areas and over time can aid in the understanding of inequities and changes in the quality and effectiveness of health care provided to population groups of people diagnosed with cancer (4). However, interpreting the results of comparisons of survival proportions for all cancer sites combined is challenging when distributions by age, sex, and primary cancer site differ by geographic area or change over time. ferent age structures in the patient populations being compared (5,6). However, to make comparisons of relative survival for all cancer sites combined requires adjusting for the case-mix of primary cancer sites. European Cancer Registry Based Study on Survival and Care of Cancer Patients (EUROCARE) researchers and others have performed age and case-mix adjustments for England and Europe to compare survival proportions between nations and among local areas and to monitor temporal trends (7)(8)(9)(10).
In this article, we describe construction of the North American Cancer Survival Index (CSI), which standardizes for age, sex, and primary cancer site, and compare unstandardized and primarysite-standardized survival proportions for all cancer sites combined, by registry jurisdiction and race. We demonstrate its use in comparative analysis of registry-and race-specific survival and describe its use as a baseline measure for monitoring progress over time in cancer control efforts and in meeting public health objectives related to improving early cancer diagnosis and access to timely, evidenced-based treatment.

Data source
All population-based cancer registries in the United States and Canada are members of the North American Association of Central Cancer Registries (NAACCR). Beginning with data from 1996, NAACCR has produced the Cancer in North America (CINA) reports (https://www.naaccr.org/cancer-in-north-america-cinavolumes/) of cancer incidence and mortality rates in the United States and Canada. Beginning in 2015, NAACCR asked member registries to provide follow-up data for the purpose of reporting survival proportions. For registries to be included in the survival analysis described in this article, they needed to provide consent, meet CINA incidence criteria for all relevant years (11), and either meet the National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) standards for follow-up (12) or ascertain deaths through our study's cutoff date, December 31, 2012, through linkages with state death records and the National Death Index (13). Survival data were provided by 36 US registries (31 states and 5 metropolitan areas) on more than 6.6 million cancers diagnosed from 2006 through 2012 (6). To avoid double-counting in the survival estimates, data from metropolitan area registries in California and Georgia were not included in the United States combined statistics. The data set included malignant cases as defined by the SEER behavior recode for analysis (14) for people aged 15 to 99 years diagnosed from 2006 through 2012.

Statistical analysis
We excluded incident cases that were reported solely via death certificates or autopsy. For registries conducting active follow-up, alive cases with no survival time were excluded from analysis. By using SEER 2007 Multiple Primary and Histology Coding Rules (15), we allowed for multiple primary cancers to be included for each patient, but only the first applicable record per patient was included in each survival estimate. SEER*Stat software version 8.2.1 (Information Management Services, Inc) was used to perform survival calculations (16). The survival duration in months was calculated on the basis of complete dates. For registries meeting SEER follow-up standards (SEER registries plus Montana and Wyoming), the survival duration for alive patients was calculated through the date of last contact (or study cutoff, if earlier). For the remaining registries, survival duration for alive patients was calculated through December 31, 2012, with all patients not known to be dead presumed to be alive on this date (17).
Sixty-month age-standardized relative survival ratios (RSRs) were calculated by using the actuarial method on monthly intervals. We calculated relative survival by using the Ederer II method to compute expected survival (18). Expected survival was estimated from life tables matched to cancer patients by age, sex, year, geographic area, race, and socioeconomic status (19). Cases were censored at an achieved patient age of 100 years.

Cancer survival index
The construction of the CSI was described in the technical notes of Cancer Survival in the United States and Canada 2006-2012 (6). Briefly, the CSI is the weighted sum of the age-standardized sitespecific RSRs, with the weights derived from the proportionate distribution of North American incidence counts for diagnosis years 2006 through 2008 as reported for the November 2014 Call for Data (Table 1). This range of years was selected, because the incidence data for these years are more mature in terms of reporting delay than more recent years. Case counts to derive the weights were limited to malignant behavior and urinary bladder in situ neoplasms among patients aged 15 years or older, and SEER metropolitan-area registries were excluded to avoid double-counting of incident cases for their respective states.
Separate sets of weights were used for male patients, female patients, and male and female patients combined. Let S i be the agestandardized, site-specific relative survival ratio estimate and W i be the proportion of the sex-specific incidence counts for site category i. The cancer survival index (CSI) and its standard error are:  (20). Confidence intervals for the CSI can be narrower than for the all sites statistics set because of the national replacement data and the larger numbers of cases. The standard error of the CSI was not adjusted for the potential inclusion of more than one case per person because these cases made up only 4% of the total. If more than 30% of the site-specific age-standardized RSR estimates were unavailable for a registry jurisdiction and were replaced with that of the country, the CSI estimate was suppressed to avoid unduly biasing the results.

PREVENTING CHRONIC DISEASE
We used funnel plots ( Figure) to show 5-year RSR estimates (vertical axes) plotted against the precision of the estimates (horizontal axes) (21). Precision was calculated as the inverse of the variance of the survival estimates.   Table  2 that use a function of the standard error for precision.

Results
The CSI cancer survival estimate was 63.9 for the United States combined ( limit considered to be high outliers. For the all sites combined statistics there were 12 registry-specific values below the lower control limit and 13 registry-specific values above the upper control limit (Figure a). For the CSI there were 13 registry-specific values below the lower control limit and 10 registry-specific values above the upper control limit (Figure b).  Table 3). The registry with the largest negative difference between the all sites and CSI estimates was New York (−2.9). Texas had the largest positive difference between the all sites and CSI estimates (0.3).
In the same 12 registries for which the index could be calculated for blacks, the CSI RSRs for whites varied from 63.1 in Louisiana to 65.7 in New York. Although the within-race ranges in CSI values for the 12 registries were 2.7 (whites) and 3.9 (blacks), the median white-black differences in CSI values for the 12 registries were 8.7 for male and female patients combined, 8.2 for male patients, and 9.4 for female patients.

Discussion
Cancer survival varies widely by age, sex, and site of the primary cancer. To compare overall cancer survival among registry jurisdictions, it is necessary to adjust for all 3 factors. In this article, we described construction of the North American CSI that was first used in the inaugural CINA Survival report and is, to our knowledge, the first set of site-mix adjusted cancer survival estimates for the United States (6). As expected, CSI ranges were narrower than the age-standardized all sites RSR estimates, which include different proportions of highly fatal cancers by registry jurisdiction (Figure).
The CSI is a summary measure of overall cancer survival and is intended to quantify and communicate disparities in cancer survival by race and across registry jurisdictions and to monitor progress in cancer survival over time. CSI statistics are directly comparable between registry jurisdictions and over time because they are standardized by age, sex, and primary-cancer-site distribution. This type of index has been suggested for use as an indicator for cancer control (7,22). EUROCARE routinely publishes age and case-mix standardized survival estimates that offer comparisons by country (8). Recently, the CSI has been used to demonstrate improvement in both short-term and long-term survival from all cancers combined over a 40-year period in England and Wales (22). The comparison of CSI estimates could be useful to policy makers, cancer control professionals and researchers, and other partners in population-based cancer control efforts in the United States. Although age and site-mix adjusted relative survival measures may be informative of a registry jurisdiction's performance in cancer control, the indicator values may not be easily interpreted clinically (7).
Summary measures such as the CSI offer brevity at the expense of the detail that may be found in site-specific survival estimates. As with age-adjusted incidence rates versus age-specific rates or a stock market index versus individual stock prices, the value of the CSI is in its economy for an overview of broad patterns. Likewise, 5-year relative survival is a commonly published metric (23) but may not be the best duration to measure cancer control performance for each cancer site. Comparisons among states may be different with application of the CSI weights to different survival durations. Ideally, the CSI can be used in conjunction with sitespecific survival estimates and incidence rates but should be considered superior to the all sites RSRs for comparing health systems performance among registry jurisdictions.
We recommend that the weights for calculating CSI estimates for male and female patients combined and separately be used to calculate age, sex, and site-standardized RSRs for North American registry jurisdictions when a one-number summary for overall patterns of cancer survival is desired. The right-most column in Table 1 shows weights for both sexes, which were not discussed in this article, but were included for completeness. Weights for both sexes should be used only when separate survival estimates for male and female patients are not available. The resulting weighted survival measure will be adjusted for site mix but not for the proportion of male patients and female patients diagnosed with cancer. Variation in survival by registry catchment area can be due to several factors, including but not limited to differences in demographic characteristics related to race, ethnicity, and socioeconomic status; cancer screening rates and overdiagnoses associated with screening, which affect stage distributions; access to and quality of care; and cancer registration practices that affect case ascertainment, dates of diagnosis, and follow-up (24). The 3 states with larger negative difference from the all sites estimate to the CSI estimate (Colorado, Idaho, Utah) have lower historic smoking rates than other states, which portend a lower proportion of highly fatal cancers (25). The 2 states with the largest positive difference from the all sites to the CSI estimates (Kentucky and West Virginia) have higher historic smoking rates (25).

PREVENTING CHRONIC DISEASE
Each of the areas with all-races-combined CSI values greater than 65.0 has higher socioeconomic status than other states as measured by median income (26). Storm et al found that adjustment for case-mix is important in comparisons of relative survival across countries, and suggested additional patient characteristics such as stage, comorbidity, and risk factors might further explain such differences (9). An area for future consideration is the correlation of the North American CSI with measures of state, province, and territory-level screening and risk factor profiles, socioeconomic status, and health care access.
Our results show stark and consistent differences in survival by race for many cancer sites in the United States, a finding seen also in the first CONCORD Programme study and the latest SEER Cancer Statistics Review (23,27). Findings from our study show that sizable differences in cancer survival by race remain in the United States after adjusting for age, sex, and case mix. Of note, the CSI values for blacks spanned a narrow range from 53.0 (Mississippi) to 56.9 (North Carolina), and the disparity between whites and blacks varied little by registry. The CSI can be used to monitor progress toward eliminating disparities by race in the United States.
This study has several limitations. First, because age, sex, and case-mix standardized measures require estimates for each combination, CSI values could be calculated for only 12 of 36 US registry areas for blacks. Approximately 30,000 cases are necessary to calculate the CSI. Second, in registries for which survival time was calculated using the "presumed alive" method, survival may be biased upwards (28). However, 4 of the 5 highest CSI values among whites were in SEER registries, so this concern may not be problematic. Third, the CINA Survival reports released to date were not able to use life tables stratified by Hispanic ethnicity. Pinheiro et al have shown that in SEER data, Hispanics and Asians are more likely to have incomplete follow-up than non-Hispanic whites or blacks, and those with worse prognoses are more likely to have incomplete follow-up than those with better pro-gnoses (29). This factor may have affected CSI values for states with high percentages of Hispanic residents, such as New Mexico and Texas. In addition, the life tables available for calculating expected survival may not reflect all factors contributing to variation in all-cause mortality, such as smoking. Finally, the US combined survival statistics may not be representative of the total national population because not all states were included.
Although the ranges in CSI values are narrower than their unadjusted variants, large disparities in cancer survival remain between blacks and whites in the United States. This summary survival measure is appropriate for interjurisdictional survival comparisons in the United States and as a baseline for monitoring progress over time in population-based cancer control efforts and in meeting public health objectives directed toward improving early diagnosis and access to evidenced-based treatment. For example, the US Department of Health and Human Services has begun planning for Healthy People 2030, scheduled for release in 2020 (30). The North American CSI, using the weights described in this article, could be used to measure progress in meeting the Healthy People objective related to cancer survival from the present through 2030.

Acknowledgments
These data are based on the North American Association of Central Cancer Registries December 2015 data submission. All analyses were performed on previously collected, de-identified data. Support for cancer registries is provided by the state in which the registry is located. In the United States, registries also participate in the SEER program or the Centers for Disease Control and Prevention's (CDC's) National Program of Cancer Registries or both. This work was supported by CDC through a cooperative agreement with the Cancer Data Registry of Idaho (no. 1-U58-DP003882). The authors have no financial disclosures to report. The findings and conclusions in this report are those of the authors and do not necessarily represent the official positions of CDC or the National Cancer Institute.