Quality assessment of the registration of vulvar and vaginal premalignant lesions at the Cancer Registry of Norway

Background A crucial factor concerning the utility of Cancer Registries is the data quality with respect to comparability, completeness, validity and timeliness. However, the data quality of the registration of premalignant lesions has rarely been addressed. High grade vulvar intraepithelial neoplasia (VIN) and vaginal intraepithelial neoplasia (VaIN) are premalignant lesions which may develop into cancer, and are often associated with infection with the human papillomarvirus (HPV). The aim was to evaluate the quality of registration of VIN and VaIN at the Cancer Registry of Norway (CRN). Material and methods We re-collected all notifications with high grade VIN and VaIN diagnoses during 2002 to 2007 from pathology laboratories, and compared these to the data in the CRN database so as to quantitatively measure the completeness, validity and timeliness of the data. Results Over the period 2002 to 2007 we estimated the completeness of the 1556 VIN and 297 VaIN notifications to be 95.0% and 92.9%, respectively. The original and reabstracted topography codes showed major discrepancies for 12 of 642 (1.9%) VIN and 7 of 128 (5.5%) VaIN notifications. The original and reabstracted morphology codes for VIN and VaIN were identical for 724 out of 814 notifications. Sixteen notifications had a major discrepancy. For the period 2002 to 2007 the median time elapsed between date of diagnosis and date of registration were 436 and 441 days for VTN and VaIN cases, respectively. Discussion Based on the present analysis of the comparability, completeness, validity and timeliness of premalignant lesions of vulva and vagina, we conclude that the Cancer Registry of Norway is able to monitor such premalignant lesions satisfactorily.

In recent decades the role of Cancer Registries has expanded beyond the generation of descriptive statistics and may also include planning, monitoring and evaluation of cancer control activities, including cancer screening and vaccination programmes, and the follow-up of the quality of care for cancer patients [1]. A crucial factor concerning the utility of the Registry in such activities is the quality of the data with respect to comparability, completeness, validity and timeliness [2,3].
In Norway, it has been compulsory to send notifi cations of malignant and premalignant lesions to the Cancer Registry of Norway (CRN) since 1952 [4]. Data quality evaluation has been applied to several specifi c cancers, including cancer of the ovary [5], prostate [6], central nervous system [7] and pharynx [8]. A recent comprehensive evaluation of the data quality of solid and non-solid tumours concluded that data from CRN yields comparable data that can be considered reasonably accurate, timely and close to complete [9]. However, the use of such concepts in the evaluation of the data quality of registration of premalignant lesions has rarely been applied.
Norway has recently introduced a nationwide vaccination programme against HPV, the main causative agent for cervical cancer. In several clinical trials, vaccination has reduced the incidence of cervical intraepithelial neoplasia [10,11], a precursor to cervical cancer. HPV also causes premalignant lesions in the vulva and vagina [12,13]. High grade vulvar intraepithelial neoplasia (VIN) and vaginal intraepithelial neoplasia (VaIN) are premalignant lesions which may develop into cancer or spontaneously regress [14,15]. In Norway, the incidence rate of vulvar intraepithelial neoplasia increased three-fold from 1973 -1977 to 1988 -1992 [16].

E. Enerly et al.
The aim of the present study was to assess the data quality for the registration of premalignant lesions of high grade VIN and VaIN at the CRN. Monitoring the incidences of these premalignant lesions is important to appropriately evaluate the impact of HPV vaccination in the population. We re-collected all notifi cations with high grade VIN and VaIN diagnoses during 2002 to 2007 from pathology laboratories, and compared these to the data in the CRN database so as to quantitatively measure the validity and completeness of the data. We also present the comparability and timeliness of VIN and VaIN registrations at the Registry.

Source of information
The Cancer Registry of Norway (CRN) collects, codes, and stores data on patients with malignant and certain premalignant diagnoses by combining information from different sources; pathology notifi cations, clinical notifi cations, death certifi cates, hospital discharge diagnoses and radiation therapy data. Coding and classifi cation of neoplasms at the CRN have been described elsewhere [9]. The major sources of information for premalignant lesions are pathology notifi cations from hospitals and private pathology laboratories that send copies of histology reports routinely to CRN. These notifi cations also have topography and morphology codes from the Norwegian version of the Systematised Nomenclature of Medicine (SNOMED). At CRN the notifi cations are registered by a trained medical coder who assigns topography and morphology codes according to the rules and classifi cation schemes used at CRN which are mainly based on the second edition of the International Classifi cation of Diseases for Oncology (ICD-O-2). Notifi cations are scanned and stored in the database and accumulated as one record for each patient. This is effectuated by using the 11-digit personal identifi cation number (PIN) issued to every newborn Norwegian citizen and to people residing in Norway.

Histological features of high grade VIN and VaIN
Vulvar intraepithelial neoplasia (VIN) is a disease which shows histological features of disordered maturation and nuclear abnormalities, such as loss of polarity, pleomorphism, coarse chromatin, irregular nuclear membranes and mitotic fi gures [17]. Warty, basaloid and mixed warty-basaloid are histological subtypes of HPV-related VIN, and are referred to as high grade VIN (also known as usual VIN (classic) VIN2-3 or undifferentiated VIN) [18]. In contrast, the differentiated VIN type is not associated with HPV. The distinction between high grade VIN and differentiated VIN is not registered at the CRN. Women with HPV-related vulvar disease have an increased risk of concurrent or subsequent cervical intraepithelial and vaginal intraepithelial neoplasia [15]. The histopathological features of VaIN are similar to those of VIN and include hyperkeratosis, nuclear enlargement and pleomorphism. We use VIN as a synonym for high grade VIN and VIN 2/3, as recommended by Sideri et al. [19] and we use VaIN as a synonym for high grade VaIN and -VaIN 2/3.

Quality measurements
Comparability. The comparability of registry data can be defi ned as the extent to which coding and classifi cation procedures, and the defi nitions for recording of specifi c data items adhere to agreed international standards [2]. The topics covered here include the defi nition of incidence of premalignant lesions, incidence date of the lesion and defi nition of multiple lesions either at the same site (topography) or at other anatomically close locations.
Completeness. The completeness of cancer registry data refl ects the extent to which all diagnosed incident cancer cases occurring in the population are included in the registry database [3]. To obtain a quantitative measurement of the degree of completeness, we performed independent case ascertainment by re-collecting notifi cations from the pathology laboratories. We contacted all of the 23 Norwegian pathology laboratories and requested them to reidentify and send to CRN all pathology notifi cations with histologically verifi ed VIN and VaIN for the period 2002 to 2007. A predefi ned list of the Norwegian SNOMED codes for VIN and VaIN (Table I) with corresponding WHO descriptions of VIN and VaIN, was provided in the request. The period 2002 to 2007 was chosen to: 1) allow the hospital laboratories to use electronic systems to identify premalignant cases of the selected topographies and morphologies; and 2) to ensure that the quality of data for estimating VIN and VaIN incidences in the prevaccination era would be satisfactory.

Validity (accuracy) .
Validity is the proportion of cases in a dataset with a given characteristic which truly have the attribute [2]. We identifi ed all notifi cations routinely sent to the registry as well as those recollected following our request. From the latter we further selected a random subset of notifi cations for the medical coders to reabstract. During reabstraction, the medical coders had on-screen information of the patient record available, but did not consult previously registered codes for the notifi cation at hand. This method was chosen to mimic the setting in which the codes were originally recorded. We then compared the data registered at CRN (the original codes) with the data reabstracted from notifi cations re-collected from the pathology laboratories (the reabstracted codes). The accuracy analysis of morphology coding was performed on 814 (69%) randomly chosen notifi cations.
A total of 44 VaIN notifi cations had an original code of cervical topography and a reabstracted code of vaginal topography. This disparity was caused by a change of coding practice in 2009, before reabstraction, and therefore was not due to an error in coding. These notifi cations were excluded from the accuracy analysis of topography coding, leaving 128 VaIN notifi cations for analyses. The accuracy analysis of vulvar topography coding was performed on 642 notifi cations.
All discrepancies between original and reabstracted codes for each record were classifi ed according to severity. We defi ned a major topographic discrepancy as any change in coding of sites. For a vulvar lesion a minor discrepancy was defi ned as variation within vulvar topographies, such as labium majus, labium minus, unspecifi ed labium majus/ minus and unspecifi ed vulva.
We further defi ned a major morphologic discrepancy as any coding change between low grade lesions, high grade lesions and carcinoma. Minor morphologic discrepancies were defi ned as use of different codes to describe high grade VIN and VaIN. These distinctions are modifi ed from Havener [20] with the intention of emphasising discrepancies which would affect incidence trend estimates.
Timeliness. Timeliness is considered here as the time from diagnosis to registration at CRN for each notification. The date of diagnosis is the date at which the sample was taken. We estimated timeliness based on all VIN and VaIN notifi cations over the registration period 2002 to 2007. Data was extracted from the CRN database in November 2009.

Comparability
A patient was defi ned as an " incident premalignant case " upon registration of a new diagnosis of high grade VIN (or VaIN) with no history of histologically confi rmed high grade lesion or cervical cancer at the same anatomical site in the past two calendar years. Women with a cancer diagnosis within four months subsequent to the high grade lesion were excluded, as these were considered " missed " cases of invasive cancer.
The CRN rule for the registration of incidence date of malignancies is to register the earliest date reported for a confi rmed malignant diagnosis and the incident date for premalignant VIN and VaIN cases are similarly defi ned as the earliest date on notifi cations describing an incident premalignant case.
The coding of multiple primary tumours mainly follows the recommendation given by the European Network of Cancer Registries (ENCR), using groups of topography codes considered as single sites (C51-vulva and C52-vagina) and that the recognition of two or more primary cancers is not time dependant. The CRN has in the period 1974 to 2007 taken into account the premalignant/malignant history of the woman when assigning topography to premalignant lesions of vulva and vagina. The registration of a cancer or premalignant lesion in an adjacent site prior to the new diagnosis usually resulted in the new diagnosis being assigned to the adjacent site. However, no clear rules were established at that time, leaving it up to the medical coder at CRN to interpret and assign the topography code.

Completeness
For the period 2002 to 2007 we performed independent case ascertainment. The records created from routinely registered pathology notifi cations in the CRN database were compared to records created from the notifi cations re-collected from the pathology laboratories. A total of 1853 notifi cations were analysed and showed that 78 of 1556 ( We noticed that 573 more notifi cations (including both VIN and VaIN) were found in the registry than were received from the laboratories during re-collection ( Figure 1). Of the 573 notifi cations, 184 were coded to the anatomical site skin by the pathology laboratories before being coded to VIN or VaIN by the medical coders at CRN (Supplementary Table I

Validity (accuracy)
The original and reabstracted topography codes for VIN were identical for 593 of 642 notifi cations (92.4%). Minor and major discrepancies between the original coding and reabstracted coding was evident for 37 (5.8%) and 12 notifi cations (1.9%), respectively (Table II). No particular pattern of misclassifi cation of vulvar topography codes was evident among the 37 notifi cations with minor discrepancy between the original and reabstracted topography (Supplementary Table II  We observed that 19 premalignant cases were registered with a topography code according to ICD-7 only and not according to ICD-O-2 in addition. These were patients with previous diagnoses coded according to ICD-7. We did not identify any missing morphology data at CRN on the notifi cations examined.

Discussion
To our knowledge, this is the fi rst study to report estimates of completeness for the registration of VIN and VaIN at a Cancer Registry. We can therefore only compare our estimates to those reported for cancers. Larsen et al. [9] estimated the average completeness of invasive vulvar and vaginal cancer incidence registration at CRN to be 99.8% [9]. Our estimates of completeness for the period 2002 to 2007 for VIN and VaIN were 95.0% and 92.9%, respectively, with some variation between years. One reason for the somewhat lower completeness of premalignant lesions than of cancer may be that some cancers are obtained from death certifi cates. Cancer mentioned on the death certifi cate allows CRN to send a reminder to the hospitals that have not yet reported a new cancer case [9]. It should also be noted that there is some uncertainty in the completeness estimates reported here since the pathology laboratories during re-collection of notifi cations did not manage to identify at least 573 notifi cations that were previously registered as VIN or VaIN in the CRN database. It is thus possible that they also may have failed to submit notifi cations not already registered by the CRN. The highest percentage of notifi cations missing in the CRN database occurred in 2007. This may indicate that a small proportion of the 2007 notifi cations were not yet registered in the database at CRN at the time of data extraction. This is in line with the observation that timeliness for VIN andVaIN was 267 days in 2006, compared to 445 days in 2007. The registration of a subset of VIN and VaIN notifi cations from 2007 was put on hold due to internal priorities at CRN and may explain some of the increase. The time interval from diagnosis to the notifi cations is sent to CRN by the pathology laboratories could be another source of variation in timeliness.
A total of 184 lesions that were assigned skin SNOMED topography codes by the pathology laboratories were coded to vulva topography by the medical coders at the CRN. This highlights the importance of a national cancer registry that can harmonise the coding practices of local pathology laboratories. Moreover, it illustrates how discrepancies in coding practices may infl uence completeness estimates in quality assessments of disease registration.
A VIN or VaIN lesion in a patient with a prior premalignant or malignant lesion in a nearby site has generally been assigned to the topography of the primary registration by the CRN medical coders. This has led to fewer registrations of VaIN in particular. A reason is that far more lesions occur in the cervix [21] and it is therefore more likely that a VaIN lesion has been assigned to the cervix than a cervical lesion has been assigned to the vagina. We have changed the coding practice at CRN so that VaIN and VIN cases are assigned topography codes according to the site of the lesion, regardless of disease history at other anatomical locations. The change took effect in 2009, and we will also recode retrospectively for the years 2002 to 2008. We believe that this change of practice will give us more reliable data to monitor the results of HPV vaccinations on the incidence of HPV-related diseases of the vulva and vagina. This change of coding practice affected 44 notifi cations with lesions originally assigned cervical topography code, but during the reabstraction assigned vaginal topography code. These were not included in the estimation of the accuracy of topography codes assigned to notifications at CRN and reabstracted notifi cations.
We have modifi ed the published defi nitions of the severity of misclassifi cation of cancers to evaluate the accuracy in registration of premalignant lesions. In brief, a major discrepancy as opposed to a minor discrepancy affects the incidence rates of VIN and VaIN. We could not compare the accuracy of the CRN coding of VIN and VaIN to other cancer registries since no such data, to our knowledge, has been published.
We found that 9.1% of the notifi cations had a minor discrepancy between the original and reabstracted morphology codes. Minor discrepancies occurred even though rules for giving priority to some codes over others exist for diagnoses with more than one appropriate morphology code. The majority of minor discrepancies occurred on notifi cations describing VIN 3/VaIN 3, carcinoma in situ and severe atypia/dysplasia in squamous epithelial cells for which there are separate morphology codes. The number of minor discrepancies should be reduced to a minimum by changing the coding practice. This entails using fewer morphology codes.
The fact that 1.5% VIN and 2.2% VaIN of all notifi cations had a major discrepancy between the original and reabstracted morphology and topography codes, indicate that major misclassifi cation is not a signifi cant problem in the registration of high grade VIN and VaIN. The selection of morphology codes used in this evaluation of data quality was chosen to assess the feasibility of the CRN to effectively monitor how HPV vaccination programmes will impact on the incidence of HPV-related premalignant diseases of the vulva and vagina. The present analysis indicates that the comparability, completeness, validity and timeliness of high grade VIN and VaIN registration is satisfactory, and therefore that the Cancer Registry of Norway is able to monitor high grade vulvar intraepithelial and vaginal intraepithelial neoplasias with adequate precision.

Declaration of interest:
The authors report no confl icts of interest. The authors alone are responsible for the content and writing of the paper.