Identification of Hypertension Predictors and Application to Hypertension Prediction in an Urban Han Chinese Population: A Longitudinal Study, 2005–2010

Introduction Research suggests that targeting high-risk, nonhypertensive patients for preventive intervention may delay the onset of hypertension. We aimed to develop a biomarker-based risk prediction model for assessing hypertension risk in an urban Han Chinese population. Methods We analyzed data from 26,496 people with hypertension to extract factors from 11 check-up biomarkers. Then, depending on a 5-year follow-up cohort, a Cox model for predicting hypertension development was built by using extracted factors as predictors. Finally, we created a hypertension synthetic predictor (HSP) by weighting each factor with its risk for hypertension to develop a risk assessment matrix. Results After factor analysis, 5 risk factors were extracted from data for both men and women. After a 5-year follow-up, the cohort of participants had an area under receiver operating characteristic curve (area under the curve [AUC]) with an odds ratio (OR) of 0.755 (95% confidence interval [CI], 0.746–0.763) for men and an OR of 0.801 (95% CI, 0.792–0.810) for women. After tenfold cross validation, the AUC was still high, with 0.755 (95% CI, 0.746–0.763) for men and 0.800 (95% CI, 0.791–0.810) for women. An HSP-based 5-year risk matrix provided a convenient tool for risk appraisal. Conclusion Hypertension could be explained by 5 factors in a population sample of Chinese urban Han. The HSP may be useful in predicting hypertension.


Introduction
Research suggests that targeting high-risk, nonhypertensive patients for preventive intervention may delay the onset of hypertension. We aimed to develop a biomarker-based risk prediction model for assessing hypertension risk in an urban Han Chinese population.

Methods
We analyzed data from 26,496 people with hypertension to extract factors from 11 check-up biomarkers. Then, depending on a 5-year follow-up cohort, a Cox model for predicting hypertension development was built by using extracted factors as predictors. Finally, we created a hypertension synthetic predictor (HSP) by weighting each factor with its risk for hypertension to develop a risk assessment matrix.

Results
After factor analysis, 5 risk factors were extracted from data for both men and women. After a 5-year follow-up, the cohort of participants had an area under receiver operating characteristic curve (area under the curve [AUC]) with an odds ratio (OR) of 0.755 (95% confidence interval [CI], 0.746-0.763) for men and an OR of 0.801 (95% CI, 0.792-0.810) for women. After tenfold cross validation, the AUC was still high, with 0.755 (95% CI, 0.746-0.763) for men and 0.800 (95% CI, 0.791-0.810) for women. An HSP-based 5-year risk matrix provided a convenient tool for risk appraisal.

Introduction
Hypertension is a worldwide public health challenge because of its high frequency and concomitant risks of cardiovascular disease and renal disease (1). Many studies have demonstrated that lifestyle modification can prevent high blood pressure, providing a rationale for the identification of high-risk participants so that early lifestyle intervention strategies can be implemented to prevent hypertension (2-4). In recent years, researchers have established hypertension prediction models for different populations, including Americans (5-8), Iranians (9), and Chinese (10,11). The 2 studies of a Chinese population prediction model (10,11) had areas under the curve (AUCs) that ranged from 71.6% to 73.5%. Inability to incorporate enough laboratory biomarkers and existing recall bias from questionnaire variables limited the effect of the 2 prediction models. This study corrects these limitations. Prevalence, awareness, treatment, and control of hypertension in China, a developing country, differs from that of developed countries (12,13). The purpose of this study was to develop a biomarker-based risk-prediction model for hypertension in a population of urban Han Chinese adults.

Methods
Of 95,785 people aged 18 to 88 years who received annual medical examinations from 2005 through 2010 at the Center for Health Management of Shandong Provincial QianFoShan Hospital and Shandong Provincial Hospital, 26,496 were diagnosed with hypertension at their first check-up. Of 69,289 people without hypertension at baseline, 17,471 (10,239 men and 7,232 women) who received annual clinical and laboratory examinations were selected as a 5-year follow-up cohort.
We measured the height and weight of participants who were wearing light clothing and no shoes. BMI was calculated as weight (kg) divided by height squared (m 2 ). SBP and DBP were measured using Omron HEM-907 (QuickMedical) by the cuff-oscillometric method in the right arm of seated participants after a 5minute rest period. Two measurements were taken, and the 2 BP values were averaged. Peripheral blood samples were obtained in the morning after a 12-hour fast to measure the following biomarkers: FBG, TG, HDL-C, Hb, HCT, WBC, LC, and NGC. The study was approved by the Ethics Committee of School of Public Health, Shandong University, and written informed consent was obtained from all participants.
Hypertension was defined as diastolic blood pressure of 90 mm Hg or more, systolic blood pressure of 140 mm Hg or more, or reported use of medication known to treat hypertension. Age and biomarkers were presented by mean (standard deviation), and Student's t test was used to distinguish between participants with and without baseline hypertension.
First, to eliminate multicollinearity between the routine check-up biomarkers and to extract risk-related factors of hypertension from them, we used factor analysis with principal component algorithm and varimax rotation from correlation matrix. The criteria for retaining factors were set up as eigenvalue of higher than 1. Further analytical interpretation used biomarkers that share a factor loading of at least 0.50. Second, on the basis of the cohort design, the Cox proportional hazards regression model was built between the hazard function of hypertension and the extracted latent factors: where h i (t) is the hazard rate for the i th subject at time t, and h 0 (t) is the baseline hazard at time t. Then, a hypertension synthetic predictor (HSP) was derived by weighting each factor with its regression coefficients using the formula HSP = β 1 F 1 +β 2 F 2 +…+β k F k .
After that, we calculated an HSP for each participant. Third, on the basis of the cohort design, the Cox proportional hazards regression model was built again between the hazard function of hypertension and the calculated HSP: The predictive probability of hypertension at year t was calculated by the following formula: where θ = θ 0 age + θ 1 HSP Fourth, we used MedCalc software (23) for analysis of receiver operator characteristics (ROC) curve, sensitivity, specificity, and significance (P value) to evaluate the prediction effect. Finally, for each participant from the 5-year cohort study, we calculated relative absolute risk (RAR) using the following equation: where P j (t) known as absolute risk (AR), was the probability of hypertension at year t, in which j denoted the participant's age.
signified the average probability of hypertension at year t in j th age, which can be calculated by the following model: where , was the mean of HSP in j th age.
All data analyses in this study were conducted for both men and women. We used ArcGIS 9.1 (Esri) to depict the HSP-based 5year risk matrix for hypertension risk appraisal. All statistical analyses were performed using SAS version 9.2 (SAS Institute, Inc), and significance was set at P < .05.

Results
In our study, 26,496 of 95,785 participants had hypertension at baseline, a prevalence of 27.7% (32.6% for men and 19.5% for women). Although hypertension prevalence increased with age in both men and women (Figure 1), it was higher in men than women before age 60 and was similar after age 60. Of the 3,793 participants (2,894 men and 899 women) who did not have hypertension at baseline but had hypertension at the end of year 5, the cumulative incidence was 21.7% (3,793 of 17,471). We calculated the distribution of age and 11 biomarkers among participants with and without baseline hypertension (Table 1), and all variables differed significantly for participants with and without baseline hypertension. Results of the analysis were used to create a correlation matrix for the 11 biomarkers (Appendix A). After exploring factor analysis (EFA), 5 latent factors were extracted from 11 biomarkers. Combined with explained variance and cumulative variance, factor loadings by principal component analysis with varimax rotation (Table 2). Five latent risk-related factors could explain 72.21% of total variance for men and 72.47% for women: inflammatory factor (IF), blood viscidity factor (BVF), insulin resistance factor (IRF), blood pressure factor (BPF), and lipid resistance factor (LRF). Of the 5 factors, IF was contributed by WBC and LC and NGC, BVF by Hb and HCT, IRF by FBG and TG, BPF by SBP and DBP, and LRF by BMI and HDL-C.
ROC curves for hypertension prediction models are in Figure S1 (Appendix B). The AUC was up to 75.5% for men and 80.1% for women (Figure 2, graphs A1 and A2 for men and graphs B1 and B2 for women), and was 75.5% and 80.0% after tenfold cross validation. These matrices provide a convenient tool for hypertension prediction in clinical and health management. For example, if a man aged 40 came to a hospital for a checkup, and 11 routine health check-up biomarkers (BMI, SBP, DBP, FBG, TG, HDL-C, Hb, HCT, WBC, LC, NGC) were tested, his HSP could be calculated using the formula in Figure 2. After that, we find his absolute risk (AR) and RAR in (graphs A1 and A2) through his age and his HSP. AR shows his predictive probabilities for hypertension are more than 5, and RAR shows his hypertension risk compared with his peers (people aged 40).

Discussion
In our study, the prevalence of hypertension was higher among men (32.6%) than women (19.5%) at baseline. However, hypertension prevalence changes with age. Hypertension prevalence rises more steeply in aging women than in men, perhaps because of hormonal changes during menopause (24-26).
Distribution of age and 11 routine health check-up biomarkers of participants with and without baseline hypertension differed significantly. After factor analysis, 5 latent factors were extracted from 11 biomarkers, not only eliminating the multicollinearity, but also explaining the specific pathogenesis of hypertension. The 5 factors were used to predict hypertension in the following prediction model. The 5 factors are the inflammatory factor (IF), blood viscidity factor (BVF), insulin resistance factor (IRF), blood pressure factor (BPF), and lipid resistance factor (LRF) in both men and women, according to our analysis. The cumulative explained variances of the 5 latent factors were 72.21% for men and 72.47% for women. IF and BVF in particular were identified as the key factors for the variation of hypertension (IF contributes 23.54% for men and 24.64% for women; BVF, 15.78% for men and 15.58% for women). Similar results have been found in other studies. Evidence from human and animal studies suggests that inflammation leads to the development of hypertension and that oxidative stress and endothelial dysfunction are involved in the inflammatory cascade (27). The elevation in blood viscosity could increase peripheral resistance and play a role in the pathogenesis of essential hypertension (28-30).
Two hypertension prediction models have been developed in Chinese populations. Although the power of these prediction models (AUC range: 71.6%-73.5%) was acceptable, their risk algorithm and visualization of risk assessment still had room for improvement. Our HSP-based prediction model had better prediction effect (ROC, 75.5% for men and 80.1% for women; ROC after tenfold cross validation, 75.5% for men and 80.0% for women.). One reason our results were more accurate is because we used laboratory biomarkers rather than questionnaire variables to avoid recall bias. Another reason is that we used both factor analysis and the Cox model, producing a better model for hypertension prediction. Finally, we developed a risk matrix with AR and RAR to represent risk assessment, which was convenient for practical application (Figure 2). For example, men and women who receive a routine health check-up can learn their AR from the risk matrix. Comparing their results with those of other same-age parti-cipants may warn patients of their risk and guide their choice of nonpharmacologic measures to prevent hypertension. A limitation to our study was that all participants were urban Han Chinese adults; therefore, our results may not be generalizable to other populations. More validation studies for these prediction models are needed. This is the first hypertension prediction model developed for urban Han Chinese adults. Because of the large sample size, the estimates from our prediction model were stable, as demonstrated by the tenfold cross validation. Physicians can use the HSP-based 5year hypertension risk matrix to measure patients' risk for hypertension, inform patients of their risks, help them choose appropriate nonpharmacologic measures to prevent hypertension, and aid in clinical counselling and decision making.