A Predictive Model and Risk Factors for Case Fatality of COVID-19

This study aimed to create an individualized analysis model of the risk of intensive care unit (ICU) admission or death for coronavirus disease 2019 (COVID-19) patients as a tool for the rapid clinical management of hospitalized patients in order to achieve a resilience of medical resources. This is an observational, analytical, retrospective cohort study with longitudinal follow-up. Data were collected from the medical records of 3489 patients diagnosed with COVID-19 using RT-qPCR in the period of highest community transmission recorded in Europe to date: February–June 2020. The study was carried out in in two health areas of hospital care in the Madrid region: the central area of the Madrid capital (Hospitales de Madrid del Grupo HM Hospitales (CH-HM), n = 1931) and the metropolitan area of Madrid (Hospital Universitario Príncipe de Asturias (MH-HUPA) n = 1558). By using a regression model, we observed how the different patient variables had unequal importance. Among all the analyzed variables, basal oxygen saturation was found to have the highest relative importance with a value of 20.3%, followed by age (17.7%), lymphocyte/leukocyte ratio (14.4%), CRP value (12.5%), comorbidities (12.5%), and leukocyte count (8.9%). Three levels of risk of ICU/death were established: low-risk level (<5%), medium-risk level (5–20%), and high-risk level (>20%). At the high-risk level, 13% needed ICU admission, 29% died, and 37% had an ICU–death outcome. This predictive model allowed us to individualize the risk for worse outcome for hospitalized patients affected by COVID-19.


Introduction
By the end of 2019, a novel coronavirus designated severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was first reported in the city of Wuhan, the capital of Hubei, China, and caused an outbreak of unusual viral pneumonia [1]. Being highly transmissible, this novel coronavirus disease, also known as coronavirus disease 2019 (COVID- 19), has spread rapidly all over the world and has posed an extraordinary threat to global public health [2]. Due to the high mortality rate and lack of optimal therapies, understanding the clinical features is crucial to responding to COVID-19. Rapid diagnoses and effective therapies are also important interventions for control the pandemic. The incidence, prevalence, and rapid evolution of SARS-CoV-2 infection are changing in all countries, and thus it is necessary to develop appropriate, dynamic, and transversal protocols to face the changing needs [3].
The clinical spectrum of SARS-CoV-2 infection appears to be wide, encompassing asymptomatic infection, mild upper respiratory tract illness, and severe viral pneumonia with respiratory failure and even death, with many patients being hospitalized with pneumonia [4][5][6][7]. The rapid growth in the number of cases, especially the critically ill or fatal patients, has posed a magnificent challenge on health systems, particularly in some countries, where there has been a sustained increase in COVID-19 cases during the last months [8]. Therefore, until an effective vaccine becomes widely available, there is an urgent need to decrease patient density, along with optimal clinical management of severe COVID-19 patients, finding new methods and approaches to achieve effective results even amidst this complex situation. This point would be of special importance, since it would allow better management of the resources at a time when a major economic recession has shaken up the world [9].
A strategic objective for clinical management of COVID-19 patients is to recognize critically ill patients in the early phases of the disease in order to adjust the treatment plan and assign medical resources rationally to improve the medical efficacy and reduce the risk of in hospital mortality. Several studies have analyzed the clinical and laboratory characteristics of critically ill COVID-19 patients [10]. These variables may provide a great support in establishing COVID-19 prognosis, helping to distinguish between nonsevere and severe COVID-19 cases. Among these variables, serological, biochemical, immunological, and coagulation parameters have proven its remarkable importance in the stratification of COVID-19 progression [11,12]. Shen et al. [13] described the characteristic protein and metabolite changes in the sera of severe COVID-19 patients, which might be used in selection of potential blood biomarkers for severity evaluation. However, understanding and interpreting current available tests to recognize and hospitalize highrisk severe COVID-19 patients remains a challenge, generating uncertainty for patients, families, and healthcare professionals [14,15]. Rubio-Rivas et al. [15] have shown the existence of different phenotypes of patients on the basis of their comorbidities and their clinical characteristics, relating how it can be effective at the healthcare level in clinical management from the point of view of survival. Along these lines, other studies have unified symptoms and signs of manifestation of this disease with the aim of improving and implementing an action model [16][17][18].
Early identification of high-risk COVID-19 cases remains a crucial element in the clinical management of this disease. Therefore, establishing the prognostic score of COVID-19 patients requiring hospital admission is a relevant objective to optimize the clinical management of these patients. In our study, we analyzed the clinical characteristics of COVID-19 patients with worse outcome and non-critical illness, comparing their clinical and laboratory parameters to identify patient's risk to develop the most severe presentation of COVID-19, helping physicians to focus on patients in greatest need.

1.
Extremely low level corresponds to less than or equal to 80%.
As an alternative specification for the comorbidity score, we used the Charlson Index [21] and the Elixhauser Index [22]. For each patient, there is a dummy variable that indicates if a bed was provided in the intensive care unit (ICU) and if the patient died from COVID-19 (Table S1 presents the variables considered for the analysis).

Data Analysis
Data obtained from the study were included in a Microsoft Office Excel database (Microsoft, Redmond, Washington DC, USA) and R. Differences with p < 0.05 were considered statistically significant. Quantitative variables were expressed as mean (interquartile range or 95% confidence interval (CI)) and categorical variables as the number of patients and rates (%) (CI 95%). Univariate analysis was performed with Fisher's test, χ 2 test, and Student's t-test, as appropriate.
To assess the risk of worse outcome (ICU or death) by COVID-19 at hospital admission, we followed a three-step process. First, we modeled the risk of worse outcome by specifying logistic regressions with different covariates, considering only data from the CH-HM (first cohort). Secondly, we validated the models trained with the CH-HM dataset by testing them to the MH-HUPA dataset (second cohort). Thirdly, we built final versions of the models using the whole dataset (total cohort). No imputations of the data were made. Table 1. Description of the comborbidities of the coronavirus disease 2019 (COVID-19) patients included in the study. Results include those of the Grupo Hospitales de Madrid (CH-HM) at the central area of Madrid, and those of the Hospital Universitario Príncipe de Asturias (MH-HUPA) at the metropolitan area of Madrid. We also present the results of the two cohorts together (total cohort). For each comorbidity, we present the percentage of patients (with respect to the group they belong to) suffering from the comorbidity, and the odds ratio of death (OR) for the indicated comorbidity in the studied population. The weight each comorbidity is given in the comorbidity score is estimated as a weighted average between the odds ratio of the disease (in the training set, which is the CH-HM cohort for the estimates EM-1 to EM-5, and the Global for the EM-6 to EM-8) and the odds ratio of the comorbidity group, which is included as a Bayesian prior of 30 patients. Hence, for a comorbidity C diagnosed to 100 patients, from group G, the weight w given as w = 100 OR C +30 OR G

130
, where OR C is the odds ratio of the disease and OR G is that of the group. In the table, we present the weights for the CH-HM cohort. For example, for COR pulmonale, the weight is w = 100 * 2.37 +30 * 2.21 130 = 2.33. We started with a basic model where the target was the dummy variable ICU or death and the covariates were variables previous to any medical test: age, gender, and comorbidities. We built a second model including oxygen saturation, which is the cheapest, fastest, and most readily available variable to obtain. Our third model included the standard indicators of a blood analysis: C-reactive protein (CRP), leukocyte level, lymphocyte/leukocyte ratio, and D-dimer. Finally, to gain a complete understanding of the model beyond the significance of the parameters, we estimated the relative importance of each variable included in the model, using a new methodology for model interpretation suggested by Lundeberg and Lee [23]: SHAP (SHapley Additive ExPlanation) values. On synthesis, given an observation x = x 1 , . . . , x J , the SHAP value of feature j on instance x corresponds to the way in which the concrete value of feature j on x modifies the output of the model with respect to other instances that share some features with x but not j. For a

CH-HM
where X is the set of observations and E X j is the average value of the j feature on X. Then, noting as N the total number of observations, we can estimate the 164 relative importance of feature j in the model, RIj, as Equation (1) Equation (1). Formula for calculating the relative importance of the parameters to be analyzed, where the relative weight of each variable included in the model is obtained, using Shapley Additive ExPlanation (SHAP) values. Concretely, the relative importance of a variable is the sum of the absolute marginal contribution of the variable into the output of the model for all the observations (i = 1, . . . , N), divided by the sum of the absolute marginal contributions of all the variables (k = 1, . . . , J) for all the observations (i = 1, . . . , N).

Ethical Approval
This study was conducted according to basic ethical principles (autonomy, harmless, benefit, and distributive justice); its development followed the standards of Good Clinical Practice and the principles enunciated in the last Declaration of Helsinki (2013) and the Oviedo Convention (1997). The project was approved by the ethics committee of the University Hospital Príncipe de Asturias (HUPA-04062020).

Patients Characteristics
We evaluated a total of 3489 patients, distributed in two health areas in the Madrid region. The mean age was 67.6 years, with 41.7% being female. The average total number of comorbidities was 3.3, with a Charlson Index of 0.9 and an Elixhauser Index of 2.0 ( Table  2). The basal oxygen saturation level was extremely low in 4.4% and 15.7%. The highest percentage of patients (45.7%) maintained a basal oxygen saturation medium mean value of CRP of 74.3 µg/dL, mean leukocyte value of 7.8 10 3 /L, lymphocyte/leukocyte ratio of 18.7%, and mean D-dimer value of 2753.1 mg/L ( Table 2). A total of 241 patients (6.9%) were admitted to ICU and 597 patients (17.3%) passed away. In total, 774 patients (22.1%) combined an episode of ICU admission or death (Table 2). Table 2. Description of the COVID-19 patients included in the study. Results include those of the Grupo Hospitales de Madrid (CH-HM) at the central area of Madrid, and those of the Hospital Universitario Príncipe de Asturias (MH-HUPA) at the metropolitan area of Madrid. We also present the results of the two cohorts together (total cohort). Global description of the patients included in the study showing the mean (standard deviation) values of the demographic, clinical, and analytical characteristics of the patients, as well as the percentage of admissions to the intensive care unit (ICU) and deaths. The results are expressed as mean (standard deviation). CRP = C-reactive protein, n = number. The mean age of the CH-HM patient group (n = 1931) was 68.4 years, and 41% were women. The presence of comorbidities was averaged 2.9 (Tables 1 and 2). The Charlson Index and Elixhauser Index analysis for the CH-HM patients' comorbidities were 0.7 and 1.8, respectively. Regarding basal oxygen saturation level, 4% patients were extremely low, 16.7% low, while the highest percentage of 30.1% was medium. Assessing analytical values in the CH-CM patient group, we found that the mean CRP was 73.8 µg/dL, leukocytes were 7.8 10 3 /L, lymphocyte/leukocyte ratio was 18.9%, and D-dimer was 2480.9 mg/L ( Table 2). A total of 133 (6.9%) patients belonging this group were admitted to the ICU and 278 (14.4%) passed away. In total, 371 (19.2%) patients had combined ICU admission and death ( Table 2).

CH-HM
With regard to the MH-HUPA patient group (n = 1558), the mean age was 66.7 years and 42.5% were female. The mean of comorbidities in this group was 3.8, and the Charlson Index and Elixhauser Index were 0.7 and 1.0, respectively. These patients showed a basal oxygen saturation that was extremely low at 4.9%, 14.3% was low, and 68.5% was medium. With regards to analytical data, the mean CRP was 75.0 µg/dL, mean leukocyte values were 73.3 10 3 /L, lymphocyte/leukocyte ratio was 18.6% and D-dimer was 3180.2 mg/L. A total of 109 (7%) patients were admitted to ICU, and 327 (20.9%) patients passed away, with 399 (25.6%) patients both admitted to ICU and having passed away ( Table 2).
The average variance inflation factor (VIF) of the covariates for the CH-HM data was 1.19, and the maximum was 1.36. For the whole dataset (CH-HM and MH-HUPA), the average VIF was 1.20, and the maximum was 1.33. Hence, there was no evidence of a problem of multicolinearity.

Empirical Model and Results
To assess the risk of severe outcomes (ICU or death) of COVID-19 patients at hospital admission, we implemented a three-step process. Firstly, we built a model to analyze the importance of age, sex, and comorbidities using CH-HM patients ( Table 3, EM-1). The parameters analyzed were statistically significant and the area under the curve was 0.7470. Secondly, the level of basal oxygen saturation was added to the model (Table 3, EM-2). Finally, analytical parameters (CRP, leukocytes, lymphocyte/leukocyte ratio, and D-dimer) were included. In the final model, all coefficients maintained their significance ( Table 3, EM-3).
Additionally, we tested alternative specifications of comorbidities, such as the Charlson Index (Table 4, EM-4) or the Elixhauser Index (Table 4, EM-5). The comorbidity score adjusted for COVID-19 is slightly more informative that global scores.

Robustness Check
To validate the quality of the model for patients that were not included in the training set and to check that it was not biased by any particular medical procedure implemented at the CH-HM, we tested the constructed model in 1558 patients from the MH-HUPA. The accuracy of fit of the model when applied to the MH-HUPA patients was almost the same as when it was applied to the CH-HM. Hence, the model was shown to be robust to different hospitals and did not present any bias, as in all cases the AUC was greater than 0.7 and the difference in the AUC between the two cohorts was lower than 2.2 percentage points (Table 5).

Final Models and Interpretation
Once the initial models were validated, we ran the models using the complete cohort including patients from both groups, namely, the CH-HM and the MH-HUPA. Estimated coefficients, levels of significance, and AUC were almost identical with respect to the CH-CM dataset (Table 6).
Finally, to gain a complete understanding of the model beyond the significance of the parameters, we estimated the relative weight of each variable included in the model using SHAP values Equation (1). Table 7 represents the relative importance of each variable in the estimates (6) to (8). The highest relative importance for the complete model (EM-8) was for basal oxygen saturation (20.3%), followed by age (17.7%), lymphocyte/leukocyte ratio (14.4%), and CRP analysis and comorbidities (12.5% each). Smaller values of relative importance were observed in leukocytes (8.9%), female gender (6.9%), and D-dimer levels (6.7%).

Categories of Individualized Risk
Our results establish a clinical score that allows to us estimate the individualized risk for severe outcomes in COVID-19 hospitalized patients. Supplementary material examples of this application are included (Table S2). Furthermore, the model allows us to categorize the risk of ICU or in-hospital mortality in three risk categories (Figure 1). The low-risk category (<5%), by our definition, contained 554 patients with a mean age of 53.2 years, with 53.0% being female. At this low-risk level, the mean level of comorbidities was 1.0 with the Charlson Index and Elixhauser Index, and 0.3 in both. It is important to note how the highest percentage of patients (74.0%) had adequate basal oxygen saturation. The analytical parameters were 27.9 for CRP, 5.5 for leukocytes, 29.0% for lymphocyte/leukocyte ratio and 798.1 for D-dimer. At this low-risk level, only 3% patients had to be admitted to ICU, 1.0% died, and 3% had a ICU-death outcome (Figure 1).
In a second point, we identified a medium-risk category (5-20%) for ICU admission or in hospital mortality that included 933 patients with a mean age of 66.3 years, with 43.0% being female. Within this medium-risk level, we observed comorbidities levels of 2.2 with the Charlson Index and Elixhauser Index, and 0.6 for the two indexes. A total of 54.0% of the patients had a mean level of basal oxygen saturation, and analytical parameters were 53.6 for CRP, 6.7 for leukocytes, 21.0% for lymphocyte/leukocyte ratio, and 1302.5 for D-dimer. At this medium-risk level, only 5.0% needed to be admitted to ICU, 5.0% died, and 9.0% had an ICU-death outcome (Figure 1).
Finally, we identified a high-risk category (>20%) that included 1177 patients. The mean age in this group was 75.4 years, with 31.0% being female. At this high-risk level, the mean levels of comorbidities were the highest, with 4.8 for the Charlston Index and an Elixhauser Index of 1.2. It is important to note that the highest percentage of patients (48.0%) had basal oxygen saturation at medium levels and only 13.0% was adequate. The analytical parameters were 110.0 for CRP, 9.9 for leukocytes, 37.0% for lymphocyte/leukocyte ratio, and 4857.1 for D-dimer. At this high-risk level, 13% were admitted to ICU, 29.0% died, and 37% had an ICU-death outcome (Figure 1).

Figure 1.
Graphic representation to classify the importance of the study variables in relation to admission to the ICU and death. Three levels of risk are described in this model: low (<5%), moderate (5-20%), and high (>20%). The results are expressed as mean (standard deviation). CRP = C-reactive protein mean, ICU = intensive care unit.

Discussion
According to our results, clinical and analytical parameters can be used to identify patients at high risk of severe forms of COVID-19. The use of predictive models was proven useful in the detection of high-risk patients, which may be a good tool for implementing rapid response interventions, reducing mortality [24]. At present, numerous authors have described how, in large cohorts, it is possible to describe a series of standards that allow clinical decision-making [17]. In the present study, we show a predictive model in patients with COVID-19 in order to identify determinants of ICU admission and death. In our regression model, the basal oxygen saturation level was found to be the most important predictive clinical marker to detect cases in which additional support is needed due to their higher risk of death. This was followed by the serum levels of CRP, age, the presence of comorbidities, leukocytes, and lymphocyte/leukocyte ratio. Our results are partly consistent with previous studies such as Allenbach et al. [25], who described that older age, respiratory insufficiency, higher CRP levels, and lower lymphocytes in blood are strongly associated with an increased risk of ICU admission and mortality. Similarly, Xie et al. [26] conducted a multivariable model on admission reporting oxygen saturation, age, lymphocyte count, and lactate dehydrogenase as independent predictors of mortality.
A wide range of studies have unraveled the important use of oxygen saturation as a predictive value of severity [27] or hospitalization [28]. Likewise, monitoring oxygen saturation is an excellent way to reduce bed demands, thus allowing a more efficient management of the available resources [29]. Hypoxemic respiratory failure is the main cause

Discussion
According to our results, clinical and analytical parameters can be used to identify patients at high risk of severe forms of COVID-19. The use of predictive models was proven useful in the detection of high-risk patients, which may be a good tool for implementing rapid response interventions, reducing mortality [24]. At present, numerous authors have described how, in large cohorts, it is possible to describe a series of standards that allow clinical decision-making [17]. In the present study, we show a predictive model in patients with COVID-19 in order to identify determinants of ICU admission and death. In our regression model, the basal oxygen saturation level was found to be the most important predictive clinical marker to detect cases in which additional support is needed due to their higher risk of death. This was followed by the serum levels of CRP, age, the presence of comorbidities, leukocytes, and lymphocyte/leukocyte ratio. Our results are partly consistent with previous studies such as Allenbach et al. [25], who described that older age, respiratory insufficiency, higher CRP levels, and lower lymphocytes in blood are strongly associated with an increased risk of ICU admission and mortality. Similarly, Xie et al. [26] conducted a multivariable model on admission reporting oxygen saturation, age, lymphocyte count, and lactate dehydrogenase as independent predictors of mortality.
A wide range of studies have unraveled the important use of oxygen saturation as a predictive value of severity [27] or hospitalization [28]. Likewise, monitoring oxygen saturation is an excellent way to reduce bed demands, thus allowing a more efficient management of the available resources [29]. Hypoxemic respiratory failure is the main cause of ICU admission [30]. Consistent with this study, Zhao et al. [31] reported that oxygen saturation could be used either as a significant variable predicting ICU admission or as a predictor of mortality in patients with COVID-19. Our study shows an up to 20.3% association among this parameter and prognosis in patients with COVID-19, being the most important factor to consider in the prediction of both admission in the ICU and its later mortality.
CRP is an inflammatory marker easily and commonly measured, which appears to be increased in COVID-19 patients. Related to disease severity, it has been suggested to represent a prognosis marker in several studies, and also to evaluate ICU admission [32]. For example, concentration of 100 mg/L of CRP has been described to implicate increased risk of ICU, prolonged length of stay, and an increase at one-month mortality [33]; in other observational studies, serum levels of CRP allowed the researchers to recognize alarming cases of COVID-19 [34]. CRP also correlates with decreased levels of red blood cells, neutrophil count, and lymphocytes in the early stages of the disease [32,35]. Increased CRP levels may reflect diameter of lung lesion as well, as both variables have been shown to correlate positively [36]. Immunological parameters including higher leucocytes and reduced lymphocytes directly correlate with severity and worse prognosis in ICU patients [37]. Our study described lymphocyte/leucocyte ratio (14.4%) and leucocytes (8.9%) as major predictive factors of both ICU admission and mortality, thereby denoting the relevance of immunological markers in COVID-19 patients.
Furthermore, our research also considered age (17.7%), comorbidities (12.5%), female gender (6.9%), and D-dimer value (6.7%) as independent variables of ICU admission and mortality. In this context, Larsson et al. [38] showed the central role of age (>59 years old) in ICU admission and mortality. In estimations EM-1 to EM-3, age is associated, in a quadratic form, with a higher risk of fatal outcome. Females are more resistant to COVID-19 than men, as reported by [39,40], among others. Previous comorbidities are also associated with a higher risk of mortality; these data are consistent with the reported information of numerous studies [41], as well as low oxygen saturation levels; high levels of CRP, leukocytes, and D-dimer; and a low lymphocyte/leukocyte ratio [27,42].
Haase et al. [43] identified age, male gender, and different comorbidities such as chronic pulmonary disease (COPD) with increased risk of death in 323 patients admitted to ICUs. Consistent with these results, Zhao et al. [31] also reported age and COPD as major variables predicting mortality. Likewise, type 2 diabetes mellitus has also been postulated as a direct risk factor for ICU admission and mortality of COVID-19, being the second most common comorbidity in these patients [44]. According to our study, the most part of hospitalized patients are male [45]. Contrary to our study, this has also been found to be associated with higher risk of mortality [46]. Notwithstanding, the real impact of sex in COVID-19 admission and mortality remains to be elucidated as there is some discordance in the different data [47]. We show that female gender is an important factor to consider in the ICU admission and as a prognostic factor in patients with COVID-19. Coagulation disorders is a common problem of COVID-19, with an approximately incidence of one in three patients [48]. D-dimer is a variable that appears to be elevated in >95% ICU patients [49]. Thus, D-dimer effectively predicts, in other studies, in-hospital mortality, showing its potential in early prognosis in patients with COVID-19 [50]. Our study expressed a significant correlation between this parameter with mortality and admission in ICU, which may provide a helpful support for clinicians in the management of these patients.
Recently, different studies have attempted to classify patients by phenotypes [18]. Our study allowed us to obtain risk levels regardless of specific phenotypes of patients, only taking into account rapid clinical parameters that allow decision-making in an effective way to save patients lives.
The reduced numbed of analytical variables of the patients included in this study may be considered a limitation. The relevance of genetic and biological factors for the susceptibility and evolution of COVID-19 have been described. There is increasing evi-dence that deficiency or overexpression of various immunological molecules and cellular populations as well as different genetic factors condition the clinical evolution and severity COVID-19 [51][52][53]. The inclusion of biological parameters of the immune-inflammatory and neuro-endocrine metabolic response of the patient to SARS-CoV-2 infection may improve the understanding of the pathogenesis of the disease and the clinical and therapeutic management of COVID-19 patients. However, the aim of our study was to develop a predictive and individualized model of risk for worse outcome for hospitalized patients by COVID-19. The described model allows for a quick and effective decision-making process for COVID-19 patients in the medical emergency rooms and at hospital admission. This is an important point and that can be considered as one of the limitations of our study. However, for the first time, we managed to develop a rapid risk model for medical emergencies.

Conclusions
Our results show a model that allows for prediction of the level of risk of a patient with a diagnosis of COVID-19 (by PCR) of suffering an ICU admission. It provides the advantage of considering simple covariates prior to any medical examination, such as age, sex, and comorbidities, together with rapidly accessible quantitative parameters in the hospital emergency (basal oxygen saturation and CRP levels). Our model estimated that 37.0% of patients classified as high-risk are admitted to ICU and die.
Supplementary Materials: The following are available online at https://www.mdpi.com/2075-4 426/11/1/36/s1. Table S1: presents the variables considered for the analysis. Table S2: presents examples of this application. Institutional Review Board Statement: The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Review Board HUPA (HUPA-04062020).
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.