HIV infection and COVID-19 death: population-based cohort analysis of UK primary care data and linked national death registrations within the OpenSAFELY platform

Background: It is unclear whether HIV infection is associated with risk of COVID-19 death. We aimed to investigate this in a large-scale population-based study in England. Methods: Working on behalf of NHS England, we used the OpenSAFELY platform to analyse routinely collected electronic primary care data linked to national death registrations. People with a primary care record for HIV infection were compared to people without HIV. COVID-19 death was defined by ICD-10 codes U07.1 or U07.2 anywhere on the death certificate. Cox regression models were used to estimate the association between HIV infection and COVID-19 death, initially adjusted for age and sex, then adding adjustment for index of multiple deprivation and ethnicity, and finally for a broad range of comorbidities. Interaction terms were added to assess effect modification by age, sex, ethnicity, comorbidities and calendar time. Results: 17.3 million adults were included, of whom 27,480 (0.16%) had HIV recorded. People living with HIV were more likely to be male, of black ethnicity, and from a more deprived geographical area than the general population. There were 14,882 COVID-19 deaths during the study period, with 25 among people with HIV. People living with HIV had nearly three-fold higher risk of COVID-19 death than those without HIV after adjusting for age and sex (HR=2.90, 95% CI 1.96-4.30). The association was attenuated but risk remained substantially raised, after adjustment for deprivation and ethnicity (adjusted HR=2.52, 1.70-3.73) and further adjustment for comorbidities (HR=2.30, 1.55-3.41). There was some evidence that the association was larger among people of black ethnicity (HR = 3.80, 2.15-6.74, compared to 1.64, 0.92-2.90 in non-black individuals, p-interaction=0.045) Interpretation: HIV infection was associated with a markedly raised risk of COVID-19 death in a country with high levels of antiretroviral therapy coverage and viral suppression; the association was larger in people of black ethnicity.


Introduction
Since it emerged in late 2019, SARS-CoV-2, the virus that causes coronavirus disease-19 , has infected over 14 million people worldwide, causing over 600,000 deaths to date. 1 Older age and male gender have been strongly associated with more severe outcomes; several comorbidities, including those that involve immunosuppression, also appear to be associated with higher risk of COVID-19 death. 2 However, there has been limited evidence to date on how human immunodeficiency virus (HIV) infection may affect risk of poor outcomes from COVID- 19. 3 There is mixed evidence on the importance of HIV to previous respiratory virus epidemics. HIV has been associated with a higher risk of severe outcomes from respiratory infections including seasonal influenza, 4,5 and people living with HIV at any stage of infection are considered a clinical risk group in seasonal influenza vaccination guidance in the UK. 6 But the importance of HIV infection to outcomes from 2009 H1N1 pandemic influenza was unclear, with no substantive evidence that HIV-infected individuals were at higher risk of being infected or experience poor outcomes, unless at an advanced stage of immunosuppression. 7 Evidence from the present SARS-CoV-2 pandemic is limited. A large population-based cohort study in South Africa found COVID-19 mortality risk among people living with HIV to be double the risk of those without HIV. 8 A high prevalence of critical illness was observed among HIV-infected patients with COVID-19 in Madrid, though there was no non-HIV comparison group. 9 Other small studies of hospitalised patients have failed to detect any raised risk of severe outcomes in people living with HIV. 10,11 We therefore aimed to investigate the association between HIV infection and COVID-19 death using populationbased data from England. We conducted our analysis within the OpenSAFELY platform which includes largescale primary care data from 40% of the English population linked to national death registrations. Information on antiretroviral treatment and viral load was not available for this study, due to the health service delivering these treatments through specialist clinics, and restrictions on the sharing of data that are deemed potentially sensitive. 12 However, in the UK, 94% of individuals are treated with antiretroviral drugs and have undetectable viral load, 13 thus our findings should generalise to settings with similarly high levels of HIV control, and may represent a "best-case" scenario for associations between HIV infection and COVID-19 outcomes in other settings.
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2020. . https://doi.org/10.1101/2020.08.07.20169490 doi: medRxiv preprint

Study design and study population
A retrospective cohort study comparing the risk of COVID-19 death among people living with and without HIV was carried out within OpenSAFELY, a new data analytics platform in England created to address urgent COVID-19 related questions, which has been described previously. 2 We used routinely-collected electronic data from primary care practices using The Phoenix Partnership (TPP) SystmOne software, covering approximately 40% of the population in England, linked to Office of National Statistics (ONS) death registrations. We included all adults (aged 18 years or over) alive and under follow-up on 1 st February 2020, and with at least one year of continuous GP registration prior to this date, to ensure that baseline data could be adequately captured. We excluded people with missing age, sex, or index of multiple deprivation, since these are likely to indicate poor data quality.

Outcome, exposure and covariates
The outcome was COVID-19 death, defined as a record for death in linked ONS data with the ICD-10 codes U07.1 ("COVID-19, virus identified") or U07.2 ("COVID-19, virus not identified") anywhere on the death certificate. 14 The main exposure was HIV status, and covariates considered in the analysis included age (at 1 st February 2020, grouped as 18-39, 40-49, 50-59, 60-69, 70-79 and ≥80 years for descriptive analysis, and parametrised as a 4knot restricted cubic spline in regression models), sex, ethnicity (White, Mixed, South Asian, Black, Other, based on self-report using categories from the UK census), obesity (categorised as class I [body mass index 30-34.9kg/m 2 ], II [35-39.9kg/m 2 ], III [≥40kg/m 2 ]), smoking status (never, former, current), index of multiple deprivation quintile (derived from the patient's postcode at lower super output area level), and comorbidities considered potential risk factors for severe COVID-19 outcomes, namely diagnosed hypertension, chronic respiratory diseases other than asthma, asthma (categorised by use of oral steroids), chronic heart disease, diabetes (categorised according to the most recent glycated haemoglobin (HbA1c) recorded in the 15 months prior to 1 st February 2020), non-haematological and haematological cancer (both categorised by recency of diagnosis, <1, 1-4.9, ≥5 years), reduced kidney function (categorised by estimated glomerular filtration rate derived from the most recent serum creatinine measure (30-<60, <30 mL/min/1.73m 2 ), chronic liver disease, stroke or dementia, other neurological disease (motor neurone disease, myasthenia gravis, multiple sclerosis, Parkinson's disease, cerebral palsy, quadriplegia or hemiplegia, and progressive cerebellar disease), organ transplant, asplenia (splenectomy or a spleen dysfunction, including sickle cell disease), rheumatoid arthritis/lupus/psoriasis, and other immunosuppressive conditions (permanent immunodeficiency ever diagnosed, or aplastic anaemia or temporary immunodeficiency recorded within the last year).
Information on HIV and all covariates was obtained by searching TPP SystmOne records prior to 1 st February 2020 for specific coded data, based on a GP subset of SNOMED-CT mapped to Read version 3 codes. Over 70 codes for HIV were included, but >90% of people living with HIV were identified from just four codes, namely: is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2020. .
(4%). All codelists, along with detailed information on their compilation is available at https://codelists.opensafely.org for inspection and re-use by the wider research community.

Statistical analysis
Follow-up time for COVID-19 mortality was from 1 st February 2020 until the date of COVID-19 death or 22 nd June 2020, which was the last date for which mortality data were complete. The competing risk of death from causes other than COVID-19 was accounted for by censoring non-COVID deaths, so our analysis focussed on causespecific hazards. 15 Crude rates of COVID-19 in the HIV and non-HIV groups were calculated, and we then used Cox regression models to estimate the association between HIV infection and COVID-19 mortality, initially adjusted for age and sex; then adding adjustment for index of multiple deprivation and ethnicity; and finally adding adjustment for all comorbidities listed above to minimise confounding; it should be noted that the fully adjusted model would not capture any effect of HIV mediated through an increased risk of the included comorbidities. We additionally adjusted for history of hepatitis C infection in a post-hoc analysis, as this might be a marker of injection drug use. Models were stratified by the Sustainability and Transformation Partnership (STP, a UK National Health Service administrative region) of the patient's general practice to allow for geographical differences in baseline hazards. Multiple imputation (10 imputations) was used to account for missing ethnicity, with the imputation model including all covariates from the main model and an indicator for the outcome. Those with missing body mass index were assumed to be non-obese, and those with missing smoking data were assumed to be never-smokers; we did not use multiple imputation for these variables as they are expected to be missing not at random in UK primary care. 16 In sensitivity analyses, we excluded individuals with missing data (complete case analysis). Proportional hazards were checked by: (i) by fitting an interaction between analysis time and the HIV variable, with analysis time categorised as 0-60, 60-90, ≥90 days from 1 st February 2020, chosen to respectively capture the period before social distancing policies in the UK would have impacted mortality, the period of peak COVID-19 mortality, and the period during which restrictions began to be eased; and (ii) by testing the slope of the Schoenfeld residuals (among those with complete ethnicity only).
To investigate effect modification by variables considered a priori to be of key potential importance, we fitted interaction terms (one at a time) between HIV status and age (<60, ≥60 years), sex, ethnicity (black vs other), and presence of comorbidities (any versus none of the comorbidities included in the analysis -a sensitivity analysis excluded hypertension from this list since a past code for hypertension may not reflect any ongoing clinical disease). These variables were dichotomised due to limited power. P-values (two-sided) were from Wald tests on the interaction terms.
Cumulative mortality curves, standardised to adjust for different covariate distributions in the HIV and non-HIV groups, were generated. First, a Royston-Parmar model with the same covariates as the fully adjusted Cox model was fitted, with the baseline hazard modelled using a 3 degrees-of-freedom spline. 17 The survival function was predicted from this model for every individual with HIV, and averaged to produce the curve for the . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2020. . https://doi.org/10.1101/2020.08.07.20169490 doi: medRxiv preprint HIV group. To produce the standardised comparison curve, the survival functions were predicted and averaged again for the same individuals, but with HIV status set to 0.

Information Governance and Ethics
NHS England is the data controller; TPP is the data processor; and the key researchers on OpenSAFELY are acting on behalf of NHS England. OpenSAFELY is hosted within the TPP environment which is accredited to the ISO 27001 information security standard and is NHS IG Toolkit compliant; 18,19 patient data are pseudonymised for analysis and linkage using industry standard cryptographic hashing techniques; all pseudonymised datasets transmitted for linkage onto OpenSAFELY are encrypted; access to the platform is via a virtual private network (VPN) connection, restricted to a small group of researchers who hold contracts with NHS England and only access the platform to initiate database queries and statistical models. All database activity is logged; only aggregate statistical outputs leave the platform environment following best practice for anonymisation of results such as statistical disclosure control for low cell counts. 20   is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2020. .

27
,480 adults living with HIV and 17,255,425 without HIV were included ( Figure 1). People with HIV had similar median age, but a narrower age distribution overall, compared to those without HIV (Table 1). People living with HIV were more likely to be male, of black ethnicity, and from a more deprived area. 921 (3.4%) individuals with HIV had chronic liver disease, compared to 99,214 (0.6%) of those in the comparison group. 1,491 individuals with HIV (5.4%) had a history of hepatitis C compared to 38,235 (0.2%) of those without HIV.
There were 25 COVID-19 deaths among people with HIV, and 14,857 among those without HIV ( Table 2, Table   3). Estimated cumulative mortality, standardised to the covariate distribution of the HIV group, was 0.087% (95% CI 0.056-0.134) and 0.038% (0.036-0.040) in people with and without HIV respectively, at the last date of mortality data coverage (22 nd June, Figure 2).
There was some evidence that the association between HIV and COVID-19 death was larger in black individuals (HR=3.80, 2.15-6.74, compared with 1.64, 0.92-2.90 in other ethnic groups, p-interaction=0.045, Table 3 and Figure 3). There was no statistical evidence of interaction by age, sex, or presence of other comorbidities; the point estimate for the HIV-mortality association was larger for those with at least one comorbidity but confidence intervals were wide, and excluding hypertension from the list of comorbidities attenuated the difference. The estimated association between HIV and COVID-19 death appeared to be larger early in the epidemic and reduced over time, suggesting possible non-proportionality of hazards, but this was not statistically significant (p=0.26).
Further assessment for non-proportional hazards based on Schoenfeld residuals showed no evidence of nonproportionality by HIV status (p=0.32) but there was evidence of non-proportional hazards in several adjustment covariates, namely ethnicity, sex, obesity, smoking, IMD, chronic respiratory disease, diabetes, chronic liver disease, stroke/dementia, other neurological conditions, reduced kidney function, so an additional sensitivity analysis was done modelling this non-proportionality by adding interaction terms between each of these variables and time-updated calendar time; this had very little effect on the hazard ratio for HIV (adjusted HR=2.20, 1.41-3.42 after adding interaction terms).
Results were similar in sensitivity analyses restricting to those with known ethnicity (fully adjusted HR for HIV = 2.21, 1.42-3.44), and those with complete BMI and smoking data (fully adjusted HR 1.93, 1.23-3.04).
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2020. . https://doi.org/10.1101/2020.08.07.20169490 doi: medRxiv preprint

Key findings
In this large population-based study using data from the Open SAFELY platform in England, we found people with HIV to be at more than twice the risk of COVID-19 death compared to people without HIV, after accounting for demographic characteristics, lifestyle-related factors and a wide range of comorbidities. The association appeared to be particularly pronounced among people of black ethnicity, with HIV associated with a 3.8-fold higher risk of COVID-19 death in this group.

Comparison with other evidence
Few comparable data have been published. Our results are similar to those from a population-based cohort study from the Western Cape in South Africa involving 3.5 million individuals, published as a preprint, which found an overall adjusted HR of 2.14 (1.98-3.43), compared with 2.30 (1.55-3.41) in the present study. 8 It is noteworthy that the association was similar in an English setting with higher levels of antiretroviral treatment and viral suppression; however, this is consistent with stratified analyses in the South African data, which found a more than doubled risk of COVID-19 death even among those with a recent prescription for antiretroviral treatment and a recent viral load measure of <1000 copies/ml. Early data from small cohorts and case series in other countries included a high proportion of patients experiencing severe disease. [22][23][24] A single-centre cohort in Madrid found that among 51 people living with HIV and diagnosed with COVID-19, 28 required hospital admission, and 13 had severe disease; the mortality rate among COVID-19 cases co-infected with HIV in this study was noted to be around double that in similar aged individuals in the general population. 9 Associations between HIV and adverse COVID-19 outcomes appear to be smaller among patients already hospitalised with COVID-19: the HR for mortality was attenuated to 1.45 (1.14-1.84) in the Western Cape study when restricted to hospitalised patients, 8 while an analysis of patients hospitalised in New York found no difference in adverse outcomes between HIV infected and demographically similar uninfected individuals. 11 However, any role that HIV may have in increasing the risk of infection and/or development of severe disease is effectively conditioned out by restriction to hospitalised individuals who are already infected with SARS-COV-2 and likely to be experiencing severe disease at the point of inclusion. 25 The larger association between HIV and COVID-19 death among people of black ethnicity has not previously been described. We have previously shown a larger overall risk of COVID-19 death in black and minority ethnic (BAME) compared with white groups in England, 2 and a systematic review suggested that BAME individuals may have a higher risk of both acquiring COVID-19 infection, and experiencing worse clinical outcomes. 26 HIV infection could plausibly exacerbate both of these, though It is not possible to delineate from our study which parts of the pathway are more or less affected by HIV status. BAME individuals are also at higher risk of negative HIV outcomes including virological rebound. 27 Over a quarter of people with HIV in our study were black, compared to just 1.9% in the general population comparison group; understanding the reasons for the disproportionately large association between HIV and COVID-19 mortality in this group will be a priority if effective policies are to be developed to mitigate any increased risks. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2020. . https://doi.org/10.1101/2020.08.07.20169490 doi: medRxiv preprint

Strengths and limitations
Our study was large, so that even in a setting with a relatively low prevalence of HIV, we were able to investigate the role of HIV accounting for demographic characteristics, lifestyle-related factors including BMI and smoking, and relevant comorbidities. We tested the robustness of our findings to different missing data approaches and to non-proportional hazards. We investigated effect modification by key covariates, and were able to detect differences by ethnicity. However the relatively small number of deaths among those with HIV, reflecting the young age distribution of the HIV group, prevented definitive conclusions about the role of comorbidities, and changes in the role of HIV over time. We noted a suggestion of a larger association between HIV and COVID-19 death early in the epidemic. If confirmed, this might suggest a higher risk of infection before widespread social distancing was introduced, which could have reduced if people living with HIV were highly adherent to social distancing guidance; furthermore, some people living with HIV initially received incorrect government advice that they were considered in the highest risk category and should take extra "shielding" precautions, which might have further reduced any HIV-associated risk. 28 We could not rule out that the observed trend over time reflected chance variation; analysis of pre-mortality indicators of severe disease, such as hospitalisation, would have greater statistical power to clarify this, but we could not obtain linked hospital data for people living with HIV, because HIV codes are considered sensitive in NHS data and transfer is legally restricted. 12,29 For similar reasons, we were unable to include data on antiretroviral therapy use, level of viral suppression, or CD4 count, so it was not possible to stratify results, or quantify how much of the raised risk was driven by the minority of individuals with poorly controlled HIV. HIV in the UK is commonly managed in specialist clinics, but restrictions on the sharing of HIV-related codes along with policy guidance that cautions against sharing HIVrelated information with the patient's primary care provider meant that we could not access key data from routine HIV care. Whilst stratification by such factors would have added additional insights, we do not consider the lack of these data to critically undermine the utility of our findings, given that 94% of diagnosed HIV-positive individuals in the UK are on antiretrovirals and have good viral suppression (97% on therapy, and 97% of these with undetectable viral load). 13 This high level of viral suppression in the population means that it is unlikely that the association between HIV and COVID-19 death could have been driven entirely by the 6% assumed untreated or with detectable viral load: if all deaths had been within this group, it would amount to over 1.5% becoming infected and dying from COVID-19, compared with just <0.01% of the general population. There are no available routinely-collected data on injection drug use, which could act as a confounder, though a post-hoc adjustment for history of hepatitis C infection, as a marker of possible prior injection drug use, made little difference to our results. No data are available on occupation, or any markers for contact patterns, which could have helped to evaluate the role of exposure to infection. A further limitation is that we relied on primary care records to capture HIV status. HIV diagnoses recorded in primary care are expected to be highly specific, but some is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2020. . care were more likely to be missed. Our outcome definition could also have been affected by some misclassification. We included clinically suspected (non-laboratory confirmed) COVID-19 deaths: due to limited testing early in the pandemic, some non-COVID-19 deaths may have been misclassified as COVID-19 deaths. We assume any such misclassification will have been non-differential with respect to HIV status, which could again have led to a degree of bias towards the null. Our study population, which was derived from one particular EHR software system, is not fully geographically representative of the broader population of England, due to substantial geographic variation in primary care provider choice of EHR software. In particular, London, which has a relatively high prevalence of HIV nationally, is under-represented in our data.

Implications for public health
Our findings suggest that people living with HIV may be a high risk group for COVID-19-death. This indicates a need to consider targeted policies for this group as universal restrictions are eased and revised social distancing and workplace policies are introduced. People living with HIV may also need priority consideration if and when a vaccine against SARS-CoV-2 becomes available. Our findings might also have significant implications globally.
Levels of antiretroviral therapy use and viral suppression in England are high, while prevalence of HIV is relatively low in global terms; 13 the impact of HIV on the progression of the pandemic in other settings will need to be carefully monitored. Future studies should prioritise better understanding the drivers of increased COVID-19 mortality among people living with HIV, the role of ethnicity, and how the association between HIV and severe COVID-19 outcomes is affected by comorbidity status, viral load, CD4 count, and antiretroviral treatment.
As well as their role in controlling HIV infection, there is also interest in whether specific antiretroviral drugs might have potential in the treatment or prevention of SARS-Cov-2 infection. 30 In England, despite a strong overall infrastructure of linkable electronic health records data, research on these and other questions that might benefit people living with HIV is hampered by policy guidance that has generally led to restrictions in the sharing and flow of HIV-related data, despite apparent support for sharing for public health purposes. 31 We propose a review of guidance to address this and ensure that people living with HIV are not unnecessarily excluded from important research."

Conclusion
In this large population-based study, people living with HIV in England had more than double the risk of COVID-19 death than people without HIV, after accounting for demographic characteristics, lifestyle-related factors and comorbidities. Monitoring the role of HIV on COVID-19 outcomes in other settings will be crucial as the pandemic progresses. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2020. . https://doi.org/10.1101/2020.08.07.20169490 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2020. . is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2020. . is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2020. . https://doi.org/10.1101/2020.08.07.20169490 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2020. . https://doi.org/10.1101/2020.08.07.20169490 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2020. . https://doi.org/10.1101/2020.08.07.20169490 doi: medRxiv preprint  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 7, 2020. . https://doi.org/10.1101/2020.08.07.20169490 doi: medRxiv preprint