Endemic HBV among hospital in-patients in Bangladesh, including evidence of occult infection

Bangladesh is one of the top-ten most heavily burdened countries for viral hepatitis, with hepatitis B (HBV) infections responsible for the majority of cases. Recombinant and occult HBV infections (OBI) have been reported previously in the region. We investigated an adult fever cohort (n=201) recruited in Dhaka, to determine the prevalence of HBV and OBI. A target-enrichment deep sequencing pipeline was applied to samples with HBV DNA >3.0 log10 IU ml−1. HBV infection was present in 16/201 (8 %), among whom 3/16 (19 %) were defined as OBI (HBsAg-negative but detectable HBV DNA). Whole genome deep sequences (WGS) were obtained for four cases, identifying genotypes A, C and D. One OBI case had sufficient DNA for sequencing, revealing multiple polymorphisms in the surface gene that may contribute to the occult phenotype. We identified mutations associated with nucleos(t)ide analogue resistance in 3/4 samples sequenced, although the clinical significance in this cohort is unknown. The high prevalence of HBV in this setting illustrates the importance of opportunistic clinical screening and DNA testing of transfusion products to minimise OBI transmission. WGS can inform understanding of diverse disease phenotypes, supporting progress towards international targets for HBV elimination.


INTRODUCTION
Estimates suggest that approximately a third of the world's population has been exposed to hepatitis B virus (HBV), with chronic HBV infection (CHB) affecting more than 260 million individuals worldwide, leading to 800 000 deaths annually [1]. Ambitious elimination targets have been established, linked to UN Sustainable Development Goals [2]. Bangladesh has an intermediate CHB prevalence, estimated at 2-6 % [3,4], although epidemiology varies between regions and according to sociodemographic factors [5]. In combination with the large population, this HBV prevalence puts Bangladesh in the top-ten highest-burdened countries for viral hepatitis worldwide [6], with the perinatal incidence of new infections among the highest in South Asia [7].

ACCESS
HBV prevalence estimates are typically based on HBsAg screening. However, this does not account for occult HBV infection (OBI), in which individuals are HBV DNA positive, but HBsAg negative (usually in combination with a positive anti-HBc antibody) [8]. OBI can arise in several contexts. Typically, when HBsAg is undetectable, corresponding viral loads (VL) are low (<200 IU ml −1 ), reflecting minimal production of HBsAg, or impaired egress from hepatocytes. A number of mutations have been linked to this low-HBsAg phenotype [8][9][10]. Occult phenotypes may also be driven by anti-HBs seroconversion [11,12]. Mutations in HBsAg affecting antibody binding in diagnostic assays have also been described, meaning HBsAg is expressed but not detected ('false-occult' infections) [8]. This is pertinent for assays that rely on detection of the 'a' determinant of the small HBsAg protein [13]; substitutions, insertions and deletions in this region can result in OBI [14].
OBI has been described in Bangladesh in a range of different contexts [15,16], with case reports of OBI transmission documented [17,18]. The majority of clinical and epidemiological studies overlook OBI, as HBV DNA screening is expensive. However, recommendations for screening blood products increasingly include HBV DNA testing in order to avoid OBI transmission [19]. Recognising the high burden of HBV infection in Bangladesh, and limited data about OBI, we set out to investigate the prevalence and molecular characteristics of HBV and OBI in a hospital cohort.

Study settings and clinical cohort
We screened a prospective observational fever cohort in Bangladesh to opportunistically study characteristics of HBV infection and OBI. This hospital cohort was recruited to examine causes of febrile illness, and it was anticipated that HBV infection would not be relevant to the diagnosis for which patients had presented. Serum samples were collected from adults (≥18 years of age) recruited between June-October 2017 at two sites: (  [20]) with an automated platform to detect and quantify HBV DNA. HBV DNA positive samples also underwent HBeAg testing at OUH, when sufficient sample was available (Abbott Architect i2000SR).

Clinical follow-up
Patients who tested HBV positive were informed about the result and referred to Hepatology services for clinical assessment, management and follow-up.

DNA extraction and sequencing
Serum volumes ≤0.5 ml were made available for the study. Samples with HBV DNA VL ≥3.0 log 10 IU ml −1 underwent a target-enrichment approach for HBV whole genome sequencing (WGS) on an Illumina Mi-Seq platform in Oxford, UK. This threshold is determined by the sensitivity of current deep sequencing approaches for WGS [21,22]. DNA was extracted from serum using the NucliSENS magnetic extraction system (bioMérieux). A completion-ligation reaction was performed to convert the partially dsDNA genome into a fully dsDNA molecule [23], after which DNA was purified using Agencourt RNAClean XP magnetic beads (Beckman Coulter).
Sequencing libraries were generated using the NEBNext Ultra II DNA Library Prep Kit (New England Biolabs) and libraries were assessed using TapeStation system (Agilent) and Qubit dsDNA HS Assay (Thermo Fisher Scientific). An adapted target-enrichment workflow was applied to the SeqCap EZ (Roche) protocol, using custom-designed HBV probes ordered from IDT (xGen Lockdown Probes). Samples were sequenced on an Illumina Mi-Seq using a v3 300 bp paired end kit.

Analysis of sequence data
Deep sequencing read pairs were de-multiplexed using QUASR v7.01 and adapter sequences trimmed with CutAdapt v1.7.1 [24]. Short reads (< 50 bp length) and reads mapping to the human genome reference sequence (identified using Bowtie v2.2.4 [25] were discarded. Remaining reads were mapped to HBV reference sequences representing genotypes A-I using BWA mem v0.7.10 [26], to select the most appropriate reference sequence and to identify HBV reads. Reads were then mapped against the reference sequence using BWA mem and consensus sequences were derived. Simmonics Sequence Editor [27] was used for alignment and sequence examination, using the genotype A sequence X02763 as a numbering reference. Maximum-likelihood phylogenetic analysis of consensus sequences was performed with mega7 [28] with 1000 bootstrap replicates used. Trees were visualised in Figtree [29]. Sequences were analysed with reference sequences [30] and 61 published full length HBV genome sequences from Bangladesh identified in Genbank.
We looked for evidence of dual infections in the mapping of reads to HBV genotypes A-I and consensus sequences were further checked for evidence of recombination by bootscan analysis using RDP4 [31].

Identification of potential resistance associated mutations (RAMs) and vaccine escape mutations (VEMs)
We referred to previously published lists of HBV RAMs for lamivudine (3TC) [32] and tenofovir disoproxil fumarate (TDF) [33]. VEMs are most commonly reported within the HBV surface antigen (HBsAg) 'a' determinant (residues 124-147), which is the major target of neutralising antibodies. We therefore focused scrutiny on this region, searching for evidence of previously catalogued polymorphisms Q129H/R, M133L, F/Y134I/L/T, K141E, P142S and G145R/A [34]. We also added A128V, as this has been independently reported as a VEM in Bangladesh [35,36].

Statistics
Data were analysed using GraphPad Prism version 7.0 (San Diego, CA, USA). Two-sided P values were calculated using the Chi-square and Fisher's exact test for dichotomous and ordinal variables. Continuous variables were compared using one-way ANOVA and the Mann-Whitney U-test. Demographic factors and clinical characteristics were summarized with counts (%) for categorical variables and median (interquartile range [IQR]) and mean (±standard deviation) for continuous variables.

Characteristics of HBV infections and OBI
Median VL of the samples with detectable HBV DNA was 2.6 log 10 IU ml −1 (IQR 1.9-5.0 log 10 IU ml −1 ), excluding one sample with a VL below the limit of quantification (<1.0 IU ml −1 ). HBeAg testing was available for 12/13 HBV DNA positive samples, of which 1/12 samples (8.3 %; ID 118) was HBeAg-positive. Interestingly this HBeAg-positive sample was also an occult infection. HBV exposure was more common amongst homemakers (P=0.006) and less frequent in students (P=0.02), but there was no difference according to age or sex (Table S1). OBI arose only in males ≥60 years of age, which is significantly older than the age of the other HBV-positive subjects (Fig. 1a, P=0.02). Median HBV VL of the OBI cases (2.6 log 10 IU ml −1 , IQR 1.1-5.3 log 10 IU ml −1 ) was comparable to VL in HBsAg-positive individuals (2.4 log 10 IU ml −1 , IQR 1.8-4.7 log 10 IU ml −1 ) (Fig. 1b, P=0.94). One of the OBI infections (sample ID 118) presented with VL 5.3 log 10 IU ml −1 , which indicates ongoing high levels of viral replication, despite the absence of HBsAg detection.

Identification of HBV genotypes A, C and D
We obtained full length HBV genome sequences from serum from all four individuals with HBV VL ≥3.0 log 10 IU ml −1 , which included 1/3 of the OBI cases (ID 118) (Fig. 1c); the remaining two OBI cases had insufficient VL for WGS at 1.1 log 10 IU ml −1 (ID 018) and 2.57 log 10 IU ml −1 (ID 034). Phylogenetic analysis indicated two genotype D infections, and one each of genotypes A and C (Fig. 2). The OBI sequence grouped with other genotype C sequences from Bangladesh (Table S2). Based on the phylogeny of full-length sequences and bootscan analysis, we did not find evidence of dual infection or recombination in this dataset.
The consensus sequence for sample 197 indicated a truncated HBV e-antigen (HBeAg) protein, based on a G1986A mutation present in 57 % reads, but there was insufficient sample to test this sample for HBeAg. The OBI sequence (ID 118) and another sequence (ID 051) also expressed the truncated HBeAg protein as a minority variant, present in 32 % of reads in both samples. However, HBeAg was still detected in sample 118 at the time of testing. This HBeAg truncation is welldescribed, associated with progression to HBeAg-negativity [37].

Investigation of OBI sequence polymorphisms
We examined the surface gene of OBI sample (ID 118) for the presence of polymorphisms previously linked to OBI [38]. There were no HBsAg (pre-S1, pre-S2 and S) OBI-associated variants in the consensus sequence. However, examining the deep sequencing reads, we identified several potentially relevant sequence changes including pre-S1 and pre-S2 deletions, evidence of truncated S gene products and possible vaccine escape mutations in the 'a determinant' [38] (Fig. 3).
Analysis indicating the most common haplotypes identified at each site of interest is shown in Fig. S2.

HBV RAMs VEMs
We examined consensus reverse transcriptase (RT) sequences for RAMs and VEMs (Table 1). RT polymorphisms A181T and Q215P in sample 197 have been associated with 3TC resistance [39]. The overlapping reading frames in HBV mean that the A181T mutation also results in HBsAg W172*, which has been associated with progressive liver disease [40].
Emerging reports suggest that RT polymorphisms can be associated with reduced TDF susceptibility [41]. Among these, L217R was present as consensus in sample 118 (55 % of reads) and V191I was identified as a minority variant (28 % reads) ( Table 1). I233L was present as a minority variant (15 % of reads) in sample 076. The clinical significance of these substitutions is not well understood, but phenotypic resistance has most robustly been described in the presence of multiple RT RAMs [32,33].
VEMs M133T and F/Y134S were present as minority variants in the OBI sequence. A128V, a putative VEM, was identified at consensus level in both our genotype D sequences. However, based on subgenotype (D2 for both 051 and 197), valine is consensus at this position.

Epidemiology of HBV and OBI infection
Our study identified HBV infection in Bangladeshi adults, with a prevalence of 8 % in a tertiary hospital cohort, of which 3/16 (18.8 %) were OBI. Existing sequence data from Bangladesh suggest genotypes C and D each account for ~40 % of the total burden, and genotype A for the remainder, with a  Four full-length HBV consensus sequences generated in this study were analysed alongside HBV genotype reference sequences (for genotypes A-J) 34 and 61 sequences originating from Bangladesh identified in online databases (unlabelled branches, Table S2). Genotype A sequences are highlighted in blue, genotype C in red and genotype D in yellow. The four sequences generated in this study are indicated with arrows; we identified one genotype C1 sequence (occult infection case, sample 118), one genotype A1 (sample 076) and two genotype D2 sequences (051 and 197). Genotype clades not containing Bangladesh sequences have been collapsed. Bootstrap replicates were repeated 1000 times, and all branches with support >70 % are indicated.
high proportion of recombinants and mixed genotype infections [3,5,42]. Of note, we identified a genotype A1 isolate, more typically associated with transmission in Southern and Eastern Africa, although potentially increasingly prevalent in Asia [3,5].
The lower exposure of students may reflect younger age and better socioeconomic backgrounds, while OBI was more common in older men. However, we cannot make generalisations about HBV distribution in this setting, based on a small cohort recruited in a very specific clinical context. Larger studies recruiting a more generalisable population are required, along with the incorporation of anti-HBs screening to determine the impact of vaccination.

Defining and characterising OBI
Definitions of OBI vary, and identifying cases is dependent on the sensitivity of the platform in use for HBV DNA detection. Sequencing data from a case of OBI demonstrated a Fig. 3. Mutations identified in the Surface gene of sample 118 that may be linked with the OBI phenotype. The HBV surface gene is subdivided into three domains, pre-S1, pre-S2 and S. The proportion of reads containing polymorphisms are shown at various positions throughout the gene. Wt -wild-type, aa -amino acid. Deletions at the start of pre-S1 (causing truncated l-HBs proteins) and highlighted mutations in pre-S1 have all been associated with OBI in previous studies [38,51,52]. The mutated start codon of pre-S2 has been linked with an inability to express M-HBs [53]. The 'a' determinant (marked with a box at residues 124-147), is a major target of neutralising antibodies and widely-used target for diagnostic assays, and has a strong association with OBI mutations [10,14,51,54]. W182* has been linked to truncated HBsAg products in OBI cases [38], and the mutation is mirrored in the reverse transcriptase gene as V191I (potentially linked to TDF resistance). *The short deletion observed in pre-S2 has not been reported previously and the resulting phenotype is unclear. *Sequencing coverage in sample 051 was low (Fig. 1c), so analysis of deep sequence data in this case lacks sensitivity for detection of minor variants. †Polymorphisms associated with TDF resistance. ‡Polymorphisms associated with 3TC resistance.
combination of mutations and deletions, occurring in all three regions of the surface gene. None of the deletions was present at consensus level, illustrating the importance of deep sequencing data for identifying minority variants that may be driving uncommon phenotypes. Several of the mutations identified have been linked with the progression of severe liver disease, including W182*, thought to interfere with cell cycle regulation [43], typically in the context of low VL [40].
Short read sequencing provides advantages in sequencing depth, and deletion detection but the reconstruction of fulllength haplotypes remains challenging, making it difficult to determine linkage between mutations. This question could potentially be addressed with longer reads that can be obtained through PCR and Sanger sequencing, but unfortunately, we were constrained by limited sample volumes. Given that all mutations associated with OBI were identified only as minority variants; to produce an OBI phenotype this suggests that all genomes must carry at least one of the relevant mutations, but understanding of how these mutations interact to produce OBI is limited. Developing sensitive long-read sequencing approaches for HBV remains an important aspiration, enabling improved understanding of the interactions between different polymorphisms, and their contribution to disease states [44].

Evidence for drug and vaccine resistance
A previous meta-analysis of drug resistance in Bangladesh reported a prevalence of 3TC resistance of 11 %, but no TDF resistant motifs [35]. Although our cohort is small, it is striking that we nevertheless identified mutations linked to both 3TC and TDF resistance. In these treatment-naive patients we cannot confirm the in vivo significance of these RAMs, but our data do raise concern that these mutations are circulating; further work is needed to explore their prevalence and clinical significance.
Mutations in HBsAg at positions 133 and 134 are localised within a region known to contain major B-cell epitopes, and can therefore contribute to vaccine resistance [13]. A128V is a potential VEM, although its specific contribution to vaccine escape remains uncertain [45]. Previous reports of VEMs in Bangladesh [35], and a case report of a child [36], have also identified the A128V substitution, and we identified this variant in two of our sequences (both subgenotype D2). However, a subgenotype D2 reference sequence (MF925358) [30] has 128-valine, suggesting it is the most common residue in this subgenotype. This raises the possibility that subgenotype D2 might be more susceptible to vaccine resistance, but further work is required to substantiate this. The significance of these variants in our sequence data is uncertain, as we did not collect vaccine history data or measure vaccine-mediated antibody titres.

Caveats and limitations
We have undertaken HBV screening on only a small cohort and without longitudinal follow-up. Although HBV was presumed to be incidental to the reasons for presentation to hospital, we cannot exclude a selection bias. HBV infections were likely to be chronic, but we did not distinguish between anti-HBc IgM and IgG, so acute infection is also possible. Routine liver function tests were not available (and interpretation would have been confounded by acute illness). Subjects were not routinely screened for HCV, which has been associated with occult HBV infection and has an estimated prevalence of 1 % in this region [6,46].
The study highlights the current technical challenges of generating whole genome deep sequencing data for HBV, for which a typical threshold VL is >3.0 log 10 IU ml −1 [47,48]. The median VL of 2.6 log 10 IU ml −1 in this cohort illustrates the high proportion of cases with VL below the current sequencing sensitivity [48,49]. For this reason, we generated full length sequence data in only 4/13 cases, leaving the majority of HBV sequences 'under the radar' , including two OBI. Previous studies have indicated there can be multiple pathways driving occult phenotypes, including low HBsAg production (often in the context of correspondingly low VL), anti-HBs seroconversion, and 'false-occult' infections, associated with the failure of diagnostic tests [8]. Whilst the high VL of the OBI we sequenced may be suggestive of a diagnostic failure, the lack of highly sensitive sequencing methods for HBV limits our insight into the mechanisms behind the other two OBI cases. HBsAg testing is potentially more sensitive for HBV detection as HBsAg can be detectable in the absence of HBV DNA. However, sensitivities vary between assay platforms, and both biomarkers are imperfect proxies for inferring disease process in the liver. The use of an alternative HBsAg detection assay (potentially lowering the HBsAg LoD down to 0.005 IU ml −1 ) may have been informative [8]. These technical challenges, along with a paucity of well-studied longitudinal cohorts, remain barriers in furthering our understanding of HBV disease and transmission.

Implications for policy and clinical practice
The approach we adopted demonstrates the feasibility of routine HBV screening for adults admitted to hospital in Bangladesh. Improvements in ascertainment are key to advancing progress towards international elimination goals [6], while for individual patients facilitates referral into appropriate clinical care. Our data also highlight the importance of continued deployment of vaccination programmes with a focus on infant coverage [50]. Given the high burden of HBV infection in Bangladesh, there is an urgent need to expand our understanding of population and molecular epidemiology, in order to strengthen advocacy, education and research, and to inform investment in clinical care and public health.

Data availability
The datasets generated during and/or analysed during the current study are available in the following repositories: • GenBank accession numbers MT114170 -MT114173.
• Metadata file of demographic data and HBV screening results in the fever cohort (n=201 subjects) on Figshare -DOI: https:// doi. org/ 10. 6084/ m9. figshare. 11973930. • A STROBE statement has been submitted as a supplementary file with this manuscript.