Variability in HIV-1 Integrase Gene and 3′-Polypurine Tract Sequences in Cameroon Clinical Isolates, and Implications for Integrase Inhibitors Efficacy

Integrase strand-transfer inhibitors (INSTIs) are now included in preferred first-line antiretroviral therapy (ART) for HIV-infected adults. Studies of Western clade-B HIV-1 show increased resistance to INSTIs following mutations in integrase and nef 3′polypurine tract (3′-PPT). With anticipated shifts in Africa (where 25.6-million HIV-infected people resides) to INSTIs-based ART, it is critical to monitor patients in African countries for resistance-associated mutations (RAMs) affecting INSTIs efficacy. We analyzed HIV-1 integrase and 3′-PPT sequences in 345 clinical samples from INSTIs-naïve HIV-infected Cameroonians for polymorphisms and RAMs that affect INSTIs. Phylogeny showed high genetic diversity, with the predominance of HIV-1 CRF02_AG. Major INSTIs RAMs T66A and N155K were found in two (0.6%) samples. Integrase polymorphic and accessory RAMs found included T97A, E157Q, A128T, M50I, S119R, L74M, L74I, S230N, and E138D (0.3′23.5% of samples). Ten (3.2%) samples had both I72V+L74M, L74M+T97A, or I72V+T97A mutations; thirty-one (9.8%) had 3′-PPT mutations. The low frequency of major INSTIs RAMs shows that INSTIs-based ART can be successfully used in Cameroon. Several samples had ≥1 INSTIs accessory RAMs known to reduce INSTIs efficacy; thus, INSTIs-based ART would require genetic surveillance. The 3′-PPT mutations could also affect INSTIs. For patients failing INSTIs-based ART with no INSTIs RAMs, monitoring 3′-PPT sequences could reveal treatment failure etiology.


Introduction
During HIV replication, the integrase enzyme catalyzes the ligation of viral reverse-transcribed DNA into the chromosome of the host cells. This integration is a critical step for viral replication, enabling the proviral DNA to persist in the host cells and form a permanent viral template that can INSTIs-based ART. Because it has been shown that mutations located outside the integrase gene, in the viral 3 polypurine tract (PPT, located in the 3 end of nef), can confer resistance to INSTIs [25][26][27], we also sequenced and analyzed nef and the PPT viral genome region in these samples. We further analyzed the HIV integrase gene and PPT sequences from 215 additional patient samples from Cameroon, obtained from the Los Alamos database [28], for INSTIs RAMs and natural polymorphisms that could affect INSTIs-based therapy.

Patient Demographics
We collected samples from 235 HIV-infected subjects in Cameroon, of which we successfully amplified and sequenced the integrase and/or nef genes in 130 samples (Table 1). This cohort of 130 patients included 39 males [mean age: 37.28 ± 9.94 years] and 91 females (35.43 ± 8.75 years). The mean plasma viral load and CD4+ T-cells count at the time of specimen collection were, respectively, 4.64 ± 1.54 log copies/mL and 360.3 ± 320.4 cells /µL for males and 4.24 ± 1.50 log copies/mL and 311.2 ± 184.5 cells /µL for females. Of the samples with successfully amplified and sequenced integrase and nef genes, 33 (84.61%) males and 66 (72.52%) females were treatment-naïve, 5 (12.82%) males and 25 (27.47%) females were on ART (1st-line regimens), 1 female had discontinued treatment, and treatment information was not available for 1 male. None of the cohort patients had been exposed to INSTIs at the time of enrolment and specimen collection. We also analyzed 215 full-length Cameroon HIV-1 sequences (samples collected from HIV+ Cameroonians between 1991 and 2014, when INSTIs was not available in Cameroon and many other countries) previously deposited in the Los Alamos HIV sequence database (see Table S1). Most of the database sequences information did not include demographics, treatment status, CD4 levels, or viral loads data. ART-Naïve (N) 33 66 ART-Experienced (N) 5 25 N: sample size; SD: standard deviation, ART: antiretroviral therapy.

INSTIs Resistance-Associated Mutations in Subjects Infected with HIV-1 CRF02_AG and Non-AG Viruses
The INSTIs accessory RAMs identified in both cohort and database samples included T97A, E157Q, M50I, L74M, L74I, and S119R. The INSTIs accessory RAMs identified in both HIV-1 CRF02_AG and non-AG cohort samples included T97A, M50I, L74M, and L74I (in 1.3% to 29.3% of AG and in 4% to 16% of non-AG cohort samples). E157Q and S119R mutations were observed only in cohort samples of AG subtype. There were no significant differences in the proportions of AG and non-AG cohort samples with INSTIs RAMs (Table 4).  The INSTIs accessory RAMs identified in both AG and AG database samples included T97A, E157Q, M50I, L74M, L74I, and S230N (in 3.5% to 60% of AG and in 1.4% to 86.5% of non-AG database samples). A significantly higher proportion of non-AG database samples (86.56%) had the M50I mutation compared to 17.6% in database AG samples (p < 0.000001, Table 4). A significantly higher proportion of AG database samples (21%) had the L74M mutation compared to 4.2% in database non-AG samples (p < 0.005, Table 4). The proportion of AG and non-AG database samples harboring other INSTIs RAMs was not significantly different and T66A, N155K, A128T, S119R, and S230N were observed only in a very small number (1.4%) of database samples of non-AG genotypes (Table 4).
Integrase polymorphisms in both cohort and database samples: Overall, T124A, G134N, I135V, K136T, T206S, and R269K polymorphisms were significantly more prevalent in AG compared to non-AG samples, whereas A21T, K136Q, D167E, and D256E polymorphisms were significantly more prevalent in non-AG compared to AG samples.

Effects of ART and Immune Function on Integrase RAMs and Natural Polymorphisms
Additional analyses of cohort patients showed no significant differences in the occurrence of gene polymorphism or gene mutations among subjects who were treatment-naïve and those on ART. Similarly, there were no significant differences in the occurrence of gene polymorphism or INSTIs RAMs in cohort patients with CD4+ T-cell counts above 200 cells/µL (66%) and those with CD4+ T-cell counts below 200 cells/µL (34%). Integrase sequences for Cameroon HIV isolates downloaded from the Los Alamos database did not have information on patient's treatment status or levels of CD4+ cells. Additional analyses of cohort patients showed no significant differences in the occurrence of gene polymorphism or gene mutations among subjects who were treatment-naïve and those on ART. Similarly, there were no significant differences in the occurrence of gene polymorphism or INSTIs RAMs in cohort patients with CD4+ T-cell counts above 200 cells/L (66%) and those with CD4+ Tcell counts below 200 cells/L (34%). Integrase sequences for Cameroon HIV isolates downloaded from the Los Alamos database did not have information on patient's treatment status or levels of CD4+ cells.

Analysis of 3'-PPT and 5' Terminal Nucleotides of 3' Long Terminal Repeat
Because mutations in the HIV-1 nef / 3'-PPT region can confer resistance to INSTIs [25][26][27], we also analyzed the sequences of the 3'-PPT viral genome, as well as the 8

Discussion
Despite INSTIs' high efficacy and superiority compared to other ARV drug classes, studies in high-income countries show that resistance to INSTIs can occur, associated with transmitted and/or acquired DRMs, leading to decreased susceptibility to INSTIs and treatment failure [20,21]. The integrase mutations often associated with reduced susceptibility to INSTIs include both nonpolymorphic and polymorphic mutations [1,22,23]. In 2018, ART regimens including 2 ndgeneration INSTIs such as DTG constituted only 4% of 1 st -line regimens and 6% of 2 nd -line regimens worldwide [12,16]. However, INSTIs are now part of the WHO-recommended alternative 1 st -line regimens for PLWH [7,11,12]. With ongoing efforts to expand INSTIs availability worldwide, it is projected that by the year 2025, up to 57% of PLWH will be on ART regimens containing DTG or other INSTIs [16]. Therefore, it is important to monitor PLWH for the presence of mutations in the integrase gene that can affect INSTIs efficacy and therapeutic outcomes. Furthermore, our current knowledge of INSTIs RAMs mostly comes from studies in western countries; patient-derived data on INSTIs RAMs in SSA are scarce and there is a major knowledge gap on INSTIs RAMs in SSA, whereas over two-thirds of the current 38 million PLWH reside in that region. Our current study contributes to filling that gap and addresses recent WHO recommendations that HIV drug resistance surveillance be implemented for INSTIs, including pretreatment DRMs [18,19]. We analyzed integrase gene sequences from Cameroon clinical isolates for the presence of mutations known to be

Discussion
Despite INSTIs' high efficacy and superiority compared to other ARV drug classes, studies in high-income countries show that resistance to INSTIs can occur, associated with transmitted and/or acquired DRMs, leading to decreased susceptibility to INSTIs and treatment failure [20,21]. The integrase mutations often associated with reduced susceptibility to INSTIs include both nonpolymorphic and polymorphic mutations [1,22,23]. In 2018, ART regimens including 2nd-generation INSTIs such as DTG constituted only 4% of 1st-line regimens and 6% of 2nd-line regimens worldwide [12,16]. However, INSTIs are now part of the WHO-recommended alternative 1st-line regimens for PLWH [7,11,12]. With ongoing efforts to expand INSTIs availability worldwide, it is projected that by the year 2025, up to 57% of PLWH will be on ART regimens containing DTG or other INSTIs [16]. Therefore, it is important to monitor PLWH for the presence of mutations in the integrase gene that can affect INSTIs efficacy and therapeutic outcomes. Furthermore, our current knowledge of INSTIs RAMs mostly comes from studies in western countries; patient-derived data on INSTIs RAMs in SSA are scarce and there is a major knowledge gap on INSTIs RAMs in SSA, whereas over two-thirds of the current 38 million PLWH reside in that region. Our current study contributes to filling that gap and addresses recent WHO recommendations that HIV drug resistance surveillance be implemented for INSTIs, including pretreatment DRMs [18,19]. We analyzed integrase gene sequences from Cameroon clinical isolates for the presence of mutations known to be associated with resistance or reduced susceptibility to INSTIs. Because mutations in the nef 3 -PPT region have been associated with resistance to INSTIs [25][26][27], we also analyzed nef gene sequences, their 3 -PPT regions, and sequences adjacent to the 3 -PPT regions.
Phylogeny of integrase and nef sequences showed high genetic diversity, with a predominance of AG infections (69-75% of cohort samples). This is similar to previous studies showing that over 60% of HIV+ Cameroonians harbored AG viruses [31][32][33]. The lower proportion of AG infections seen in database samples (28%) is likely due to overrepresentation of other genotypes such as HIV-1 groups O, N, and P from Cameroon in the database, from studies that specifically analyzed sequences from these subtypes [34], rather than sequences from randomly collected patient samples as in our cohort or other studies of Cameroon isolates [31][32][33].
Several other integrase accessory RAMs and polymorphic mutations were observed in our current study, including the INSTIs accessory RAMs T97A (in 2% of cohort and 7.4% of database samples), E157Q (in 6% of cohort and 1.4% of database samples), A128T (in 0.47% of database samples); the polymorphic mutations M50I (in 6% of cohort and 31% of database samples), S119R (in 4% of cohort and 0.47% of database samples), L74M (in 9% of cohort and 4% of database samples), L74I (in 25% of cohort and 23% of database samples), E138D (in 0.47% database samples), and S230N (in 0.93% of database samples). Ten of the 11 INSTIs accessory RAMs observed were in the integrase central core domain. This integrase region contains the endonuclease and polynucleotidyl transferase site and is involved in DNA substrate recognition, binding, and chromosomal integration of the newly synthesized double-strand viral DNA into the host genomic DNA [56][57][58]. The other mutation (S230N) was in the C-terminal domain, a region that helps stabilize the integrase-viral DNA complex [4][5][6]. The location of these mutations in regions involved in recognition, binding, integration, and stabilization of HIV into the host DNA shows the potential for these mutations to affect the integrase function and response to INSTIs. In fact, mutations in integrase aa residue 230 reduce DTG susceptibility by 3-fold [59]. T97A mutation can reduce EVG susceptibility by 3-fold [36], and combination of T97A substitution with other INSTIs RAMs markedly reduce HIV susceptibility to RAL [60,61] and DTG [62,63]. Similarly, although L74M and E157Q individually have minimal effect on the susceptibility to INSTIs, a combination of L74M and E157Q with other INSTIs RAMs result in reduced susceptibility to INSTIs. In fact, patients harboring viruses with L74M [61] or E157Q [64] mutations in addition to other integrase RAMs showed reduced susceptibility to DTG.
Six of our subjects (3 cohort and 3 database samples) were infected with viruses that had both I72V and L74M mutations; one cohort sample had both L74M and T97A, and 3 database samples had both I72V and T97A mutations. These subjects could be susceptible to resistance to INSTIs, as there is evidence that the presence of multiple integrase accessory RAMs or polymorphic mutations can increase viral fitness and reduce susceptibility to INSTIs, even in the absence of INSTIs major RAMs [40,65]. Furthermore, the fact that most of the integrase mutations observed in our study occurred in the central core and C-terminal domains suggests that they could alter proviral DNA binding, integration, and stabilization and affect INSTIs efficacy. Thus, vigilance and surveillance for INSTIs DRMs would be required in Cameroon and other SSA countries when they do shift to current WHO-recommended DTG/INSTIs-based ART.
Previous studies showed that polymorphic differences among HIV subtypes can affect viral fitness and resistance pathways [66] and that different viral subtypes may support different mutational pathways and this could result in subtype-based differences in drug resistance [67][68][69]. There is also evidence that natural polymorphisms associated with integrase activity and occurrence of resistance to INSTIs are subtype-dependent and subtype-specific polymorphic mutations in the integrase gene affect integrase DNA binding affinity when additional mutations are present and can influence INSTIs efficacy [66,68,[70][71][72]. Thus, we compared the occurrence of INSTIs RAMs among samples from subjects infected with HIV-1 CRF02_AG and subjects harboring non-AG subtypes. Our data showed polymorphisms and mutations in the integrase in both groups, similar to previous studies showing that mutations and resistance to INSTIs can occur in both clades-B, CRF02_AG and non-AG subtypes [70,71,73]. However, the frequencies of such mutations can vary based on viral genotype [74]. The aa substitutions E92Q, S119R, E138A, Y143R, G148H/R, and S230R/N are more prevalent in subtype-B than in non-B subtypes [71,[74][75][76], whereas mutations such as L74I/M, T97A, L101I, E157Q, T214A, and V201I are more prevalent in non-B subtypes compared to HIV-1 subtype B [71,74,77]. Because the vast majority of PLWH on INSTIs are people infected with HIV-1 subtype B, most INSTIs resistance-conferring mutations that have been characterized pertain to HIV-1 subtype B. It is likely that several mutations prevalent in non-B subtypes could affect the genetic barrier and INSTIs efficacy. In fact, computational modeling of INSTIs resistance development across different HIV-1 subtypes showed that compared to subtype B, the presence of M50I in subtypes A and C, L74I in subtypes A and CRF02_AG, G163R in CRF01_AE, and V165I in subtypes F and CRF01_AE would be associated with lower genetic barrier to resistance in these non-B clades [74]. With increased use of INSTIs by individuals infected with non-B viruses, it is important to monitor these subjects for viral escape in the context of selection pressure [74] and identify the RAM and polymorphic mutations that alter the genetic barrier to resistance and INSTIs efficacy and the role of viral genotype.
The genetic barrier is measured by evolutionary time to viral escape in the context of a selection pressure [74] and previous studies showed that development of HIV resistance to nucleoside reverse transcriptase inhibitors, non-nucleoside reverse transcriptase inhibitors, and protease inhibitors resulted in part to subtype-associated preferential codon usages and that different amino acid substitutions influenced the likelihood of development of drug resistance [78][79][80]. INSTIs are the newest class of anti-HIV drugs approved for use in HIV/AIDS treatment, and modeling of HIV-1 integrase sequences showed that viral subtypes and codon substitution would differentially affect the genetic barrier to the development of INSTIs resistance [74]. Most mutational pathways linked to higher or lower genetic barrier to resistance to INSTIs come from subtype B infected subjects in resource-rich countries. The role of baseline genetic diversity and subtype-specific polymorphism among non-B subtypes in the development of resistance to INSTIs has not been systematically investigated. With the increasing use of INSTIs worldwide, including in low-and middle-income countries where non-B subtypes predominate, monitoring of nucleotide substitutions that evolves into INSTIs-resistant viruses will help better understand the factors underlying the development of resistance to INSTIs across more diverse HIV subtypes.
Recent studies of patients failing DTG therapy showed that some patients with viral failure had no DRM in the viral integrase but instead had mutations in the nef 3 -PPT region [25][26][27]. These results suggested that mutations outside the integrase, in the 3 -PPT, can confer resistance to DTG and other INSTIs. Furthermore, the 3 -PPT is closely associated with the 5 terminal of the 3 LTR, and mutations within the six guanine residues (G-track) of the 3 -PPT can alter RNase H-mediated cleavage at the PPT 3 terminus, and integrase activity [81,82]. This suggests that mutations in the 3 -PPT G-track might confer an alternative pathway to resistance to DTG and other INSTIs. Thirty-one (9.8%) subjects in our study were infected with viruses that had mutations in the 3 -PPT, including 26 with mutations in the 3 -PPT G-track. The potential effects of these 3 -PPT mutations on integrase function and susceptibility to INSTIs are not known. If DTG/INSTIs-based ART is adopted in Cameroon, studies of patients failing INSTIs-based ART should include both analyses of these 3 -PPT mutations and integrase DRMs to elucidate the potential role of these mutations on the susceptibility to DTG and other INSTIs.
In summary, the current study of integrase and nef viral sequences from 345 HIV-infected Cameroonians showed only 2 subjects with INSTIs major RAMs, but several subjects with INSTIs accessory RAMs and polymorphic mutations, including 10 subjects harboring viruses that simultaneously had two different INSTIs accessory RAMs. Individually, INSTIs accessory RAMs do not have major effects on susceptibility to INSTIs, but the simultaneous presence of several accessory mutations or their presence in combination with other mutations has been associated with reduced susceptibility to INSTIs, increased viral fitness and virologic failure [40,[60][61][62][63][64][65]. With the current worldwide push to expand the use of DTG/INSTIs-based ART, it is inevitable that some patients on these regimens could experience virologic failure at some point during the course of their treatment. Genetic surveillance for management of such cases should include screening for the presence of INSTIs major RAMs, accessory and polymorphic mutations, and potential combinations of these, as well as mutations in the 3 -PPT region.

Ethics Statement
The study cohort samples were collected as part of an ongoing project aimed at analyzing the influence of HIV genetic diversity on viral neuropathogenesis in Cameroon. This study was conducted in accordance with the Declaration of Helsinki and the protocol was approved by the Cameroon National Ethics Committee (Ethical Clearance Authorization #146/CNE/SE/2012, approved on 13 June 2006 and re-approved on 2 May 2012), as well as the Institutional Review Board of the University of Nebraska Medical Center (UNMC) (IRB #307-06-FB, approved on 26 March 2007 and re-approved annually until 2018). Written informed consent was obtained from all study participants and data were processed using unique identifiers to ensure confidentiality.

Specimen Collection, HIV Serology, CD4 Cell Counts, and Viral Loads
Sample collection, serology, CD4 cell counts, and viral loads analyses were performed in the Hematology laboratory of the Yaoundé University Teaching Hospital or the International Reference Center "Chantal Biya", Cameroon, between 2008 and 2016. Venous blood samples were collected and stored at room temperature in the outpatient clinic and analyses performed in the Hematology laboratory within 6 h of blood collection. The HIV status of each participant was determined using the rapid immunochromatographic HIV-1/2 test (Abbott Diagnostics, Chicago, IL, USA) and the Murex HIV antigen/antibody Combination ELISA (Abbott Diagnostics), according to the manufacturer's instructions. A participant was considered HIV positive if he/she tested positive for the two tests, HIV negative if non-reactive for both tests, and discordant if reactive for only one test. No discordant result was observed in our study population.
CD4 T-lymphocyte counts were quantified by flow cytometry, using a Fluorescence-Activated Cell Sorting (FACS) Count Instrumentation System and the BD FACSCount CD4 reagent kit (BD Biosciences, San Jose, CA, USA), according to the manufacturer's instructions. The FACS instrument was calibrated and quality control tested before each experiment. For viral loads quantification, HIV plasma viral load was quantified by reverse transcription polymerase chain reaction (RT-PCR), using Amplicor HIV-1 Monitor Test (Roche Diagnostic Systems, Pleasanton, CA, USA), according to the manufacturer's protocol. The assay detection limit was 40 viral RNA copies/mL. For additional molecular analyses, plasma samples were stored in 1mL aliquots at −80 • C until further use.

RNA Extraction and PCR Amplification
Plasma samples were shipped on dry ice (−70°C) to UNMC, where sequencing and analyses of the integrase and nef (viral 3 -PPT region) genes were performed. HIV-1 RNA was extracted from plasma using QIAamp Viral RNA mini kit (Qiagen Inc., Germantown, MD, USA) per manufacturer's protocol. cDNA synthesis and 1st round of PCR amplification of integrase and nef genes was performed using SuperScript TM III One-Step RT-PCR system (Life Technologies, Grand Island, NY, USA). The 50 µL reaction volume contained 500 ng of purified RNA, 25 µL of 2X reaction buffer, 10 pMol of forward and reverse primers (Table 6) and 1 µL reverse transcriptase /Platinum Taq DNA polymerase mix. Reverse transcription was carried out at 50°C for 1 h followed by PCR consisting of an initial denaturation at 94°C for 2 min; followed by 40 cycles of 94°C, 15 s; 55°C, 30 s; 68°C, 1 min; and a final extension step at 72°C, 10 min. The nested PCR amplification of integrase and nef genes was carried out in a total volume of 50 µL containing 5 µL of the 1st-round PCR amplicon, 25 µL of 2X KAPA HiFi HotStart ready Mix (KAPA Biosystems, Wilmington, MA, USA), 10 pMol of forward and reverse primers. The nested PCR thermal cycling condition were 95°C, 5 min; followed by 35 cycles of 94°C, 10s, 60 C, 30s; 72°C, 1 min; and a final extension step at 72°C, 10 min. Amplicons were detected by agarose gel electrophoresis (1% agarose), visualized by ethidium bromide (0.5 µg/mL) staining under ultraviolet light (260 nm), and images captured using an automated gel documentation system (Syngene, Frederick, MD). Sequences of the primers used for PCR amplifications are detailed in Table 6.

Gene Sequencing
PCR products were purified using PureLink Quick PCR Purification Kit (Invitrogen, Carlsbad, CA, USA) and subjected to double-strand DNA sequencing to cover the entire amplicon using a set of sequencing primers ( Table 6). The sequencing reactions were carried out using the BigDye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems, Foster City, CA, USA) per manufacturer's instructions, followed by capillary electrophoresis performed on an Applied Biosystems PRISM 3730 Genetic Analyzer at the UNMC DNA Sequencing Core facility. Sequences of the primers used for DNA sequencing are described in Table 6.

Sequence Analysis of the Study Cohort Samples
Raw sequence data were manually edited, spliced, and assembled by Sequencher v4.9 to generate the final contig. Multiple sequence alignment of full-length HIV-1 integrase gene was performed with all known HIV-1 group M reference sequences, using Clustal W [85]. All the reference sequences were obtained from the Los Alamos HIV Sequence Database. Phylogenetic trees were constructed using the neighbor-joining method, as well as the Maximum Likelihood method and General Time Reversible model, with 1000 bootstrap replication tests, using MEGA.v.6.0 software [86]. Samples' HIV subtypes were determined using the NCBI viral genotyping tool. The INSTIs drug resistance associated mutation screening was done using the Stanford University HIV Drug resistance Database v.8.7. [87]. The 3 -PPT and its flanking nucleotide sequences were curated from the full-length nef gene sequences, and their sequence alignments performed using HXB2 as a reference sequence and the Clustal W program [85].

Sequence Analysis of the Database Samples
We analyzed additional Cameroon HIV-1 sequences (integrase and nef/3 -PPT sequences) using full-length HIV-1 sequences from Cameroon previously submitted to the Los Alamos HIV sequence database [28]. We downloaded all Cameroon sequences available in the database and after eliminating duplicate sequences from the same patient, a total of 215 full-length sequences were included in the analysis. Full-length integrase gene sequences for these 215 samples were used to screen for INSTIs RAMs, using the Stanford University HIV Drug resistance Database v.8.7. [87]. The 3 -PPT and its flanking nucleotide sequences were curated from the full length HIV-1 genomic sequences, and their sequence alignments performed using HXB2 as reference sequence and the Clustal W program [85].

Statistical Analyses
Comparative analyses of males' and females' demographic data were performed using the Student's t-tests (for continuous variables) and Fisher's exact test (for binary variables). Descriptive statistics including counts and percentages were used to summarize gene polymorphism or mutation occurrences for both cohort and database samples. Fisher's exact tests were used to compare the proportion of gene polymorphism or mutation occurrences between groups for each gene. False discovery rate (FDR) was controlled to be no more than 0.05 to account for multiple comparisons [88]. All analyses were done using SAS version 9.4 or Prims version 7.0d.

Data Availability
The full-length HIV-1 integrase and nef gene sequences generated from this study are available in the NCBI database with GenBank accession numbers: MK327828-MK327927 (integrase sequences) and MK333810-MK333910 (nef [including 3 -PPT] sequences). Funding: This work was supported by grants from the National Institute of Mental Health MH094160 and the Fogarty International Center. Nebraska Medical Center High-Throughput DNA Sequencing and Genotyping Core Facility for assistance with gene sequencing.

Conflicts of Interest:
The authors declare no conflict of interest.