Recombination in Avian Gamma-Coronavirus Infectious Bronchitis Virus

Recombination in the family Coronaviridae has been well documented and is thought to be a contributing factor in the emergence and evolution of different coronaviral genotypes as well as different species of coronavirus. However, there are limited data available on the frequency and extent of recombination in coronaviruses in nature and particularly for the avian gamma-coronaviruses where only recently the emergence of a turkey coronavirus has been attributed solely to recombination. In this study, the full-length genomes of eight avian gamma-coronavirus infectious bronchitis virus (IBV) isolates were sequenced and along with other full-length IBV genomes available from GenBank were analyzed for recombination. Evidence of recombination was found in every sequence analyzed and was distributed throughout the entire genome. Areas that have the highest occurrence of recombination are located in regions of the genome that code for nonstructural proteins 2, 3 and 16, and the structural spike glycoprotein. The extent of the recombination observed, suggests that this may be one of the principal mechanisms for generating genetic and antigenic diversity within IBV. These data indicate that reticulate evolutionary change due to recombination in IBV, likely plays a major role in the origin and adaptation of the virus leading to new genetic types and strains of the virus.


Introduction
Avian infectious bronchitis virus (IBV) is a gamma-coronavirus in the family Coronaviridae, the order Nidovirales, and the genus Coronavirus that causes a highly contagious upper-respiratory disease of domestic chickens. In layer type birds it can cause a drop in egg production and some strains are nephropathogenic. Infectious bronchitis remains one of the most widely reported respiratory diseases of chickens worldwide despite the routine usage of attenuated live vaccines to control the disease. Control of IBV is difficult because there is little to no cross-protection between the numerous different serotypes of the virus.
Infectious bronchitis virus is an enveloped, single-stranded, positive-sense RNA virus with a genome length of approximately 27 kb. The 3' end of the genome encodes four structural proteins; spike (S), envelope (E), membrane (M) and nucleocapsid (N) as well as several non-structural proteins [1]. The S glycoprotein of IBV forms projections on the surface of the virion. Spike is post-translationally cleaved into S1 and S2 subunits with the S1 subunit forming the outermost portion and S2 forming a stalk-like structure that is embedded in the viral membrane. The S1 subunit contains hypervariable regions that play a role in attachment to host cell receptors, and it contains conformationally-dependent virus-neutralizing and serotype-specific epitopes [2,3]. Spike is also involved in membrane fusion and viral entry into the host cell. The E and M proteins are integral membrane proteins involved in assembly of the virus. The N protein is closely associated with the viral genome and plays a role in replication. The 5' two-thirds of the genome, approximately 21 kb, encodes two polyproteins 1a and 1ab. A minus one frame-shift mechanism is used to translate the 1ab polyprotein. The polyproteins are post-translationally cleaved into 15 non-structural proteins (nsps), nsp 2-16 (IBV does not have an nsp1) that make up the replication complex. Key nsps encoded, include a papain-like protease 2 (PLP2) within nsp 3, a main protease (Mpro) within nsp 5, and the RNA-dependent RNA-polymerase (RdRp) within nsps 11 and 12. Genetic diversity in coronaviruses is due to adaptive evolution driven by high mutation rates and genetic recombination [4]. High mutation rates are attributed to minimal proof reading capabilities associated with the RdRp. Recombination is thought to be due to a unique template switching "copy-choice" mechanism during RNA replication [5]. Evidence of recombination among strains of IBV has been observed both experimentally and in the field [6][7][8][9][10][11]. The emergence of several alpha-and beta-coronaviruses has been attributed to recombination [12,13] but only recently was recombination shown to be the mechanism behind the emergence of a novel gamma-coronavirus, turkey coronavirus (TCoV) [14]. Although "hot spots" of recombination in the genome of IBV have been reported [9,15], a thorough study of recombination using multiple different strains across the entire genome has not been conducted.
In this study we sequenced and analyzed the entire genome of eight IBV strains that represent different serotypes that have not been previously sequenced, and we compared these sequences with other gamma-coronavirus full-length genome sequences available in GenBank for evidence of recombination [16]. Different serotypes of field viruses and vaccine type viruses were selected to provide a wide variety of sequences potentially capable of contributing gene fragments to recombinants.

Sequence Analysis
The full-length genomes of eight isolates of IBV were sequenced at 5× to 10× coverage, and the consensus sequences were assembled. The genome size (see the end of the 3'UTR in Table 1), organization of the genome, and the location and size of the open reading frames (ORFs) are listed in Table 1 for each of the viruses. The gene order is the same for all the viruses examined; 5'UTR-1a/ab-spike-3a-3b-Envelope-Membrane-4b-4c-5a-5b-Nucleocapsid-3'UTR. In addition, the genomes for CAV/CAV56b/91, DE/DE072/92, FL/FL18288/71, Mass/H120, Iowa/Iowa97/56 and JMK/JMK/64 contain ORF 6b between nucleocapsid and the 3'UTR.
The full-length genomes were aligned and phylogenetic trees were constructed using the Neighbor-joining, Minimum Evolution, Maximum Parsimony and UPGMA programs in MEGA4 [17]. The trees all had similar topology and bootstrap support, and a representative tree is shown in Figure 1. The feline coronavirus FCoV/FIPV/WSU-79-1146 and the beluga whale virus BelugaWhaleCoV/SW1/08 were included as out-groups. The wild bird viruses isolated from a munia (MuniaCoV/HKUY13/09), thrush (ThrushCoV/HKU12/09) and bulbul (BulBulCoV/HKU11/09) formed a unique clade, which is not surprising as this group might represent a new coronavirus genus provisionally designated Deltacoronavirus [18]. The remaining viruses separated into clades consisting of IBV isolates from the US and vaccine viruses, TCoV isolates, an IBV isolate from West Africa and IBV isolates from China and Taiwan.
Vaccines for IBV used in commercial poultry include the serotypes Mass, Conn, DE and Ark. The PeafowlCcV/GD/KQ6/03, CK/CH/LSD/051/06 and CK/CH/ZJ971/97 strains from China grouped with Mass type viruses indicating that they are closely related, which is not surprising since Mass type vaccines are used in China. The overall percent similarities between the various strains are listed in Supplemental Table 1. All IBV genomes examined are greater than 80% similar at the nucleotide level.   genomes available in the database for CK/CH/EP3, CK/CH/p65, and Mass/Beaudette were excluded from the analysis because they are viruses not found in the field. The recombination programs can be used to detect recombination without reference sequences, and our analysis was conducted without regard to date of isolation because that information was not available for some of the viruses. Although the programs attempt to identify major and minor parent sequences contributing to each recombinant, the data reported herein only represents sequences in other viruses that are most closely related to the sequence surrounding the transferred fragment (major sequence) and the sequence closely related to the transferred fragment (minor sequences) and doesn't imply origin or source of the transferred fragment. In many cases, the transferred fragment has undergone mutations making it difficult to identify all the endpoints for the major and minor sequences. In addition, some of the transferred fragments overlap suggesting that recombinations have occurred between recombinant viruses.
Twenty-five IBV strains were examined and the viruses with the most transferred fragments in Table 2 are CAV/56b/91 and Mass/H52 both with 8 fragments, and CK/CH/LSD/051/06 and GA98/0470/98 both with 7 fragments. The strains with the fewest transferred fragments are Iowa/Iowa97/56 and TW/2575/98 with only 2 transferred fragments and the CK/CH/BJ/97, Holte/Holte/54, and NGA/A116E7/06 strains with only 1 transferred fragment. The Ark/Ark-DPI-p11/81 and Ark/Ark-DPI-p101/81 strains are the same virus that was passaged 11 and 101 times in embryonated eggs, respectively. Both viruses share identical transferred fragments indicating that they have identical recombination history. In addition, Conn/Conn46/66 and Conn/Conn46/91 share the same recombination history (4 identical transferred fragments). The Conn/Conn46/66 field virus was used to produce an attenuated live vaccine, which is currently used in commercial poultry. Viruses that share the same recombination history are likely derived from the same parent virus suggesting that Conn/Conn46/91 is reisolated Conn vaccine derived from the Conn/Conn46/66 virus. The FL/FL18288/71 virus also shares all 4 transferred fragments with the Conn viruses, however; FL/FL18288/71 and Conn viruses are different serotypes suggesting that FL/FL18288/71 is a field virus that emerged due to point mutations accumulating in spike over time rather than from recombination. All 6 of the transferred fragments in CK/CH/ZJ971/97 are identical to all 6 of the transferred fragments in vaccine strain Mass/H120, providing compelling evidence that CK/CH/ZJ971/97 is reisolated Mass/H120 vaccine. That observation was also reported by Zhang et al. [23]. It is interesting that Mass/H52 (8 transferred fragments) and Mass/H120 (6 transferred fragments) share only 5 identical transferred fragments. The Mass/H52 and Mass/H120 viruses were isolated circa 1955 in the Netherlands and it is widely accepted that H stands for Holland, but it actually stands for Houben, the owner of the broiler farm where the viruses were isolated [24]. It is thought that Mass/H120 was derived from Mass/H52 but the actual relationship between the viruses is not certain. Our data indicates that they are not necessarily parent and progeny but they are closely related.
The Gray/Gray/60 and JMK/JMK/64 viruses share 99.7% nucleotide similarity across the entire genome and have 4 identical transferred fragments with JMK/JMK/64 having one additional fragment located in the 5'UTR, which is not found in Gray/Gray/60. It is well known that the Gray/Gray/60 virus is nephropathogenic, whereas the JMK/JMK/64 virus is strictly respirotropic. Perhaps sequence differences in the 5'UTR, which is involved in replication of the viral genome, play a role in the different pathobiologies observed for these viruses.      There is evidence that some transferred fragments in field viruses come from vaccines. As an example, CK/CH/LSD/051/06 has 3 of 7 and 2 of 7 transferred fragments in common with vaccine strains Mass/H52 and Mass/H120, respectively. In addition, the only fragments that USA viruses have in common with the viruses from China and Taiwan are fragments also associated with Mass type vaccines, which are used in both regions, providing further evidence that some of the fragments in field viruses come from vaccines. That result and the observation in Figure 1 that the viruses separated into clades based on geographic location also supports the conclusion that USA viruses have not recombined with Asian viruses.
A difference in the order of taxa in phylogenetic trees constructed from different regions of the genome is further evidence of recombination [25]. The ordering of taxa in sequential trees [26,27] was conducted and inconsistent phylogenetic relationships were observed for all of the examined virus strains across the entire genome, indicating a substantial amount of recombination (data not shown). There is a high number of breakpoints in the 1a region of the genome and immediately upstream of the S gene, which has been previously shown to be a 'hot spot' for recombination [9]. A phylogenetic compatibility matrix constructed at the 70% bootstrap level for 250 bp sequence fragments at 100 bp intervals also showed that recombination breakpoints were distributed throughout the IBV genomes (data not shown).
To determine recombination hot and cold spots, a recombination breakpoint distribution plot (Figure 3) was generated in RDP4 using a 200 nt window and 1,000 permutations [21]. No global hot-spot regions were observed in the 95% and 99% confidence thresholds (dotted lines at the top of the graph). The detectable recombination breakpoint positions are shown at the top of the figure and were distributed throughout the genome with a relatively high number clustered just upstream of the S gene. That region also had the highest breakpoint count within the 99% local hot/cold-spot confidence interval. A high number of breakpoints were also observed in the 1a region of the genome; nsp 2, nsp 3, and nsp 16, in the envelope and matrix protein genes and in a small area near the 3'UTR. Table 3 shows that nsp2, nsp3, nsp16 and spike genes were associated with the greatest number of transferred fragments, which is consistent with the location and number of breakpoints in Figure 3.
Recombination in the 1ab ORF area, which encodes the nonstructural proteins involved in the viral replication complex, has the potential to alter the pathogenicity of the virus [28]. The nsp 2 contains hydrophobic residues that likely anchor the replication complex to the Golgi [29]. The nsp 3 encodes the protease PLP2 which cleaves nsps 2, 3, and 4 and an area with ADP-ribose 1'-phosphatase (ADRP) activity. The protease PLP2 has been shown to have deubiquinating-like activity [30] and also to be a type I interferon (IFN) antagonist [31]. Changes in the amino acid composition of this area could affect the ability of the virus to replicate in a variety of cell types. The ADRP region of nsp 3 is conserved among coronaviruses [32,33], and a recent study suggested a biological role for the coronavirus ADRP in modulating the expression of pro-inflammatory immune modulators such as tumor necrosis factor alpha and interleukin-6 [34]. Recombination in this area could alter the pathogenicity of the virus by modulating host cytokine expression. The nsp16 is reported to be an S-adenosyl-L-methionine (AdoMet)-dependent RNA (nucleoside-2'O)-methyltransferase (2'O-MTase) responsible for capping the viral mRNA nascent transcripts [32]. An alteration in the efficiency of this protein could profoundly decrease not only viral replication but also pathogenicity. The spike glycoprotein of IBV on the surface of the virus plays a role in attachment to host cell receptors, membrane fusion and entry

Conclusions
In this study, evidence was obtained that recombination is occurring among avian coronavirus IBV isolates across their entire genome. Every sequence included in the analysis was recognized as a potential recipient of horizontally acquired sequences at some point in its viral evolutionary past. The nsp2, nsp3, nsp16 were associated with the greatest number of transferred fragments. In addition, the area immediately upstream of the spike gene had the highest number of recombination breakpoints. Breakpoints in the 1ab polyprotein gene have the potential to alter pathogenicity of the virus, and breakpoints near or in spike have the potential to lead to the emergence of new serotypes of IBV or new coronaviruses. Although the spike region determines the serotype of the virus, the remainder of the genome may be a mosaic of sequence fragments from a variety of gamma-coronaviruses. The only evidence of a gamma-coronavirus possibly recombining with an alpha or beta-coronavirus was the The PCR reaction was run on the same machine as the RT step and included a one-time initial denaturation step of 94 °C for 2 min, followed by 30 cycles of 94 °C for 30 s, 60 °C for 30 s and 72 °C for 3 min.
The PCR products were agarose gel purified using the QIAquick gel extraction kit (Qiagen, Valencia, CA, USA) according to the manufacturer's protocol. The PCR products were cloned into the TOPOXL vector using the TOPOXL cloning kit (Invitrogen, Carlsbad, CA, USA) according to manufacturer's protocol to prepare cDNA libraries for sequencing.
Plasmid DNA from the libraries of the cloned cDNA fragments for each virus was isolated using an alkaline lysis method modified for the 96-well format and incorporating both Hydra and Tomtek robots. Sequencing reactions were performed using the BigDye™ Terminator® Cycle Sequencing Kit Version 3.1 (Applied Biosystems, Foster City, CA, USA) and MJ Research (Watertown, MA, USA) thermocyclers. Sephadex filter plates were used to filter each reaction into Perkin-Elmer MicroAmp Optical 96-well plates. A 1/12-strength sequencing reaction on an ABI 3730 was used to sequence each clone from both the 5' and 3' ends.
Primers for one-step RT-PCR were specifically designed for each virus (Supplemental Table 2). Viral RNA was amplified using the Titan One Tube RT-PCR kit (Roche Diagnostics, Indianapolis, IN, USA) following manufacturer's instructions. A DNA Engine Peltier Thermocycler (Bio-Rad Laboratories, Inc., Hercules, CA, USA) was used for the RT-PCR reaction, which had the following steps: one cycle of 42 °C for 60 min and 95 °C for 5 min, followed by 10 cycles of 94 °C for 30 s, 50 °C for 30 s, and 68 °C for 1 min 30 s, and then 25 cycles of 94 °C for 30 s, 50 °C for 30 s, 68 °C for 1 min and 30 s adding 5 s with each cycle.
The resulting PCR products were agarose gel purified using the QIAquick gel extraction kit (Qiagen, Valencia, CA, USA) according to the manufacturer's protocol. The resulting cDNA was sequenced using ABI Prism BigDye Terminator Cycle Sequencing Ready Reaction Kit (Applied Biosystems, Foster City, CA, USA) following the manufacturer's protocol. The reactions were prepared for sequencing by centrifugation through either a Centri-Sep column (Applied Biosystems, Foster City, CA, USA) or using the Edge system (EdgeBio, Gaithersburg, MD, USA) plate. The samples were sequenced at the Georgia Genomics Facility (University of Georgia, Athens, GA, USA).

Genome Assembly and Analysis
Chromatogram files and trace data were read and assembled using SeqMan Pro, and genome annotation was conducted with SeqBuilder (DNASTAR, Inc., Madison, WI, USA). Each sequence was aligned to a representative genome; Mass/Mass41/41 (GenBank accession #AY851295), or CAL99/CAL99/99 (GenBank accession #AY514485) as a backbone for genome assembly.
Whole genome analyses were generated and phylogenetic trees constructed with the Neighbor-Joining method with 1000 bootstrap replicates as well as with Minimum Evolution, Maximum Parsimony and UPGMA methods [17].
IBV strains in phylogenetic trees generated from sequential genome fragments using TreeOrder Scan (Version 1.6, Simmonics, University of Warwick, Coventry, UK) [26,27]. Changes in the tree position of taxa supported at the 70% or greater bootstrap level for a 250 bp sequence window were examined at 100 bp intervals. In addition, a phylogenetic compatibility matrix was constructed and used to examine the frequency and location of recombinations across the entire genome.

Recombination Site Detection
Potential recombination sites were identified using the RDP4 software [22] and a breakpoint map was constructed. A breakpoint density plot was then created from this map by moving a 200 nt window 1 nt at a time along the length of the map. The number of breakpoints falling within a window was plotted at the central window position. A 99% (upper) and 95% (lower) confidence threshold for globally significant breakpoint clusters (defined as windows with more breakpoint positions than the maximum found in >95% of the 1,000 permuted plots) was calculated. In addition, 99% and 95% confidence intervals were calculated for local breakpoint clusters (defined as windows with more breakpoint positions than the maximum found in >99% of the windows at that location in 1,000 permuted plots).