Genomics-Driven Activation of Silent Biosynthetic Gene Clusters in Burkholderia gladioli by Screening Recombineering System

The Burkholderia genus possesses ecological and metabolic diversities. A large number of silent biosynthetic gene clusters (BGCs) in the Burkholderia genome remain uncharacterized and represent a promising resource for new natural product discovery. However, exploitation of the metabolomic potential of Burkholderia is limited by the absence of efficient genetic manipulation tools. Here, we screened a bacteriophage recombinase system Redγ-BAS, which was functional for genome modification in the plant pathogen Burkholderia gladioli ATCC 10248. By using this recombineering tool, the constitutive promoters were precisely inserted in the genome, leading to activation of two silent nonribosomal peptide synthetase gene clusters (bgdd and hgdd) and production of corresponding new classes of lipopeptides, burriogladiodins A–H (1–8) and haereogladiodins A–B (9–10). Structure elucidation revealed an unnatural amino acid Z- dehydrobutyrine (Dhb) in 1–8 and an E-Dhb in 9–10. Notably, compounds 2–4 and 9 feature an unusual threonine tag that is longer than the predicted collinearity assembly lines. The structural diversity of burriogladiodins was derived from the relaxed substrate specificity of the fifth adenylation domain as well as chain termination conducted by water or threonine. The recombinase-mediating genome editing system is not only applicable in B. gladioli, but also possesses great potential for mining meaningful silent gene clusters from other Burkholderia species.


Introduction
The genus Burkholderia belongs to the beta subdivision of proteobacteria and occupies diverse ecological niches ranging from terrestrial and aquatic niches as free-living organisms to in association with eukaryotic hosts [1][2][3][4]. It exhibits great potential in the production of a variety of potent antibacterial, antitumor, herbicidal and insecticidal compounds [5,6]. Genome analysis showed a large number of natural product biosynthetic gene clusters in Burkholderia, presenting an abundant reservoir of nonribosomal peptides and polyketides, which are of particular interest due to their various biological properties [3,7]. Based on the genomic-guided discovery technologies associated with genome data, diverse compounds from Burkholderia have been discovered, such as bolagladins/glidochelins, gladiofungins, thailandepsins/burkholdacs [8][9][10][11][12]. However, many silent BGCs embedded in the Burkholderia genome still needs to be investigated. During our ongoing genome mining efforts to discover new bioactive compounds from Burkholderia sensu lato, the model plant pathogen Burkholderia gladioli ATCC 10248, which potentially produces new structures based on bioinformatic analysis, attracted our interests [4,[13][14][15][16]. Its genome size is estimated to be 8. 9 Mbp with 68% of the G + C content, harboring two chromosomes (NZ_CP009323.1, NZ_CP009322.1) and three free plasmids (NZ_CP009321.1, NZ_CP009320.1, NZ_CP009319.1) [17]. The antiSMASH analysis predicted 19 biosynthetic gene clusters (BGCs) [18]. At present, only three types of natural products, polyketide gladiolins and nonribosomal peptide icosalides as well as sulfazecin, have been discovered in this model strain, while the others were likely dormant treasure trove and needed to be awakened through appropriate technical methods [19][20][21].
Promoter engineering has been proven as a useful strategy in the activation of silent gene clusters [22]. Through substitution of the native promoter of dedicated BGC with a constitutive or inducible promoter, transcriptional regulation of the BGC in the original producer could be bypassed. However, this methodology requires genome editing tools that should be workable in the target microorganisms. Red/ET recombineering technology mediated by λ phage Redα/Redβ or Rac prophage RecE/RecT recombinases is an efficient genetic engineering method and was used primarily in Escherichia coli for genome editing by using short homology arms (40-50 bp) [23].
Due to the limits of the Red/ET recombination system in the other microorganisms, our group recently established two other recombination techniques based on the homology to the recombinases for Burkholderia genus and Pseudomonas genus, respectively [13,24]. One is Redγ-Redαβ7029, which was discovered from Schlegelella brevitalea DSM 7029 (previously known as Burkholderiales strain) and is workable in several Burkholderiales strains [13,15,25]. The other is a lambda Red-like recombination system BAS from Pseudomonas aeruginosa phage Ab31 that was established to carry out genome editing in four Pseudomonas species [24]. These recombineering-mediated genome editing systems provide us a convenient alternative for gene manipulation of the target B. gladioli strain ATCC 10248.
In this work, we first screened an applicable genome editing recombination system for B. gladioli ATCC 10248 and used it to activate two silent nonribosomal peptide synthetase (NRPS) BGCs by insertion of potent exogenous promoters. Ten new lipopeptides, burriogladiodins A-H (1-8) and haereogladiodins A-B (9-10) (Figure 1), were identified through HRESIMS, NMR, and Marfey's analysis.

Bioinformatic Analysis and Manipulation of Silent BGCs in B. gladioli ATCC 10248
Bioinformatic analysis with the aid of the antiSMASH platform showed nineteen putative secondary metabolite BGCs in the genome of B. gladioli ATCC 10248, including six BGCs in chromosome 1 and thirteen BGCs in chromosome 2 (Table S1) [18]. Except for known gladiolin, icosalide and sulfazecin BGCs [19][20][21], the remaining five NRPSs, two polyketide synthases (PKSs), and one NRPS-PKS hybrid clusters exhibited difference to the known BGCs, indicating the potential of new secondary metabolites production in ATCC 10248. Among the NRPS BGCs, BGC 2 and BGC 5 on the chromosome 2 (Chr2C2 and Chr2C5), attracted our attention for containing a starter condensation (C s ) domain putatively responsible for the biosynthesis of lipopeptides, which are remarkable classes of pharmaceutical molecules with distinctive antibacterial, antifungal, or surfactant activities [26,27].
According to bioinformatics prediction, Chr2C2 and Chr2C5 are conserved among some plant-pathogenic Burkholderia species producing unusual threonine-tagged lipopeptides, such as B. glumae and B. plantarii [28,29]. Based on the predicted substrate specificity of the assembly lines, Chr2C2 and Chr2C5 probably synthesize heptapeptide and pentapeptide with FA-Thr-Pro-Gln-Ala-X-Phe-Pro and FA-Thr-Thr-X-X-Pro backbones, respectively. LC-MS analysis of the crude extracts from the culture of ATCC 10248 did not show corresponding products under our laboratory conditions, indicating that Chr2C2 and Chr2C5 are silent.

Screening an Available Recombineering Genome Editing System for B. gladioli ATCC 10248
Due to the lack of feasible genetic tools for the manipulation of ATCC 10248, we set out to establish an efficient recombination system in the strain by introducing phage recombinases, which showed high efficiency in genome editing of other strains. Three recombination systems, Redγβα, Redγ-Redαβ7029, and Redγ-BAS, were employed to perform genome editing in ATCC 10248. Redγβα from E. coli and Redγ-Redαβ7029 from S. brevitalea DSM 7029 have been used in the genome editing of the Burkholderia species, while the recombinases BAS from P. aeruginosa which is closely related to Burkholderia gladioli ATCC 10248 [13,24,30].
To investigate the applicability of the three recombination systems in ATCC 10248 genome modification, the three recombination systems were first electro-transformed into ATCC 10248, respectively. Then an apramycin resistance gene flanked with homology arms of different length (50 bp, 75 bp, 100 bp) was transformed in the three transformants, respectively, and used to replace the 1276 bp fragment (468577-469853) of the gladiolin gene cluster (gbn) on chromosome 2 ( Figure 2a). The recombinants were verified by colony PCR and abolishment of gladiolins production ( Figure 2c). The results showed that Redγ-BAS could efficiently mediate genome modification with all three selected lengths of the homology arms. Redγβα also functioned with the three lengths of homology arms, but with a low colony-forming unit (CFU) of~8,~15, and~70, respectively. Unexpectedly, Redγ-Redαβ7029 was ineffective in ATCC 10248 even when 100 bp-homology arms were used ( Figure 2b). Therefore, the Redγ-BAS recombination system was used for genome mining in ATCC 10248.

Activating Two Silent Biosynthetic Gene Clusters and Structure Elucidation of Lipopeptides
To activate the silent Chr2C2 and Chr2C5 gene clusters, the original promoters of the target BGCs were replaced by the constructive promoter P genta associated with the gentamicin resistance gene. In order to avoid interference from the high yield of gladiolin, we constructed the gladiolin-deficient mutant by inserting an apramycin resistance gene in the gladiolin gene cluster ( Figures S1 and S2). The inactivation mutants of the Chr2C2 and Chr2C5 gene clusters were constructed by gentamicin resistance gene replacing the core domains of the target genes. The mutants were fermented, and the compounds were extracted.   (Table 1), compound 1 contains seven amino acid moieties: Dhb, two Pro, Gln, Ala, Val, and Phe, which was closely related to burriogladin A isolated from B. gladioli pv. agaricicola with the difference of the p-hydroxyphenyl glycine (p-Hpg), which is replaced by a Val in 1 [28], as suggested by the HMBC correlations from the Val H-2 to Val C-1/C-3/C-4/C-5, from Val H-4/H-5 to C-2/C-3, and supported by successive COSY correlations between Val H-2/H-3/H-4 or H-5. The detailed structure of 1 was further confirmed by the 2D NMR correlations (Figure 4), MS/MS fragmentations and Marfey's analysis ( Figures S3 and S4). The configuration of amino acid residues was determined to be D and L-Pro, D-Gln, L-Ala, D-Val, and L-Phe. The D-amino acids were predicted to be generated by the corresponding C d domain in the assembly line with dual epimerization and condensation activity [31]. The absolute configuration at C-3 of β-OH-decanoate (β-OH-Dec) was also proposed to be 3R because its C s domain showed high homology to the C s domain of burriogladins (99% identity) [28].
Burriogladiodin B (2) was also isolated as a white solid with the molecular formula C 50 H 77 N 9 O 13 (HRESIMS, m/z 1012.5693 [M + H] + , calcd 1012.5714). Preliminary NMR analysis of 2 (Table 1) showed a close similarity to 1 except for several additional typical Thr signals. The Thr fragment connected to Pro2 via amide bond was suggested by obvious changes of the chemical shifts of the Pro2 part and evidenced by 2D NMR correlations from Thr NH to Thr C-2 and Pro2 C-1 ( Figure 4). The configurations of Thr from 2 were confirmed to be L-type by Marfey's analysis, while the other amino acids were identical to 1 (Table S3).
Burriogladiodins C (3) and D (4) could not be separated by reverse-phase C 18 column and thus existed as a 2:1 (as calculated from the 1 H NMR spectra integral) mixture of isomers in DMSO-d 6 . They have the same molecular formula, C 51 H 79 N 9 O 13 , deduced by the HRESIMS spectrum on the protonated ion peak at m/z 1026.5851 [M + H] + . The NMR data of 3-4 (Table 2) showed a high similarity to 2 except for one additional methylene signals (δ C 41.4 in 3, 25.4 in 4). HMBC correlations from Leu or Ile H-1 to C-1/C-3/C-4 and H-5/H-6 to C-3/C-4 together with series COSY correlations between Leu or Ile NH/H-2/H-3/H-4/H-5 or H-6 clearly indicated a Leu in 3 and an Ile in 4 instead of the Val in 2, respectively. Finally, the complete structures of 3 and 4 were elucidated unambiguously by 2D-NMR correlations ( Figure 4) and also fulfilled with tandem MS/MS analysis and feeding experiment ( Figure S5). The absolute configurations of Leu in 3 and Ile in 4 were both determined to be D configuration by Marfey's analysis (Table S3).      (Tables 3 and 4) to 1-4. Compared to 1, compounds 5-8 showed the common feature of the disappearance of the Pro moiety. Moreover, the main difference between 6 and 7 (Table 4), obtained as a mixture in the ratio of ca. 3:4, was originated from the substitution of Leu in 6 by Ile in 7. A comparison of the 1D NMR data of 8 (Table 3) with 5 undoubtedly demonstrated that 8 had an Ala in place of the Val in 5. The planar structures of 5-8 were further confirmed by the 2D NMR correlations ( Figure 4) and MS/MS analysis as well as the feeding experiment ( Figures S6 and S7). The configuration of the Dhb units in 1-8 was determined to be Z configuration by NOESY correlation, represented by compound 5 (Figure 4) and their biosynthesis consideration.   (Table 5) were similar to haereoglumins A and B, but with the difference in the first amino acid [28], which showed a Tyr in 9 and 10. Their structures were further confirmed by the 2D NMR correlations (Figure 4) and MS/MS fragmentations ( Figure S8). The E-geometry of the double bonds in Dhb was determined by the NOESY correlations ( Figure 4). According to Marfey's analysis, the absolute configuration of the amino acids in 9 and 10 are D-Tyr, L-Leu, and L-Thr, respectively.

Biosynthesis of Burriogladiodins and Haereogladiodins
Accurate structural determination assisted with bioinformatic analysis allowed us to propose the biosynthetic mechanisms of burriogladiodins and haereogladiodins (Figure 5a,b). The elucidated structure of burriogladiodin A (1) is consistent with the chemical backbone predicted by in silico BGC analysis, while burriogladiodins B-D (2-4) with the additional Cterminal threonine tag are assumed to be introduced by the TE domain. The C-terminal threonine tag has been found in the biosynthesis of burrioplantin/burriogladins/burrioglumins, of which the BGCs showed high homology with the bgdd gene cluster [28,29]. The structural diversities of burriogladiodins B-D (2-4) is proposed to be generated by the substrate flexibility (Val/Leu/Ile) of the A 5 domain. In addition, premature termination of the elongation in the assembling line led to the formation of the four truncated burriogladiodins E-H (5)(6)(7)(8). Hereogladiodins are proposed to share similar biosynthetic mechanisms with burriogladiodins. Compared to 9, hereogladiodin B (10) was an early hydrolysis product like 5-8 and our previously discovered holrhizins [14].

The Bioactivity Assays of Burriogladiodins and Haereogladiodins
Bioactivity test of compounds 1-10 showed no obvious activities against our selected four Gram-positive and Gram-negative bacteria (MIC > 100 µM) and as well as six tumor cell lines and normal cell line 293 T (IC 50 > 20 µM, Table S4). Since lipopeptides often mediate important processes such as biofilm formation and swarming motility, and the lipopeptide with unusual threonine-tag are conserved in mushroom and plant pathogenic Burkholderia and environmental Paraburkholderia, which could promote bacterial infection in the host [15,28,29], we performed swarming and swimming assays with wild type and mutants to verify the activities in the bacterial cell motility ( Figure S9). The swarming and swimming assays showed that the activation mutant swarmed and colonized in a bigger area compared to the wild-type strain. However, the activation mutant strain colonized in a smaller area compared to the gladiolin-deficient mutant. The gladiolins could probably inhibit the swarming ability of wild type, while burriogladiodins and haereogladiodins may promote the swarming ability of strain ATCC 10248.
In this work, we successfully established a recombination system for genome mining in B. gladioli ATCC 10248 and activated two silent NRPS BGCs by inserting a potent exogenous promoter. Two new classes of lipopeptides, burriogladiodins (1-8) and haereogladiodins (9-10) were isolated and elucidated, which enriched a new member of linear lipopeptides. According to the swarming and swimming assays, these compounds probably play a role in bacteria invasion of plant hosts.

General Experimental Procedures
Optical rotations were obtained on a JASCO P-1020 digital polarimeter (JASCO Corporation, Tokyo, Japan). UV spectra were recorded on a Thermo Scientific Dionex Ultimate 3000 DAD detector, and IR spectra were taken on a Nicolet NEXUS 470 spectrophotometer as KBr disks (Thermo Fisher Scientific, Waltham, MA, USA). 1 H and 13 C NMR, DEPT, and 2D NMR spectra were recorded on an Agilent 500 MHz DD2 (Agilent Technologies Inc., Santa Clara, CA, USA) using TMS as an internal standard. HRESIMS spectra were measured on a Bruker Impact HD microTOF Q III mass spectrometer (Bruker, Rheinstetten, Germany) using the standard ESI source. UHPLC-MS was operated using a Thermo Scientific Dionex Ultimate 3000 system coupled with the Bruker amazon SL Ion Trap mass spectrometry, controlled by Hystar v3.2 and Chromeleon Xpress software. A Thermo Scientific™ Acclaim™ C 18 column (2.1 × 100 mm, 2.2 µm) was used. The mobile phase consisted of H 2 O and acetonitrile (ACN), both containing 0.1% formic acid. Semipreparative HPLC (Agilent Technologies Inc., Santa Clara, CA, USA) was performed using an ODS column (Bruker ZORBAX SB-C 18 , 9.4 × 250 mm, 5 µm, 3 mL min −1 ). Vacuum-liquid chromatography (VLC) was carried out over silica gel H (Qingdao Marine Chemical Factory, Qingdao, China).

Bacterial Strains, Plasmids and Reagents
The strains, mutants and plasmids used in this study are lists in Table S5

Knockout and Promoter Insertion of the Silent Gene Clusters on the Chromosome of B. gladioli ATCC 10248
The target genes were knocked out by the gentamicin resistance gene using the Redγ-BAS system. The target BGC activation mutant was constructed by the insertion of a constructive promoter (P genta ), replacing the original promoter in front of the main biosynthetic gene of BGCs. The antibiotic resistance gene and constructive promoter flanked with homology arms (50 bp) were generated by polymerase chain reaction (PCR) amplification using 2 × PrimerSTAR Max polymerases (Takara Biomedical Technology (Beijing) Co., Ltd., Beijing, China), and the templates for genta R , P genta and apra R are derived from plasmids R6K-lox71-genta-lox66-FleQ and RK2-apra-cm, respectively. For the recombineering, purified PCR products of the resistance gene were transformed into B. gladioli ATCC 10248/pBBR1-Rha-Redγ-BAS-km, respectively. Recombinants were selected on CYMG plates containing gentamicin (120 µg mL −1 ) or apramycin (250 µg mL −1 ), respectively. Correct recombinants were verified by colony PCR. A list of recombinants generated in this study is provided in Table S5. Primers used for gene cluster modification are listed in Table S6.

Extraction and Isolation
The recombinant B. gladioli ATCC 10248∆gbnP genta -bgdd and B. gladioli ATCC 10248∆gbnP genta -hgdd were fermented in 20 L of CYMG medium supplemented with 30 µg mL −1 km at 30 • C, 200 rpm for 3 days, and then added 2% XAD 16 (v/v) incubated for another 1 day. The resin was collected by sieving, washed with double distilled H 2 O (ddH 2 O), and then extracted with methanol (5 L). The extracts were concentrated under reduced pressure. The final crude extracts were subjected to vacuum liquid chromatography (VLC) on a silica gel column using step gradient elution with CH 2 Cl 2 and

Antibacterial and Cytotoxic Activities Assay
The antibacterial activities of compounds were evaluated using Kirby-Bauer disk diffusion method. The tested bacteria included Gram-negative bacteria Escherichia coli ATCC 35218 and Pseudomonas aeruginosa ATCC 27853, and Gram-positive bacteria Staphylococcus aureus ATCC 29213 and Bacillus subtilis ATCC 6633. The tested microorganisms were obtained from China General Microbiological Culture Collection Center (CGMCC). The tested cells contained human hematological disease cells K562, human breast adenocarcinoma cells MCF7, human hepatoma cell HepG-2, human lung adenocarcinoma cell A549, human negroid cervix epithelioid carcinoma Hela, human colon cancer cell HCT-116, and human lung normal cell 2B. The detailed procedure was performed according to methods previously described [14].

Swarming and Swimming Assay
The hot CYMG-0.5% agar and hot CYMG-0.25% agar (15 mL) was poured into Petri dishes for swarming assay and swimming assay, respectively [28,32]. The plates were dried. The overnight bacterial culture in CYMG was diluted to get the OD 600 to 0.1. The suspension (3 µL) was carefully dropped at the center of the agar plate. The plates were incubated at 30 • C for 36 h.