Comparative Genomics Provides Insight into the Function of Broad-Host Range Sponge Symbionts

ABSTRACT The fossil record indicates that the earliest evidence of extant marine sponges (phylum Porifera) existed during the Cambrian explosion and that their symbiosis with microbes may have begun in their extinct ancestors during the Precambrian period. Many symbionts have adapted to their sponge host, where they perform specific, specialized functions. There are also widely distributed bacterial taxa such as Poribacteria, SAUL, and Tethybacterales that are found in a broad range of invertebrate hosts. Here, we added 11 new genomes to the Tethybacterales order, identified a novel family, and show that functional potential differs between the three Tethybacterales families. We compare the Tethybacterales with the well-characterized Entoporibacteria and show that these symbionts appear to preferentially associate with low-microbial abundance (LMA) and high-microbial abundance (HMA) sponges, respectively. Within these sponges, we show that these symbionts likely perform distinct functions and may have undergone multiple association events, rather than a single association event followed by coevolution.

suggesting that these symbionts may have, at one point, been acquired from the surrounding environment.
In this study, we used the dominant, conserved Tethybacterales (strain Sp02-1) symbiont of Tsitsikamma (subgenus Tsitsikamma) favus sponge species (family Latrunculiidae) (56-60) as a springboard into a deeper investigation of Tethybacterales. Here, we report a comparative study using new and existing Tethybacterales genomes and show that functional potential follows that of their taxonomic ranking rather than host-specific adaptation. We also show that the Tethybacterales and Poribacteria have distinct functional repertoires, that these bacterial families can coexist in a single host, and that the Tethybacterales may represent a more ancient lineage of ubiquitous sponge-associated symbionts.

RESULTS AND DISCUSSION
The microbiomes of sponges of the Latrunculiidae family are highly conserved and are dominated by populations of related betaproteobacterial symbionts. These bacteria have since been reclassified as class Gammaproteobacteria, as several betaproteobacteria were, when genome phylogeny was proposed as the basis for taxonomy, which has since been incorporated in the Genome Taxonomy Database (GTDB) (61). The numerically dominant symbiont in T. (T.) favus sponges is strain Sp02-1. Based on their 16S rRNA gene sequence, the Sp02-1 strain and closely related symbionts from different latrunculid sponges are likely members of the newly described Tethybacterales order.
Genome bin 003B_4 was used as a representative of the Tethybacterales Sp02-1 symbiont. Bin 003B_4 is approximately 2.95 Mbp in size and of medium quality per MIMAG standards (62) (Table S1), and it has a notable abundance of pseudogenes (;25% of all genes), which resulted in a coding density of 65.27%, far lower than the average for bacteria (63). An abundance of pseudogenes and low coding density are usually indications that the genome in question may be undergoing genome reduction (64), similar to other genomes in the proposed order of Tethybacterales (45).
The Tethybacterales Sp02-1 genome carries all genes necessary for glycolysis and PRPP biosynthesis, and most genes required for the citrate cycle and oxidative phosphorylation were detected in the gene annotations. Also present are the genes necessary to biosynthesize valine, leucine, isoleucine, tryptophan, phenylalanine, tyrosine, and ornithine amino acids, as well as genes required for transport of L-amino acids, proline, and branched amino acids. This would suggest that this bacterium may exchange amino acids with the host, as observed previously in both insect and sponge-associated symbioses (11,65,66).
A total of 13 genes unique to the Tethybacterales Sp02-1 symbiont were identified (i.e., not identified elsewhere in the T. [T.] favus metagenomes). One gene was predicted to encode an ABC transporter permease subunit that was likely involved in glycine betaine and proline betaine uptake. A second gene encoded 5-oxoprolinase subunit PxpA (Table S5). The presence of these two genes suggests that the Tethybacterales Sp02-1 genome can acquire proline and convert it to glutamate (67) in addition to glutamate already produced via glutamate synthase. Other unique genes encode a restriction endonuclease subunit and site-specific DNA-methyltransferase, which would presumably aid in defense against foreign DNA. At least seven of the unique gene products are predicted to be associated with phages, including the antirestriction protein ArdA. ArdA is a protein that has previously been shown to mimic the structures of DNA normally bound by type I restriction modification enzymes, which prevent DNA cleavage, and effectively results in antirestriction activity (68). If functionally active in the Tethybacterales Sp02-1 symbiont, we speculate that this protein may similarly prevent DNA cleavage through its mimicry of the targeted DNA structures and protect the genome against type I restriction modification enzymes. Finally, two of the unique genes were predicted to encode an ankyrin repeat domain-containing protein and a von Willebrand factor type A (VWA) domain-containing protein. These two proteins are known to be involved in cell-adhesion and protein-protein interactions (69,70), and if active within the symbiont, they may help facilitate the symbiosis between the Tethybacterales Sp02-1 symbiont and the sponge host.
Comparison of putative Sp02-1 with other Tethybacterales. Several Tethybacterales sponge symbionts have been described to date, and these bacteria are thought to have functionally diversified following the initiation of their ancient partnership (45). To test this hypothesis, we downloaded 12 genomes/MAGs of Tethybacterales (classified as AqS2 in GTDB) from the JGI database. Additionally, we assembled and binned metagenomic data from 36 sponge SRA data sets, covering 14 sponge species, and recovered an additional 14 AqS2-like genomes. Of the total 27 bins, 10 were of low quality, so Bin 003B_4 (Sp02-1) and 16 medium-quality Tethybacterales bins/genomes were used for further analysis (Table 1).
First, the phylogeny of the Tethybacterales symbionts was determined using singlecopy marker genes in autoMLST, revealing a deep branching clade of these spongeassociated symbionts and revealing that bin 003B_4 clustered within the proposed Persebacteraceae family (Fig. 1). All members of the Persebacteraceae family dominate the microbial community of their respective sponge hosts (47,56,57,71). We additionally identified what appears to be a third family, consisting of symbionts associated with Coelocarteria singaporensis and Cinachyrella sponge species (Fig. 1). Assessment of shared average amino acid identity (AAI) indicates that these genomes represent a new family, sharing an average of 80% AAI within the family (Table S6) (72). These three families share less than 89% sequence similarity with respect to their 16S rRNA sequences, with intraclade differences of less than 92% (Table S6). Therefore, they may represent novel classes (72) within the Tethybacterales order. While it is still hotly debated whether MAGs should be named at the genus level (73)(74)(75)(76), we chose to tentatively name the additional genera and family after Oceanids of Greek mythology in keeping with Taylor and colleagues, who initially resolved the Tethybacterales order (45). We propose the family name Polydorabacteraceae, which means "many gifts." Additionally, we propose species names for the newly identified genera as follows: Bin 003B_4 is a single representative of "Candidatus Ukwabelana africanus," Bin Imet_M1_9 and Bin ImetM2_1_1 are both representatives of "Candidatus Regalo mexicanus," Bin CCyA_2_3 and CCyB_3_2 are both representatives of "Candidatus Dora taiwanensis," and all six bins from C. singaporensis are representative of "Candidatus Hadiah malacca." In each case, the genus name means "gift from" in the local language (where possible) from where the host sponge was collected, and the species name reflects the region/country from which the sponge host was collected.
We identified 4,306 groups of orthologous genes between all 17 Tethybacterales genomes, with only 18 genes common to all the genomes. More shared genes were expected, but as several of the genomes investigated are incomplete, it is possible that additional common genes would be found if the genomes were complete. Hierarchical clustering of gene presence/absence data revealed that the gene pattern of Bin 003B_4 most closely resembled that of Tethybacterales genomes from Crambe crambe, Crella incrustans, and the Scopalina sp. sponges (family Persebacteraceae) ( Fig. 2A). A total of 13 of the shared genes between all Tethybacterales genomes encoded ribosomal proteins or those involved in energy production. Genes encoding chorismate synthase were found across all 17 genomes and suggest that tryptophan production may be shared among these bacteria. According to a recent study, Dysidea etheria and A. queenslandica sponges cannot produce tryptophan (a possible essential amino acid), which may indicate a common role for the Tethybacterales symbionts as tryptophan producers (77). Several other shared genes were predicted to encode proteins involved in stress responses, including protein-methionine-sulfoxide reductase, ATP-dependent Clp protease, and chaperonin enzyme proteins, which aid in protein folding or degradation under various stressors (78)(79)(80)(81)(82). Internal changes in oxygen levels (83) and temperature changes (84)(85)(86) are examples of stressors experienced by the sponge holobiont. It is unsurprising that this clade of largely sponge-specific Tethybacterales share the ability to deal with these many stressors as they adapt to their fluctuating environment.  Alignment against the KEGG database revealed some noteworthy trends that differentiated the three Tethybacterales families ( Fig. 2B; Table S7): (i) the genomes of the proposed Polydorabacteraceae family include several genes associated with sulfur oxidation; (ii) the Persebacteraceae are unique in their potential for reduction of sulfite (cysIJ), and (iii) the Tethybacteraceae have the potential for cytoplasmic nitrate reduction (narGHI), while the other two families may perform denitrification. Similarly, the families differ to some extent in what can be transported in and out of the symbiont cell (Fig. 2C). Proposed members of the Polydorabacteraceae appear exclusively capable of transporting hydroxyproline, which may imply a role in collagen degradation (87). The Tethybacteraceae and Persebacteraceae appear able to transport spermidine, putrescine, taurine, and glycine, which in combination with their potential to reduce nitrates, may suggest a role in C-N cycling (88). All three families transport various amino acids as well as phospholipids and heme. The exchange of amino acids between symbiont and sponge host has previously been observed (89) and may provide the Tethybacterales with a competitive advantage over other sympatric microorganisms (90) and possibly allow the sponge hosts to regulate the symbioses via regulation of the quantity of amino acids available for symbiont uptake (91). Similarly, the transfer of heme in the iron-starved ocean environment between sponge host and symbiont could provide a selective advantage, as heme may act as a supply of iron (92). The Tethybacteraceae were distinct from the other two families in their potential to transport sugars. As mentioned earlier, the transport of sugars plays an important role in symbiotic interactions (84,(93)(94)(95), and it is possible that this family of symbionts require sugars from their sponge hosts.
Comparative analyses of functional potential between Tethybacterales and Poribacteria. We wanted to determine whether broad-host range sponge-associated symbionts have converged to perform similar roles in their sponge hosts. Accordingly, we annotated 62 Poribacteria genomes, which consisted of 24 Pelagiporibacteria (free-living) and 38 Entoporibacteria (sponge-associated) genomes, and the 17 Tethybacterales genomes against the KEGG database. We catalogued the presence/absence of 896 unique genes spanning carbohydrate metabolism, methane metabolism, nitrogen metabolism, sulfur metabolism, phosphate metabolism, and several transporter systems (Table S7). Inspection of the functional potential in the Tethybacterales and Poribacteria revealed several insights (Fig. 3). The gene repertoires of the Poribacteria and the Tethybacterales are distinct from one another ( Fig. S3; Table 2), with notable differences, including the genes associated with dissimilatory nitrate reduction, thiosulfate oxidation, and transport of glycine betaine/proline, glycerol, taurine, tungstate, and lipooligosaccharides, all of which are present in at least two of the three Tethybacterales families and absent in the Poribacteria (Fig. 3). Conversely, several gene clusters were detected in the Poribacteria and absent in the Tethybacterales, including trehalose biosynthesis, galactose degradation, phosphate metabolism, assimilatory sulfate reduction, and transport of phosphonate, urea, iron complexes, molybdate, and hydroxymethylpyrimidine (Fig. 3). It has been reported that both Entoporibacteria and Pelagiporibacteria include genes associated with denitrification (36); however, we could not detect many genes associated with nitrogen metabolism in our analyses (Fig. 3).
We cross-checked gene annotations generated using Prokka (HAMAP database) and BLAST (nonredundant [nr] database). Genes associated with assimilatory nitrate reduction (narB and nirA) were identified in Poribacteria using these alternate annotations, but we could not detect genes associated with denitrification in the Poribacteria. Conversely, genes associated with denitrification (napAB and nirK) were detected in the Persebacteraceae of the Tethybacterales in Prokka, BLAST, and KEGG annotations (Fig. 3), indicating that their absence in Poribacteria genomes was not an artifact of our analyses.
Pairwise analysis of similarity (ANOSIM) (using Bray-Curtis distance) confirmed that the functional genetic repertoire (KEGG annotations) of the Tethybacterales bacteria showed a strong, significant dissimilarity to that of the sponge-associated Entoporibacteria and the free-living Pelagiporibacteria (Table 2). In addition, the Polydorabacteraceae and the Persebacteraceae were significantly different from one another, but the lower R statistic would suggest that the dissimilarity is not as strong as that between other groups in this analysis, while the Tethybacteraceae appear to be more functionally distinct from the other two Tethybacterales families.
Taken together, these data suggest that the three Tethybacterales families and the Entoporibacteria lineages may each fulfil distinct functional or ecological niches within a given sponge host. We then considered the sponge hosts themselves and found that Entoporibacteria included in this study associate exclusively with high-microbial abundance (HMA) sponges, while the Tethybacterales largely associate with low-microbial abundance (LMA) sponges (Table 3). This difference is consistent with previous findings that LMA and HMA sponges have different bacterial community structures, where the HMA sponges are associated with highly abundant, highly diverse, and similar bacterial   (96)(97)(98). More specifically, Poribacteria have been identified as "indicator species" for HMA sponges, and Betaproteobacteria (now within the Gammaproteobacteria class) were indicator species of LMA sponges (97,99). However, exceptions to Tethybacterales associating exclusively with LMA sponges were observed. First, the Iophon methanophila sponges which harbor symbionts within the Tethybacteraceae family do not conform to the LMA/HMA dichotomy (100), and second, some sponges, such as C. singaporensis (HMA), can play host to both Tethybacterales and Entoporibacteria species (Table 3), which provides further evidence that these symbionts may serve different purposes within their sponge host. However, why and how these different sponge types select for different broad-host range symbionts remain to be discovered. We investigated the respective approximate divergence pattern of the Tethybacterales and the Entoporibacteria and whether their divergence followed that of their sponge hosts. The 18 homologous genes shared between the Tethybacterales were used to estimate the rate of synonymous substitution, which provides an approximation for the pattern of divergence between the species (101). We found that the estimated divergence pattern of the Tethybacterales (Fig. 4A) and the phylogeny of the host sponges (Fig. 4B) was incongruent. Phylogenetic trees inferred using single-copy marker genes (Fig. 1) and the comprehensive 16S rRNA tree published by Taylor and colleagues (45) confirm this lack of congruency between symbiont and host phylogeny. Other factors, such as collection site or depth, could not explain the observed trend. Similar incongruence of symbiont and host phylogeny was observed for the Entoporibacteria (34 homologous genes used to estimate synonymous substitution rates) ( Fig. 4C and D), in agreement with previous phylogenetic studies (34,36,37). This would suggest that these sponges likely acquired a free-living Tethybacterales common ancestor at different time points throughout their evolution and that the same is true for the Entoporibacteria. Evidence of coevolution of betaproteobacteria symbionts within sponge families (49,55,56,102) implies that Tethybacterales symbionts were likely acquired horizontally at various time points and may have coevolved with their respective hosts subsequent to acquisition.
Finally, the estimated rates of synonymous substitution of homologous genes were used to estimate the relative times at which the Tethybacterales and Entoporibacteria taxa began diverging. Regardless of the substitution rate used, it was found that the sponge-associated Tethybacterales genomes began diverging from one another before the Entoporibacteria began diverging from one another (Table S4). If one accepts that divergence between exclusively sponge-associated bacterial lineages began when the common ancestor first associated with a sponge host, then the earlier divergence of sponge-associated Tethybacterales (relative to the Entoporibacteria) suggests that the Tethybacterales may have associated with sponges before the Poribacteria common ancestor and represent a more ancient symbiont. However, this hypothesis may prove false if additional Entoporibacteria lineages are discovered and added to the analyses, or other factors such as mutation rates, time between symbiont acquisition, and transition to vertical inheritance of symbionts or fossil records disprove this hypothesis.
Conclusion. Here, we have shown that the family to which a broad-host range symbiont belongs dictates the functional potential of the symbiont. This work has expanded our understanding of the Tethybacterales and the possible functional specialization of the families within this new order. The Tethybacterales are functionally distinct from the Poribacteria, which would suggest that although these bacteria are both ubiquitously associated with a wide range of sponge hosts, they likely have not converged to fulfil the same role. Instead, it would appear that these symbionts were selected by the various sponge hosts for existing functional capabilities that fulfil requirements of either HMA or LMA sponges. The phylogenetic incongruence of both Tethybacterales and Entoporibacteria and their respective sponge hosts suggests that their ancestors were horizontally acquired at different evolutionary time points, and coevolution may have occurred following the establishment of the association. Estimates of when the Tethybacterales and Entoporibacteria began diverging from their respective common ancestors implied that Tethybacterales may have associated with a Genomics of Sponge-Associated Bacterial Symbionts ® sponge host before the Entoporibacteria, and therefore the Tethybacterales may be an older sponge-associated symbiont. However, additional data are required to validate or disprove this hypothesis.

MATERIALS AND METHODS
Sponge collection and taxonomic identification. Sponge specimens Tsitsikamma (Tsitsikamma) favus TIC2016-050A and TIC2016-050C were collected in June 2016 at Evans Peak (33.84548°S, 25.31663°E) at a depth of 20 m via self-contained underwater breathing apparatus (SCUBA). Sponge . Sponge specimens were stored on ice during collection and, thereafter, at 220°C. Subsamples collected for DNA extraction were preserved in RNALater (Invitrogen) and stored at 220°C. Sponge specimens were dissected, thin sections were generated, and spicules were mounted on microscope slides and examined to allow species identification, as done previously (103)(104)(105). Molecular barcoding (28S rRNA gene) was also performed for several of the sponge specimens (Fig. S1) as described previously (56).
Metagenomic sequencing and analysis. Small sections of each preserved sponge (approximately 2 cm 3 ) were pulverized in 2 ml sterile artificial seawater (24.6 g NaCl, 0.67 g KCl, 1.36 g CaCl 2 Á2H 2 O, 6.29 g MgSO 4 Á7H 2 O, 4.66 g MgCl 2 Á6H 2 O, 0.18 g NaHCO 3 , and distilled H 2 O to 1 liter) with a sterile mortar and pestle. The resultant homogenate was centrifuged at 16,000 rpm for 1 min to pellet cellular material. Genomic DNA (gDNA) was extracted using the ZR fungal/bacterial DNA miniprep kit (D6005; Zymo Research). Shotgun metagenomic sequencing was performed for four T. (T.) favus sponge specimens using Ion Torrent platforms. Shotgun metagenomic libraries, of reads 200 bp in length, were prepared for each of the four sponge samples (TIC2016-050A, TIC2018-003B, TIC2016-050C, and TIC2018-003D) using an Ion P1.1.17 chip. Additional sequence data of 400 bp were generated for TIC2016-050A using an Ion S5 530 chip. TIC2016-050A served as a pilot experiment, and we wanted to identify which read length was best for our investigations. However, we did not want to waste additional sequence data and included it when assembling the TIC2016-050A metagenomic contigs, so the 400-bp reads were included in the assembly of these metagenomes. Metagenomic data sets were assembled into contiguous sequences (contigs) with SPAdes v3.12.0 (106) using the -iontorrent and -only-assembler options. Contigs that were classified as bacterial were selected and clustered into genomic bins using Autometa (107) and manually curated for optimal completion and purity. Validation of the bins was performed using CheckM v1.0.12 (108). Of the 50 recovered genome bins, 5 were of high quality, 13 were of medium quality, and 32 were of low quality in accordance with MIMAG standards (62) (Table S1).
A total of 36 raw-read SRA data sets from sponge metagenomes were downloaded from the SRA database (Table S2). Illumina reads from these data sets were trimmed using Trimmomatic v0.39 (109) and assembled using SPAdes v3.14 (106) in -meta mode. Contigs classified as bacterial were selected and used for further binning using Autometa (107). This resulted in a total of 393 additional genome bins (Table S1), the quality of which was assessed using CheckM (108) and taxonomically classified with GTDB-Tk (110) with database release 95. A total of 27 bins were classified as AqS2 and were considered likely members of the newly proposed Tethybacterales order (45). However, 10 of the 27 bins were low quality and were not used in downstream analyses. In addition, 59 Poribacteria genome bins were downloaded from the NCBI database for functional comparison (Table S3), and three were used from the 393 genome bins generated in this study (Geodia parva sponge hosts).
Taxonomic identification. Partial and full-length 16S rRNA gene sequences were extracted from bins using barrnap 0.9 (https://github.com/tseemann/barrnap). Extracted sequences were aligned against the nr database using BLASTn (111). Genomes were additionally uploaded individually to autoMLST (112) and analyzed in both placement mode and de novo mode (IQ tree and ModelFinder options enabled and concatenated gene tree selected). All bins and downloaded genomes were taxonomically identified using GTDB-Tk (110).
Genome annotation and metabolic potential analysis. All bins and downloaded genomes were annotated using Prokka v1.13 (113) with NCBI compliance enabled. Protein-coding amino-acid sequences from genomic bins were annotated against the KEGG database using kofamscan (114) with output in mapper format. Custom Python scripts were used to summarize annotation counts (find scripts here: https://github.com/samche42/Family_matters). Potential biosynthetic gene clusters (BGCs) were identified by uploading genome bins to the antiSMASH Web server (115) with all options enabled. Predicted amino acid sequences of genes within each identified gene cluster were aligned against the nr database using BLASTp (111) to identify the closest homologs. Protein sequences of genes within each identified gene cluster were aligned against the nr database using BLASTp (111) to identify the closest homolog.
Phylogeny and function of Tethybacterales species. A subset of orthologous genes common to all medium-quality Tethybacterales genomes/bins was created. Shared amino acid identity (AAI) was calculated with the aai.rb script from the enveomics package (116). 16S rRNA genes were analyzed using BLASTn (111). Functional genes were annotated against the KEGG database using kofamscan (114). Annotations were collected into functional categories and visualized in R (see https://github.com/ samche42/Family_matters for all scripts). A Nonmetric multidimensional scaling (NMDS) plot of the presence/absence metabolic counts was constructed using Bray-Curtis distance using the vegan package (117) in R. Analysis of similarity (ANOSIM) analyses were also conducted using the vegan package in R using Bray-Curtis distance and 9,999 permutations.
Genome divergence estimates. Divergence estimates were performed as described previously (118). Briefly, homologous genes in Tethybacterales genomes were identified using OMA v2.4.2 (119). A subset of homologous genes present in all genomes was created. Homologous genes were aligned using MUSCLE v3.8.155 (120) and clustered into fasta files representing each genome using merge_fastas_for_dNdS.py (see https://github.com/samche42/Family_matters for all scripts). The corresponding nucleotide sequences were extracted from Prokka annotations using multifasta_seqretriever.py. All stop codons were removed using remove_stop_codons.py. All nucleotide sequences, per genome, were concatenated to produce a single nucleotide sequence per genome using the union function from EMBOSS (121). All amino acid sequences were similarly concatenated. This resulted in a single concatenated nucleotide sequence and a single concatenated amino acid sequence per genome. Concatenated nucleotide sequences were clustered into two fasta files (one nucleotide, one protein sequence) and then aligned using PAL2NAL (122). The resultant alignment was then run in codeml to produce pairwise synonymous substitution rates (dS). Divergence estimates can be determined by dividing pairwise dS values by a given substitution rate (substitutions per year) and be further divided by 1 million to provide estimates of branch divergence million years ago (mya). Pairwise synonymous substitution rates can be found in Table S4. Pairwise divergence values were illustrated as a tree using MEGA X (123). Concatenated amino acid and nucleotide sequences of the 18 orthologous genes were aligned using MUSCLE v3.8.155 (120), and the evolutionary history was inferred using the UPMGA method (124) in MEGA X (123) with 10,000 bootstrap replicates.
Identification of unique and host-associated genes in putative symbiont genome bins. A custom database of genes from all bacterial bins (with the exception of the putative Tethybacterales symbionts) was created using the "makedb" option in DIAMOND (125) to identify genes that were unique to the putative Tethybacterales symbionts. To be exhaustive and screen against the entire metagenome, genes from low-quality genomes (except low-quality putative Tethybacterales genomes), small contigs (,3000 bp) that were not included in binning, and unclustered contigs (i.e., included in binning but not placed within a bin) were included in this database. Putative Tethybacterales genes were aligned using DIAMOND blast (125). A gene was considered "unique" if the aligned hit shared less than 40% amino acid identity with any other genes from the T. (T.) favus metagenomes and had no significant hits against the nr database or were identified as pseudogenes. All "unique" putative Tethybacterales genes annotated as "hypothetical" (both Prokka and NCBI nr database annotations) were removed. Finally, we compared Prokka annotation strings between the putative Tethybacterales bins and all other T. (T.) favus-associated genome bins and excluded any putative Tethybacterales genes that were found to have the same annotation as a gene in one of the other bins.
Data availability. The raw 16S amplicon and metagenomic read data can be accessed from the NCBI website under BioProject PRJNA508092.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only.

ACKNOWLEDGMENTS
This research was performed in part using the computer resources and assistance of the UW-Madison Center for High Throughput Computing in the Department of Computer Sciences. The CHTC is supported by UW-Madison, the Advanced Computing Initiative, the Wisconsin Alumni Research Foundation, Wisconsin Institutes for Discovery, and the National Science Foundation and is an active member of the Open Science Grid, which is supported by the National Science Foundation and the U.S. Department of Energy's Office of Science.
We also acknowledge the South African Center for High-Performance Computing for providing computing facilities for bioinformatics data analysis. we acknowledge Gwynneth Matcher (The South African Institute for Aquatic Biodiversity, Aquatic Genomics Research Platform) and Carel van Heerden and Alvera Vorster (Stellenbosch University Central Analytical Facility) for their next-generation sequencing (NGS) technical support. We thank Ryan Palmer and Koos Smith (ACEP) for technical support and expertise during sponge collections. We thank the South African Environmental Observation Network, Elwandle Coastal Node, and the Shallow Marine and Coastal