The quest for biofuels fuels genome sequencing

The list of recently completed microbial genome projects (Table 1) shows further progress in sequencing genomes of poorly studied environmental bacteria. The genome of Aquifex aeolicus, sequenced 10 years ago, has been joined by genomes of two more representatives of the phylum Aquificae. The genome of Polaribacter sp. MED152, a marine member of Bacteroidetes, revealed a combination of heterotrophic metabolism with light energy capture by proteorhodopsin. In addition, six genomes from the phylum Chlorobi more than doubled the number of sequenced genomes of green sulfur bacteria.

The list of recently completed microbial genome projects (Table 1) shows further progress in sequencing genomes of poorly studied environmental bacteria. The genome of Aquifex aeolicus, sequenced 10 years ago, has been joined by genomes of two more representatives of the phylum Aquificae. The genome of Polaribacter sp. MED152, a marine member of Bacteroidetes, revealed a combination of heterotrophic metabolism with light energy capture by proteorhodopsin. In addition, six genomes from the phylum Chlorobi more than doubled the number of sequenced genomes of green sulfur bacteria.
In eukaryotic genomics, important news was the release by the JGI scientists of a draft genome of the soft-rot ascomycete fungus Trichoderma reesei, also known as Hypocrea jecorina (Martinez et al., 2008). Trichoderma reesei is filamentous fungus that is widely used in biotechnology as a producer of various cellulases and hemicellulases for the hydrolysis of plant cell walls. This organism has attracted renewed interest owing to its potential use in the conversion of lignocelluloses to biofuel. The GenBank version of the draft genome of T. reesei consists of 2236 contigs, assembled into 170 scaffolds and containing~34 Mbp of DNA, representing 99% of the whole genome. The current assembly did not assign the scaffolds to any of the seven chromosomes of T. reesei, but allowed identification of 9129 predicted protein-coding genes (Martinez et al., 2008). Comparison of T. reesei with Fusarium graminearum (Gibberella zeae) and Neurospora crassa revealed a certain degree of synteny between these three genomes. A surprising finding was the relatively low number of glycoside hydrolases (cellulases, hemicellulases and pectinases) encoded by T. reesei genome. The authors suggest that successful utilization by T. reesei of its limited set of cellulolytic enzymes to efficiently degrade plant cell walls could be due to (i) clustering of the respective genes that ensures co-expression of the right combination of hydrolytic enzymes, and (ii) secretion of secondary metabolites (Martinez et al., 2008).
Although phylogenetically unrelated to T. reesei, the g-proteobacterium Cellvibrio japonicus also encodes an efficient machinery for degrading plant cell walls that includes 130 predicted glycoside hydrolases (DeBoy et al., 2008).
The current list includes two actinobacterial genomes, representing the soil bacterium Kocuria rhizophila (Takarada et al., 2008) and a new strain of the human gut symbiont Bifidobacterium longum (Lee and O'Sullivan, 2006;Lee et al., 2008). The genus Kocuria belongs to the family Micrococcineae and was separated from Micrococcus just a few years ago (Stackebrandt et al., 1995). Accordingly, K. rhizophila ATCC 9341, parental strain of the sequenced K. rhizophila DC2201, was until recently classified as Micrococcus luteus and used as a standard quality control strain in a number of applications, including testing of antimicrobial compounds (Tang and Gillevet, 2003). The genus name was assigned to honour Miroslav Kocur, Slovakian microbiologist who dedicated many years to studying M. luteus (Rosypal and Kocur, 1963;Kocur, 1986). Kocuria rhizophila is an environmental actinomycete that is often associated with plant roots. Despite its small (for a soil actinomycete) 2.7 Mbp genome, K. rhizophila appears to encode the full set of key metabolic enzymes. However, it encodes fewer proteins participating in secondary metabolism, including single genes for a non-ribosomal peptide synthetase and a polyketide synthase. The relatively high tolerance of K. rhizophila to various organic compounds correlates with the presence of a large number of genes encoding various membrane transporters, including drug efflux pumps (Takarada et al., 2008).
The two newly sequenced genomes of Aquificae represent two major families in this phylum. Hydrogenobaculum sp. YO4AAS1 belongs to the family Aquificaceae, which also includes A. aeolicus, the best-characterized member of the phylum, whereas Sulfurihydrogenibium sp. YO3AOP1 belongs to the family Hydrogenothermaceae. Both are thermophilic chemolitoautotrophs, isolated from hot springs at Yellowstone National Park at 60-75°C and capable of growing in microaerophilic conditions by using reduced sulfur compounds and/or hydrogen as electron acceptors and CO2 as the source of carbon (Stöhr et al., 2001;Reysenbach et al., 2005). However, the former is an acidophile, growing at or below pH 3.0, and the latter grows at neutral pH values. The genome size of Hydrogenobaculum sp. YO4AAS1 is very close to that of A. aeolicus, whereas Sulfurihydrogenibium sp. YO3AOP1 features a 300 kb larger genome and almost a hundred of extra proteins. Availability of these new genomes should provide a much-needed insight into the physiology of Aquificae, one of the earliest-branching bacterial lineages.
Of the two members of the highly diverse phylum Bacteroidetes in the current list, the first one, Candidatus Amoebophilus asiaticus, is an obligate intracellular symbiont of the amoebae Acanthamoeba sp. (Horn et al., 2001). However, it has a much larger genome and encodes far more proteins than Candidatus Sulcia muelleri, another member of the Bacteroidetes that is an endosymbiont of sharpshooters (McCutcheon and Moran, 2007). In addition, JGI scientists plan to sequence the genome of Candidatus Cardinium hertigii, a symbiont of Encarsia wasps. Comparison of Ca. A. asiaticus with Ca. S. muelleri and Ca. C. hertigii on one hand and to free-living Bacteroidetes on the other should provide further clues to the mechanisms of bacterial adaptation to the endosymbiotic lifestyle.
The second Bacteroidetes member, Polaribacter sp. MED152, is a marine bacterium that was isolated from the surface water of north-western Mediterranean Sea off the Catalan coast (González et al., 2008). In the original GenBank submission, it was listed as a strain of Polaribacter dokdonensis (Yoon et al., 2006), with which it shares 99.6% similar 16S rRNA sequence. However, because of certain phenotypic differences between the two, the authors have chosen to refer to the sequenced organism simply as 'strain MED152'. Together with the previously described Gramella forsetii (Bauer et al., 2006), Polaribacter sp. MED152 represents the marine Bacteroidetes that in certain conditions may comprise up to 20% of the bacterioplankton. Physiology of these bacteria is still poorly understood, and the authors use the genome of MED152 to offer a very attractive scheme of a 'dual lifestyle' for this organism. Based on the abundance of protease and glycosidase genes, they propose that the normal modus operandi for MED152 includes gliding motility in search for suitable polymers and their subsequent degradation for carbon, nutrients and energy (González et al., 2008). However, once suitable polymeric substrates have been exhausted, MED152 must sustain itself in a nutrient-poor environment. In contrast to G. forsetii, MED152 encodes proteorhodopsin, an H + -translocating light-dependent ion pump that can use light energy to charge the membrane, generating the proton-motive force. In fact, exposure to light does not stimulate growth of MED152 but appears to stimulate bicarbonate uptake and, conceivably, assimilation of carbon dioxide (González et al., 2008). Accordingly, MED152 encodes a variety of (predicted) light sensors that have not been seen in other members of Bacteroidetes. As noted in the accompanying insightful comment (Kirchman, 2008), the ability of marine bacteria to absorb light and use it to supplement their energy needs has important consequences for the understanding of the global carbon cycle.
In the past 2 months, JGI scientists released six complete genomes of Chlorobi (green sulfur bacteria), five of which, Chlorobaculum parvum, Chlorobium limicola, Chloroherpeton thalassium, Pelodictyon phaeoclathratiforme and Prosthecochloris aestuarii, represent new species and one, Chlorobium phaeobacteroides represents a new strain of the species that had its first sequenced genome 2 years earlier (Table 1). Like other green sulfur bacteria, all these strains are anoxygenic phototrophs that live in strictly anaerobic sulfide-rich environments. They gain energy from photosynthesis, which relies on type I reaction centres and uses sulfide, sulfur and/or thiosulfate as electron acceptors, and fix carbon through the reverse TCA cycle (Overmann and Garcia-Pichel, 2000;Frigaard and Bryant, 2004). The species differ in their ecological niches and the relative amounts of carotene pigments and bacteriochlorophylls a, c, d and e. Green sulfur bacteria play a key role in carbon, nitrogen and sulfur turnover in anoxic freshwater aquatic environments and are a potential source of biomass for biofuels. In addition, Prosthecochloris aestuarii, which forms multilayered biofilms, has been implicated in microbial infection of coral reefs. Comparative analysis of these genomes should clarify many unanswered questions in physiology of these interesting and important organisms.
Natranaerobius thermophilus strain JW/NM-WN-LF is an anaerobic, halophilic alkalithermophile isolated from sediments of a solar-heated, alkaline, hypersaline soda lake at Wadi An Natrun, Egypt (Mesbah et al., 2007). Its optimum growth conditions are 53°C, pH 9.5 and between 3.3 and 3.9 M Na + . It cannot grow at pH lower than 8.3 (or higher than 10.8). This organism belongs to a separate lineage in the class Clostridia and is currently assigned to the separate order Natranaerobiales and family Natranaerobiaceae. A detailed analysis of its genome sequence should clarify the adaptations of N. thermophiles to its unique ecological niche but it is already obvious that they include a Na + -dependent F1FO-type ATP synthase, very similar to the ones in the recently sequenced genomes of Alkaliphilus metalliredigens and Alkaliphilus oremlandii.
The current list also includes genomes of two mollicutes, Candidatus Phytoplasma mali and Mycoplasma arthritidis. The first one is a phytopathogen infecting apple, cherry, apricot and plum trees. It was isolated in Heidelberg, Germany, from an apple tree displaying symptoms of apple proliferative disease and is the first mycoplasma to have a linear chromosome (Kube et al., 2008a). The second one causes arthritis in rats and mice and is remarkable for carrying a lysogenic bacteriophage (Dybvig et al., 2008).
However, the greatest surprise in the mycoplasma studies came not from genome sequencing labs but from taxonomists. Although mycoplasmas have long been listed in the Division Tenericutes (International Committee on Systematic Bacteriology-Subcommittee on the Taxonomy of Mollicutes, 1995), this clade was usually considered together with Rickettsia and Chlamydia and not treated as an actual taxonomic unit. Instead, Mollicutes were considered a class in the phylum Firmicutes, which was consistent with the available phylogenetic analyses (Falah and Gupta, 1997;Ciccarelli et al., 2006). However, in the recent edition of Bergey's Manual of Systematic Bacteriology, class Mollicutes was excluded from the phylum Firmicutes and moved to the new phylum Tenericutes (Ludwig et al., 2008). While there might have been valid reasons for doing that (for example, many mycoplasma use a non-standard genetic code with UGA codon coding for tryptophan instead of terminating translation), the cited reason for that move was comparative analysis of mycoplasmal sequences by Ludwig and Schleifer (2005), published in a book to which many researchers had no access. Given that the goal of Bergey's Manual is introduction of 'phylogenetic framework' (Ludwig et al., 2008), it seems unfortunate that such important changes are being made without a public discussion or at least a publication in a peer-reviewed journal. After all, massive investments in microbial genome sequencing worldwide have moved bacterial taxonomy from a purely academic sphere into the realm of the biotechnological marketplace, and relatively minor changes in classification could have serious effect on the priorities in future genome sequencing projects.