Gene Expression Analysis of Four Radiation-resistant Bacteria.

To investigate the general radiation-resistant mechanisms of bacteria, bioinformatic method was employed to predict highly expressed genes for four radiation-resistant bacteria, i.e. Deinococcus geothermalis (D. geo), Deinococcus radiodurans (D. rad), Kineococcus radiotolerans (K. rad) and Rubrobacter xylanophilus (R. xyl). It is revealed that most of the three reference gene sets, i.e. ribosomal proteins, transcription factors and major chaperones, are generally highly expressed in the four bacteria. Recombinase A (recA), a key enzyme in recombinational repair, is predicted to be highly or marginally highly expressed in the four bacteria. However, most proteins associated with other repair systems show low expression levels. Some genes participating in ‘information storage and processing,’ ‘cellular processes and signaling’ and ‘metabolism’ are among the top twenty predicted highly expressed (PHX) genes in the four genomes. Many antioxidant enzymes and proteases are commonly highly expressed in the four bacteria, indicating that these enzymes play important roles in resisting irradiation. Finally, a number of ‘hypothetical genes’ are among the top twenty PHX genes in each genome, some of them might contribute vitally to resist irradiation. Some of the prediction results are supported by experimental evidence. All the above information not only helps to understand the radiation-resistant mechanisms but also provides clues for identifying new radiation-resistant genes from these bacteria.

The above four radiation-resistant aerobic bacteria survive in an intense dosage of radiation that is lethal to most organisms. 6 The complete genome of D. geo is composed of a circular chromosome and two megaplasmids, which was originally isolated from a hot spring, and subsequently identified from other extreme environments. 7,4 D. rad is the first deinobacteria discovered and isolated from canned meat that had spoiled following exposure to X rays, which contains two circular chromosomes and two plasmids. 1 The above two bacteria belong to extremely radiation-resistant family Deinococcus, which can survive acute exposures to IR (10 kGy), UV (1 kJ/m 2 ), and can grow under chronic IR (60 Gy/hour). [8][9][10] K. rad contains a 4.76 Mb linear chromosome and two plasmids, which is isolated from a high-level radioactive environment. 3,5 The thermophilic bacteria R. xyl contains a circular chromosome. 2 Both K. rad and R. xyl belong to Actinobacteria phylum, which can withstand relatively high concentration of metals and alkali cations, as well as exposure to extreme doses of IR close to that of D. rad. 2,3,5 The genome size, G+C content, optimal growth temperature (OGT) and other features of the four radiationresistant bacteria are listed in Table 1.
Briefly, the prevailing opinions of radiation-resistant mechanisms are: (i) Genome reassembly and DNA repair are facilitated by numerous repeated sequences and morphology features, extended synthesis-dependent strand annealing (ESDSA) repair, nonhomologous end joining (NHEJ) repair and other DNA repair systems. 5,[11][12][13][14] (ii) A large number of 'hypothetical genes' might encode proteins related to unexplored radiation-resistant mechanisms. 15 (iii) Accumulation of non-enzymic Mn(II) complexes help to protect proteins from oxidation during irradiation, and elemental ratios of Mn/Fe have been proposed to indicate a cell's susceptibility to oxidative stress. 16,17 (iv) Conventional enzymatic detoxifying systems might operate with extraordinary efficiency. 13,17 (v) Genes involved in carotenoid biogenesis can confer a modest level of radiation resistance by scavenging reactive oxygen species (ROS). 18,19 The mechanisms and some related genes are illustrated in Figure 1. However, no single opinion can explain the underlying genetic complexity of the extreme resistance phenotype.
To further understand the radiation-resistant mechanisms of these bacteria, a viable strategy is to identify the highly expressed genes. Thanks to the rapid progress in bioinformatics, several theoretical indices have been proposed to predict the expression levels of prokaryotic genes, which include codon adaptation index (CAI), 20 effective  number of codons, 21 frequency of optimal codons, 22 E(g) index [23][24][25][26] etc. Based on E(g) index, Karlin and coworkers predicted the gene expression level of D. rad. 25 It was found that a high number of chaperone/degradation, protease, detoxification genes, and several proteins of cell envelope surface structures are highly expressed. Many cell division proteins, ABC transporter genes and a high number of function-unknown genes are also predicted to be highly expressed. 25 The abundance and versatility of the above predicted highly expressed (PHX) genes help to maintain the survival of D. rad when exposed to severe conditions of radiations. However, it remains unclear whether these mechanisms are also shared by other radiation-resistant bacteria, which stimulated our interest to do a comprehensive analysis. Considering the validity of E(g) based gene expression prediction methodology, [23][24][25][26] it is employed in the present work to perform our analysis.

Results and Discussion statistics of PhX genes among the four radiation-resistant bacteria
The statistics of PHX genes and their E(g) values for the four radiation-resistant and one radiation-sensitive bacteria is listed in Table 2. As is shown, the percentage of PHX genes account for 9.6% to 14.1% of all the genes in the four radiation-resistant bacteria, and 19.1%∼31.2% of the PHX genes are function-unknown 'hypothetical genes'. However, the number of annotated and PHX 'hypothetical genes' in radiation-sensitive Escherichia coli str. K12 (E. coli) is much smaller than the four radiation-resistant bacteria. The highest E(g) value ranges from 1.17 in R. xyl to 1.66 in D. rad. Table 3 displays the top twenty PHX genes in the four radiation-resistant and bacteria, and their corresponding E(g) values in E. coli.
The following section will analyze these PHX genes with details.

The Top Twenty PHX Genes in the Four Radiation-resistant Bacteria
The top twenty PhX genes in reference gene sets rPs, TFs, and chs It is observed that genes in three reference groups (RPs, TFs, and CHs) are generally highly expressed ( Table 3). Many 50 S (L2, L3, L15, L19, L22, L24) and 30 S ribosomal proteins (S1∼S4, S13) are among the top twenty PHX genes in one or more radiation-resistant bacteria. Especially, 30S ribosomal protein S1 is among the top twenty PHX genes in the analyzed four genomes. In the TF group, the translation elongation factor G (fusA), elongation factor Tu, Ts, Crp/Fnr family transcriptional regulator, RNA polymerase beta and beta' subunit are generally highly expressed in the four species, especially RNA polymerase beta' subunit, which is among the top twenty PHX genes in three radiation-resistant bacteria. The principal transcription/translation factors ensure microbes to synthesize proteins rapidly and exactly to prevent radiation damage. Some major chaperone proteins, such as DnaK (HSP70), DnaJ, GroEL (HSP60) and co-chaperonin GroES are among the top twenty PHX genes in the four species. DnaK can cooperate with trigger factor in the de novo protein folding by transiently binding to nascent and newly synthesized polypeptides. 27 GroEL and GroES chaperones consist of a nest with a lid, which provide secluded environments for folding molecules to protect them against intermolecular aggregation and facilitate the actual folding process. 28 The high expression levels of DnaK, GroEL and GroES are helpful for preventing protein misfolding and   increasing their thermal stability, which might be helpful for resisting irradiation. 29,30 The top twenty PhX genes in 'information storage and processing' and 'cellular processes and signaling' According to the clusters of orthologous groups of proteins (COGs) category, 31 all the functional genes are classified into four functional groups, i.e. 'information storage and processing', 'cellular processes and signaling', 'metabolism' and 'poorly characterized', respectively. 31 In 'information storage and processing' functional category, translationassociated GTPase, recombinase A (recA), GCN5related N-acetyltransferase and DEAD/DEAH box helicase domain-containing protein are top twenty PHX genes in one or more radiation-resistant bacteria. Among them, recA is a key enzyme in recombination and repair, which is highly expressed in three radiation-resistant bacteria and marginally highly expressed in R. xylanophilus. As a major defense against environmental damage to cells, DNA repair is present in all organisms. It is presumed that other proteins involved in DNA repair processes should also be highly expressed, whereas the prediction result is much different from our speculation. Both Karlin et al's and our analyses showed that most genes involved in direct repair, base excision repair, apurinic/apyrimidinic (AP) endonuclease, mismatch excision repair, nucleotide excision repair, recombinational repair (except recA) and other repair proteins are not highly expressed (data not shown). Therefore, it is speculated that recA gene makes more contribution for resisting irradiations than other repair proteins. All basal DNA repair genes in radiation-resistant bacteria are subject to positive selection. 32 It is reported that recombinational processes contribute vitally to reconstituting the cleaved DNA fragments. 25 Furthermore, experimental evidence shows that recA in D. rad is substantially up-regulated at early phase of DNA damage, 33 and mutations in recA render D. rad as sensitive to ionizing radiation as E. coli. 5 Some genes in 'cellular processes and signaling' are among the top twenty PHX genes in the four genomes. However, most of them are not generally highly expressed (Table 3) except ATP-dependent protease and S-layer-like protein. ATP-dependent protease is described with details in the next section. Cell surface proteins can envelop the cell exterior, which provide protection against environmental desiccation, thermal effect and other hazards in the environment. S-layer protein (surface structure) is among the top twenty PHX genes in D. geo and D. rad, whereas the other two species have no similar genes.

The top twenty PhX genes in 'metabolism' functional category
Many genes participating in 'energy production and conversion', 'lipid metabolism', 'amino acid transport and metabolism' and 'nucleotide transport and metabolism' are among the top twenty PHX genes and generally highly expressed in the four radiation-resistant bacteria. Aconitate hydratase (acnA) of tricarboxylic acid (TCA) cycle, antioxidant protein thioredoxin, F0F1 ATP synthase and oxidoreductase-like protein are usually highly expressed in the four radiation-resistant bacteria. Whereas, other genes, such as aldo/keto reductase, isocitrate lyase (aceA) in glyoxalate bypass and light-independent protochlorophyllide reductase subunit B, are PHX genes in one or two species. In 'lipid metabolism' category, only acyl-CoA dehydrogenase-like protein is highly expressed in three bacteria except K. rad. ABC transporter can transport a wide variety of substrates including sugars, amino acids, metal ions, peptides, proteins, and a large number of hydrophobic compounds and metabolites across extra-and intracellular membranes, which is essential for all living organisms. 34 Some proteins in 'amino acid transport and metabolism' are commonly highly expressed in the four bacteria. However, a number of genes in 'amino acid transport and metabolism' category are highly expressed only in one or two genomes. For example, histidinol phosphate aminotransferase are the top twenty PHX genes in R. xyl, whereas it is not annotated in the other three bacteria. Some extracellular solute binding proteins are highly expressed in D. geo and D. rad. Furthermore, glycine hydroxymethyltransferase is only highly expressed in K. rad. Nucleoside-diphosphate kinase in 'nucleotide transport and metabolism' is highly expressed in three bacteria except R. xyl, and bifunctional 2′, 3′-cyclic nucleotide 2′-phosphodiesterase is highly expressed in D. geo and K. rad.

The top twenty PhX genes with special and hypothetical function
A membrane lipoprotein is one of the top twenty PHX genes in D. geo and highly expressed in D. rad and K. rad. Furthermore, a large number of 'hypothetical genes' (ranging from 46 in D. geo to 145 in K. rad) are highly expressed in the four radiation-resistant species, and some of them are among the top twenty PHX genes (Table 3). Since 'hypothetical genes' account for 22.6%∼47.8% of the four radiation-resistant bacteria, some of them might be unexplored radiation-resistance proteins which are more effective than the known counterparts. Functions of these 'hypothetical genes' are of special interest for future studies.

comparing the top twenty PhX genes with radiation-sensitive bacterium E. coli
Comparing the top 20 PHX genes with well-studied radiation-sensitive bacterium E. coli, it could be observed that their expressions are much different (Table 3). Although the reference gene sets RPs, TFs and chaperones are highly expressed in E. coli, none of them are among the top twenty PHX genes. However, many of these reference genes belong to the top 20 highly expressed in the four radiation-resistant bacteria. The relative abundance of RPs, TFs and chaperones might help protein synthesis and folding more efficiently in radiationresistant bacteria. Some tRNA synthetases are among the top 20 PHX genes, whereas none of them are among the top highly expressed in the four radiation-resistant bacteria. Furthermore, most top PHX genes in radiation-resistant bacteria listed in Table 3 are not highly expressed in E. coli, indicating that gene expression patterns are much different in radiation-sensitive and radiationresistant bacteria.

PHX Genes in Antioxidant System and Proteolysis
Because the formation of ROS during irradiation is extremely rapid, the antioxidant enzymes in radiation-resistant bacteria must be highly efficient in order to neutralize and remove free radicals and other toxic substances. Many antioxidant enzymes are highly expressed in the four radiation-resistant bacteria (Table 4). For example, Cu-Zn superoxide dismutase (sodC) and Mn superoxide dismutase (sodA) are highly expressed in three radiationresistant species except K. rad. Many catalase and peroxidase participating in detoxification are generally highly expressed in the four species. Thioredoxin reductase (TrxR) in conjunction with thioredoxin (Trx) is a ubiquitous oxidoreductase system with antioxidant and redox regulatory roles. 35 Glutaredoxin is another important thiolbased antioxidant with function overlapping that of thioredoxin. 35 Some thioredoxin and glutaredoxin related genes are commonly highly expressed in the four radiation-resistant bacteria. The multiple PHX detoxification genes help remove free radicals generated by irradiation. In addition, a vanadium-dependent haloperoxidase is highly expressed in D. geo, whereas the other three bacteria do not have this enzyme. Some proteases participating in protein degradation are highly expressed in three radiation-resistant bacteria except R. xyl. Most highly expressed proteases are ATP-dependent proteases. ATP-dependent proteases control diverse cellular processes by degrading specific regulatory proteins. Especially, ATP-dependent protease La is essential for cellular homeostasis by mediating the degradation of abnormal and damaged polypeptides, as well as short-lived regulatory proteins. 36 Furthermore, two serine proteases are also highly expressed in D. rad. High expression levels of these proteases make sure that the irradiation injury proteins could be degraded efficiently.

Comparing the in silico Prediction Result with Experimental Evidence
To evaluate the prediction reliability, E(g) values should be compared with experimental evidence. Although the four radiation-resistant bacteria have been sequenced for several years, only D. rad transcriptional profiling can be obtained in gene expression databases up to now. The experimental gene expression data of D. rad were downloaded from NCBI GEO database (http://www.ncbi.nlm. nih.gov/geo/query/acc.cgi?acc=GSE9636). Then the correlation between experimental and prediction result is analyzed. Figure 2 (a-b) show the correlation between E(g) values and experimental transcriptional profiling in wild-type D. rad. rad following exposure to 6 kGy γ-irradiation. 37 Many PHX genes, such as a serine protease and some 'hypothetical genes' were found to be expressed at high levels and not degraded during long time of irradiation. Some PHX gene, such as aconitate hydratase (acn), groEL, dnaK and some 'hypothetical genes' were degraded during 4 h lag period of PIR and re-synthesized to normal levels during the late phase of PIR. 37 Like DNA degradation, export and repair, degradation and resynthesis of damaged proteins can ensure rapid and smooth post-irradiation recovery of D. rad. 37 The above examples show that the prediction results are in accordance with experimental evidence. In the absence of experimental data, the in silico prediction can provide some valuable information.

Conclusion
Through analyzing the common PHX genes in four radiation-resistant bacteria, it can be inferred that some common radiation-resistant mechanisms are shared by the four species (Fig. 1). Our prediction and experimental evidence supports the opinion that recA is more important than other repair proteins in resisting irradiation. Furthermore, many antioxidant enzymes, e.g. superoxide dismutase, catalase, peroxidase and thiol-based thioredoxin and glutaredoxin, play important roles in scavenging free radicals caused by irradiation, which are commonly or uniquely PHX genes in the four radiation-resistant bacteria. Proteases are important for degrading abnormal and damaged polypeptides, some of them, especially ATP-dependent protease La, are PHX genes in most radiation-resistant bacteria. In addition to the common highly expressed genes, each species has its own unique PHX genes to resist irradiation. Another important issue is that a number of 'hypothetical genes' are PHX genes in the four radiation-resistant bacteria, and some are among the top twenty, which might be unexplored genes for resisting irradiation. All the above information is helpful for further understanding the radiation-resistant mechanisms and  provides important clues to identifying new radiation-resistant genes.

Materials and Methods
The complete sequences and corresponding annotation information of the four radiation-resistant bacteria D. geo (chromosome: NC_008025), D. rad (chromosome 1: NC_001263; chromosome 2: NC_001264), K. rad (NC_009664), R. xyl (NC_008148) and one radiation-sensitive model bacterium E. coli (NC_000913) were downloaded from NCBI RefSeq at Aug 6, 2008. As a well-established formulism, Karlin's methodology select three sets of genes, i.e. ribosomal proteins (RP), principal transcription/translation factors (TF), and the major chaperones/degradation (CH), as the reference of tacit highly expressed genes. Qualitatively, if a gene's codon usage is similar to that of the three groups of genes but deviates strongly from the average gene of the genome, it is predicted to be a PHX gene. [23][24][25][26] The E values can reflect the levels of common genes' similarity in codon frequencies to RP, TF, and CH groups and of their deviation from the whole genome.
Let F and G be two groups of genes, the codon usage difference of F relative to G is calculated by the formula where p a (F) are the average amino acid frequencies of the genes in F, and f(x, y, z) and g(x, y, z) are the average codon frequencies for the codon triplet (x, y, z) of each amino acid codon family for gene groups F and G, respectively. Let B (g|G) indicates the codon usage difference of the gene g relative to the gene group G, and C is the totality of all genes in the genome. 25 Predicted expression levels with respect to individual standards are based on the following PHX genes must satisfy the following two criteria, the E value exceeds 1.00 and at least two of E RP (g), E CH (g), and E TF (g) are more than 1.05. (