Isolation of a Δ5 Desaturase Gene from Euglena gracilis and Functional Dissection of Its HPGG and HDASH Motifs

Delta (Δ) 5 desaturase is a key enzyme for the biosynthesis of health-beneficial long chain polyunsaturated fatty acids such as arachidonic acid (ARA, C20:4n-6), eicosapentaenoic acid (C20:5n-3) and docosahexaenoic acid (C22:6n-3) via the “desaturation and elongation” pathways. A full length Δ5 desaturase gene from Euglena gracilis (EgΔ5D) was isolated by cloning the products of polymerase chain reaction with degenerate oligonucleotides as primers, followed by 5′ and 3′ rapid amplification of cDNA ends. The whole coding region of EgΔ5D was 1,350 nucleotides in length and encoded a polypeptide of 449 amino acids. BlastP search showed that EgΔ5D has about 39 % identity with a Δ5 desaturase of Phaeodactylum tricornutum. In a genetically modified dihomo-gamma-linoleic acid (DGLA, C20:3n-6) producing Yarrowia lipolytica strain, EgΔ5D had strong Δ5 desaturase activity with DGLA to ARA conversion of more than 24 %. Functional dissection of its HPGG and HDASH motifs demonstrated that both motifs were important, but not necessary in the exact form as encoded for the enzyme activity of EgΔ5D. A double mutant EgΔ5D-34G158G with altered sequences within both HPGG and HDASH motifs was generated and exhibited Δ5 desaturase activity similar to the wild type EgΔ5D. Codon optimization of the N-terminal region of EgΔ5D-34G158G and substitution of the arginine with serine at residue 347 improved substrate conversion to 27.6 %. Electronic supplementary material The online version of this article (doi:10.1007/s11745-012-3690-1) contains supplementary material, which is available to authorized users.

Abstract Delta (D) 5 desaturase is a key enzyme for the biosynthesis of health-beneficial long chain polyunsaturated fatty acids such as arachidonic acid (ARA, C20:4n-6), eicosapentaenoic acid (C20:5n-3) and docosahexaenoic acid (C22:6n-3) via the ''desaturation and elongation'' pathways. A full length D5 desaturase gene from Euglena gracilis (EgD5D) was isolated by cloning the products of polymerase chain reaction with degenerate oligonucleotides as primers, followed by 5 0 and 3 0 rapid amplification of cDNA ends. The whole coding region of EgD5D was 1,350 nucleotides in length and encoded a polypeptide of 449 amino acids. BlastP search showed that EgD5D has about 39 % identity with a D5 desaturase of Phaeodactylum tricornutum. In a genetically modified dihomogamma-linoleic acid (DGLA, C20:3n-6) producing Yarrowia lipolytica strain, EgD5D had strong D5 desaturase activity with DGLA to ARA conversion of more than 24 %. Functional dissection of its HPGG and HDASH motifs demonstrated that both motifs were important, but not necessary in the exact form as encoded for the enzyme activity of EgD5D. A double mutant EgD5D-34G158G with altered sequences within both HPGG and HDASH motifs was generated and exhibited D5 desaturase activity similar to the wild type EgD5D. Codon optimization of the N-terminal region of EgD5D-34G158G and substitution of the arginine with serine at residue 347 improved substrate conversion to 27.6 %.
Since mammals lack delta (D) 12-and D15-desaturases, ARA, EPA and DHA cannot be synthesized de novo and must be obtained either in the diet or synthesized through ''desaturation and elongation'' pathways ( Fig. 1) from essential fatty acids linoleic acid (LNA, 18:2n-6) and/or alpha (a)-linolenic acid (ALA 18:3n-3). These LC-PUFA are important fatty acids for human growth and development. For example, ARA, a precursor of EPA, is abundant in the brain and muscles. As a lipid second messenger ARA is involved in cellular signaling and is a key inflammatory intermediate [3]. EPA is a precursor of DHA, and induces a broad anti-inflammatory response [1][2][3]. DHA is a major x-3 fatty acid in the mammalian central nervous system and enhances synaptic activities in neuronal cells [1][2][3][4]. EPA and DHA are the precursors of E-and D-series resolvins, respectively. These two classes of resolvins have distinct structural, biochemical and pharmacological properties [5,6]. ARA and DHA together play critical roles for neurological development and health [4,7]. Dietary EPA and DHA can effectively reduce the level of blood triglycerides in human [8]. Increased intake of EPA-rich supplement has beneficial effects on coronary heart disease, high blood pressure, inflammatory disorders and mental illness [9,10]. Currently, the primary source of EPA and DHA is marine fish oil. Most of the EPA and DHA in fish oil are from their cold-water oceanic microalgae food sources. More than 85 % of isolated fish oil is used for aquaculture. In the case of salmon-farming, fish oils from approximately 4 pounds of fish are needed to raise one pound of salmon filet, the fish-in and fish-out ratio is about 4:1. In today's environment, wild-caught fish often contain contaminants such as methylmercury, polychlorinated biphenyls, dioxins and several other halogenated persistent organic pollutants [11]. With ever-growing human populations, and limited sources of ocean fish, there is growing concern about the quality, quantity and sustainability of fish oil.
In the last two decades, great efforts have been focused on developing different hosts for production of LC-PUFA. Wild type Mortierella alpina has been developed for commercial production of ARA [12], while Crypthecodinium cohnii and Schizochytrium have been developed for commercial production of DHA [13]. The ARA and DHA oils produced from these organisms have been largely used in infant formulas. Yarrowia lipolytica has been genetically engineered to contain an EPA biosynthesis pathway [14] allowing for the commercial production of EPA oil, NewHarvest TM (http://www.newharvest.com). The EPA oil has been used as a human nutritional supplement. Additionally, EPA-rich Yarrowia biomass has been used to feed a brand of farmed salmon, Verlasso TM (http://www.verlasso.com), with a fish-in and fish-out ratio of about 1:1. However, the current production scale and cost of ARA, EPA and DHA cannot meet the market demand.
D5 desaturases are known as ''front-end'' desaturases, wherein desaturation occurs between a pre-existing double bond and the carboxyl terminus of the fatty acid [26][27][28][29]. Like other desaturases, D5 desaturase is an iron-containing and membrane-bound enzyme that requires both molecular oxygen and an electron transfer to introduce double bonds into an existing acyl chain. Microsomal cytochrome b 5 serves as electron donor to desaturase enzymes [26,29,30]. Fatty acid desaturation can be carried out via concerted action of multiple enzymes including NADH reductase, the desaturase enzyme, and cytochrome b 5 reductase. Alternatively, many desaturases contain both a cytochrome b 5 domain and a desaturase domain. For example, D4, D5, D6 and D8 desaturases have a cytochrome b 5 domain at their N-terminus [26]; D9 desaturase has a cytochrome b 5 domain at its C-terminus [30]. The cytochrome b 5 domain of all desaturases has a hemebinding ''HPGG'' motif. Previous studies using molecular dynamics simulations suggest the ''HPGG'' motif is important for heme group assembly and desaturase function [31,32].
The active site of desaturases has been characterized as a diiron cluster that is bound to the enzyme by three regions of highly conserved histidine-rich (His-rich) motifs [27,30]. These three His-rich motifs H(X) [3][4] H, H(X) 2-3 HH, and H/Q(X) 2-3 HH are conserved among all front-end desaturases, and the eight histidine residues of these three motifs are essential for catalytic activity [33]. In the case of D5 desaturase from MaD5D [24,25], the exact amino acid sequence of the first His-rich motif (H(X) 3-4 H) is HDASH, which has been suggested as one of the characteristics of D5 desaturases and necessary for its function to convert DGLA to ARA [34]. Recent studies find that several D5 desaturases (GenBank accession #s: AAL82631, AAL13311, AAL92562, AAM09687, CAJ07076) do not contain the exact HDASH sequence. Due to the important role of D5 desaturases in LC-PUFA biosynthesis, a detailed understanding of the functional significance of the conserved HPGG and HDASH motifs may contribute to improvements in hosts biologically engineered to produce commercially valuable LC-PUFA.
We report the isolation of a D5 desaturase gene from Euglena gracilis (EgD5D). Expression of EgD5D in a genetically modified DGLA producing Y. lipolytica strain revealed that EgD5D had strong D5 desaturase activity. Functional dissection of HPGG and HDASH motifs demonstrated that neither the HPGG nor the HDASH motif is necessary in the exact form as encoded for enzyme activity of EgD5D. Various mutants, within HPGG or HDASH motif alone, or within both HPGG and HDASH motifs, are functionally equivalent or have higher D5 desaturase activity than the wild type EgD5D. Codon optimization of  Leuand Uraphenotype, also originated from the wild type strain ATCC #20362. Strain Y4036U produced approximately 18 % DGLA of total fatty acids (Fig. 2b) and is composed of heterologous genes encoding D12 desaturase of Fusarium moniliforme [35]; C16/18 elongase of M. alpina [36]; D9 elongase of E. gracilis [37] and synthetic mutant of D8 desaturase [38] derived from E. gracilis. Minimal Media ? Leucine (MMLeu), High Glucose Media (HGM) and YPD medium were used as required for Y. lipolytica strains and cultured at 30°C. MMLeu (per liter): 20 g of glucose; 1.7 g yeast nitrogen base without amino acids or ammonium sulfate; 0.1 g proline; 0.1 g leucine; pH 6.1. HGM (per liter): 80 g glucose, 2.58 g KH 2 PO 4 , 5.36 g K 2 HPO 4 , pH 7.5. YPD medium (per liter): 10 g of yeast extract, 20 g of Bacto peptone, and 20 g of glucose. Agar plates were prepared by addition of 20 g/l agar to liquid media.

General Techniques for Molecular Biology
Recombinant DNA techniques were used according to standard methods [39,40]. Site-directed mutagenesis was performed according to the manufacturer's protocol (QuikChange TM , Stratagene; San Diego, CA). When PCR or site-directed mutagenesis was involved in the generation of mutants and/or cloning, DNA was sequenced to verify that no additional mutations were introduced.
Total RNA was extracted from the E. gracilis cells using the RNA STAT-60 TM reagent (Amsbio LLC., Lake forest, CA). 85 lg of mRNA was purified from 1 mg of total RNA using the mRNA Purification Kit (Amersham Biosciences, Piscataway, NJ). Synthesis of cDNA from the E. gracilis mRNA was carried out using the adapter primer AP of 3 0 -RACE kit from Invitrogen (Carlsbad, CA) and the Smart IV oligonucleotide of BD-Clontech Creator TM Smart TM cDNA library kit (Mississauga, ON, Canada) as primers. The reverse transcription was performed with Superscript II reverse transcriptase of Invitrogen.
PCR reactions were carried out in a 50 ll total volume comprising: PCR buffer (containing 10 mM KCl, 10 mM (NH 4 ) 2 SO 4 , 20 mM Tris-HCl (pH 8.75), 2 mM MgSO 4 , 0.1 % Triton X-100), 100 lg/mL BSA, 200 lM each deoxyribonucleotide triphosphate, 10 pmol of each primer, 10 ng cDNA of E. gracilis and 1 ll of Taq DNA polymerase (Epicentre Technologies, Madison, WI). The thermocycler conditions were set for 35 cycles at 95°C for 1 min, 56°C for 30 s and 72°C for 1 min, followed by a final extension at 72°C for 10 min. The DNA band with expected size was isolated from a 1 % agarose gel and cloned into pGEM-T easy vector (Promega, Madison, WI.).
Modified 5 0 and 3 0 RACE techniques were used to obtain the full length EgD5D. The cDNA product from E. gracilis mRNA was used as template, and all the primers used in the 5 0 and 3 0 RACE are listed in Supplemental  Table S1. Specifically, a gene specific primer ODMWP480 and a generic primer CDSIII 5 0 were used in the first round of 5 0 RACE. The PCR amplifications were carried out in a 50 ll total volume, comprising: 25 ll of LA Taq TM premix (TaKaRa Bio Inc., Otsu, Shiga, 520-2193, Japan), 10 pmol of each primer and 1 ll of Taq DNA polymerase (Epicentre Technologies, Madison, WI). The thermocycler conditions were the same as described above. One micro liter of this product was directly used in a second amplification, which differed from the first only in that the primers used, ODMWP479 and the generic primer DNR CDS 5 0 were internal to the first set of primers. As no translation initiation codon was found in the product of the first round of 5 0 RACE, the entire modified 5 0 RACE protocol was repeated using gene specific primers YL791 and YL792 as primers instead of the primers used in the first round.
A variation of a 3 0 RACE technique was used to isolate the C-terminal fragment of EgD5D. The combinations of primers ODMW469 and AUAP, and then YL470 and AUAP were used in the initial amplification and second round reaction, respectively. The PCR reactions were the same as those described for the 5 0 RACE.

Yarrowia Expression Vector and Transformation
Yarrowia expression vector pDMW367 contained autonomous replication sequence 18 [41] and a URA3 gene (Genbank accession#: No. AJ306421) of Y. lipolytica. It also contains a FBAIN::EgD5D:Pex20 chimeric gene. The FBAIN is a promoter derived from the fructose-bisphosphate aldolase gene (FBA1) of Y. lipolytica [42]. The EgD5D is the coding region of a wild type D5 desaturase of E. gracilis, in which the amino acid at position 347 is arginine. The Pex20 was a terminator sequence of PEX20 gene (Genbank accession#: AF054613) of Y. lipolytica. Transformation of Y. lipolytica strain Y4036U was carried out as described by Chen et al. [43].

Cultivation of Y. lipolytica Transformants and Fatty Acid Analysis by Gas Chromatography
The Yarrowia expression plasmid and its derivatives were used to transform strain Y4036U individually. Transformants from each transformation were streaked onto new MMLeu plates and kept in a 30°C incubator for 2 days. Cells from streaked plates were cultivated in 24 well blocks with 3 mL MMLeu, and incubated for 2 days at 30°C with shaking at 200 rpm. The cells were then collected by centrifugation and resuspended in 3 mL HGM. The cells were incubated another 5 days at 30°C with shaking at 200 rpm.
Fatty acid methyl esters from 1 ml cell culture of Y. lipolytica or E. gracilis were prepared as described [44], except that the fatty acid methyl esters were extracted with 0.5 ml of heptane and separated by Agilent 7890A GC using hydrogen as carrier gas supplied by a hydrogen generator (Parker Hannifin, Cleveland, OH). The oven temperature was programmed from 200 to 240°C at a rate of 25°C/min. The proportion of each fatty acid was based on the integrated peak area of the corresponding fatty acid methyl ester as a percent relative to the sum of all integrated peaks as calculated by Agilent ChemStation Software.
After comparison of four D5 desaturase genes, PiD5D from Pythium irregulare [46], PmD5D from Phytophthora megasperma (Genbank accession #: CAD53323), PtD5D from Phaeodactylum tricornutum [47], and DdD5D from Dictyostelium discoideum (Genbank accession #: XP_640331) as well as two D8 desaturase genes, EgD8D from E. gracilis [16,38] and PlD8D from Pavlova lutheri [48], two conserved regions, GHH(I/V)YTN and N(Y/ F)Q(V/I)EHH (Fig. 3) were selected to design primers to amplify a portion of EgD5D. To reduce the degeneracy of the primers, four primers (Supplemental Table S1: 5-1A to 1D) were generated for conserved region 1 and four primers (Supplemental Table S1: 5-5AR to 5-5DR) for the anti-sense strand of conserved region 2. One DNA fragment amplified with primers 5-1B and 5-5DR was cloned into pGEM-T Easy vector to generate pT-F10-1. DNA sequence showed that a 590 bp insert of pT-F10-1 encoded an amino acid sequence with 38 % identity and 53 % similarity to the amino acid sequence of the D8-sphingolipid desaturase of Thalassiosira pseudonana (TsD8D, Genbank accession #: AAX14502), and 37 % identity and 52 % similarity with PtD5D [47]. These data suggested that the 590 bp DNA fragment might be a part of a desaturase gene of E. gracilis. This gene was designated as putative EgD5D. 5 0 and 3 0 RACE techniques were used to extend the 590 bp region of the putative EgD5D (Fig. 4). Assembly of the 5 0 region, the original 590 bp fragment and the 3 0 region resulted in a 1,633 bp contig, comprising the complete coding region with additional untranslated 5 0 and 3 0 ends (Fig. 4). The coding region of the putative EgD5D is 1,350 bp in length and encodes a peptide of 449 amino acids. BlastP searches using the full length putative EgD5D as the query sequence showed that it shares 39 % identity and 56 % similarity with PtD5D [47]; 37 % identity and 55 % similarity with TsD8D (Genbank accession #: AAX14502). Amino acid sequence alignment performed with the Clustal W analysis (MegAlign TM program of DNASTAR software) showed that EgD5D has \30 % identity with some represent D5 desaturases such as IgD5D from I. galbana (AEA72469); MaD5D from M. alpina [24,25,34]; PiD5D [46]; OtD5D from O. tauri (Genbank accession #: XP_003082424) and TaD5D from T. aureum [49]. Further analyses showed that EgD5D has only 20 % and 25.5 % identity with EgD8D [16] and EgD4D [45] desaturases of E. gracilis, respectively. It was found that the PCR products for the full length coding region of putative EgD5D had two versions, both having identical nucleotide sequence except at base pair positions 1,039 and 1,041. This disagreement resulted in a codon change from CGA to AGC. As such, one PCR product indicated arginine at position 347, whereas the second indicated serine. It was hypothesized that this discrepancy was raised at the stage of PCR amplification or during cDNA generation.

Determination of D5 Desaturase Activity and Topology Model of EgD5D
To study the function of the putative EgD5D, plasmid pDMW367 was generated to express the EgD5D coding region under the control of the strong constitutive FBAIN promoter [42] from Y. lipolytica. The four restriction sites (i.e., BglII, EcoRI, HindIII and NcoI) inside the coding region of EgD5D in pDMW367 were removed by sitedirected mutagenesis to generate pDMW367-M4 (Supplemental Fig. S1). The amino acid sequence of EgD5D is identical in pDMW367 and pDMW367-M4 constructs, and the amino acid at position 347 is an arginine.
Like other fatty acid desaturases, EgD5D is also a membrane-bound enzyme and belongs to a super-family of membrane di-iron proteins with three His-rich motifs: HX (3,4) H, HX (2,3) HH and (H/Q)X (2, 3) HH. These His residues have been predicted to be located in the cytoplasmic face of the membrane and have been shown to be very important for enzyme activity [33]. Within EgD5D, these 3 His-rich motifs are the HDASH motif located from residue 155 to 159, the HIMRHH motif located from residue 190 to 195, and the QIEHH motif located from residue 385 to 389. The third His-rich motif contains a glutamine substitution that is common to other front-end desaturases. Based on transmembrane domain analysis (TMHMM Server v. 2.0, Center for Biological Sequence Analysis, BioCentrum-DTU, Technical University of Denmark, DK-2800 Lyngby, Denmark) and the location of the His-rich motifs, a topology model of EgD5D was developed (Fig. 5). The model shows that the N-terminal cytochrome b 5 domain is located in the cytosol. The topology model also predicts that EgD5D has a total of four transmembrane regions (amino acid residues 103-125, 130-152, 280-302 and 306-328) and two hydrophobic regions (amino acid residues 165-187, and 234-256). These hydrophobic segments are not membranespanning, and may represent hydrophobic patches located closed to the di-iron active site. Because the substrates for the desaturase is highly hydrophobic, they will likely partition into the lipid bilayer. Therefore we purport that the di-iron active site assembled from these His-clusters may occur at or very near the membrane surface.
The HPGG Motif is Important, but not Necessary for D5 Desaturase Activity of EgD5D It has been suggested that the highly conserved HPGG motif plays a crucial role in heme group assembly, protein folding and stabilization in cytochrome b 5 proteins, with the histidine residue functioning as an axial heme ligand where a peptide chain reversal occurs [50]. Previous studies have demonstrated that the heme-binding HPGG motif, and in particular, the histidine residue, is essential for enzyme activity of desaturases with cytochrome b 5 domain [51][52][53]. Although sequence divergence in the vicinity of the HPGG motif is normal, the HPGG motif itself has been conserved throughout the evolution of all the D5 desaturase genes [54]. Thus it was claimed to be a characteristic of D5 desaturases and necessary for its function to convert DGLA to ARA [34].
To assess the functional significance of the HPGG motif (position 33-36) of EgD5D, we first elected the proline  Table 1 shows that the P34 residue could be substituted with several different amino acids without significantly impacting the D5 desaturase activity of EgD5D. The EgD5D-34A, EgD5D-34C, EgD5D-34K or EgD5D-34W mutants exhibited [90 % of the wild type EgD5D activity. The EgD5D-34G mutant was functionally equivalent to the wild type EgD5D. Next, we studied the significance of the second glycine residue at position 36 (G36) within the HPGG motif of EgD5D using the same approach as that for P34. Table 2 shows that the G36 residue within the HPGG motif could be substituted with several different amino acids without significantly impacting the D5 desaturase activity of EgD5D. The EgD5D-36S or EgD5D-36D mutant had about 100.8 or 99.2 % of D5 desaturase activity when compared to EgD5D, respectively.
The above functional studies at the P34 and G36 positions within the HPGG motif of EgD5D demonstrated that the HPGG motif could be changed without impacting the D5 desaturase activity. Specifically, the EgD5D-34G, EgD5D-36S and EgD5D-36D mutants were functionally equivalent to the wild type EgD5D. Comparison of the ARA in unesterified fatty acid (FFA), phospholipid (PL) and neutral lipid (NL) pools of Yarrowia transformants with EgD5D and EgD5D-34G showed that the P34G mutation did not affect the ARA distribution in these pools (Supplemental Fig. S2).

Improvement of D5 Desaturase Activity of EgD5D by Amino Acid Substitution Within the HDASH Motif
The HDASH motif was also claimed as one of the characteristics of D5 desaturases and necessary for its function to convert DGLA to ARA [34]. To test the hypothesis that the exact sequence of the HDASH motif (position 155-159) of EgD5D was required, we first selected the alanine residue at position 157 (A157) as a target. The D5 desaturase activity attributed to each mutation at A157 is summarized in Table 3. The data showed that almost all mutations at A157 greatly reduced the D5 desaturase activity of EgD5D. However, the EgD5D-157G and EgD5D-157S mutants retained about 96 and 94 % activity of wild type EgD5D, respectively.
We also studied the significance of the serine residue at position 158 (S158) within the HDASH motif of EgD5D. Table 4 shows that the S158 could be substituted with either an alanine or a glycine without substantially  The proline residue within the HPGG motif can be substituted with glycine with simultaneous substitution of either (1) the alanine residue within the HDASH motif for glycine or (2) the serine residue within the HDASH motif for alanine or glycine. The proline residue within the HPGG motif can also be substituted with histidine with simultaneous substitution of the serine residue within the HDASH motif for either alanine or glycine. And, the second glycine residue within the HPGG motif can be substituted with serine with simultaneous substitution of the serine residue within the HDASH motif for either alanine or glycine. Specifically, the EgD5D-34G/157G, EgD5D-34G/158A and EgD5D-34H/158G double mutants had more than 80 % of the D5 desaturase activity of EgD5D, while EgD5D-34G/158G had about 97 % D5 desaturase activity of EgD5D. Further analyses showed that the ARA distribution in FFA, PL and NL pools of Yarrowia transformants with EgD5D-34G/158G was similar to Yarrowia transformants with EgD5D-34G or EgD5D (Supplemental Fig. S2), suggesting that the simultaneous substitutions (P34G and S158G) within HPGG and HDASH motifs did not change desaturase substrate specificity.

Increased Substrate Conversion of EgD5D-34G/158G with Double Mutations in HPGG and HDASH Motifs
In order to increase the substrate conversion of EgD5D-34G/158G, we optimized the codon usage of the  N-terminal portion of the gene for expression in Y. lipolytica. The codon-optimized EgD5D-34G/158G, designated as ''EgD5M'', had 48 bp changed in the first 204 bp of the coding region (23.5 %; Fig. 6), which resulted in optimization of 43 codons of the first 68 amino acids within the N-terminus of the protein (63.2 %). The amino acid sequence encoded by the codon-optimized EgD5M was identical to that of the EgD5D-34G/158G. EgD5M was used to replace the EgD5D of pDMW367-M4 to generate pDMW367-5M, containing a FBAIN::EgD5M::Pex20 chimeric gene. We then studied the importance of the arginine or serine at position 347 that were found in the original clones of EgD5D. Based on EgD5M, the CGA codon for arginine at position 347 was changed to AGC codon to encode for serine, which was designated as EgD5M1. The synthetic EgD5M1 was used to replace the EgD5D of pDMW367-M4 to generate pDMW367-5M1, containing an FBAIN:: EgD5M1::Pex20 chimeric gene.
The D5 desaturase activity of EgD5D, EgD5M and EgD5M1 is summarized in Table 6. GC analyses determined that there were about 3.6 % ARA and 10.8 % DGLA, 4.0 % ARA and 11.2 % DGLA, and 4.1 % ARA and 10.8 % DGLA of total fatty acids produced in the Yarrowia transformants with pDMW367-M4, pDMW367-5M, and pDMW367-5M1, respectively. It showed that the wild-type EgD5D converted about 24.8 % of DGLA to ARA; EgD5M converted 26.7 % of DGLA to ARA; and, EgD5M1 converted 27.6 % of DGLA to ARA. The fatty acid profile of Yarrowia transformants with EgD5M1 was almost identical to the profile of Yarrowia transformants with pDMW367-M4 as shown in Fig. 2c, except that more ARA was produced. These data demonstrated that the codon optimization of EgD5D improved its substrate conversion efficiency. Further, the amino acid at position 347 did affect the D5 desaturase activity of EgD5D, with a serine residue preferred over an arginine residue.

Discussion
Y. lipolytica has an established history of robust fermentation performance at commercial scale for processes including the production of food-grade citric acid for human consumption and single-cell protein for animal feeds [55]. Recently, Y. lipolytica has been used as a host for production of lipid-based compounds [14,56,57]. Some Y. lipolytica strains are oleaginous organisms that are able to accumulate up to 40 % dry cell weight as oil when starved for nitrogen in the presence of excess glucose as carbon source. However, LNA is the only PUFA that Y. lipolytica can synthesize de novo (Fig. 2a). Therefore, it  Average of 6 samples for each construct containing different mutations is necessary to isolate genes encoding enzymes for every step of the ''desaturation and elongation'' pathways ( Fig. 1) before genetically engineering Y. lipolytica to produce ARA, EPA and DHA oil. D5 desaturase is the enzyme responsible for the conversion of DGLA to ARA, and ETA to EPA. Although several D5 desaturase genes have been isolated from various organisms [26], more effective enzymes may help to improve the production of commercially important LC-PUFA. Previous studies have indicated that Euglena was able to synthesize ARA, EPA and DHA through the ''D9 elongase/ D8 desaturase'' pathway [16,38,45]. In this report, the gene encoding a D5 desaturase from E. gracilis was isolated and characterized. Our results indicated two nucleotide sequences with a difference of two base pairs that would result in either arginine or serine at position 347. This discrepancy was most likely generated from PCR amplification or during cDNA generation. BlastP searches showed that the amino acid sequence of EgD5D shares \40 % identity with any D5 desaturase found in Genbank; PtD5D [47] was the most similar one (about 39 %), suggesting that the primary structure of EgDD5 is quite different from those D5 desaturase genes previously isolated.
Amino acid sequence alignment also shows that EgD5D has about 20 % identity with EgD8D [16,38] and about 25.5 % identity with EgD4D [45]. These data suggest that EgD5D is evolutionary closer to D5 desaturase than the D4 or D8 desaturases. Functional analyses of EgD5D in Y. lipolytica strains Y4036U and Y2224 revealed that it has strong D5 desaturase activity, with more than 24 % substrate conversion of DGLA to ARA, and it is not a D5/D6 bifunctional enzyme.
The HPGG motif is expected to be on the cytochrome b 5 surface, in contact with the heme through van der Waals interactions [27]. The conserved HPGG motif was thought to be essential in maintaining cytochrome b 5 electron transfer function, with the histidine serving as a heme axial ligand [50]. Substitution of the histidine residue of the HPGG motif with alanine in the D6 desaturase cytochrome b 5 domain of starflower, rat and algae abolished D6 desaturase activity [51][52][53]. It is expected that H33 of EgD5D should also be essential for its function.
The HPGG motif itself has a unique structure. The proline residue of the HPGG stretch is located in a turn between two consecutive helices (Fig. 5) and was thought to be important in protein folding and in maintaining cytochrome b 5 protein stability [32]. The three-carbon side chain of proline is bonded to both the nitrogen and the carbon of the peptide backbone to form a five-member ring that greatly restricted its conformational freedom. The nonpolar characteristic of this ring structure may create a hydrophobic spot within the HPGG motif. On the other hand, the glycine possesses the smallest amino acid side chain, hydrogen, which can allow for greater flexibility in local structure. It is likely that the combination of proline and glycine residues within the cytochrome b 5 HPGG motif is an important factor affecting both the structural position of the hydrophobic heme pocket and the appropriate orientation of the heme group within the heme pocket relative to the desaturase catalytic site. Surprisingly, our results indicate that the proline residue of the HPGG motif is not essential for electron transfer from the heme group of the cytochrome b 5 domain to the catalytic diiron cluster of EgD5D. Most substitution mutants at P34 displayed at least 70 % of the wild type EgD5D activity. The EgD5D-34G (HgGG) mutant had greater than 98 % of the wild type EgD5D activity, demonstrating that the proline residue of HPGG motif is not required for the enzyme activity of EgD5D (Table 1).
It is noteworthy that aspartate substitutions in EgD5D-34D and EgD5D-36D exhibit different effects on desaturase activity. Compared to free cytochrome b 5 , while there are several conserved acidic amino acids, there is a characteristic reduction in the number of aspartate and glutamate residues in the vicinity of the HPGG motif of cytochrome b 5 domain of desaturases. This reduced number of acidic residues around the heme pocket is thought to contribute to stabilizing nonpolar intermolecular interactions between the cytochrome b 5 and desaturase domains [27,54]. Substitutions involving aspartate or glutamate residues in the HPGG motif may affect the interface geometry of electron donor/acceptor docking that is exhibited in desaturase activity due to altered electron transfer. We also found that substitutions for G36 of EgD5D resulted in mutants with strong D5 desaturase activity. The most functional mutants were the small, slightly polar serine replacement, EgD5D-36S, and the acidic substitution with aspartate, EgD5D-36D. The activities of these two mutants are about the same as the wild type EgD5D ( Table 2).
The amino acid sequence of the first His-rich motif, HX (3,4) H, of EgD5D is HDASH located from residues 155 to 159. The HDASH motif has been suggested as one of the characteristics of D5 desaturases and necessary for its function to convert DGLA to ARA in any transformed organisms [34]. Sequence analyses showed that there are natural variants of the HDASH motif in D5 desaturases, for example, PiD5D [46] has the sequence of HDsSH, the D5 desaturase from Thraustochytrium sp. ATCC 21685 (GenBank accession #: AAM09687) has the sequence of HemgH, the D5 desaturase from Leishmania major strain Friedlin (GenBank accession #: CAJ07076) has the sequence of HeAgH, the D5 desaturase from Atlantic salmon (GenBank accession #: AAL82631) has the sequence of HDygH, and PtD5D [47] has the sequence of HDAnH in the corresponding location. This suggests that the HDASH motif is not an invariant characteristic of D5 desaturases, and may be not required for D5 desaturase activity. We suggest that the two His residues of HDASH motif participate in the coordination of the diiron center (Fig. 5), but the other three residues (DAS) residues between the two His residues can be modified.
Systemic substitution studies (Tables 3, 4) at positions A157 and S158 within the HDASH motif of EgD5D demonstrated that these two residues could be replaced, and the mutants retained good D5 desaturase activity. The EgD5D-157G and EgD5D-157S mutants had about 96 and 94 % of the wild type EgD5D activity, respectively. Since PiD5D has an HDsSH motif [46], it is not surprising that EgD5D-157S with sequence HDsSH functioned well in Yarrowia. We also found that S158 could be substituted with either alanine or glycine. The alanine, glycine and serine are interchangeable within the HDASH motif of EgD5S; furthermore, the enzyme activity of EgD5D could be improved with a motif of HDAgH instead of HDASH (Table 4).
To determine whether at least one motif, HPGG or HDASH, is required for the enzyme activity of EgD5D, a series of mutants with mutations in both the HPGG and HDASH motifs was generated (Table 5). Some double mutants such as EgD5D-34G/157G, EgD5D-34G/158A and EgD5D-34H/158G had more than 80 % of wild type EgD5D activity, while EgD5D-34G/158G had almost the same activity as wild type EgD5D. Therefore, neither the HPGG nor the HDASH motif is necessary in the exact form as encoded for the activity of EgD5D.
Distribution analyses (Supplemental Fig. S2) of ARA in FFA, PL and NL pools of Yarrowia transformants with EgD5D shows that more ARA loaded in PL pool than that in FFA pool, suggesting that EgD5D is also an acyl-lipid desaturase, just like other front-end desaturases from lower plants, fungi and algae [26]. ARA distribution comparison of Yarrowia transformants with EgD5D-34G and EgD5D-34G/158G with wild type EgD5D shows that either single mutation within HPGG motif (P34G), or simultaneous mutations within HPGG (P34G) and HDASH (S158G) motifs does not change its fatty acid distribution pattern (Supplemental Fig. S2), and therefore these changes in EgD5D do not affect its substrate specificity.
An effective D5 desaturase is required for efficient conversion of DGLA to ARA or ETA to EPA (Fig. 1) in engineered Y. lipolytica or other organisms to produce commercially valuable LC-PUFA. We employed two approaches to improve the enzyme activity of the double mutant EgD5D-34G/158G. The optimization (Fig. 6) of the 43 codons of the 68 amino acids within the N-terminal portion of EgD5D-34G/158G, EgD5M, improved substrate conversion (Table 6). This improvement may relate to the rate of translation. Recent reports suggest that sequence at the beginning of a gene can influence translation, and the mRNA structure at the 5 0 end of an mRNA can affect protein levels [58,59]. Next, we substituted the arginine residue at position 347 with the serine which was identified in our original PCR products. Surprisingly, this R347S substitution in codon optimized EgD5M1 further improved substrate conversion (Table 6). These data suggest that some un-conserved amino acids among different D5 desaturases may be good targets for protein evolution to generate improved enzymes. At this stage, the improved EgD5D, both EgD5M and EgD5M1 should enable us to engineer Yarrowia and other organisms to produce high levels of ARA, EPA and DHA.
In conclusion, our studies suggest that the exact sequences of the HPGG and HDASH motifs are not necessary for the function of EgD5D. Several amino acids within these two motifs can be changed individually, or simultaneously, without significantly reducing the enzyme activity or altering its substrate specificity. In some cases such as EgD5D-36D, EgD5D-36S, and EgD5D-158G mutants, the D5 desaturase activity can be improved. In order to fully understand the function of EgD5D, the roles of the HX (2,3) HH and (H/Q)X (2,3) HH motifs as well as the un-conserved amino acids need to be studied in the future.