Construction, expression, and characterization of AG11–843 and AG11–1581

This data article contains descriptive and experimental data on the construction, expression, and simple characterization of AG11–843 and AG11–1581. AG1 is an important member of the DUF1220 protein family. It׳s hard to get the recombinant protein because of its DNA sequence. The DNA sequence were optimized by proper design, cloned by overlap PCR and constructed into expression vector. AG11–843 and AG11–1581.were over expressed in Escherichia coli, purified and analyzed by dynamic light scattering and gel filtration analysis. An effective technique is provided to construct and express proteins with complicated sequences.

The data are available with the article. Related research article Construction, expression, and characterization of AG1 1-843 and AG1  .

Value of the data
A method of amino acid and DNA sequence optimization, synthesis, recombinant protein expression for proteins with complicated sequences is provided.
Sequence analysis and synonymous codon substitution was used for sequence optimization.
Overlap-PCR was used for the sequence synthesis. The recombinant proteins AG1 1-843 and AG1 1-1581 were expressed and purified for further analysis. The existence state in solution of AG1 1-843 and AG1  were analyzed by DLS and gel filtration.

Primer name
Primer

Gene sequence analysis
The gene synthesis product in this study is that of the 1862 bp human AG1 gene (GenBank accession no: AF380580.1) which encodes the 615 amino acid DUF1220 AG1 protein fragment (http:// www.uniprot.org/uniprot/Q8IX72). The AG1 gene and amino acid sequences are highly repetitive (Fig. 1). In order to increase the speed and efficiency of gene synthesis, we modified the AG1 nucleotide sequence. However, these changes did not affect the amino acid sequence. This codon optimization allows us to exploit the frequently used codons in Escherichia coli to obtain high level gene expression. Moreover, in order to improve the efficiency of gene transcription and RNA stability, the GC content of the synthetic gene was held at 52.9%. (Tables 2 and 3).
The design of the synthetic assembly oligonucleotides was similar to that of Xiong et al. [1], whereby each optimized DNA sequence was divided into economically sized oligonucleotides approximately 57-59 bases long that had 17-19 overlapping bases at both the 5 0 and 3 0 ends, leaving a 21 base gap between the overlapping regions. In addition, two outer amplification primers containing different restriction enzyme binding sites were designed for each gene to facilitate cloning. Both the AG1 1-843 and AG1 1-1581 sequences contained 23 oligoes. The oligonucleotides listed in Table 1 were from Sangon Biotech (Shanghai) Co., Ltd.

Rapid preparation of DUF1220 AG1 1-843
We used the single overlap extension method as well as the two-step successive PCR method to synthesize the duplicated DUF1220 AG1 gene. Firstly, we mixed 14 (DUF1-DUF12, DUF-F, and Kpn1-R) and 12 (DUF13-DUF19, DUF-R, and Kpn1-F) chemically synthesized single stranded oligonucleotides (1 μM) in separate reaction tubes, followed by hybridization and extension to form the long dsDNA  in presence of 1 Â Pfu buffer and 200 μM dNTP. The PCR conditions were 10 s at 90°C, 10 s at 60°C, and 50 s at 72°C for each cycle, followed by extension for 10 min at 72°C, unless stated otherwise. The AG1 1-450 and AG1 450-843 gene fragments were then cloned into a simple pMD19-T vector and sequenced. The pMD19-T-AG1 1-450 plasmid was then digested with EcoRI and KpnI, while the pMD19-T-AG1 450-843 was digested with Kpn 1 and XhoI, followed by separation on a 1% agarose gel. The digestion products (AG1 1-450 and AG1 450-843 ) were excised from the gel with a blade, and a purification kit (CoWin Biosciences) was used according to the manufacturer's instructions. Then, the AG1 1-450 and AG1 450-843 gene fragments were cloned together into a pET-15b-sumo vector which contains 6 Â His tag and SUMO fusion tags. The molecular cloning of the synthesized DNA fragments was performed according to the standard procedures [1].

High efficiency preparation of DUF1220 AG1 1-1581
Gene AG1 1-1581 is composed of two repeats of the AG1 1-843 fragment, meaning it can be built with the AG1 1-450 and AG1 450-843 fragments expressed in the pET-15b-sumo-AG1 1-843 plasmid. Firstly, the AG1 450-843 and AG1 1-450 fragments and the pET-15b-sumo-AG1 1-843 plasmid were used to assemble the template for the third, fourth, and sixth PCR reaction. The two outermost oligonucleotide primers used were Kpn1-F and DUF1, DUF2 and DUF6, and DUF5 and DUF-R, respectively. Secondly, the DNA segment from the Kpn1-DUF1 and DUF2-DUF6 reactions were mixed and used to assemble the template for the fifth PCR reaction, which was carried out using the Kpn1-F and DUF6 oligonucleotides as the two outermost primers. All of the PCR reactions used 5 U Pfu polymerase and 200 μM dNTP and were performed with the following program: 98°C for 1 min, then 25 cycles of 10 s at 90°C, 10 s at 58°C , and 50 s at 72°C. Thirdly, the AG1 1-450 fragment was digested with EcoRI and KpnI, while the AG1 450-917 and AG1 917-1581 fragments were digested with KpnI/PstI and PstI/XhoI, respectively. Then, the digested products purified as described. The three purified DNA fragments were mixed with pET-15bsumo vector, and the four DNA strands were sealed together at their sticky ends by DNA ligase to form the recombinant plasmid (pET-15b-sumo-AG1 1-1581 ). The pET-15b-sumo-AG1 1-843 and pET-15b-sumo-AG1 1-1581 sequences were then identified by PCR, double enzyme digesting, and sequencing.

Protein expression and purification
All proteins were expressed in E. coli BL21(DE3) cells. Cells inoculated in 10 ml of LB containing 100 μg/ml of ampicillin. Cultures were grown by shaking at 200 rpm at 37°C until the absorbance at 600 nm (A600) was $ 1.0. This starter culture was then inoculated into 1.5 L of the same LB medium and grown as above until A600 ¼ 0.8-1. Then, 0.3 mM IPTG was added, and incubation was continued for 18 h at 18°C. Cells were then pelleted by centrifugation and re-suspended in lysis buffer (20 mM Tris-HCl, pH 7.5, 1000 mM NaCl, 10% glycerol (v/v)). The cells were sonicated and then centrifuged at 12,000 rpm for 30 min. The samples were loaded on to a Ni 2 þ -charged IMAC column (GE Healthcare), bound with 120 ml of lysis buffer, and washed with 240 ml of washing buffer (20 mM Tris-HCl, pH 7.5, 300 mM NaCl, 10% glycerol (v/v), 50 mM imidazole). Then, the protein was eluted from the Ni 2 þ affinity column with elution buffer (20 mM Tris-HCl, pH 7.5, 300 mM NaCl, 10% glycerol (v/v), 500 mM Table 3 The characteristic constants of the AG1 1-843 and AG1 1-1581 recombinant proteins.