Enhancing the cell-free expression of native membrane proteins by in-silico optimization of the coding sequence – an experimental study of the human voltage-dependent anion channel

The investigation of membrane proteins, key constituents of cells, is hampered by the difficulty and complexity of their in vitro synthesis, of unpredictable yield. Cell-free synthesis is herein employed to unravel the impact of the expression construct on gene transcription and translation, without the complex regulatory mechanisms of cellular systems. Through the systematic design of plasmids in the immediacy of the start of the target gene, it was possible to identify translation initiation and the conformation of mRNA as the main factors governing the cell-free expression efficiency of the human voltage dependent anion channel (VDAC), a relevant membrane protein in drug-based therapy. A simple translation initiation model was developed to quantitatively assess the expression potential for the designed constructs. A scoring function is proposed that quantifies the feasibility of formation of the translation initiation complex through the ribosome-mRNA hybridization energy and the accessibility of the mRNA segment binding to the ribosome. The scoring function enables to optimize plasmid sequences and semi-quantitatively predict protein expression efficiencies.


12
The production of membrane proteins outside living cells circumvents many of the issues of in-13 cell synthesis. [1,2] Cell-free synthesis uses cell lysates to in situ generate rightly folded 14 membrane proteins [3,4] from exogeneous mRNA or DNA, which can be directly incorporated 15 into artificial membranes. [5] 16 Yet cell-free and in-cell synthesis face a common challenge. In both the design of the plasmid 17 vector is crucial. This genetic construct lodges the sequences of the transcription promotor, of 18 the ribosomal binding site, RBS, and occasionally of translation enhancers in addition to the 19 target gene. [1,6] The sequence layout, particularly in the vicinity of the gene's initiation or start 20 codon, has become the quintessence of cell-free protein expression and yet it has not been fully 21 exploited in optimizing constructs for protein expression. The coding region adjacent to the start 22 codon remains untapped in both in-silico [7,8] and wet-bench design of constructs, and finding a 23 working construct is to date mainly based on trial and error

24
Herein we present a rationalized approach to the generation of constructs for the expression of 25 wild-type, human membrane proteins in prokaryotic cell-free systems, that includes alterations 26 in the coding sequence proximal to the start codon. As a relevant case example, we chose the 27 human voltage dependent anion channel or VDAC; a small, 285-amino acid-long protein

32
Cloning and purification of plasmids

27
Invitrogen, Thermo Fisher) was added to the mixture. Protein denaturation in the diluted 28 samples was conducted at 70°C for 10 min before electrophoresis.

31
Electrophoresis was conducted at a constant potential of 200V for 45 minutes and imaged 9

RNA detection and quantitative PCR (qPCR)
10 Levels of RNA were measured with a ND-10000 Spectrophotometer (Nanodrop Technologies,

11
Wilmington USA) on RNA-isolated samples [14] . For qPCR, 650 ng of isolated RNA was 12 reversed-transcribed into cDNA with the iScript TM Select cDNA synthesis kit and random 13 primers (Bio-Rad, Hercules, USA). qPCR was performed in a 48-well, MiniOpticon Real-Time

15
Universal SYBR Green Supermix (Bio-Rad), was used to prepare the master mix for each 16 primer.

27
The design of constructs is such that enables not only to understand and assess the influence 28 of expression modulators on protein expression efficiency, but also to assign their optimal 29 location upstream and downstream the initiation codon. The generation of constructs was 30 accomplished by the recombination of a PCR-product into commercially available plasmids. The 1 PCR-product consists of a specific nucleotide sequence or primer, and the VDAC-encoding 2 sequence. By the introduction of self-designed primers we are able to modify the genetic code 3 in a controlled fashion and hence assess the effect of these modifications on protein expression.

4
The original pDEST17 plasmids provide sequences before (upstream) and after (downstream) 5 the ATG codon in the untranslated and translated regions, 5'UTR and TR, respectively ( figure   6 1a). The UTR is preceded by the T7 promoter and lodges a prokaryotic RBS in the form of a  VDAC sequences, which proves sufficient to confirm the primary structure of VDAC [19] . The

23
insertion of the chloramphenicol acetyltransferase (CAT)-enhancer sequence, [20] as in VDAC-I-   we thus employed the plasmid pDEST14 to gain better control over this region, while aiming to 30 express native, tag-free VDAC at comparable levels to those attained through pDEST17-based 31 constructs. pDEST14 provides the T7 promoter as its pDEST17 counterpart does but, unlike the 32 latter, it allows the insertion of self-designed primers at desired locations upstream and 33 downstream the start codon. Western Blot [18] . This evinces the enhancer role of the pre-VDAC sequence and the 5'UTR in 6 pDEST17-based plasmids.

12
In view of these results, we directed our efforts in investigating the role of the 5'UTR and the 13 adjacent TR in protein expression. Starting at the SD sequence, we inserted the 5'UTR of the 14 pDEST17 vector into pDEST14-based constructs at the same location. The resulting construct,

15
VDAC-II-C, enables marginal protein expression, as evinced by the appearance of a weak band 16 above 36 kDa (figure 2c) [21] . So does the construct VDAC-II-D, with the same first three-codon-

5
In contrast to eukaryotic-based expression systems, the prokaryotic machinery is not capable of 6 clearing conformational elements of mRNA that may potentially hamper the correct assembly of 7 the ribosome and hence of the initiation complex [22,23] . Although the specificity of the interaction 8 between ribosome and mRNA is mediated by hybridization of the SD sequence and 9 strengthened by the coupling of the first transfer-RNA (tRNA Met ) to the start codon, the whole 10 initiation complex extends over a much longer nucleotide segment. This segment or ribosome 11 docking site (RDS) extends over 30 nucleotides downstream the SD sequence [7] . Since SD 12 sequences are usually positioned 5-13 nucleotides before the start codon [22] , the RDS extends 13 into the coding sequence. Based on this fact, we changed our strategy of ameliorating 14 constructs and opted for a quantitative approach. Inspired by the work of Na et al, [7] we 27 Figure 4a shows E open as a function of the position of the SD sequence relative to the start  (Figure 4a). Figure 2c   6 shows that VDAC-II-G experimentally enables protein expression in a comparable degree to 7 those attained with enhancer-containing sequences.

15
Cell-free protein synthesis is governed by the biochemical conditions and the template DNA 16 sequence. The E. coli-based system used in this study requires high concentrations of phage 17 T7-RNA polymerase and a surplus of fast degradable amino acids, such as arginine, cysteine, 18 tryptophan, glutamate, aspartate, and methionine [24,25] . Though necessary, these conditions are 19 not as crucial in protein expression as the mRNA sequence, or rather, the mRNA 20 conformational structure. Sequence elements in the proximity of the start codon, either 21 upstream or downstream, are known to significantly affect translation efficiency [26,27,28] Which 22 not only implies finding the optimal location for the RBS [29] , but also proper tailoring of the whole 23 RDS. Our findings are based on the design of several plasmids in which the sequence in the 24 immediacy of the start codon have been altered to accommodate the RBS and the gene of a 25 membrane protein at varying distances upstream and downstream the start codon, respectively.

26
The results so far indicate that the best strategy to elicit tag-free protein expression from 27 constructs with off-the-shelf RBSs in prokaryotic cell-free expression systems entails proper 28 engineering of the TR proximal to the initiation codon.

29
Translation initiation in prokaryotes differs from that of eukaryotes in that it involves much less  Indeed, E open at i  -11 is significantly lower for transcripts of the E. coli genome than for those 2 of the human genome [30] . On the other hand, upregulation mechanisms for protein expression 3 in prokaryotic cells are not present in cell-free systems, and may be responsible for in-cell 4 expression of recombinant VDAC from plasmids that do not elicit expression otherwise [31] .

5
Hence, the mRNA sequence is crucial in the cell-free context. Since the ribosome footprint on 6 the mRNA sequence is larger than the RBS and extends well into the TR, a correspondingly 7 long mRNA segment should be accessible for the ribosome to properly dock at and initiate 8 translation. Hence, it makes sense to modify the mRNA sequence within the proximal TR so as

31
The current study demonstrates that prokaryotic cell-free expression of VDAC is determined by 32 the mRNA sequence in the immediacy of the start codon and its impact on translation initiation.

33
Providing the RBS site is optimal, i.e., 11 nucleotides upstream the ATG codon, the efficiency of