6
Overproduction, Separation and Purification of Affinity‐Tagged Proteins from Escherichia coli

Finbarr Hayes1 and Daniela Barillà2

1 Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, M13 9PL, UK

2 Department of Biology, University of York, York, YO10 5DD, UK

6.1 Introduction

Understanding protein function is a central goal of modern molecular biology. This aim typically is achieved using a combination of in vivo and in vitro approaches that together provide an integrated overview of protein features, interactions and regulation. The availability of a sufficient quantity of pure protein that can be isolated rapidly and efficiently is crucial for its biochemical, physical and structural characterisation in vitro. Moreover, in vitro studies of purified proteins are key to the development of drugs that affect proteins implicated in disease formation and progression, as well as to the optimization of proteins produced for biotechnological, bioprocessing and other industrial and commercial purposes.

The Gram‐negative bacterium Escherichia coli remains the primary cell factory for protein production: approximately 90% of the tertiary structures that have been deposited in the Protein Data Bank have been generated from proteins that were purified from E. coli [1]. E. coli is specially suitable for protein production due to the bacterium's rapid growth kinetics in inexpensive cultivation media, high cell density, simplicity of handling and storage and ease of genetic manipulation [2]. E. coli has a doubling time of less than 30 minutes when growing maximally in low cost growth medium and under optimal conditions can achieve densities of more than 1010 cells per millilitre of culture medium. Most laboratory strains of E. coli are considered to be non‐pathogenic, which makes handling of these strains relatively non‐hazardous. Moreover, these derivatives also have been engineered to be less fit genetically than wild‐type strains and therefore are less likely to survive should an inadvertent escape occur from the research environment. Nevertheless, monocultures of E. coli laboratory strains remain viable for extended periods at room temperature and survive for decades when stored correctly at −80 °C [3]. Finally, the sophisticated genetic manipulation systems that have been developed for E. coli over many years have made this bacterium the most tractable and well‐studied of all model organisms [4].

Prior to the advent of molecular gene cloning techniques, proteins were purified from the multitude of other proteins and cellular components in E. coli using a series of sequential fractionation and chromatographic separation techniques that utilised the biochemical and biophysical features of the target protein. For example, the LacI repressor was one of the first DNA binding proteins to be isolated and characterised [5]. In one instance, purification of the repressor began with 350 l of E. coli culture, which was harvested and then was lysed by freezing and thawing. This step was followed by a set of enzymatic treatment, fractionation, ammonium sulphate precipitation, dialysis and chromatography steps that resulted in 350–1000 mg of purified, functional LacI protein [6]. In another example, an acetyltransferase that confers resistance to the antibiotic chloramphenicol was purified from 18 l of Staphylcoccus aureus culture by successive steps of lysis, centrifugation, ammonium sulphate precipitation, dialysis, ion‐exchange chromatography, ultrafiltration under nitrogen, gel filtration and concentration [7]. These laborious and time‐consuming purification procedures, although still necessary in certain cases, largely have been supplanted by the use of affinity tags when purifying recombinant proteins from E. coli. Indeed, the use of tags to purify proteins from E. coli by affinity chromatography has revolutionised the ease with which proteins can be isolated, studied and manipulated [8].

Image described by caption.

Figure 6.1 Overproduction and purification of His6‐tagged proteins. (a) The gene for the target protein (blue) is cloned in a plasmid expression vector so that the locus is fused to a sequence that encodes a His6 affinity tag (yellow). Here the tag is at the 5′ end of the gene but may also be placed at the 3′ end depending on the choice of vector and target gene. The fusion is under the control of a promoter (bent red arrow) that is recognised by the RNA polymerase encoded by bacteriophage T7 (red). The gene for the latter is engineered into the E. coli chromosome and is under the control of the lac promoter (bent black arrow). This promoter is blocked by binding of the LacI repressor protein (white hexagon). Thus, the gene for T7 RNA polymerase is not expressed, T7 RNA polymerase is not produced and the target gene is not expressed. (b) Binding of IPTG (magenta) alters the conformation of the LacI repressor which no longer binds to the lac promoter. The native RNA polymerase in E. coli transcribes the gene for T7 RNA polymerase which binds to the promoter upstream of the target gene fusion, which then is expressed to high levels. The His6 affinity tag allows specific capture of the target protein on a nickel matrix (green) to which other cellular proteins and components (not shown) do not bind. The matrix is washed to remove impurities and the target protein is eluted by altering the conditions on the matrix. Specifically, a solution with a high concentration of imidazole is passed over the column. Imidazole competes with the His6‐tagged protein for the nickel matrix thereby displacing the protein. The eluted protein is transferred to appropriate buffer conditions for assay and storage, and protein purity is assessed by SDS‐PAGE. The identity of the protein may be verified by mass spectrometry.

Current techniques for protein purification typically involve, first, cloning of the cognate gene in a plasmid expression vector so that an affinity tag is fused to the target gene. The reader is referred to Chapter 4 for an in‐depth description of molecular cloning strategies. The tagged gene is inserted downstream of a strong, regulatable promoter on the vector (Figure 6.1a). Controlled overexpression of the cloned, tagged gene is achieved by addition of a small molecule inducer that directly or indirectly activates the regulatable promoter. Induction is sufficiently potent that the target protein is highly overproduced and often is the principal protein observed when cell lysates are analysed by sodium dodecyl sulphate‐polyacrylamide gel electrophoresis (SDS‐PAGE). The lysate is passed over a matrix for which the tagged, target protein has high affinity, whereas the plethora of other proteins and cellular material in the lysate passes unhindered through the column (Figure 6.1b). Thus, the affinity tag permits single‐step enrichment of the target protein. As the target protein is bound non‐covalently to the affinity matrix, the protein is released from the column by addition of an elution buffer that alters the conditions on the column. The eluted target protein often is >90% pure at this stage and may be suitable directly for functional analysis or characterisation (Figure 6.1b). If a higher degree of purity is required, one or more gel filtration or other separation techniques may be used as outlined briefly below (also see Chapter 7).

6.2 Selecting an Affinity Tag: Glutathione‐S‐Transferase, Maltose‐Binding Protein and Hexa‐Histidine Motifs

Affinity purification of tagged proteins typically begins with the overexpression of the cognate gene that is cloned under the control of a regulatable promoter in a plasmid expression vector. The cloning strategy involves the genetic tagging of the target protein with an affinity moiety that markedly eases subsequent purification on a matrix that non‐covalently captures the tagged protein [9,10]. Numerous affinity tags may be used, including glutathione‐S‐transferase (GST), maltose‐binding protein (MBP) and hexa‐histidine motifs (His6), among others [11,12]. Fusion proteins with these tags are purified by affinity chromatography using glutathione‐sepharose, amylose and Ni2+ matrices, respectively. The use of His6‐tags is particularly widespread for protein capture from E. coli as many proteins tolerate the attachment of a string of six histidine residues (approx. 1 kDa) at the N‐terminus and/or C‐terminus without perturbation of protein function. GST (26 kDa) and MBP (42.5 kDa) are bulkier entities but are thought to fold independently of proteins to which they are fused. MBP is a highly soluble protein that has been used extensively to improve the solubility of fusion proteins [13,14]. This approach can be specially useful for target proteins that are prone to aggregation when overproduced. For example, elastin‐like polypeptides are intrinsically disordered peptides that are highly susceptible to aggregation above defined transition temperatures. These polypeptides were purified more readily as MBP fusions from E. coli. An intervening protease site allowed cleavage and subsequent separation of the functional polypeptides from the MBP‐tag [15]. In another example, murine leukaemia inhibitory factor was purified as an MBP fusion protein. The fusion protein was bioactive, demonstrating that the MBP‐tag did not block the functionality of the target protein [16]. The mechanism by which MBP enhances the solubility of proteins to which it is fused is uncertain [17]. Analogously, GST may improve the solubility of proteins to which it is fused and which otherwise are recalcitrant to purification. GST has a high affinity for the reduced form of glutathione (GSH). Fusions between GST and the target protein of interest can be readily immobilised on sepharose beads to which GSH is covalently attached. Other proteins and cellular materials pass through the column. The subsequent addition of an excess of free GSH displaces the fusion protein from the matrix. The fusion protein often is sufficiently pure at this stage that it is suitable for further characterisation. Moreover, the target protein may be liberated from the fusion using a protease that cleaves a linker sequence that is engineered between GST and the target protein [18]. Protease cleavage sites often are designed between the target and tag moieties to provide the option of removing the latter from the fusion protein [19]. An example of the production and purification of a GST fusion protein is provided below.

Purification of proteins with His6‐tags is achieved by immobilised metal ion affinity chromatography (IMAC). This technique uses a chelating matrix that is impregnated with soft metal ions such as Cu2+ or Co2+, but most commonly Ni2+ (Figure 6.1b). Protein surfaces contain electron‐donating groups, notably the imidazole side chain of histidine, which specifically recognise the non‐coordinated sites of the metal ions with high affinity. Thus, proteins that are rich in histidine residues are non‐covalently immobilised on matrices that contain Ni2+ ions. However, the interaction between the histidine side chains and the metal can be reversed by washing with a buffer that contains a high concentration of free imidazole, thereby eluting the protein (Figure 6.1b). Imidazole competes effectively with histidine for the nickel matrix as histidine possesses an imidazole side chain (Figure 6.2). In short, a protein with electron‐donating groups, in particular histidine, can be purified by reversible interactions with a metal complex [20,21].

Skeletal formulas of histidine (left) and imidazole (right).

Figure 6.2 Illustration of the imidazole side chain (right) in histidine (left), which, as a canonical amino acid, additionally possesses α‐amino and carboxylic acid groups.

Image described by caption and surrounding text.

Figure 6.3 An example of target gene cloning in an expression vector. (a) Example primers for PCR amplification of a target gene. The forward primer anneals to the 5′ end of the target gene via residues N1–N15 that represent the five codons that follow the ATG start codon of the gene. The eight bases at the 5′ end generate a tail that allows the NdeI restriction enzyme to bind stably to its recognition sequence in the final PCR product. NdeI cleaves the sequence 5′‐CATATG‐3′ that fortuitously includes an ATG start codon. The reverse primer anneals to the target gene via residues N16–N30 that represent the five codons at the 3′ end of the gene. The lower case letters indicate the translational stop codon. The six bases at the 5′ end of the reverse primer generate a tail that allows the XhoI enzyme to bind stably to its recognition sequence in the final PCR product. (b) The forward and reverse primers are used to amplify the target gene from an appropriate template, e.g. genomic DNA or a plasmid that harbours the gene. The PCR product is digested with NdeI‐XhoI and ligated into an expression vector that is digested with the same enzymes. The resultant recombinant plasmid contains an in‐frame transcriptional fusion of the cloned gene and the desired tag.

As most proteins do not contain a sufficient number of contiguous histidine residues for efficient IMAC, His6‐tags are engineered at the C‐ or N‐terminus of the target protein by cloning the corresponding gene in‐frame with six consecutive histidine codons in specialised expression vectors. The correct frame is achieved by using the polymerase chain reaction (PCR) and specially designed primers (Figure 6.3a), which generate a product that, when cleaved with appropriate restriction enzymes, is inserted into the expression vector that is digested with the same enzymes (Figure 6.3b). The initial His6‐tags vectors [22] were improved by the addition of expression signals derived from bacteriophage T7 [23] to produce a series of plasmid derivatives (pET vectors) and host strains that are used widely for protein production in E. coli. The host strains were engineered to include a chromosomal gene for T7 RNA polymerase. This polymerase initiates transcription from a distinctive promoter sequence introduced into the pET vectors that is not recognised by E. coli RNA polymerase (Figure 6.4). Thus, the use of T7 expression signals allows for tightly regulatable and specific expression of the target gene cloned downstream of the T7 promoter. The pET vector series is available under the Novagen brand from Merck Biosciences/EMD Biosciences, Inc.

Image described by caption and surrounding text.

Figure 6.4 Organisation of an example expression vector. The pET33b(+) vector (Novagen) facilitates the overproduction of proteins with N‐ or C‐terminal His6‐tags. The former may be cleaved from the purified protein. A different tag also may be included that allows detection by antibodies that recognise the tag. The vector includes the pMB1 plasmid replication origin (red arc) which allows replication at a moderate copy number in E. coli. The replication origin of bacteriophage f1 (grey arc) permits the production of single‐stranded DNA species that are useful for sequencing purposes, although this feature has been superseded by routine sequencing of double‐stranded DNA. A gene for kanamycin resistance (green arrow) allows plasmid selection. The lacI gene (blue arrow) encodes the LacI repressor protein that binds to lac operator sites both on the plasmid (boxed) and upstream of a chromosomally located gene for T7 RNA polymerase. Thus, leaky expression of genes cloned downstream of the T7 promoter in pET33b(+) is repressed and expression of the chromosomal gene for T7 RNA polymerase is also blocked. Repression is relieved by addition of IPTG to the growth medium. IPTG alters the conformation of the Lac repressor, which no longer binds to the plasmid or chromosomal lac operator sites that thereby allow expression of the gene for T7 RNA polymerase. The polymerase binds to the T7 promoter (red arrow) and induces expression of cloned downstream genes. The DNA sequence above the pET33b(+) vector map illustrates additional salient features involved in the expression of cloned genes. The ribosome binding site (RBS) and T7 transcriptional terminator are boxed. His6 affinity tags are indicated by horizontal black lines. The T7 tag (horizontal red line) is an 11 amino acid sequence from T7 bacteriophage gene 10 permits detection of fusion proteins by using antibodies against the tag. Selected restriction enzyme sites are marked. These sites are used to clone the gene of interest by different strategies. For example, insertion of a target gene in‐frame between the NdeI and XhoI sites produces a fusion protein that harbours a His6‐tag at the N‐terminus. The tag allows affinity purification of the corresponding fusion protein. The His6‐tag subsequently may be removed from the protein by digestion with thrombin serine protease that cleaves at the site indicated by the vertical arrow (see Figure 6.7). In this example, cloning into the XhoI site introduces a stop codon that avoids addition of a His6‐tag at the C‐terminus of the fusion protein. In another example, the target gene is cloned between NcoI and XhoI sites so that the target protein is produced with a non‐cleavable C‐terminal His6‐tag. In contrast, use of the BamHI or nearby sites allows for a gene fusion that includes the gene 10 tag.

An extensive array of expression vectors in addition to the pET series is available for production of recombinant proteins in E. coli [11]. These vectors include, for example, plasmids that replicate using diverse replicons that provide different copy numbers and with different antibiotic resistance markers [2426]. Other expression vectors employ arabinose‐responsive promoters [27,28] or permit protein fusions with GST, MBP and other solubility tags [2934]. Expression vectors also have been developed that, for example, utilise recombination‐based cloning techniques [35,36] or ligation‐independent cloning strategies [37]. General approaches for molecular cloning are described in Chapter 4.

6.3 The pET Vector Series: Archetypal Expression Vectors in E. coli

As the pET series is among the most commonly used expression vectors and illustrates key principles that are used for high‐level gene expression in E. coli, the features of a typical pET vector are presented here. The pET33b(+) vector encodes kanamycin‐resistance to allow selection of E. coli transformants (Figure 6.4). Other pET vectors possess different resistance genes which permit either the cotransformation of more than one plasmid with different markers into the expression host and/or the utilisation of E. coli hosts that display a similar chromosomal resistance as that encoded by the vector. Replication of pET33b(+) occurs via the pMB1 origin which provides a copy number of approx. 15–20 plasmids per cell. In practice this means that the vector is easy to isolate from small volumes of E. coli K‐12 laboratory strains using standard plasmid purification procedures. The plasmid, like many other pET vectors, also contains the replication origin of bacteriophage f1. This element allows the production of a single‐stranded form of the plasmid when E. coli is infected with the appropriate helper bacteriophage. Single‐stranded DNA can be used for the sequencing of cloned genes although this feature largely has been surpassed by routine sequencing of double‐stranded plasmid DNA [38].

The lacI gene on pET33b(+) encodes the LacI transcriptional repressor (Figure 6.4). This protein recognises the lac operator site on the plasmid which prevents leaky expression of genes inserted downstream of the T7 promoter. LacI also binds upstream of a chromosomally‐located gene for T7 RNA polymerase in E. coli BL21‐type strains thereby blocking production of this polymerase. Repression is relieved by addition of isopropyl β‐D‐1‐thiogalactopyranoside (IPTG) to the growth medium. IPTG is a non‐hydrolysable analogue of allolactose which is the natural inducer of the lac operon in E. coli. IPTG binds to and alters the conformation of the LacI repressor which consequently no longer interacts with either the plasmid or chromosomal lac operator sites. Therefore, the chromosomal gene for T7 RNA polymerase is expressed, and the polymerase binds to the T7 promoter on pET33b(+) and induces expression of the cloned downstream gene (Figure 6.1).

The His6‐tag may be engineered either at the amino‐ or carboxy‐terminal end of the target protein in the case of the pET33b(+) vector (Figure 6.4). For the former the target gene is cloned between the NdeI and XhoI restriction enzyme sites with a translation stop codon placed after the XhoI site. The stop codon is introduced during PCR amplification of the target gene (Figure 6.3). This strategy results in a target protein with a 25 residue peptide at the amino‐terminus (MGSSHHHHHHSSGLVPR↓GSRRASVH). If required, 17 of these amino acids, including the His6‐tag, may be removed from the purified protein by treatment with the thrombin protease that cleaves between the arginine and glycine residues in the sequence Leu‐Val‐Pro‐Arg‐Gly‐Ser as outlined below [ 19,39]. If the His6‐tag is required at the carboxy‐terminal end of the target protein, the cognate gene is cloned between the NcoI restriction enzyme site and any of the sites that are located between BamHI and XhoI in pET33b(+) (Figure 6.4). The XhoI site is used frequently as it ensures that all of the intervening sequences between NcoI and XhoI, including the 25 codons that specify the amino‐terminal His6‐tag, thrombin cleavage site and the T7 tag, are removed. The T7 tag is an 11 amino acid motif derived from T7 bacteriophage gene 10 which allows detection in Western blots of fusion proteins with an amino‐terminal His6‐tag by using commercially available antibodies against the T7 tag. However, anti‐His‐tag antibodies also are available which allow the detection of fusion proteins with His6‐tags irrespective of whether the tag is located at the amino‐ or carboxy‐terminal. Note that a stop codon is not introduced after the XhoI site when a carboxy‐terminal His6‐tag is required. Instead the stop codon at the 3′ end of the six histidine codons will terminate translation. Because the T7 promoter drives very effective gene expression [40], the pET33b(+) vector contains a strong transcriptional terminator that inhibits undesirable read‐through by T7 RNA polymerase into other vector sequences (Figure 6.4).

6.4 IMAC of a His6‐Tagged Protein: Example Methodology with the ParF DNA Segregation Protein

The ParF protein (22 kDa) that is encoded by a bacterial multidrug resistance plasmid provides a valuable illustration of the step‐by‐step procedures for purification of a His6‐tagged protein in E. coli. ParF is representative of a widespread class of ATPases that mediate the stable segregation of plasmids during bacterial cell division [41]. The function of these and analogous NTPases is vital for the maintenance of antibiotic resistance and other plasmids in bacterial populations [42]. The parF gene was cloned as a PCR product between NdeI and XhoI sites in the pET22b(+) vector (Figure 6.3), thereby generating a protein fusion in which the His6‐tag is situated at the carboxy‐terminal of ParF [43].

6.4.1 Testing for ParF Induction and Solubility

Different proteins exhibit different characteristics when overproduced in E. coli. Therefore, it is crucial to determine the optimal expression conditions when a target protein is to be produced for the first time. For example, preliminary, small‐scale experiments initially were performed in which the optimal concentration of IPTG and the most appropriate time and temperature for parF induction were assessed. 0.1–5 mM IPTG and induction times from one to five hours were tested, as well as the expression at 30 and 37 °C. Induction at lower temperatures also may be tested as the formation of insoluble inclusion bodies (see below) may be ameliorated by the reduction in the rate of protein synthesis that occurs when the post‐induction temperature is lowered [44]. Trials for parF expression were conducted in E. coli BL21(DE3), which harbours the gene for T7 RNA polymerase under control of a modified lac promoter. E. coli BL21(DE3) and its derivatives are widely used strains for induction with the pET vector series. Among these derivatives are BL21(DE3) strains that contain the pLysS plasmid. This plasmid constitutively produces low levels of T7 lysozyme, which reduces the expression of recombinant genes cloned in pET22b(+) and analogous vectors by inhibiting basal levels of T7 RNA polymerase. The use of expression strains that are pre‐transformed with pLysS is especially useful when expressing recombinant genes whose products are toxic in E. coli as induction of these genes is repressed more effectively than without pLysS. Analysis of induced samples by SDS‐PAGE revealed that induction in the BL21(DE3) strain with 1 mM IPTG for three hours at 30 °C produced the highest concentration of the approx. 23 kDa ParF‐His6 protein from the pET22b(+)‐parF plasmid.

Proteins vary in their solubility following overproduction. Indeed it has been estimated that up to 30% of recombinant proteins are insoluble when overproduced in E. coli [45]. These proteins typically form inclusion bodies, which are densely packed, denatured, insoluble protein aggregates [46]. Although proteins can be recovered from inclusion bodies by denaturation and refolding of pelleted material in the presence of denaturing agents such as urea or guanidine hydrochloride [47], it is more convenient if the majority of the overproduced protein remains in the soluble (supernatant) fraction of a cell extract. If a target protein is insoluble under one set of induction conditions, different induction temperatures, growth media, inducer concentrations and induction times can be tested. Insertion and testing of the target gene in different expression vectors with different affinity tags also may be helpful, as may the co‐expression of chaperone proteins that assist in folding of the target protein [48,49]. Note that even if a protein is principally in the soluble fraction when overproduced, mutated versions of the same protein may behave differently and the solubility of these mutated proteins should be assessed independently. The following is a step‐by‐step protocol that was used to determine ParF‐His6 solubility.

  1. E. coli BL21(DE3) was transformed with approx. 0.1 μg of the pET22b(+)‐parF expression plasmid and the transformation mix was plated on Luria–Bertani (LB) agar plates containing ampicillin (100 μg/ml) for plasmid selection. Plates were incubated at 37 °C for 12–16 hours.
  2. Three colonies from the transformation plate were inoculated in 10 ml of LB broth with ampicillin in a 125 ml conical flask.
  3. Cultures were grown at 30 °C with shaking at 180 rpm until OD600 was 0.65–0.75 (approx. three hours).
  4. A 100 μl aliquot of the culture was removed to a 1.5 ml microcentrifuge tube and centrifuged at approx. 10 000g for one minute. The supernatant was discarded and the cell pellet was resuspended in 20 μl SDS‐PAGE loading buffer (50 mM Tris‐HCl, pH 6.8, 100 mM dithiothreitol (DTT), 2% SDS, 0.1% bromophenol blue, 10% glycerol).
  5. The sample was denatured by boiling for three minutes and was stored at −20 °C. This sample is the uninduced control for later SDS‐PAGE analysis.
  6. The remaining culture was induced with 1 mM IPTG at 30 °C, shaking at 180 rpm for three hours.
  7. A 100 μl aliquot was removed and treated as in step 4 to check by SDS‐PAGE for proper induction. This sample is the induced control.
  8. The remaining induced culture was harvested by centrifugation at approx. 5000g for five minutes at 4 °C. The supernatant was discarded and the cell pellet was analysed immediately or was stored at −80 °C for subsequent use.
  9. The pellet was thawed at room temperature and resuspended in 400 μl of binding buffer (20 mM Tris‐HCl, pH 7.9, 500 mM NaCl, 5 mM imidazole, 10% glycerol).
  10. 10 μl of lysozyme (10 mg/ml) and 10 μl of phenylmethylsulphonyl fluoride (100 mM) were added to assist cell lysis and inhibit protease activity, respectively. The sample was incubated at 30 °C for 15 minutes.
  11. The sample was sonicated on ice six times for 15 seconds each with pause intervals of 30 seconds. Sonication causes cell lysis and also shears DNA, which aids in subsequent handling of the lysate. Pause intervals between sonication bursts avoid overheating of the sample.
  12. The lysate was centrifuged at approx. 4000g for 60 minutes at 4 °C.
  13. The supernatant was decanted and the pellet was resuspended in 400 μl of binding buffer.
  14. 5 μl of 4× SDS‐PAGE loading buffer were added to both 15 μl supernatant and 15 μl of pellet, and both fractions were denatured by boiling for three minutes.
  15. The uninduced (step 5) and induced (step 7) controls as well as the supernatant and pellet fractions (step 14) were analysed by SDS‐PAGE. The relative amounts of the 23 kDa induced protein in the supernatant and pellet fractions were inspected to assess the solubility of the overproduced ParF‐His6 protein. In this case, >90% of the protein was detectable in the supernatant fraction, indicating that ParF is highly soluble under these induction conditions.

6.4.2 Identifying the Correct Imidazole Concentration for ParF Protein Elution

Different proteins elute at different concentrations of imidazole during IMAC. To avoid perturbation of the protein conformation or quaternary structure, the lowest imidazole concentration possible should be used during elution. To identify this concentration, the protein first can be purified from a 10 ml culture using an imidazole gradient to ascertain how much imidazole should be used in the wash buffer and in the elution buffer during larger‐scale purifications. To determine the most appropriate imidazole concentration for ParF‐His6 elution from an Ni2+ column, steps 1 to 12 in the preceding induction protocol were followed to produce a cleared E. coli lysate containing overproduced ParF‐His6. This lysate was then treated as follows.

  1. A 15 μl aliquot of the cleared lysate was removed to a microcentrifuge tube, 5 μl of 2× SDS‐PAGE loading buffer were added and the sample was stored at –20 °C for later use. This sample was the induced, cleared extract control.
  2. 200 μl settled bed volumes of commercially available Ni2+ resin can be handled in a 1.5 ml microcentrifuge tube. Non‐stick tubes were used so that the resin did not adhere to the tube walls. The following steps were done at 4 °C, although it may not be strictly necessary here as protein activity was not a concern during these trials.
  3. 400 μl of Ni2+ resin slurry were transferred to a 1.5 ml non‐stick microcentrifuge tube, which then was centrifuged at approx. 500g for 4 °C for five minutes. It is important to centrifuge at 400–1000g only as higher velocities may break the beads. The supernatant was discarded.
  4. The following sequence of washes was used to charge and equilibrate the resin with Ni2+. For each step the appropriate buffer was added, the tube was inverted several times to mix and then was centrifuged at approx. 500g at 4 °C for one minute.
    1. Twice with 400 μl sterile deionised water
    2. Three times with 400 μl 1× charge buffer (NiSO4 solution stored at 8× concentration [400 mM NiSO4])
    3. Twice with 400 μl binding buffer (20 mM Tris, pH 7.9, 500 mM NaCl, 5 mM imidazole, 10% glycerol).
  • For each step, 400 μl volumes were added with a 1 ml pipette tip, but were removed with a 200 μl tip in order not to disturb the resin.
  1. The charged resin was stored at 4 °C with 400 μl 1× binding buffer until ready for use. The resin is usable for many weeks when stored under these conditions.
  2. The cleared extract (step 1) was added to the resin. The sample was mixed gently by inversion several times and incubated for one hour at 4 °C on a rotary device to allow binding of ParF‐His6 to the resin.
  3. The sample was centrifuged at approx. 500g for 4 °C for one minute.
  4. 15 μl of the supernatant were retained, 5 μl of 2× loading buffer were added and the sample was stored at −20 °C. This sample was the ParF‐His6‐depleted extract.
  5. The following sequence of steps was used to wash and elute the ParF protein from the resin. Concentrations of imidazole from 100 to 400 mM were used successively to determine the minimum concentration that was sufficient for elution of the protein. For each step the appropriate buffer was added, the tube was inverted several times to mix, incubated for five minutes at 4 °C on a rotary device and then centrifuged at approx. 500g for 4 °C for one minute.
    1. Three times with 600 μl of binding buffer
    2. Twice with 600 μl of wash buffer (20 mM Tris, pH 7.9, 500 mM NaCl, 85 mM imidazole, 10% glycerol)
    3. The bound protein was eluted twice with 300 μl of elution buffer (20 mM Tris‐HCl, pH 7.9, 500 mM NaCl, 10% glycerol) containing in turn 100, 200, 300 or 400 mM imidazole.
  6. 15 μl of each fraction were retained, 5 μl of 2× loading buffer were added and the samples were analysed immediately by SDS‐PAGE or were stored at −20 °C for later processing.
  7. Visual comparison by SDS‐PAGE of ParF‐His6 concentrations in the cleared extract control (step 1), the depleted extract (step 8) and the wash and elution fractions (step 9) revealed that 300 mM was the optimal concentration of imidazole required for elution of the protein from the Ni2+ resin.

6.4.3 Optimised Protocol for ParF Purification

After determination of the optimal induction and imidazole elution conditions for ParF‐His6 in IMAC as described above, the protocol below was employed to purify the protein on a larger scale. Although protein purification trials with small culture volumes generally are scalable to larger volumes, caution should be exercised. An aliquot of the large‐scale culture after induction always should be analysed by SDS‐PAGE to verify correct overproduction before proceeding to purification of the target protein. In the event that expression at a larger scale does not replicate fully the expression that was optimised using smaller culture volumes, the induction time, inducer concentration and other parameters may need to be modified to improve the overproduction that is obtained when employing larger culture volumes. It is important also to transform the expression plasmid freshly into the E. coli host strain for each induction procedure. Long‐term maintenance of the plasmid in the expression strain, either on agar plates or as frozen stock cultures, is not recommended. E. coli BL21(DE3) and other strains that are commonly used for high level gene expression are proficient in homologous recombination that may induce plasmid rearrangements and also possess endonuclease activity that potentially results in plasmid loss. Instead the plasmid should be stored at −80 °C in a standard E. coli K‐12 strain that is used for molecular cloning. These strains have been mutated both to be deficient in homologous recombination and in endonuclease activity and provide a stable environment in which to avoid genetic alterations to the expression plasmid. The strain that contains the plasmid should be streaked for single colonies on appropriate selective agar plates at 37 °C. The strain subsequently can be propagated as a broth culture from which plasmid DNA may be prepared freshly using standard isolation procedures (see Chapter 4).

  1. E. coli BL21(DE3) was transformed with the pET22b(+)‐parF expression plasmid, selecting on LB plates containing ampicillin (100 μg/ml) at 37 °C for 12–16 hours.
  2. Approximately five colonies from this transformation were inoculated in 6 ml of LB broth with ampicillin in a 125 ml conical flask.
  3. The culture was grown at 30 °C, shaking at 180 rpm, until OD600 was 0.65–0.75 (approx. three hours).
  4. The 6 ml culture was used to inoculate 600 ml of LB with ampicillin in a 2 l flask.
  5. The culture was incubated at 30 °C, shaking at 180 rpm, until OD600 was 0.65–0.75 (approx. three hours).
  6. A 100 μl aliquot was removed to a microcentrifuge tube, centrifuged at approx. 10 000g for one minute, the supernatant was discarded and the cell pellet was resuspended in 20 μl 2× SDS‐PAGE loading buffer. The sample was denatured and stored at −20 °C as the uninduced control.
  7. The remaining culture was induced with 1 mM IPTG with incubation at 30 °C, shaking at 180 rpm, for three hours.
  8. A 100 μl aliquot was removed to check for correct overproduction (induction control) (Figure 6.5a, lane O).
  9. The induced culture was dispensed in 100 ml aliquots and centrifuged at approx. 5000g for five minutes. The pellets were stored at −80 °C. Cell pellets retained for many months at −80 °C can be used for protein recovery in the case of ParF. The long‐term viability of other recombinant proteins under these conditions may vary.
  10. A 100 ml cell pellet was thawed at room temperature.
  11. The pellet was resuspended in 15 ml of 1× binding buffer and the following were added: 150 μl of soybean trypsin inhibitor (1 mg/ml), 150 μl of lysozyme (10 mg/ml) and 150 μl of phenylmethylsulphonyl fluoride (100 mM).
  12. The sample was incubated for 15 minutes at 30 °C, an additional 150 μl of lysozyme solution were added and incubation was continued for another 15 minutes.
  13. The sample was sonicated on ice 5 times for 30 seconds each, with pause intervals of 1 minute, to disrupt the cells.
  14. The lysate was centrifuge at approx. 5000g for 40 minutes at 4 °C. This low‐speed centrifugation step maintains ParF‐His6 in the supernatant. Other overproduced proteins may be more robust and may tolerate higher centrifugation speeds.
  15. The supernatant extract was collected and loaded on to a pre‐equilibrated, commercially available Ni2+ column (2.5 ml settled bed resin) at 4 °C.
  16. The loading–elution circle was closed and ParF‐His6 was loaded for 90 minutes using a peristaltic pump at a flowrate = 3.0.
  17. The column was washed first with 30 ml of binding buffer followed by 70 ml of wash buffer (20 mM Tris, pH 7.9, 500 mM NaCl, 85 mM imidazole, 10% glycerol); 50–100 μl volumes of this flowthrough were retained in each case for later SDS‐PAGE analysis (Figure 6.5a, lane F).
  18. ParF‐His6 was eluted with 12 ml of elution buffer using the optimal imidazole concentration that was determined above (20 mM Tris‐HCl, pH 7.9, 500 mM NaCl, 10% glycerol, 300 mM imidazole).
  19. DTT was added to 2 mM final concentration to all the elution fractions to stabilise the protein. DTT prevents the oxidation of proteins in reducing environments and inhibits the formation of undesirable disulfide bonds. Although ParF‐His6 lacks cysteine residues that might form disulfide bonds, the protein proved to be more stable with DTT in the buffer. The effects of DTT on other recombinant proteins needs to be assessed empirically.
  20. The Bradford assay [50] and SDS‐PAGE were performed to identify the elution fractions in which ParF‐His6 was most concentrated (Figure 6.5a, lanes 1–12). Protein concentrations also can be determined using commercially available kits. Fractions 3–6 were retained in this case. The protein was estimated to be >90% pure (Figure 6.5).
  21. The selected fractions were exchanged into a storage buffer (30 mM Tris, pH 7.5–8.0, 100 mM KCl, 10% glycerol, 2 mM DTT) by a buffer exchange using 5 ml HiTrap desalting columns pre‐packed with Sephadex® G‐25 Superfine (GE Life Sciences). The storage buffer was used to equilibrate the column and to elute the protein. Alternatively, successive dialysis into the storage buffer was performed.
  22. Purified ParF‐His6 was aliquoted in 100 μl volumes, flash‐frozen with liquid nitrogen and stored at −80 °C. Each 100 ml volume of induced culture (step 9) produced several milligrams of purified protein. The protein was stable for many months under these storage conditions. The stability of other recombinant proteins needs to be assessed by trial‐and‐error.
Image described by caption and surrounding text.

Figure 6.5 Purification and chemical cross‐linking of ParF‐His6. (a) The ParF‐His6 protein was overproduced in E. coli BL21(DE3) as described in the text and aliquots from the purification steps were analysed by SDS‐PAGE and stained with Coomassie Blue. Lane M, protein size markers; lane O, overproduced ParF‐His6; lane F, flowthrough after binding of ParF‐His6 to the Ni2+ column; lanes 1–12, fractions collected after release of ParF‐His6 from the column with elution buffer. The open arrow indicates the position of ParF‐His6 in the overproduced and elution fractions. (b) Chemical cross‐linking of purified ParF‐His6. A series of reactions was assembled that contained ParF‐His6 (2.5 μg), ATP (1 mM), and DMP (10 mM). The reactions were incubated at 23 °C for 0, 0.5, 1, 3, 5, 15, 30, 60, and 120 minutes (left to right). Reactions were quenched by addition of 33 mM Tris‐HCl, pH 7.5 and an equal volume of SDS‐PAGE loading buffer, placed on ice, and then analysed by SDS‐PAGE and stained with Coomassie Blue. Open and closed arrows indicate the positions of monomeric and dimeric ParF‐His6, respectively. Lane M, protein size markers. Selected size markers (kDa) are indicated on both panels.

6.4.4 Demonstrating the Functionality of Purified ParF‐His6 Protein

ParF is a monomer when bound to ADP but forms higher‐order complexes in the presence of ATP [51]. One example assay of the purified ParF‐His6 protein involved examination of the formation of these higher‐order species after covalent cross‐linking with the cross‐linking reagent dimethyl pimelimidate (DMP). DMP reacts with amine‐containing moieties such as those in proteins to form covalent amidine linkages. These linkages are insensitive to SDS‐PAGE and thus treatment with DMP permits the visualisation of dimeric and other oligomeric forms of proteins. Aliquots of purified ParF‐His6 were incubated with ATP and DMP for up to 120 minutes and the reactions were analysed by SDS‐PAGE. In addition to the monomeric form of the protein that was evident at approx. 23 kDa, a dimeric product became visible as the incubation time increased (Figure 6.5b). Thus, purified ParF‐His6 forms higher‐order species that are detectable by chemical cross‐linking. The ParF‐His6 protein also displayed ATPase activity, interacted with a partner protein and exhibited other properties in vitro, which verified that the purified protein was functional [51].

6.5 Production and Purification of a GST‐Tagged Protein: Example Methodology with the C‐Terminal Domain of Yeast RNA Polymerase II

RNA polymerase II is one of three RNA polymerases in eucaryotic cells. This multisubunit complex is responsible for the bulk of gene transcription and therefore is a fulcrum for regulation of gene expression. The C‐terminal domain (CTD) of the largest subunit of RNA polymerase II interacts with numerous RNA processing factors and also is a site for phosphorylation that profoundly influences the domain's function [52]. CTD contains repeats of a heptapeptide motif, which in the yeast Saccharomyces cerevisiae comprises 26 copies of the consensus sequence Tyr‐Ser‐Pro‐Thr‐Ser‐Pro‐Ser [53,54]. Yeast CTD has been purified by various strategies including as a GST tagged protein in E. coli [55].

The gene segment that encodes CTD was amplified by PCR from yeast genomic DNA and inserted in the pGEX‐4T‐1 vector (GE Healthcare Life Sciences) using EcoRI and XhoI restriction enzymes [55] (Figure 6.6a). The primers used for amplification were designed analogously to those illustrated in Figure 6.3a. As outlined above for the pET vector series, the cloning generated an in‐frame N‐terminal fusion of the genes for GST and CTD. Unlike the pET vector series, expression of the fused gene from pGEX‐4T‐1‐CTD is driven by the tac promoter, which is a hybrid between the lac and trp promoters that is stronger than either of the parental promoters [56]. The pGEX‐4T‐1 vector includes the lacI q gene, which overproduces the Lac repressor. As described above, IPTG inhibits binding of the repressor at the tac promoter, which therefore is available for recognition by E. coli RNA polymerase that directs the high‐level expression of the fusion gene for GST‐CTD.

Image described by caption and surrounding text.

Figure 6.6 Cloning and purification of GST‐CTD. (a) Organisation of the multiple cloning site in vector pGEX‐4T‐1. This vector (GE Healthcare Life Sciences) facilitates the overproduction of proteins with an N‐terminal GST tag. The gene for GST (blue) is followed in‐frame by the expanded DNA sequence. Selected restriction enzyme sites are marked. These sites are used to clone the gene of interest as an in‐frame PCR product between, for example, the EcoRI and XhoI sites to produce a fusion protein that harbours the GST tag at the N‐terminus. The tag allows affinity purification of the corresponding fusion protein. If desired, the GST tag subsequently may be removed from the protein by digestion with thrombin serine protease that cleaves at the site indicated by the vertical arrow. Stop codons are underlined. Expression of the fusion gene is driven by the tac promoter. (b) SDS‐PAGE analysis of purified GST‐CTD. The GST‐CTD fusion protein was purified as detailed in the text. Aliquots of the elution fractions 1, 2, and 3 were analysed by SDS‐PAGE and stained with Coomassie Blue. Lane M, protein size markers. Selected size markers (kDa) are indicated.

Production of the GST‐CTD protein in E. coli was optimised as described above for the ParF‐His6 protein: the most suitable incubation temperature, IPTG concentration, induction time and solubility characteristics were determined by tests with small‐scale volumes of cultures that harboured the pGEX‐4T‐1‐CTD expression plasmid. Samples were analysed by SDS‐PAGE to visualise production of the GST‐CTD protein fusion. Unlike the pET system detailed above, the pGEX‐4T‐1 expression vector does not involve regulation by T7 RNA polymerase and therefore does not require specialised strains such as E. coli BL21(DE3). Therefore different laboratory derivatives of E. coli K‐12 were assessed for GST‐CTD production from pGEX‐4T‐1‐CTD. The following is the optimised protocol for production and purification of the GST‐CTD fusion protein and is a modified version of the procedure described elsewhere [57].

  1. E. coli DH5α was transformed with the pGEX‐4T‐1‐CTD expression plasmid (0.1 μg) with selection on LB plates containing ampicillin (100 μg/ml) at 37 °C for 12–16 hours.
  2. Approximately five colonies from this transformation were inoculated in 6 ml of LB broth with ampicillin in a 125 ml conical flask.
  3. The culture was grown at 30 °C, shaking at 180 rpm, until OD600 was approx. 0.6 (approx. four hours).
  4. The 6 ml pre‐culture was used to inoculate 1 l of LB with ampicillin in a 3 l flask.
  5. The culture was incubated at 30 °C, shaking at 180 rpm, until OD600 was approx. 0.6 (approx. four hours).
  6. A 100 μl aliquot was removed to a microcentrifuge tube, centrifuged at approx. 10 000g for one minute, the supernatant was discarded and the cell pellet was resuspended in 20 μl 2× SDS‐PAGE loading buffer. The sample was denatured and stored at −20 °C as the uninduced control for subsequent SDS‐PAGE analysis.
  7. The remaining culture was induced with 1 mM IPTG with incubation at 30 °C, shaking at 180 rpm, for 12 hours.
  8. A 100 μl aliquot was removed to check for correct induction (induction control) in SDS‐PAGE analysis.
  9. The induced culture was dispensed in 500 ml aliquots and centrifuged at approx. 5000g for five minutes. The pellets were stored at −80 °C.
  10. A 500 ml cell pellet was thawed on ice (approx. one hour).
  11. The pellet was resuspended in 20 ml of TZ buffer (50 mM Tris‐HCl, pH 7.9, 12.5 mM MgCl2, 0.5 mM ethylenediaminetetraacetic acid (EDTA), 100 mM KCl, 20% glycerol, 1 mM β‐mercaptoethanol, 10 μM ZnCl2).
  12. The protease inhibitors leupeptin, phenylmethylsulphonyl fluoride and soybean trypsin inhibitor were added to the resuspended cell pellet to final concentrations each of 1 μg/ml, along with the detergent Triton X‐100 (1%) and 1 ml of lysozyme (20 mg/ml).
  13. The sample was incubated on ice for 30 minutes and then was sonicated on ice 6 times for 30 seconds each with pause intervals of one minute.
  14. The lysate was centrifuged at approx. 8000 g for 60 minutes at 4 °C.
  15. The supernatant extract (approx. 40 ml) was collected and mixed with one bed volume of GSH‐sepharose resin. The resin was prepared by centrifuging 1.33 ml of resin slurry, resuspending in 10 ml of TZ buffer and washing twice with this buffer.
  16. The sample was incubated with gentle agitation for 30 minutes at room temperature to permit binding of the GST‐CTD protein to the resin.
  17. The lysate–resin mix was centrifuged for five minutes and the supernatant was removed and discarded.
  18. The resin with the bound GST‐CTD was washed with 20 ml of 50 mM Tris‐HCl, pH 7.9, 1 M NaCl, 1 mM β‐mercaptoethanol, and soybean trypsin inhibitor, leupeptin and phenylmethylsulphonyl fluoride at final concentrations of 1 μg/ml.
  19. The sample was centrifuged and washed with 20 ml of phosphate buffered saline (137 mM NaCl, 10 mM Na2HPO4, 1.8 mM KH2PO4, 2.7 mM KCl, pH 7.4) that contained Triton X‐100 (1%), 1 mM β‐mercaptoethanol and the protease inhibitors described in the preceding step.
  20. GST‐CTD was eluted from the glutathione sepharose resin with 2 ml of 50 mM Tris‐HCl, pH 7.5 that contained 15 mM reduced glutathione with incubation at room temperature for 10 minutes.
  21. The sample was centrifuged and approx. 1.5 ml of the supernatant was collected (fraction 1).
  22. Steps 20 and 21 were repeated twice to produce fractions 2 and 3.
  23. Fractions 1–3 were dialysed against 500 ml of TZ buffer, aliquoted in 100 μl volumes, flash‐frozen with liquid nitrogen and stored at −80 °C.
  24. The Bradford assay and SDS‐PAGE were performed to determine protein concentration and purity (Figure 6.6b), respectively. The protein was estimated to be >90% pure.

6.5.1 Demonstrating the Functionality of Purified GST‐CTD Protein

The interaction of CTD with a plethora of RNA processing factors is detectable both in vivo and in vitro using a variety of experimental approaches. These interactions are modulated by the phosphorylation state of the CTD [52]. The GST‐CTD purified above was phosphorylated in vitro. Both the phosphorylated and unphosphorylated forms were tested for interactions with proteins in yeast whole‐cell extracts using affinity chromatography assays (also known as pull‐down assays). These experiments revealed diverse interacting partners with the CTD including the Pcf11 protein that links mRNA 3′‐end processing, transcriptional elongation and termination of transcription in S. cerevisiae [55]. As the tag was not cleaved from the GST‐CTD fusion protein in this case, the tagged protein was demonstrated to be functional in pull‐down assays and the GST tag did not interfere detectably with the interaction of CTD with partner proteins. An important negative control experiment in these assays was to test that purified GST alone did not interact with the proteins to which GST‐CTD is bound.

6.6 Further Purification of Tagged Proteins

The preceding examples illustrated cases in which His6 and GST affinity tags did not perturb the behaviours of recombinant proteins in vitro. Nevertheless, the potential impact of the tag on the function and structure of the purified protein should always be considered. Although bulky tags such as GST and MBP are considered to fold independently of the target protein to which they are fused, these tags are more likely to affect protein behaviour than compact tags such as His6, which generally are considered innocuous [58]. Nevertheless, there are numerous reports that His6‐tags also can affect protein activity [5961]. Therefore the production, purification and comparison of proteins with N‐ or C‐terminal tags may be appropriate. In practice the behaviour of a protein with one of these tags, e.g. a C‐terminal tag, may be examined initially and the second tagged version of the protein, e.g. with an N‐terminal tag, may be characterised only if the first protein either is not produced properly or displays aberrant properties in vitro. Alternatively, certain expression vectors are designed specifically to allow proteolytic cleavage of His6‐tag and other tags after protein purification (Figure 6.4) [19]. In this case the tagless protein and cleaved tag are separated by passing the reaction through an Ni2+ column to which the tag and uncleaved protein will bind but through which the tagless protein will pass unhindered (Figure 6.7). Co‐purified proteins that have affinity for the His6 matrix will also be retained on the column. The cleavage reaction typically leaves a ‘stub’ of a few amino acids between the cleavage site and the target protein. For example, cloning of a target gene between NdeI and XhoI restriction sites in pET33b(+) generates a fusion protein that harbours the amino acid sequence MGSSHHHHHHSSGLVPR↓GSRRASVH at the N‐terminus (Figure 6.4). Digestion of the purified protein with the thrombin protease removes 17 residues, including the His6 motif, but leaves the eight amino acid tail GSRRASVH. Ion exchange chromatography also may be used to purify the untagged protein from the protease cleavage reaction. Strategies have been developed for purification of overproduced tagless proteins by single‐step procedures, although these approaches have yet to gain widespread popularity [62,63].

Image described by caption.

Figure 6.7 Separation and purification of a tagless target protein. The overproduced fusion protein comprises the target protein (blue) with a His6 or other affinity tag (yellow). The tagged protein is purified from other proteins and cellular components (open symbols) in E. coli by IMAC or equivalent affinity chromatography. The fusion protein binds non‐covalently to the affinity matrix (green) whereas other proteins pass through the column. The tagged protein is eluted and purified by altering the buffer conditions on the matrix, e.g. with a high concentration of imidazole in the case of His6‐tagged proteins. The purified protein is treated with an appropriate protease (not shown) that cleaves a linker sequence that is engineered between the target protein and the tag. The cleavage reaction is applied to a fresh affinity column to which the liberated tag binds whereas the purified tagless target protein emerges in the flowthrough. The commercially available protease may be engineered to possess the same tag as the target fusion protein so that it also is captured on the matrix.

The example purification procedures for ParF‐His6 and GST‐CTD described above typically generate proteins that are >90% pure based on inspection by SDS‐PAGE. This level of purity often is sufficient for in vitro analyses. However, if the presence of contaminating proteins is a concern, a mutated non‐functional version of the tagged protein can be purified by an identical procedure and tested in the relevant assay(s) in parallel with the wild‐type protein. The mutated protein can be generated by site‐directed mutagenesis of a codon, which specifies an amino acid that is known to be key for protein function [64]. If the mutated protein displays no activity then the activity observed with the wild‐type protein can be attributed more confidently to the latter and not to contaminating proteins. Alternatively, the wild‐type protein can be purified further using additional chromatographic techniques, which are discussed in detail in Chapter 7. Principal among these techniques are ion exchange chromatography that utilises the reversible interaction between a charged protein and a chromatographic matrix that possesses an opposite charge, gel filtration chromatography that separates protein species based on molecular size and hydrophobic interaction chromatography that purifies proteins based on hydrophobicity [6567].

6.7 Alternative Hosts for Protein Production

Although E. coli will continue to be the workhorse for recombinant protein production for the foreseeable future, expression platforms also have been developed for protein production in other bacterial species [68]. However, one noteworthy drawback with protein purification from E. coli and other bacteria is the absence of extensive post‐translational modification processes in these hosts. Proteins in eucaryotic species often are subject to glycosylation, phosphorylation, acetylation and other modifications after translation [69]. Post‐translational modifications may modulate the structure and/or function of eucaryotic proteins. Proteins that lack these modifications when produced and purified from E. coli may not exhibit all of the characteristics of the equivalent modified versions. Moreover, numerous eucaryotic proteins and protein complexes have also proven to be recalcitrant to overproduction in E. coli. However, several alternative hosts have been developed for the purification of post‐translationally modified proteins as well as mammalian and other recombinant proteins that are refractory to production in bacteria [ 1 7073]. Pichia pastoris is a methylotrophic yeast that can utilise methanol as a sole carbon and energy source. The genes involved in methanol utilisation are highly inducible, which led to the development of methanol‐inducible expression systems in this host [74]. Recent developments in genome analysis of P. pastoris, along with promoter engineering, the optimization of codon usage and gene dosage, and enhancements to protein secretion and methanol metabolic pathways, make this organism an attractive host for the production of heterologous proteins [75].

The yeast S. cerevisiae is generally recognised as safe, is genetically tractable and is used extensively for the industrial production of certain biochemicals and biofuels due to its tolerance of alcohols and harsh conditions [76,77]. In addition, a range of expression plasmids that possess inducible or constitutive promoters and different selectable markers have been developed for recombinant protein production in S. cerevisiae [78]. Glycosylation is one of the most important post‐translational modifications of eucaryotic proteins. Glycosylation patterns in P. pastoris and S. cerevisiae differ both from each other and from those that occur in human cells. Therefore, caution needs to be exerted in choosing which of these hosts is more appropriate to produce a recombinant target protein [79]. This issue has been alleviated in part by the engineering of P. pastoris derivatives that mimic the glycosylation patterns in human cells [80].

Baculovirus is an insect virus that has been adapted for protein production in cell lines. Notably, baculovirus‐infected insect cell lines have been used extensively for the production of a range of recombinant proteins, including proteins that do not fold properly in bacterial expression systems, glycoproteins, membrane proteins, vaccines and multiprotein complexes [81]. Baculovirus expression systems avail particularly of the viral polyhedrin and p10 promoters, which are the primary promoters that are active during the late phase of virus infection. Baculovirus also has been engineered with mammalian promoters for the production of recombinant proteins in mammalian cell lines [82].

6.8 Concluding Remarks

The production and simplified purification of tagged recombinant proteins from E. coli has been central to the success of molecular biology in recent decades [83]. Proteins now are purified routinely for fundamental functional studies, as reagents and targets for biomedical and pharmaceutical use, and as industrial enzymes. Continued efforts to improve further the protein production capacity of E. coli centre on the enhancement of expression vectors, host strains and induction systems [84,85], as well as the use of genomic, proteomic, and metabolic engineering strategies to modulate gene expression and control [86,87]. A wide array of sophisticated expression systems that incorporate diverse plasmid vectors, different affinity purification tags and tightly regulated promoters are available for use in E. coli [11]. The breadth of expression platforms may seem daunting when embarking on a first venture into recombinant protein production and purification. However, many of the available expression systems have been tried, tested and honed by a plethora of researchers over many years using multifarious proteins from sundry organisms [88]. The simplest start point for a researcher who wishes to delve into the realm of protein production potentially is to use a His6‐tag vector of the pET or a related series that has been proven to be robust for the expression of a variety of disparate genes. Investigation of the relevant literature concerned with overproduction, separation and purification of proteins that are related to the protein of interest is also highly recommended as nihil novi sub sole.

Acknowledgements

Work in the laboratory of DB is supported by the Biotechnology and Biological Sciences Research Council.

References

  1. 1 Nettleship, J.E., Assenberg, R., Diprose, J.M. et al. (2010). Recent advances in the production of proteins in insect and mammalian cells for structural biology. J. Struct. Biol. 172: 55–65.
  2. 2 Huang, C.J., Lin, H., and Yang, X. (2012). Industrial production of recombinant therapeutics in Escherichia coli and its recent advancements. J. Ind. Microbiol. Biotechnol. 39: 383–399.
  3. 3 Son, M.S. and Taylor, R.K. (2012). Growth and maintenance of Escherichia coli laboratory strains. Curr. Protoc. Microbiol. Chapter 5:Unit 5A.4.
  4. 4 Blount, Z.D. (2015). The unexhausted potential of E. coli. eLife 4: e05826.
  5. 5 Gilbert, W. and Müller‐Hill, B. (1966). Isolation of the Lac repressor. Proc. Natl. Acad. Sci. U.S.A. 56: 1891–1898.
  6. 6 Rosenberg, J.M., Khallai, O.B., Kopka, M.L. et al. (1977). Lac repressor purification without inactivation of DNA binding activity. Nucleic Acids Res. 4: 567–572.
  7. 7 Winshell, E. and Shaw, W.V. (1969). Kinetics of induction and purification of chloramphenicol acetyltransferase from chloramphenicol‐resistant Staphylococcus aureus. J. Bacteriol. 98: 1248–1257.
  8. 8 Amarasinghe, C. and Jin, J.P. (2015). The use of affinity tags to overcome obstacles in recombinant protein expression and purification. Protein Pept. Lett. 22: 885–892.
  9. 9 Raran‐Kurussi, S. and Waugh, D.S. (2017). Expression and purification of recombinant proteins in Escherichia coli with a His6 or dual His6‐MBP tag. Methods Mol. Biol. 1607: 1–15.
  10. 10 Wood, D.W. (2014). New trends and affinity tag designs for recombinant protein purification. Curr. Opin. Struct. Biol. 26: 54–61.
  11. 11 Rosano, G.L. and Ceccarelli, E.A. (2014). Recombinant protein expression in Escherichia coli: advances and challenges. Front. Microbiol. 5: 172.
  12. 12 Zhao, X., Li, G., and Liang, S. (2013). Several affinity tags commonly used in chromatographic purification. J. Anal. Methods Chem. 2013: 581093.
  13. 13 Lebendiker, M. and Danieli, T. (2017). Purification of proteins fused to maltose‐binding protein. Methods Mol. Biol. 1485: 257–273.
  14. 14 Waugh, D.S. (2016). The remarkable solubility‐enhancing power of Escherichia coli maltose‐binding protein. Postepy Biochem. 62: 377–382.
  15. 15 Bataille, L., Dieryck, W., Hocquellet, A. et al. (2015). Expression and purification of short hydrophobic elastin‐like polypeptides with maltose‐binding protein as a solubility tag. Protein Expression Purif. 110: 165–171.
  16. 16 Guo, Y., Yu, M., Jing, N., and Zhang, S. (2018). Production of soluble bioactive mouse leukemia inhibitory factor from Escherichia coli using MBP tag. Protein Expression Purif. 150: 86–91.
  17. 17 Raran‐Kurussi, S. and Waugh, D.S. (2012). The ability to enhance the solubility of its fusion partners is an intrinsic property of maltose‐binding protein but their folding is either spontaneous or chaperone‐mediated. PLoS One 7: e49589.
  18. 18 Harper, S. and Speicher, D.W. (2011). Purification of proteins fused to glutathione S‐transferase. Methods Mol. Biol. 681: 259–280.
  19. 19 Waugh, D.S. (2011). An overview of enzymatic reagents for the removal of affinity tags. Protein Expression Purif. 80: 283–293.
  20. 20 Block, H., Maertens, B., Spriestersbach, A. et al. (2009). Immobilized‐metal affinity chromatography (IMAC): a review. Methods Enzymol. 463: 439–473.
  21. 21 Porath, J., Carlsson, J., Olsson, I., and Belfrage, G. (1975). Metal chelate affinity chromatography, a new approach to protein fractionation. Nature 258: 598–599.
  22. 22 Hochuli, E., Bannwarth, W., Döbeli, H. et al. (1988). Genetic approach to facilitate purification of recombinant proteins with a novel metal chelate adsorbent. Nat. Biotechnol. 6: 1321–1325.
  23. 23 Studier, F.W., Rosenberg, A.H., Dunn, J.J., and Dubendorff, J.W. (1990). Use of T7 RNA polymerase to direct expression of cloned genes. Methods Enzymol. 185: 60–89.
  24. 24 Bartosik, A.A., Markowska, A., Szarlak, J. et al. (2012). Novel broad‐host‐range vehicles for cloning and shuffling of gene cassettes. J. Microbiol. Methods 88: 53–62.
  25. 25 Santos, P.M., Di Bartolo, I., Blatny, J.M. et al. (2001). New broad‐host‐range promoter probe vectors based on the plasmid RK2 replicon. FEMS Microbiol. Lett. 195: 91–96.
  26. 26 Scott, H.N., Laible, P.D., and Hanson, D.K. (2003). Sequences of versatile broad‐host‐range vectors of the RK2 family. Plasmid 50: 74–79.
  27. 27 Chakravartty, V. and Cronan, J.E. (2015). A series of medium and high copy number arabinose‐inducible Escherichia coli expression vectors compatible with pBR322 and pACYC184. Plasmid 81: 21–26.
  28. 28 Guzman, L.M., Belin, D., Carson, M.J., and Beckwith, J. (1995). Tight regulation, modulation, and high‐level expression by vectors containing the arabinose PBAD promoter. J. Bacteriol. 177: 4121–4130.
  29. 29 Bedouelle, H. and Duplay, P. (1988). Production in Escherichia coli and one‐step purification of bifunctional hybrid proteins which bind maltose. Eur. J. Biochem. 171: 541–549.
  30. 30 Bird, L.E. (2011). High throughput construction and small scale expression screening of multi‐tag vectors in Escherichia coli. Methods 55: 29–37.
  31. 31 Cabrita, L.D., Dai, W., and Bottomley, S.P. (2006). A family of E. coli expression vectors for laboratory scale and high throughput soluble protein production. BMC Biotechnol. 6: 12.
  32. 32 Correa, A., Ortega, C., Obal, G. et al. (2014). Generation of a vector suite for protein solubility screening. Front. Microbiol. 5: 67.
  33. 33 Guan, C., Li, P., Riggs, P.D., and Inouye, H. (1988). Vectors that facilitate the expression and purification of foreign peptides in Escherichia coli by fusion to maltose‐binding protein. Gene 67: 21–30.
  34. 34 Smith, D.B. and Johnson, K.S. (1988). Single‐step purification of polypeptides expressed in Escherichia coli as fusions with glutathione S‐transferase. Gene 67: 31–40.
  35. 35 Jia, B. and Jeon, C.O. (2016). High‐throughput recombinant protein expression in Escherichia coli: current status and future perspectives. Open Biol. 6: 160196.
  36. 36 Salim, L., Feger, C., and Busso, D. (2016). Construction of a compatible Gateway‐based co‐expression vector set for expressing multiprotein complexes in E. coli. Anal. Biochem. 512: 110–113.
  37. 37 Schmid‐Burgk, J.L., Schmidt, T., Kaiser, V. et al. (2013). A ligation‐independent cloning technique for high‐throughput assembly of transcription activator‐like effector genes. Nat. Biotechnol. 31: 76–81.
  38. 38 Shendure, J., Balasubramanian, S., Church, G.M. et al. (2017). DNA sequencing at 40: past, present and future. Nature 550: 345–353.
  39. 39 Jenny, R.J., Mann, K.G., and Lundblad, R.L. (2003). A critical review of the methods for cleavage of fusion proteins with thrombin and factor Xa. Protein Expression Purif. 31: 1–11.
  40. 40 Tang, G.Q., Bandwar, R.P., and Patel, S.S. (2005). Extended upstream A‐T sequence increases T7 promoter strength. J. Biol. Chem. 280: 40707–40713.
  41. 41 McLeod, B.N., Allison‐Gamble, G.E., Barge, M.T. et al. (2017). A three‐dimensional ParF meshwork assembles through the nucleoid to mediate plasmid segregation. Nucleic Acids Res. 45: 3158–3171.
  42. 42 Hayes, F. and Barillà, D. (2010). Extrachromosomal components of the nucleoid: recent developments in deciphering the molecular basis of plasmid segregation. In: Bacterial Chromatin (ed. C.J. Dorman and R.T. Dame), 49–70. Dordrecht: Springer Publishing.
  43. 43 Barillà, D. and Hayes, F. (2003). Architecture of the ParF*ParG protein complex involved in prokaryotic DNA segregation. Mol. Microbiol. 49: 487–499.
  44. 44 Papaneophytou, C.P. and Kontopidis, G. (2014). Statistical approaches to maximize recombinant protein expression in Escherichia coli: a general review. Protein Expression Purif. 94: 22–32.
  45. 45 Leibly, D.J., Nguyen, T.N., Kao, L.T. et al. (2012). Stabilizing additives added during cell lysis aid in the solubilization of recombinant proteins. PLoS One 7: e52482.
  46. 46 Ramón, A., Señorale‐Pose, M., and Marín, M. (2014). Inclusion bodies: not that bad…. Front. Microbiol. 5: 56.
  47. 47 Singh, A., Upadhyay, V., Upadhyay, A.K. et al. (2015). Protein recovery from inclusion bodies of Escherichia coli using mild solubilization process. Microb. Cell Fact. 14: 41.
  48. 48 Correa, A. and Oppezzo, P. (2015). Overcoming the solubility problem in E. coli: available approaches for recombinant protein production. Methods Mol. Biol. 1258: 27–44.
  49. 49 Saccardo, P., Corchero, J.L., and Ferrer‐Miralles, N. (2016). Tools to cope with difficult‐to‐express proteins. Appl. Microbiol. Biotechnol. 100: 4347–4355.
  50. 50 Bradford, M.M. (1976). A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein‐dye binding. Anal. Biochem. 72: 248–254.
  51. 51 Barillà, D., Rosenberg, M.F., Nobbmann, U., and Hayes, F. (2005). Bacterial DNA segregation dynamics mediated by the polymerizing protein ParF. EMBO J. 24: 1453–1464.
  52. 52 Buratowski, S. (2009). Progression through the RNA polymerase II CTD cycle. Mol. Cell 36: 541–546.
  53. 53 Babokhov, M., Mosaheb, M.M., Baker, R.W., and Fuchs, S.M. (2018). Repeat‐specific functions for the C‐terminal domain of RNA polymerase II in budding yeast. G3 8: 1593–1601.
  54. 54 Hsin, J.P. and Manley, J.L. (2012). The RNA polymerase II CTD coordinates transcription and RNA processing. Genes Dev. 26: 2119–2137.
  55. 55 Barillà, D., Lee, B.A., and Proudfoot, N.J. (2001). Cleavage/polyadenylation factor IA associates with the carboxyl‐terminal domain of RNA polymerase II in Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. U.S.A. 98: 445–450.
  56. 56 de Boer, H.A., Comstock, L.J., and Vasser, M. (1983). The tac promoter: a functional hybrid derived from the trp and lac promoters. Proc. Natl. Acad. Sci. U.S.A. 80: 21–25.
  57. 57 Peterson, S.R., Dvir, A., Anderson, C.W., and Dynan, W.S. (1992). DNA binding provides a signal for phosphorylation of the RNA polymerase II heptapeptide repeats. Genes Dev. 6: 426–438.
  58. 58 Carson, M., Johnson, D.H., McDonald, H. et al. (2007). His‐tag impact on structure. Acta Crystallogr. D Biol. Crystallogr. 63: 295–301.
  59. 59 Booth, W.T., Schlachter, C.R., Pote, S. et al. (2018). Impact of an N‐terminal polyhistidine tag on protein thermal stability. ACS Omega 3: 760–768.
  60. 60 Mohanty, A.K. and Wiener, M.C. (2004). Membrane protein expression and production: effects of polyhistidine tag length and position. Protein Expression Purif. 33: 311–325.
  61. 61 Sabaty, M., Grosse, S., Adryanczyk, G. et al. (2013). Detrimental effect of the 6 His C‐terminal tag on YedY enzymatic activity and influence of the TAT signal sequence on YedY synthesis. BMC Biochem. 14: 28.
  62. 62 Cooper, M.A., Taris, J.E., Shi, C., and Wood, D.W. (2018). A convenient split‐intein tag method for the purification of tagless target proteins. Curr. Protoc. Protein Sci. 91: 5.29.1–5.29.23.
  63. 63 Guan, D. and Chen, Z. (2014). Challenges and recent advances in affinity purification of tag‐free proteins. Biotechnol. Lett. 36: 1391–1406.
  64. 64 Peracchi, A. (2001). Enzyme catalysis: removing chemically ‘essential’ residues by site‐directed mutagenesis. Trends Biochem. Sci. 26: 497–503.
  65. 65 O'Fágáin, C., Cummins, P.M., and O'Connor, B.F. (2011). Gel‐filtration chromatography. Methods Mol. Biol. 681: 25–33.
  66. 66 Jungbauer, A. and Hahn, R. (2009). Ion‐exchange chromatography. Methods Enzymol. 463: 349–371.
  67. 67 McCue, J.T. (2009). Theory and use of hydrophobic interaction chromatography in protein purification applications. Methods Enzymol. 463: 405–414.
  68. 68 Gómez, S., López‐Estepa, M., Fernández, F.J., and Vega, M.C. (2016). Protein complex production in alternative prokaryotic hosts. Adv. Exp. Med. Biol. 896: 115–133.
  69. 69 Khoury, G.A., Baliban, R.C., and Floudas, C.A. (2011). Proteome‐wide post‐translational modification statistics: frequency analysis and curation of the swiss‐prot database. Sci. Rep. 1: 90.
  70. 70 Contreras‐Gómez, A., Sánchez‐Mirón, A., García‐Camacho, F. et al. (2014). Protein production using the baculovirus‐insect cell expression system. Biotechnol. Prog. 30: 1–18.
  71. 71 Fernández, F.J. and Vega, M.C. (2016). Choose a suitable expression host: a survey of available protein production platforms. Adv. Exp. Med. Biol. 896: 15–24.
  72. 72 Gasser, B., Prielhofer, R., Marx, H. et al. (2013). Pichia pastoris: protein production host and model organism for biomedical research. Future Microbiol. 8: 191–208.
  73. 73 Wang, G., Huang, M., and Nielsen, J. (2017). Exploring the potential of Saccharomyces cerevisiae for biopharmaceutical protein production. Curr. Opin. Biotechnol. 48: 77–84.
  74. 74 Zahrl, R.J., Peña, D.A., Mattanovich, D., and Gasser, B. (2017). Systems biotechnology for protein production in Pichia pastoris. FEMS Yeast Res. 17: fox068.
  75. 75 Byrne, B. (2015). Pichia pastoris as an expression host for membrane protein structural biology. Curr. Opin. Struct. Biol. 32: 9–17.
  76. 76 Kutyna, D.R. and Borneman, A.R. (2018). Heterologous production of flavour and aroma compounds in Saccharomyces cerevisiae. Genes 9: E326.
  77. 77 Turner, T.L., Kim, H., Kong, I.I. et al. (2018). Engineering and evolution of Saccharomyces cerevisiae to produce biofuels and chemicals. Adv. Biochem. Eng. Biotechnol. 162: 175–215.
  78. 78 Darby, R.A., Cartwright, S.P., Dilworth, M.V., and Bill, R.M. (2012). Which yeast species shall I choose? Saccharomyces cerevisiae versus Pichia pastoris. Methods Mol. Biol. 866: 11–23.
  79. 79 Vieira Gomes, A.M., Souza Carmo, T., Silva Carvalho, L. et al. (2018). Comparison of yeasts as hosts for recombinant protein production. Microorganisms 6: E38.
  80. 80 Hamilton, S.R., Davidson, R.C., Sethuraman, N. et al. (2006). Humanization of yeast to produce complex terminally sialylated glycoproteins. Science 313: 1441–1443.
  81. 81 Kost, T.A. and Kemp, C.W. (2016). Fundamentals of baculovirus expression and applications. Adv. Exp. Med. Biol. 896: 187–197.
  82. 82 Mansouri, M. and Berger, P. (2018). Baculovirus for gene delivery to mammalian cells: past, present and future. Plasmid 98: 1–7.
  83. 83 Gileadi, O. (2017). Recombinant protein expression in E. coli: a historical perspective. Methods Mol. Biol. 1586: 3–10.
  84. 84 Gupta, S.K. and Shukla, P. (2016). Advanced technologies for improved expression of recombinant proteins in bacteria: perspectives and applications. Crit. Rev. Biotechnol. 36: 1089–1098.
  85. 85 Schlegel, S., Genevaux, P., and de Gier, J.W. (2017). Isolating Escherichia coli strains for recombinant protein production. Cell. Mol. Life Sci. 74: 891–908.
  86. 86 Liu, M., Feng, X., Ding, Y. et al. (2015). Metabolic engineering of Escherichia coli to improve recombinant protein production. Appl. Microbiol. Biotechnol. 99: 10367–10377.
  87. 87 Mahalik, S., Sharma, A.K., and Mukherjee, K.J. (2014). Genome engineering for improved recombinant protein expression in Escherichia coli. Microb. Cell Fact. 13: 177.
  88. 88 Konczal, J. and Gray, C.H. (2017). Streamlining workflow and automation to accelerate laboratory scale protein production. Protein Expression Purif. 133: 160–169.

Further Reading

  1. Bonner, P.L.R. (2018). Protein Purification, 2e. CRC Press Inc.
  2. Janson, J.C. (ed.) (2011). Protein Purification: Principles, High Resolution Methods, and Applications, 3e. Wiley.
  3. Rosenberg, I.M. (2005). Protein Analysis and Purification: Benchtop Techniques. Birkhäuser.
  4. Simpson, R.J., Adams, P.D., and Golemis, E.A. (eds.) (2009). Basic Methods in Protein Purification and Analysis: A Laboratory Manual. Cold Spring Harbor Laboratory Press.
  5. Scopes, R.K. (2010). Protein Purification: Principles and Practice. New York: Springer.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset