8
Synthetic Methodology in Chemical Biology

Richard C. Brewster and Stephen Wallace

Institute of Quantitative Biology, Biochemistry and Biotechnology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK

8.1 Introduction

Chemical biology is a diverse field encompassing a wide range of techniques and processes. One challenging area in this field is the study of biomolecules in their native environments to determine structure, function and dynamics.

This chapter looks to introduce some of the different approaches to biomolecule modification in this field. The first section will look at in vitro methods, starting with an introduction to peptide synthesis and how these peptides can be ligated to produce synthetic proteins. We will then consider some of the different chemical reactions for modifying endogenous amino acids on proteins in vitro.

The second section will discuss the use of bioorthogonal reactions for protein modification in vivo. We will look at how design principles can be used to improve reaction kinetics in vivo and how the judicious choice of a reaction manifold can lead to improved properties in single cells and whole animals. We will then provide an overview of different methods for the introduction of new functional groups into living cells using both metabolic and synthetic biology approaches.

Finally, a case study will consider the field of histone post‐translational modifications (PTMs). Three very different methods for incorporating unnatural amino acid (UAA) modifications into a histone protein are considered and their differences and complementarities discussed.

8.2 Peptide Synthesis

Peptides play a fundamental role in essentially all physiological and biochemical processes. They form the basis of a number of pharmaceutical agents and their unique chemistry has facilitated the study of many biological systems. By definition, peptides are short chains of amino acids linked by amide (peptide) bonds. Generally they vary in length from 2 to 50 amino acids, with anything longer than this generally denoted as a protein. Nature has evolved remarkably efficient methods for constructing polypeptide chains as proteins are required for almost all cellular functions. As dictated by the Central Dogma – ‘DNA makes RNA, makes protein’ – proteins are synthesised in vivo by first decoding genomic DNA into mRNA via transcription, which is then translated into a polypeptide at the ribosome. Here, tRNA–amino acid pairs (called aminoacyl‐tRNAs) are sequentially polymerised according to their cognate messenger RNA in a rapid and sequence‐specific manner.

In the field of biochemistry, methods have been optimised in recent years for producing recombinant proteins in cells using modern molecular biology and this can yield large amounts of a specific protein in a range of microbial hosts.

However, producing recombinant peptides is a much greater challenge and often requires extensive optimisation, especially for shorter peptides (<100 amino acids in length) as these are often degraded by proteases in the host cell [1]. It is also difficult to incorporate unnatural modifications using recombinant techniques, although advances in this field are discussed in section 8.6.

The requirement for peptides of any length, sequence and bearing unnatural modifications from both industrial and research fields has led to the development of extremely robust methods for peptide synthesis in the field of synthetic organic chemistry.

8.3 Amide Bond Synthesis

The basic reaction behind peptide synthesis is deceptively simple, an electrophilic carboxylic acid on one amino acid undergoes nucleophilic attack by an amine from another amino acid, forming an amide bond and eliminating water. The energy barrier for this process, however, is extremely high and temperatures >100 °C are required for the uncatalysed reaction (Figure 8.1) [2]. This low reactivity can be attributed to the stability of carboxylate and ammonium salts – the predominant structure of amino acids in water – which are significantly less electrophilic and nucleophilic, respectively.

Image described by caption.

Figure 8.1 Amide bond synthesis by thermal reaction or by carboxylic acid activation via an acyl chloride.

Early methods to synthesise amide bonds first converted the carboxylic acid to an acid chloride, which can then react with amines in the presence of excess base. Acid chlorides are highly reactive and therefore not very stable, and most methods to synthesise them require harsh conditions. Most modern amide bond formation reactions now use coupling reagents, as shown in Figure 8.2. Typically the hydroxyl group reacts with a carbodiimide to form an O‐acylurea, an extremely unstable intermediate. Nucleophilic addition of the amine gives the amide product and urea. The highly unstable nature of the O‐acylurea can, however, cause different reaction products, including the N‐acylurea and epimerisation of the α‐position of the L‐amino acid (Figure 8.3). Several additives have been developed that can stabilise the O‐acylurea as an activated ester, which reduces this unwanted epimerization process. HOBt (1‐hydroxy‐1H‐benzotriazole) is one such reagent and has been shown to significantly reduce side reactions and epimerisation during amide bond formation. Many modern amide coupling methods now include the carbodiimide and HOBt motif in the same molecule (see HATU and PyBOP) (Figure 8.2).

Synthetic organic chemistry for amide bond synthesis and the mechanism of carbodiimide-mediated coupling reactions with a box at the bottom containing skeletal formulas of coupling reagents such as DCC, DIC, EDC, etc.

Figure 8.2 Common coupling reagents used in synthetic organic chemistry for amide bond synthesis and the mechanism of carbodiimide‐mediated coupling reactions.

Reaction schematic illustrating the mechanisms of based catalyzed racemization of amino acid residues in SPPS.

Figure 8.3 Mechanisms of based catalysed racemization of amino acid residues in SPPS.

Now assume we want to synthesise a simple dipeptide and were to mix two amino acids together with a coupling reagent. The result may produce a small amount of the dipeptide, but there would be a multitude of different length peptide chains with a random sequence of amino acids as each of our amino acids can react with another equivalent of itself (Figure 8.4).

2 Reaction schematics illustrating the unprotected synthesis of a dipeptide (top) and protecting group synthesis of a dipeptide (bottom).

Figure 8.4 Synthesis of a simple dipeptide on unprotected amino acids will lead to a complex mixture of products formed due to competing side reactions from other reactive functional groups in the molecule. If these other functional groups are protected then only one product is formed, giving a higher yielding reaction. If one of the protecting groups can be selectively removed then the peptide can be elongated.

This is where the concept of protecting group chemistry is important. If the amine on one amino acid is protected and the carboxylic acid on the other is protected, then the product of the reaction will be a single dipeptide. If orthogonal protecting groups are used on the C‐ and N‐terminii, then one can be selectively deprotected and another amino acid with the corresponding C/N protecting groups in place can be added to extend the peptide. Any functional group that is nucleophilic or electrophilic can interfere with the amide coupling reactions and so must also be orthogonally protected to prevent side reactions and be resistant to the conditions used to deprotect the C‐/N‐terminus. This includes the side chains of the naturally occurring amino acids; e.g. lysine has an amine and aspartic/glutamic acid have a carboxylic acid that can participate in amide bond formation reactions and so must be protected.

8.3.1 Solid Phase Peptide Synthesis (SPPS)

Each step in a peptide synthesis requires purification, so that coupling or deprotection reagents and by‐ products can be removed to allow the next step of the reaction to proceed. This can be particularly time‐consuming and generates large volumes of chemical waste. Realising this limitation, the field of ‘solid phase peptide synthesis’ (SPPS) was developed by Bruce Merrifield in 1963, for which he won a Nobel Prize in 1984 [3]. Using this approach, the first amino acid is coupled through its C‐terminus to an insoluble solid (polymer) support or ‘resin’ and, instead of a separate purification step, excess amino acid and coupling reagent can simply be washed away leaving the amino acid functionalised solid support. The N‐terminus can then be deprotected in the solid phase, the reagents washed away and another amino acid added using a coupling reagent to extend the peptide. The cycle is then simply repeated until the desired amino acid sequence has been synthesised. The peptide can then be deprotected and cleaved from the resin concurrently (normally with trifluoroacetic acid [TFA] or hydrofluoric acid [HF]) and then further purified, if required, by reverse phase high performance liquid chromatography (RP‐HPLC).

Introduction of the Wang linker (para‐alkoxybenzyl alcohol) on to Merrifield's resin increases the lability of the peptide to acid cleavage, which reduced side reactions observed during the acid‐catalysed cleavage step during SPPS (Figure 8.5). Other linkers now exist that allow a more acid labile cleavage, e.g. 2‐chlorotrityl linker. The ability to use more acid‐labile linkers allows for peptides to be removed from the solid support still fully protected, which can be useful for further functionalisation, e.g. cyclisation. Primary amide incorporation at the C‐terminus can be achieved using a Rink amide linker; this modification can increase peptide stability to proteases and alter the charge of the peptide.

Image described by caption and surrounding text.

Figure 8.5 SPPS using a Wang functionalised resin and Fmoc protecting groups. The first amino acid is coupled to the solid support via an esterification reaction. Fmoc deprotection of the amine using piperidine allows subsequent amide couplings and this procedure is repeated until the target peptide is made. Cleavage from the resin is achieved using TFA, which also removes side chain protecting groups giving the unprotected peptide product.

In general, SPPS has many advantages over solution phase synthesis. For example, excess reagent can be used at each step to drive reactions to completion and then simply washed away, bypassing any laborious liquid–liquid separation, solvent evaporation and chromatography steps. SPPS is also easily automated and peptide synthesisers are available that can synthesise peptide sequences rapidly with minimal user input.

Two amine protecting group strategies now dominate SPPS, tert‐butyloxycarbonyl (Boc) or fluorenylmethoxycarbonyl (Fmoc). Boc groups are acid labile protecting groups and can be rapidly cleaved using 25–50% (TFA) in DCM. The Fmoc protecting group is base labile and is normally cleaved using 20% piperidine in DMF. For each method the amino acid side chain protecting groups must be stable under Fmoc or Boc deprotection conditions. Fmoc is used more commonly than Boc as the deprotection conditions are milder, leading to fewer side reactions, and the final resin cleavage uses TFA instead of HF, which is extremely toxic.

The key to SPPS is good conversion for each step. Consider the synthesis of a 10 amino acid peptide, which requires a total of 18 reaction steps. If each of these reactions yields 90% of the product, the final yield will be ∼15%. If this is improved to 95% for each reaction, then the overall yield will be 40%. Similarly, a 99% yield for each reaction will result in an overall yield of 83%. These differences may seem small, but their impact on the overall efficiency of polypeptide synthesis via SPPS can be dramatic. It is for this reason that optimisation of coupling reagents and protecting groups is so important in the field of peptide synthesis.

Typical Procedure for Fmoc SPPS

  1. Wang resin (1 g, 1 mmol/g loading) is swollen in CH2Cl2/DMF (10 ml; 8 : 2) for 1.5 hours and drained.
  2. Fmoc amino acid (4 mmol) is dissolved in the minimum amount of DMF with HOBT (4 mmol) and DMAP (0.4 mmol), and DIC (4 mmol) is added. The mixture is added to the resin, which is mixed for 12–20 hours.
  3. The resin is drained and washed with DMF (2 × 10 ml), CH2Cl2 (2 × 10 ml) and MeOH (2 × 10 ml), dried and 10–20 mg are weighed into a flask. 20% Piperidine in DMF (1 ml) is added and mixed for 30 minutes, the solution is filtered off and absorbance is measured at 301 nm. Fmoc deprotection product is quantified using the extinction coefficient (ε = 7800 dm3/mol cm). If the loading percentage is acceptable the resin is capped using Ac2O (20 mmol) and DIPEA (20 mmol) in DMF (3 ml). If loading is not acceptable, step 2 is repeated.
  4. The resin is drained and washed with DMF (2 × 10 ml), CH2Cl2 (2 × 10 ml) and DMF (10 ml). A solution of 20% piperidine in DMF is added (5 ml) and mixed for 5 minutes. The solution is drained and washed as before and 20% piperidine (5 ml) is added and mixed for 10 minutes, drained and washed again.
  5. Fmoc‐amino acid (3 mmol) and HOBt (3 mmol) are dissolved in the minimum amount of DMF and DIC is added. The solution is added to the resin and mixed for 40 minutes at room temperature (r.t.). The resin is drained and washed and Fmoc deprotection is done by repeating step 4.
  6. A cleavage cocktail is prepared based on 95% TFA and 5% H2O (additional scavengers are added depending on the side chain protecting groups used), which are added to the resin and mixed for 1–2 hours. The resin is drained into a flask and washed with TFA (3 × 3 ml). The peptide is precipitated with cold ether and collected by vacuum filtration, purity is checked by high‐performance liquid chromatography (HPLC) and purified if required.

For more information about SPPS several comprehensive reviews have been written [4]. Further information on amino acid protecting groups used can be found in the review by Isidro‐Llobet et al. [5] and on coupling reagents in the review by El‐Faham and Albericio [6].

The major advantages of SPPS is that it is quick, predictable and scalable. Peptides can be readily synthesised on a milligram‐to‐gram scale in most laboratories, or up to multikilogram scale industrially [7]. The iterative nature of this method also means that peptides can be easily modified to include unnatural motifs or functional groups that cannot be introduced using cellular methods. Common modifications of this type include acetylation of the N‐terminus and primary amide synthesis at the C‐terminus, cyclic peptides, incorporation of UAAs, amino acids bearing PTMs or peptides bearing fluorophores, ligands or other small organic compounds.

However, while SPPS has made peptide synthesis routine for most labs, this technique is extremely wasteful and environmentally irresponsible. Excess chemical reagents are commonly used at each step to increase coupling efficiencies and the solvents and reagents used are often toxic and non‐renewable. While the cost of reagents has reduced in recent years due to mass scale production to meet a growing demand for this technology in industry, SPPS remains a cost ineffective way of synthesising polypeptides compared to recombinant DNA methods [8].

There is also a limitation on the size of peptides than can be synthesised on resin, which is normally limited to about 50 amino acids, although through optimisation of conditions, resin and loading values this can be increased to >100 amino acids for certain sequences. However, purification of these large compounds can also be challenging. Although these improvements have enabled the synthesis of small proteins (e.g. RNAse A, 124 amino acids) [9], development of other techniques such as native chemical ligation (NCL) has since been favoured.

8.3.2 Native Chemical Ligation (NCL)

SPPS has made the synthesis of peptides very tractable, but protein synthesis remains a challenge using this technique. Conceptually, if we could couple multiple peptides together in an ordered fashion, a protein could be synthesised in vitro from entirely synthetic peptide precursors. One technique for achieving this is NCL [10]. Using this method, unprotected peptides, one bearing a thioester at the C‐terminus and the other containing a cysteine residue at the N‐terminus, can be ligated. The reaction proceeds via a transthioesterification followed by an S‐to‐N acyl shift giving the amide product as shown in Figure 8.6. Sequential NCL coupling reactions can lead to the synthesis of large proteins bearing multiple synthetic modifications. As the cysteine residue may not be required in the final peptide/protein sequence, desulphurisation of cysteine can be achieved in a separate step using triscarboxyethyl phosphine (TCEP) and a radical initiator, converting the cysteine into an alanine residue. The drawback here is that if a cysteine residue is required in the sequence it must be protected.

Image described by caption and surrounding text.

Figure 8.6 Mechanism of the native chemical ligation (NCL) reaction on an unprotected peptide/protein. Transthioesterification is followed by an S‐to‐N acyl shift to afford the new amide bond. Expressed protein ligation (EPL) relies on intein excision (an intein is an intervening protein, which is a natural protein segment that undergoes ‘protein splicing’, where a section protein will remove itself through generation of a thioester [10]), where an N‐to‐S acyl shift gives the thioester product [11]. Addition of a free thiol, typically mercaptophenylacetic acid (MPAA), causes transthioesterification, giving a protein thioester, which can undergo an NCL reaction with a synthetic peptide.

NCL produces a single ligated peptide product and therefore can be considered as highly chemoselective. Purification by RP‐HPLC or size exclusion chromatography means that products are normally separable from starting materials. However, the applicability of this reaction has been limited by slow reaction rates and difficulties in synthesising thioesters, although the development of new reagents in recent years has significantly improved these issues [12].

A significant enhancement in NCL reactivity comes from increasing the reactivity of the thioester. Mercaptophenyl acetic acid (MPAA) is often added to the reaction mixture, causing an additional transthioesterification to occur. The intermediate aryl‐thioester (which are often too reactive to isolate) reacts much faster than an alkyl thioester, increasing the overall rate of reaction. A drawback to the use of excess MPAA, however, is that it can complicate purification and MPAA must be removed before desulphurisation [13].

Trifluoroethanethiol (TFET) has been proposed as an alternative to MPAA as the electron withdrawing nature of the CF3 group gives a pK a of 7.3, allowing efficient transthioesterification at pH 6.8. The benefit of TFET is that it is volatile (bp = 35 °C) and so can be easily removed, allowing desulphurisation and ligation to be done in a ‘one‐pot’ reaction, as shown in the example in Figure 8.7 for the synthesis of a tick‐derived protein Chimadanin, where two sequential ligations and desulphurisation are all done as ‘one‐pot’. Of particular note is the author's use of a modified glutamic acid bearing a thiol. This approach allowed the first ligation site to be at Glu rather than Ala [13].

Image described by caption and surrounding text.

Figure 8.7 The synthesis by Payne et al. of Chimadanin using an NCL strategy [13].

Example procedure for Chimadanin synthesis by NCL [13]: (NCL is a more complex technique than some of the examples given, so a general protocol is not really applicable. The procedure below is given to give readers an idea of the experimental method using the example shown in Figure 8.7.)

  1. Ligation. A solution of peptide Chimadanin (43–70) (3.8 mg, 1.13 μmol, 1.2 eq.) in ligation buffer (370 μl; 6 M Gn·HCl, 100 mM NaPi, 25 mM TCEP, pH 6.8) was added to peptide Chimadanin (22–40) (2.6 mg, 0.94 μmol, 1.0 eq.) with TFET (7.5 μl, 2 vol.%). The pH was readjusted to 6.8 with NaOH (3 M) and the solution incubated for 2 hours at 30 °C. Analysis by HPLC‐MS confirmed complete conversion.
  2. Thiazolidine deprotection. A solution of MeONH2·HCl (390 μl; 0.2 M in 6 M Gn·HCl, 100 mM NaPi pH 3.4) was added to the reaction mixture and incubated for three hours at 30 °C. Analysis by HPLC‐MS confirmed complete conversion.
  3. Ligation. The pH of the reaction mixture was adjusted to 7.0 using NaOH (3 M) followed by addition of peptide Chimadanin (1–19) (3.2 mg, 1.2 μmol, 1.3 eq. in 6 M Gn·HCl, 100 mM NaPi) and TFET (18 μl, 2 vol.%) and incubated for 18 hours at 30 °C. Analysis by HPLC‐MS confirmed complete conversion.
  4. Desulphurisation. A solution of TCEP (1 M) and glutathione (100 mM) in buffer (925 μl; 6 M Gn·HCl, 100 mM NaPi) was added to give a solution containing 500 mM TCEP, 40 mM glutathione and 0.5 mM Chimadanin intermediate. The solution was adjusted to pH 6.2 and degassed by sparging with Ar(g) for 10 minutes, which also removed TFET. VA‐044 was added to give a concentration of 20 mM and the mixture was incubated at 37 °C for 5 hours. Analysis by HPLC‐MS confirmed complete conversion. The product was purified by reversed‐phase semipreparative HPLC, giving Chimadanin (3.1 mg, 35% yield).

NCL is an extremely powerful strategy for synthesising proteins bearing UAAs and PTMs. It has been used to synthesise an enzyme using entirely D‐amino acids to create a protein mirror image [14], for the synthesis of glycoproteins [15] and for protein synthesis of up to 304 amino acids [16]. In these examples NCL has been successfully used to access protein targets that are not yet possible to synthesise using purely biological methods.

One way of combining the benefits of biological and chemical protein synthesis methods is through the use of expressed protein ligation (EPL), where a protein expressed using recombinant DNA techniques can be further extended using synthetic peptide fragments [17]. The protein is first expressed bearing a thioester at the C‐terminus through a separate intein excision step. The thioester is then reacted with a peptide bearing a cysteine residue at the N‐terminus, which in turn induces the desired transthioesterification and S‐/N‐acyl shift, as for NCL. Alternatively, the protein can be genetically modified to contain an additional N‐terminal Cys residue that can then react with a peptide thioester in an analogous fashion [18].

The major advantage of EPL is that the amount of peptide synthesis and NCL reactions can be significantly reduced, enabling larger proteins/polypeptides to be constructed in a more efficient manner.

8.3.3 Reactions of Endogenous Amino Acids

In the first section we focused on how synthetic chemistry can be used to generate short peptides sequences and then how these can be coupled to synthesise larger proteins. The major benefit of this approach to protein synthesis is that a large amount of chemical divergence can be incorporated site selectively into the protein/peptide chain. The disadvantages are that the process is expensive, time consuming and requires multiple purification steps. This in turn generates large volumes of chemical waste. Protein synthesis can also be achieved via plasmid‐based overexpression in microorganisms such as Escherichia coli and Saccharomyces cerevisiae.

In addition to these approaches, the functional groups on amino acids can also be reacted selectively to introduce synthetic modifications. Although these reactions can be selective for a particular amino acid, they generally do not give site selectivity; e.g. if a lysine‐selective reagent is used, the reagent could react with any Lys residue in the protein, not just at one particular site. Some selectivity can be obtained if a particular residue is solvent exposed or if the surrounding amino acids in the protein's tertiary structure increase the reactivity of one particular residue, but these effects are not generalisable and vary a lot between proteins and residues [19].

Within this field, a wide range of reagents exist to target almost all canonical amino acids (apart from the aliphatic side chains, and Ser and Thr), which have been recently reviewed [ 19,20].

The most common compounds that are attached to proteins (represented by ‘R’ in Figures 8.8 and 8.9) using these techniques are fluorophores or biotin. The addition of a fluorophore to a protein has many benefits for protein imaging using microscopy. Biotinylation is commonly used for affinity purification, a technique that takes advantage of the extremely high binding affinity (K D > 10 M [12]) of biotin to the proteins avidin and streptavidin.

Common commercial reagents for selective reaction of Lys on proteins and peptides such as (sulpho)NHS esters, iso(thio)cyanates, and sulphyonyl chlorides.

Figure 8.8 Common commercial reagents for selective reaction of Lys on proteins and peptides.

Common commercial reagents for the selective modification of Cys on proteins and peptides such as iodoacetamide, maleimide, and HPDP.

Figure 8.9 Common commercial reagents for the selective modification of Cys on proteins and peptides.

Some of the most common commercially available reagents for amino acid conjugation are shown in Figure 8.6; these reagents are typically focused on Lys and Cys, the most nucleophilic amino acids.

The electrophilic NHS esters and the water‐soluble sulpho‐NHS derivative are reactive towards primary amines, giving an amide bond product. The reaction is selective for Lys and N‐terminal primary amines in proteins and is pH dependent, as the nucleophilic free amine (–NH2) rather than the non‐nucleophilic ammonium ion (–NH3 +) is required for a reaction to occur. Due to the pK a of Lys (10.5), a pH 7–9 is generally required for this reaction to occur. It should be noted that the N‐terminus of peptide chains has a lower pK a , typically around 8, which allows for some selectivity over Lys [21].

  • Typical procedure for lysine modification with NHS esters: To protein (50–100 μM) in amine‐free buffer (100 mM, pH 7–9) is added a 10‐fold molar excess of NHS‐ester (5–50 mM dissolved in water or DMSO). The solution is mixed and incubated at room temperature for 4–6 hours or at 4 °C overnight. The labelled protein is normally purified by size exclusion chromatography or buffer exchange.

L‐cysteine (Cys) is another amino acid that can be targeted for protein bioconjugation. Cys is the most nucleophilic amino acid, especially when deprotonated (pK a = 8.5). The low abundance of Cys in proteins (<2%) can enable protein modification via site‐directed mutagenesis, where a point mutation can be introduced to give a recombinant protein with a single Cys at a chosen location. Modification at Cys is normally achieved using an alkylating reagent, Michael addition reaction or via a dehydroalanine (Dha) intermediate. All of these methods show excellent selectivity at carefully controlled pH [22].

  • Typical procedure (maleimide and iodoacetamide): To protein (50–100 μM) in buffer (100 mM phosphate or Tris, must be free of thiols) at pH 7 (maleimides) or pH 8 (iodoacetamide) is added TCEP (10‐fold molar excess) and the solution is incubated in the dark for one hour. The reagent is dissolved in water or DMSO (1–10 mM) and a 10–20‐fold molar excess is added. The solution is incubated for one hour at room temperature in the dark or overnight at 4 °C. The reaction can be quenched by addition of excess thiol (e.g. β‐mercaptoethanol or glutathione [100 eq.]) and purified by buffer exchange or gel filtration.
  • Typical procedure (HPDP): Protein must be reduced using TCEP or dithioreitol (DTT) and thoroughly desalted to remove any excess reagent. HPDP reagent is dissolved in DMF to give a concentration of ∼4 mM and 100 μl of this is added to the reduced protein (10–50 μM) in buffer (1 ml 1 mM EDTA in phosphate‐buffered saline [PBS], pH 7–8), giving a final concentration of HPDP of 0.4 mM. The solution is mixed well and incubated at room temperature for two hours. Excess reagent is removed by desalting or gel filtration.

Most Cys selective reactions are unreactive towards disulphides, which can form between two Cys residues via oxidation in air. For this reason, the first step of any Cys reaction normally involves a reduction step. Reduction is normally achieved through incubation of the protein with DTT or tris(2‐carboxyethyl)phosphine (TCEP). DTT must normally be removed before the conjugation as it contains free thiols that can react and reduce the effective concentration of the reagent. For this reason, the use of TCEP is recommended for most procedures.

Iodoacetamide has been used to alkylate Cys residues since 1935 and is still used as a technique to ‘cap’ reactive Cys side chains in proteomic analysis, such as tryptic digestion. Iodoacetamides react rapidly and selectively at pH 8.0 via an SN2 mechanism.

Maleimides are more reactive than iodoacetamides and react through a Michael‐type addition to the maleimide alkene. The highly electrophilic nature of these reagents means that the deprotonation of Cys‐SH is not required for the reaction to occur, so these reagents can be used at physiological pH (6.5–7.5). For both reagents the rate of reaction will increase at higher pH, but this is often accompanied by increased side reactions with other amino acids.

HPDP reagents are ‘activated’ disulphides that react with free thiols at near‐neutral pH to give a disulphide linked product. The disulphide bond is cleavable by reduction with DTT or TCEP, so these reagents are favoured for applications where the modification can be later removed.

8.3.4 Dehydroalanine (Dha) Procedure

Dehydroalanine (Dha) can also be used to introduce UAAs and PTM mimics into proteins using a two‐step ‘tag and modify’ approach. The chemistry of Dha is an interesting example of umpolung chemistry, where a nucleophilic sulphur atom is eliminated using a reagent such as 2,5‐dibromohexanediamide to afford an electrophilic alkene [23] (Figure 8.10). This effectively reverses the electronic properties of the amino acid. Conjugation to Dha is then achieved by nucleophilic addition to the alkene using thiol‐based reagents or by means of radical addition using alkyl halides [24].

Image described by caption and surrounding text.

Figure 8.10 Formation of dehydroalanine (Dha) from Cys converts a good nucleophile into an excellent electrophile. Dha can be selectively reacted with nucleophiles, e.g. thiols, or alkyl iodides in a radical addition reaction.

Although the reactions of Dha are extremely powerful for creating modified proteins, the stereocentre at the α‐position of the parent amino acid is epimerised and therefore the products are formed as an approximately 1 : 1 mixture of diastereomers [24]. The stereoablative nature of this reaction can limit the utility of this reaction in some experiments. The Dha procedure is given as follows:

  1. A cysteine containing protein (60 μM) in PBS (10 mM; pH 8) is reduced using DTT (300‐fold molar excess) by shaking at room temperature for one hour. DTT is removed using size exclusion chromatography or buffer exchange.
  2. Dibromohexanediamide (150‐fold molar excess) is weighed into an Eppendorf and reduced protein solution is added. The reaction is shaken at 37 °C for four hours or until the reaction is complete by HPLC‐MS.
  3. Dibromohexanediamide is precipitated by centrifugation (1 minute, 16 k × g), giving the Dha product.
  4. For nucleophilic addition, a thiol (twofold molar excess) is added to the Dha protein solution, which is incubated at 37 °C with shaking. Additional thiol (twofold molar excess) is added every 20 minutes 10 times. The reaction is incubated for 2–10 hours, with completion determined by HPLC‐MS and the product is purified by dialysis of size exclusion chromatography.

8.4 Bioorthogonal Chemistry

All of the techniques mentioned so far in this chapter are limited to the reactions of functional groups present in the 20 canonical amino acids. This has led to the development of methods to specifically modify proteins in vitro; however, for in vivo modifications achieving selectivity for one protein is extremely difficult. This is largely due to the density and functional complexity of the cell interior, where one must first intercept the correct protein but then also achieve selectivity over other small molecule cofactors, lipids and metabolites.

This challenge has led to the development of the field of bioorthogonal chemistry. First introduced by Bertozzi et al. in 2003, a bioorthogonal reaction can be defined as a non‐enzymatic reaction that occurs in vivo but yet neither interacts nor interferes with the underlying biological system. The rates of these reactions must be fast enough under physiological conditions to enable rapid bioconjugation kinetics and the substrate(s) and product(s) of the reaction must also be non‐toxic to the cell [25].

Research in this field has developed rapidly in recent years and it is now possible to perform bioorthogonal reactions in a range of cellular environments at reaction rates approaching that of native enzymes. Despite this success, the selection of the ‘right’ bioorthogonal reaction for a given application remains a challenging process of elimination. For example, the second‐order rate constants of these reactions generally range from 10−2 to 105 M−1 s−1. For a 1 μM reaction this equates to a half‐life ranging from 10 seconds to 3 years. Increasing the rate of reactions is achieved by increasing the reactivity of the reaction components, and this often makes them more difficult to prepare and therefore more expensive. In general, higher levels of reactivity are also accompanied by an increased susceptibility to nucleophilic attack by free thiols in the cell. These liabilities make many of the most sophisticated bioorthogonal chemistries unsuitable for use in multicellular organisms and whole animals (e.g. live cell imaging in nematodes and mice) where factors such as tissue penetration, circulatory and immunological effects are important considerations. Slower reactions use more stable reagents but in order to achieve appreciable reaction kinetics more reagent must be used, which can increase toxicity effects. Navigating the balance between reactivity, deactivation and toxicity is the art of bioorthogonal reaction selection.

By and large, pericyclic reactions have seen the most success in this field. The majority of the reactions developed have focused on the use of azides. Azides are yet to be discovered in nature and possess weak electrophilic properties and a dipolar moment, meaning they are ideal for cyclisation reactions. They are also stable under biological conditions. Due to their small size, azides can also be readily incorporated into cells through metabolic, genetic or activity based pathways. Alkynes are also popular diene‐ and dipolarophiles due of their appropriate reactivity and compact size.

8.4.1 The Staudinger Ligation

One of the first bioorthogonal reactions to be widely used is the Staudinger ligation reaction between an azide and a phosphine [26]. The mechanism (Figure 8.11) follows the classic Staudinger reduction of azides, where the azide reacts with a phosphine followed by elimination of N2. During the reaction the aza‐ylide can undergo hydrolysis, yielding the free amine or, through displacement of a proximal ester, intramolecular cyclisation to give the oxaphosphetane, which then hydrolyses to give an amide product with the phosphine oxide attached.

Image described by caption and surrounding text.

Figure 8.11 Mechanism for Staudinger ligation reaction of azides and phosphines bearing an electrophilic trap. Initial Staudinger reduction of the azide releases nitrogen gas and forms the aza‐ylide, which undergoes cyclisation giving the oxaphosphetane which is hydrolysed forming the amide bond.

The reaction was modified by Bertozzi et al. and Raines et al. in 2000 to give the traceless Staudinger ligation reaction [27]. In this variant of the classic Staudinger ligation the product does not contain a phosphine oxide, which is instead eliminated during the reaction mechanism instead of methanol. Staudinger ligation reactions have been used extensively to couple fluorophores to biomolecules, but the reaction has been limited by unacceptably slow reaction kinetics [28]. Attempts to increase the rate of the ligation reaction have been made by adding electron donors to the phosphine, but this has been found to also increase the rate of phosphine oxidation (the major side reaction in the Staudinger ligation), leading to poor product conversions.

Typical procedure for Staudinger protein labelling

Azido protein (50–500 μM) is mixed with phosphine probe (10‐fold molar excess) in buffer (50–100 mM Tris or PBS; pH 7–8, 6 M guanidine‐HCl). Reaction mixtures are incubated for 15 hours at 37 °C and purified by gel chromatography or buffer exchange.

8.5 The Copper‐Catalysed Azide‐Alkyne Cycloaddition Reaction (CuAAC)

The azide‐alkyne 1,3‐dipolar cycloaddition reaction is an extremely popular method for bioconjugation. The thermally initiated Huisgen reaction was discovered in 1893 and showed that heating azides and alkynes at high temperatures yielded a mixture of 1,4‐ and 1,5‐triazoles. Despite the reaction being exothermic (i.e. thermodynamically favourable), the activation energy between terminal alkynes and azides is too high for the reaction to occur at room temperature, significantly limiting its potential as a bioorthogonal reaction [29].

Meldal et al. and Sharpless et al. simultaneously discovered that Cu(I) could be used to catalyse the reaction, allowing the reaction to proceed at room temperature; this is now known as the copper‐catalysed alkyne‐azide cycloaddition (CuAAC) reaction [29]. This extremely powerful reaction gives exclusively 1,4‐triazoles and epitomises the concept of ‘click’ chemistry – reactions that are high yielding, simple to perform, wide in scope, conducted in benign solvents and create by‐products that are easily removed. One of the key drivers in the uptake of CuAAC chemistry is that both the azide and alkyne are small, stable and easy to introduce into a wide variety of chemical probes and biomolecules. This has led to the availability of a wide range of commercially available CuAAC reagents, making this reaction a first port‐of‐call for many applications in chemical biology.

CuAAC reactions are typically performed in water in the presence of an organic solvent (if required) to aid the solubility of reagents. The reactions are sensitive to air since oxygen can readily oxidise Cu(I) to Cu(II), producing reactive oxygen species (ROSs), which are toxic to cells. Sodium ascorbate is a soft reducing agent that is compatible with most proteins and is often used to reduce any Cu(II), giving Cu(I) and dehydroascorbic acid [30].

The mechanism of the CuAAC reaction has been somewhat debated in recent years, but Fokin et al. published the widely accepted mechanism in 2013 (shown in Figure 8.12) which involves two moles of Cu(I) species in the catalytic cycle [31].

Image described by caption.

Figure 8.12 The copper‐catalysed azide‐alkyne cycloaddition (CuAAC) reaction and its mechanism. (a) The thermal Huisgen reaction producing a mix of 1,4‐ and 1,5‐triazoles; (b) CuAAC reaction, which is solvent and functional group tolerant; (c) picolyl azide chelation of copper, which allows for lower use of Cu(I) concentration; (d) mechanism of the CuAAC reaction [31]; (e) common Cu(I) chelating ligands to reduce toxicity, side reactions and increase the rate of reaction [30].

Although CuAAC chemistry is orthogonal to many biological reactions and is widely used for coupling to peptides and proteins, it cannot be described as bioorthogonal as the Cu(I) catalyst is extremely toxic to cells. Cu(I) is readily chelated by many proteins, which can alter their structure and function and can also produce ROSs. Chelating ligands such as TBTA stabilise the Cu(I) and limit interactions with proteins, or Click‐iT® Plus reagents chelate Cu(I) giving enhanced rates and lowering the Cu(I) concentration – both solutions have been shown to mitigate toxicity. However, the best solution to this problem is to remove the need for copper entirely. This has been achieved in recent years through the elegant design of new reagents that contain a pre‐installed molecular strain.

Typical procedure for chelation assisted CuAAC [32]

  1. A biomolecule bearing an azide or alkyne (2–100 μM) is dissolved in degassed PBS buffer (100 mM) and alkyne/azide reagent is added in 2–10‐fold molar excess.
  2. CuSO4 (2.5 μl; 20 mM in H2O) and chelating ligand THPTA (5 μl; 50 mM in H2O) are pre‐mixed and added to the protein solution followed by freshly prepared sodium ascorbate (25 μl; 5 mM in H2O).
  3. The solution is mixed, the reaction sealed to prevent oxygen diffusion and left at room temperature for one hour. The reaction can be quenched by addition of excess ethylenediamine tetraacetic acid (EDTA) solution.
  4. Copper removal can be achieved by dialysis with EDTA.

8.5.1 The Strain‐Promoted Azide‐Alkyne Cycloaddition Reaction (SPAAC)

In 1953 it was discovered that cyclooctynes react with azides at room temperature to form triazoles. The reason for this enhanced reactivity over aliphatic alkynes is increased ring strain, which promotes the alkyne‐azide cyclisation in the absence of a metal catalyst. For cyclooctyne this has been measured at ∼18 kcal/mol compared to cyclooctane at 12.1 kcal/mol [33]. In general, ring strain is inversely proportional to the size of the ring. However, cyclooctyne is the smallest stable cyclic alkyne that is known to exist at ambient temperature.

Bertozzi et al. first demonstrated the potential of cyclooctynes for cell and protein labelling in what is now known as the ‘strain promoted azide‐alkyne cycloaddition’ (SPAAC) reaction, shown in Figure 8.13 [34]. Bertozzi went on to demonstrate that cells decorated with sialic acid azides in their membrane glycoproteins via the metabolic incorporation of peracetylated ManAz could be labelled with a biotin‐cyclooctyne reagent and imaged using an avidin bound fluorophore with no toxic effects to the cells [31].

Image described by caption.

Figure 8.13 Seminal example of SPAAC reaction on live cells. (a) Azides are incorporated into cell surface glycoproteins by incubation of Jurkat cells with peractylated N‐azidoacetylmannosamine. SPAAC reaction with a biotin cyclooctyne probe (b) conjugates the biotin probe through a 1,2,3‐triazole (c). Addition of fluorescently tagged Avidin binds to the biotin molecule, fixing the fluorophore to the cell surface [34].

This paper showcased the potential of SPAAC chemistry, but the non‐optimised cyclooctyne reagent used showed extremely slow reaction kinetics. This has led to the development of new ultra‐fast SPAAC reagents where reactivity has been enhanced by reagent design. Dibenzoazacyclooctynes (DIBACs) and bicyclo‐[6.1.0]‐nonynes (BCNs) (Figure 8.14) are two classes of reagents that are now widely used for SPAAC reactions that are 130 and 58 times more reactive, respectively, than the cyclooctyne used by Bertozzi et al. [35]. However, as mentioned previously, an increase in reaction rate is generally accompanied by decreased stability in vivo.

Skeletal formulas for cyclooctyne, DIFO, BCN, and DIBAC with reaction rates of 2.4a, 76a, 140b, and 310b, respectively.

Figure 8.14 Bertozzi et al.'s cyclooctyne compound and a selection of commercial cyclooctynes and their rate of reaction with BnN3. Conditions: (a) CD3CN; (b) CD3CN:D2O (3 : 1) [35].

Another limitation of these cyclooctyne reagents is the hydrophobic nature of the cyclooctyne ring, especially the dibenzyl DIBAC compound, which has led to complications in their use in protein labelling. In particular, these reagents have been shown to partition in the hydrophobic environment of the cell membrane, inhibiting their diffusion into the cytosol and making intracellular targets difficult to visualise during labelling experiments. For this reason, even though DIFO and BCN have slower rates of reaction, they are sometimes preferred to DIBAC.

8.5.2 Tetrazine Ligations

Although the reactions of azides for biological labelling have provided selectivity and allowed genetic incorporation of reactive handles, they have become the limiting factor in the search for ever faster reactions. Other reactive groups such as nitrones and syndones have also been shown to react with cyclooctynes, but are also limited by their slow reaction kinetics.

However, 1,2,4,5‐tetrazines also react with strained alkynes and alkenes via an inverse electron demand Diels‐Alder (IEDDA) cycloaddition (Figure 8.15a) and these typically have much faster rates than any other bioorthogonal reaction. Cycloadditions with BCNK range between 3 and 1245 M−1 s−1, 10–100 times faster than the DIBAC SPAAC reaction with azides [36] (Figure 8.15b). Reactions with transcyclooctene (TCO) derivatives are truly exceptional, with second‐order rate constants reported up to 106 M−1 s−1. The rate of cycloaddition is dependent on a number of factors, including the electronic nature of the substituents on the tetrazine, the ring strain of the dienophile, sterics and solvent choice (Figure 8.15) [37].

Image described by caption and surrounding text.

Figure 8.15 The IEDDA reaction of tetrazines. (a) Mechanism of the tetrazine IEDDA reaction. (b) Rates of reaction for selected alkynes and alkenes with tetrazine Tz [37].

Once again, the high reactivity of trans‐cyclooctene is accompanied by high levels of instability and short half‐lives in biological serum (3.26 hours for TCO). Although the exact mechanism of its deactivation has yet to be confirmed, it has been shown that TCO readily isomerises to the less reactive cis‐cyclooctene in vivo. Further increasing the ring strain of TCO by fusing a cyclopropane (s‐TCO) or dioxolane (d‐TCO) increases the rate of reaction even more and, in the case of d‐TCO, also increases the stability of this reagent in serum for up to four days. For biological labelling studies in more complex environments, cyclopropenes have recently emerged as suitable dienophiles for the IEDDA reaction with tetrazines. Despite their slower reaction kinetics these reagents perform well in whole animals, while also retaining their bioorthogonal properties.

Overall, the exceptional properties of the IEDDA reaction means that this bioorthogonal reaction is the state‐of‐the art in the field and has found widespread use in the field of chemical biology, especially in situations where high labelling efficiencies and low reagent toxicity in vivo are of paramount importance.

8.6 Unnatural Amino Acid Incorporation

One of the major advancements that has allowed the concept of bioorthogonal chemistry to flourish is the ability to genetically incorporate unnatural functional groups into living cells. There are three main methods that are generally used to achieve this: metabolic incorporation and incorporation of UAAs through selective pressure incorporation in auxotrophic cells, or by genetic code expansion using orthogonal translation.

Incorporation of UAAs can be achieved using amino acids that are similar in structure to natural amino acids. Using this approach, incorporation can simply be achieved via the addition of the UAA to the culture medium and relies on the promiscuity of the cognate aminoacyl‐tRNA synthetase. However, the efficiency of this system is greatly improved by making the cells auxotrophic for one or more amino acids, as the activation of UAAs by aa‐tRNAs is much slower than for natural amino acids.

This was first demonstrated using selenomethionine (Se‐Met) for use in X‐ray crystallography due to its increased diffraction properties relative to L‐Met [38]. This technique has been predominantly used in recombinant protein expression, where cells are first grown in the presence of all 20 natural amino acids and then transferred into media containing 19 amino acids and the UAA before protein expression is induced. However, the survival rate of auxotrophic cells cultured exclusively in the presence of UAA is typically very low, as the proteome‐wide incorporation of the UAA at high concentration can lead to toxicity effects. For this reason, labelling efficiencies are generally very low using this approach.

The substrate scope has been improved by targeting the inherent promiscuity of the methionyl‐tRNA synthetase (Met‐RS), which has allowed the metabolic incorporation of azidohomoalanine (Aha), homopropargylglycine (Hpg) and homoallylglycine (Hag) UAAs (Figure 8.16). These UAAs have been applied in a range of bioconjugation experiments. The metabolic incorporation of unnatural Trp, Phe and Leu‐derived amino acids have also been achieved, but this has required a degree of protein engineering to alter the natural specificity of the respective tRNA synthetase (Figure 8.16) [39].

Skeletal formulas of Se-Met, Aha, Hpg, Hag, Tcg, p-Brf, p-IF, p-AzF, p-EtF, Anl, Aoa, Pra, Onv, and p-AcF. The first four skeletal formulas are enclosed in a box labeled Incorporation by MetRS.

Figure 8.16 UAAs that can be incorporated using selective pressure incorporation. Se‐Met, Aha, Hpg, and Hag can all be incorporated using native MetRS [39].

Perhaps the most elegant approach to UAA incorporation is via genetic code expansion. In this approach, the amber STOP codon (TAG) is repurposed to direct the incorporation of a UAA using an amber‐decoding tRNA (CUA) and an orthogonal aminoacyl tRNA synthetase (aaRS) (Figure 8.17). Identification of a suitable orthogonal aaRS can be achieved from non‐common amino acid use in microorganisms (e.g. the pyrrolysyl aaRS from Methanosarcina mazei) or via directed evolution. Suppression of a target amber codon via this approach therefore allows for the site‐specific incorporation of a UAA to any residue in a protein in a genetically directed manner. Although protein yields in these experiments are often lower than via more traditional methods, this technique has undeniable value. Since its initial development in E. coli, this technology has since been expanded for use in a variety of microorganisms, mammalian cells, and more recently in whole animals [40].

Skeletal formulas of azides (p-AzF, o-AzbK, AzK, and ACPK), terminal alkynes (PrgF, EtcK, AlkK, and AlkK2), strained alkenes (CpK, NorK, TCOK, and TetF), and strained alkynes (CoK1, CoK2, and BCNK).

Figure 8.17 Unnatural amino acids that can be introduced into proteins through: (a) selection pressure incorporation or (b) amber stop codon techniques [ 37, 39].

While expansion of the genetic code has allowed UAAs to be incorporated into proteinaceous material in a variety of organisms, the stochastic labelling of other cellular structures (e.g. oligosaccharides and fatty acids) requires a different approach. Metabolic incorporation of metabolite mimics bearing unnatural functional groups has been shown to be an excellent way to achieve this. By incubating a cell in the presence of an unnatural, yet structurally related, metabolite these structures can be incorporated into the cell metabolome via native metabolic incorporation.

Cell‐surface glycans were some of the first substrates to be the target of this approach. Bertozzi et al. first demonstrated the incorporation azido sialic acid (Sia) by incubation of Jurkat and HeLa cells with peracetylated N‐azidoacetylmannosamine (ManAz; Figure 8.18 and Figure 8.13) through a de novo biosynthesis pathway [23]. Further work has since incorporated azido derivatives of N‐acetylglucosamine (GlcNAc) and N‐acetyl galactosamine (GalNAc), GlcNAz and GalNAz, respectively, using an appropriate salvage pathway in mammalian cells. Prescher et al. recently expanded the versatility of this approach by incorporation of various cyclopropene labelled glycans for IEDDA coupling [41]. Surprisingly, Bertozzi et al. have shown that larger groups such as BCN could also be incorporated using this approach [42].

Image described by caption.

Figure 8.18 Unnatural metabolites for (a) incorporation of azido and cyclopropyl groups into carbohydrates and incorporation of BCN into sialic acid; (b) incorporation of azide and alkyne groups into fatty acids; (c) incorporation of an alkyne into DNA via EdU.

Finally, lipid biosynthesis pathways and nucleic acids have also been targeted for incorporation of unnatural functional groups. Myristoylation and palmitoylation have been studied through the metabolic incorporation of azide and alkyne‐modified fatty acids, and DNA has also been labelled via the metabolic incorporation of 5‐ethynyl‐2′‐deoxyuridine (EdU) in cells.

8.7 Case Studies

8.7.1 Post‐Translational Modification of Histone Tails

Histones are the proteins responsible for packaging DNA inside the nucleus of eukaryotic cells. The histone is composed of four core components called H2A, H2B, H3 and H4, which form the histone octamer, the major repeating unit of chromatin. PTMs are covalent modifications to the side chains of amino acids that alter the chemical properties. There are around 20 confirmed PTMs in histones focused on Lys residues, although Ser, Thr, Arg, His and Glu are also modified.

Histone PTMs are thought to act by altering interactions between amino acid side chains on histones and DNA. ε‐NH2 Lys acetylation is one of the most commonly observed histone PTMs and can be installed on histones by the enzyme histone acetyltransferase (HAT) or removed by histone deacetylases (HDACs). Acetylation converts a positively charged amine residue (the ε‐NH2 of Lys is protonated at the physiological pH) to a neutral amide. This change in charge state is thought to reduce the interaction between DNA and the histone, causing the DNA to be coiled less tightly and leaving it more available for transcription.

The ability to synthesise synthetic or semi‐synthetic histones bearing PTMs at specific sites has greatly facilitated understanding the effect PTMs have and whether PTMs at specific sites act as markers for protein–protein interactions.

This section will compare three methods to generate histone PTMs: through alkylation of cysteine residues, use of NCL to create synthetic proteins and through amber stop codon techniques.

8.7.1.1 Cysteine Alkylation

ThiaLys (shown in Figure 8.19) is a lysine mimic obtained by alkylation of cysteine using 2‐bromoethylamine. Substitution of a carbon atom for a thiol causes a slight bond lengthening (+0.28 Å) and shift in pK a (−1.1) compared to Lys, but the mimic is recognised by the lysine specific protease trypsin. Shokat et al. employed a cysteine alkylation strategy to synthesise lysine methylation mimics in histone proteins [43]. Histones are the proteins that package DNA and modification of Lys residues in histones are thought to control DNA transcription, replication and repair.

Image described by caption.

Figure 8.19 Alkylation of Cys to give ThiaLys, a lysine mimic, using 2‐bromoethyl amine, which reacts with Cys via an aziridine intermediate. By using (methyl)‐2‐bromoethylamine methyl Lys mimics were synthesised, facilitating the study into the role of Lys methylation has in histone–DNA interactions.

A point mutation of Lys to Cys allowed introduction of a single Cys residue into the histones, which were expressed recombinantly in E. coli. Incubation of the histones with bromoethylamines bearing a methylation mark gave synthetic methyl lysine products that have facilitated the study of the role Lys methylation plays. A similar procedure has been used successfully to generate dimethyl and trimethyl Lys mimics [43].

  • Procedure: Lyophilised histones (5–10 mg) were dissolved in alkylation buffer (900 μl, 1 M HEPES pH 7.8, 4 M guanidinium chloride, 10 mM D/L‐methionine). DTT (20 μl, 1 M) was added to the solution and incubated at 37 °C for one hour. Bromoethylamine (100 μl, 1 M in H2O) was added and incubated for two hours in the dark at room temperature. DTT (10 μl, 1 M in H2O) was added followed by bromoethylamine (50 μl, 1 M in H2O) and the reaction incubated for a further two hours at room temperature. The reaction was quenched by addition of β‐mercaptoethanol (50 μl) and purified using a PD‐10 size exclusion column and lyophilised.

8.7.1.2 Native Chemical Ligation (NCL)

NCL and EPL are extremely powerful tools to synthesise modified histones as most histone modifications are found at the C‐ or N‐terminus. For NCL, sequential ligation and N‐terminal deprotection strategies have been used to introduce histone PTMs. For the synthesis of histone H2A containing a phosphonated tyrosine, Brik et al. used the method shown in Figure 8.20 [44]. Three peptides were prepared using Fmoc SPPS and the sequences were designed so that each ligation site was at the position of an Ala residue in the native sequence. By doing this the Cys left after ligation could be desulphurised to Ala, leaving the native sequence. The N‐terminal of peptide H2A(49‐86) was protected as a thiazolidine, which can be removed in the presence of [Pd(allyl)Cl]2 and MgCl2. The C‐terminus was activated using an N‐acyl‐benzimidazolinone group to increase the rate of ligation.

  • Procedure: Ligation of fragments H2A(49‐86) and H2A(88‐130) was achieved in the presence of 20 eq. 4‐mercaptophenylacetic acid (MPAA; a catalyst) and 10 eq. of TCEP at pH 7.3 for three hours. Thiazolidine deprotection was achieved using 100 eq. MgCl2 and 15 eq. [Pd(allyl)Cl]2 at 37 °C in one hour. Addition of the third peptide fragment H2A(1‐48) with 75 eq. MPAA and 37.5 eq. TCEP gave complete ligation after a further seven hours. The reaction was quenched with 75 eq. DTT and dialysed overnight. Desulphurization using 100 eq. VA‐044 and TCEP for six hours gave the product, which was purified by RP‐HPLC, giving the protein in a 16% yield for the four steps. For a detailed protocol see Brik et al. [44].
Image described by caption and surrounding text.

Figure 8.20 One pot sequential NCL reaction, followed by desulphurisation to give a synthetic histone H2AY57p from three peptide fragments prepared using Fmoc SPPS [44].

8.7.1.3 Genetically Encoded Methyl‐Lys Incorporation

The incorporation of an UAA using amber stop codon techniques relies on having a tRNA‐synthetase/tRNACUA pair that are specific for the UAA. A major challenge for the incorporation of Me‐Lys is that although a synthetase that could accept Me‐Lys is possible, achieving discrimination between Me‐Lys and Lys is difficult since Lys is smaller in size and similarly charged. To circumvent this issue, Chin et al. developed a two‐step method to incorporate Me‐Lys into histones (Figure 8.21). N ε ‐tert‐butyloxycarbonyl‐L‐lysine has been previously shown to be incorporated into myoglobin in E. coli using an evolved mutant of the pyrrolysyl‐tRNA synthetase/tRNACUA pair [44]. Using this technique Chin et al. were able to incorporate N ε ‐tert‐butyloxycarbonyl‐N ε ‐methyl‐L‐lysine (BocMeLys) into an His‐tagged H3K9BocMe, which was purified using an Ni‐affinity column. Boc groups are acid labile and treatment of the protein with 1% TFA for 4 hours at 37 °C gave quantitative removal or ‘deprotection’ of the Boc protecting group yielding H3K9me1, which was confirmed by electrospray ionisation mass spectrometry (ESI‐MS) and western blot using an anti‐H3K9me1 antibody.

  • Procedure: Expression of Histone H3K9Boc‐me1 was achieved by transformation of E. coli Bl21(DE3) cells with pBKPylS (which encodes the Methanosarcina berkeri pyrrolysyl‐tRNA synthetase, MbPylRS) and pCDF‐PylT‐H3K9TAG (which encodes histone H3 bearing an amber codon at position 9 and an N‐terminal His6‐tag followed by a TEV protease cleavage site sequence, as well as MbtRNACUA on an lpp promoter and rrnC terminator, where the plasmid has a spectinomycin resistance marker). Cells were recovered in 1 ml of SOC media for one hour at 37 °C, before incubation (16 hours, 37 °C, 250 rpm) in 100 ml of 2 × TY containing kanamycin (50 μg/ml) and spectinomycin (70 μg/ml); 25 ml of this overnight culture was used to inoculate 500 ml of 2 × TY supplemented with kanamycin (25 μg/ml), spectinomycin (35 μg/ml) and 2 mM of BocMeLys. Cells were grown (37 °C, 250 rpm) and protein expression was induced at OD600 ∼0.9 by addition of isopropyl β‐D‐1‐thiogalactopyranoside (IPTG) to a final concentration of 1 mM. After five hours of induction, cells were harvested and resuspended in 50 ml of 1 × PBS containing 1 mM DTT, lysozyme (1 mg/ml), DNaseI (100 μg/ml), 1 mM PMSF and Roche protease inhibitor cocktail. The cells were disrupted by sonication. The cell lysates were centrifuged at 17 000 rpm for 20 minutes at 4 °C. The supernatant was discarded and the pellet was retained as the insoluble fraction. The pellet was resuspended in 25 ml of 1 × PBS supplemented with 1 mM DTT and 1% Triton‐X, and centrifuged at 17 000 rpm for 20 minutes at 4 °C. The pellet was subsequently resuspended in 25 ml of 1 × PBS containing 1 mM DTT and centrifuged at 17 000 rpm for 20 minutes at 4 °C. The insoluble fraction was incubated in 350 μl of DMSO for 30 minutes at room temperature and dissolved in 25 ml of 20 mM Tris‐HCl buffer (pH 8.0) containing 6 M guanidinium hydrochloride and 1 mM DTT. The solution was incubated with vigorous shaking at 37 °C for one hour and centrifuged at 17 000 rpm for 20 minutes at 4 °C. The supernatant was equilibrated with 1 ml of 50% Ni‐NTA beads (Qiagen) for one hour at room temperature. The beads were collected by centrifugation at 2400 rpm for five minutes. The beads were washed with 15 ml of 100 mM sodium phosphate buffer (pH 6.2) containing 8 M urea and 1 M DTT. The protein was eluted with 20 mM sodium acetate buffer (pH 4.5) supplemented with 7 M urea, 200 mM NaCl and 1 mM DTT in 500 μl fractions. The fractions of the purified proteins were analysed by 4–12% sodium dodecyl sulphate‐polyacrylamide gel electrophoresis (SDS‐PAGE). The protein‐containing fractions were combined, dialysed overnight in 1 mM DTT solution and stored at −20 °C. Heterochromatin protein 1 homologue beta (HP1b) from mouse, cloned into pET‐16 (Novagen) expression vector, was expressed in E. coli C41(DE3) and purified by Ni affinity, anion exchange chromatography and gel filtration. For the preparation of monomethylated histones, the protein H3K9boc‐me1 (40 nmol) was incubated with shaking (800 rpm) in 1 ml of 1% TFA for four hours at 37 °C to produce H3K9me1. The protein was rebuffered to 1 mM DTT (1.5 ml) using a sephadex G25 column. The hexahistidine tag was removed by incubating with TEV protease (1.5 mg/ml, 100 μl) in 50 mM Tris buffer (pH 7.4) for five hours at 30 °C and overnight dialysis in 1 mM DTT.
Image described by caption and surrounding text.

Figure 8.21 Genetic incorporation of a Boc protected methyl lysine analogue using the amber stop codon technique. After protein expression and purification, the Boc group is removed using a low concentration of strong acid, giving native H3K9me1 [45].

8.8 Conclusion

This case study has showcased three different techniques that can be used to produce a histone containing a PTM. The alkylation method has the benefit of being the simplest of the three methods to perform, and it can be easily performed in conjunction with the introduction of a Cys residue via site‐directed mutagenesis. These can be alkylated using commercial reagents without the requirement for specialist chemical handling equipment and knowledge. It also provides a method to generate three different states of Lys methylation. The modification is a mimic of Me‐Lys, however, and the true impact this can have on results generated from these compounds can vary for each individual application.

UAA incorporation is an extremely powerful technique and, in this case, produced a ‘native’ H39Kme1 protein, although it should be noted that the strong acid required to remove the Boc group would not be tolerated by all proteins. This method allows for the Me‐Lys to be expressed at any point in the sequence, but no examples yet exist to introduce dimethyl‐ or trimethyl‐Lys residues using this technique.

NCL is ultimately the most flexible technique of the three as almost any PTM can be incorporated into any position. As this technique develops, ligation reactions are becoming higher yielding and faster, and new techniques in SPPS have allowed milder Fmoc strategies to be used that have avoided the use of extremely toxic hydrofluoric acid‐mediated deprotection steps. However, each sequence is different and this technique can require extensive optimisation of reaction conditions to achieve acceptable results.

8.8.1 Overall Conclusion

Overall, this chapter explores a variety of modern methodologies that have been developed in recent years to synthesise, modify and intercept polypeptides and proteins in vitro and in whole organisms. Through years of innovation in the fields of synthetic organic chemistry, chemical biology and synthetic biology, bioconjugation methods now exist that enable the learned practitioner to deploy a suitable methodology to almost any given experimental question. With these tools available and now well understood, the field of chemical biology looks set to remain a powerful approach to exploring the biological sciences.

References

  1. 1 (a) Li, Y. (2011). Protein Expression Purif. 80: 260–267.(b) Palm, C., Jayamanne, M., Kjellander, M., and Hällbrink, M. (2007). Biochim. Biophys. Acta Biomembr. 1768: 1769–1766.
  2. 2 Marcia de Figueiredo, R., Suppo, J.‐S., and Campagne, J.‐M. (2016). Chem. Rev. 116: 12029–12122.
  3. 3 Merrifield, R.B. (1963). J. Am. Chem. Soc. 85: 2149–2154.
  4. 4 (a) Behrendt, R., White, P., and Offer, J. (2016). J. Pept. Sci. 22: 4–27.(b) Paradis‐Bas, M., Tulla‐Puche, J., and Albericio, F. (2016). Chem. Soc. Rev. 45: 631–654.(c) Palomo, J.M. (2014). RSC Adv. 4: 32658–32672.(d) Mäde, V., Els‐Heindl, S., and Beck‐Sickinger, A.G. (2014). Beilstein J. Org. Chem. 10: 1197–1212.
  5. 5 Isidro‐Llobet, A., Álvarez, M., and Albericio, F. (2009). Chem. Rev. 109: 2455–2504.
  6. 6 El‐Faham, A. and Albericio, F. (2011). Chem. Rev. 111: 6557–6602.
  7. 7 Bray, B.L. (2003). Nat. Rev. Drug Discovery 2: 587–593.
  8. 8 Guzman, F., Barberis, S., and Illanes, A. (2007). Electron. J. Biotechnol. 10 (2, 279–314.).
  9. 9 Gutte, B. and Merrifield, R.B. (1971). J. Biol. Chem. 246: 1922–1941.
  10. 10 Dawson, P.E., Muir, T.W., Clark‐Lewis, I., and Kent, S. (1994). Science 266: 776–779.
  11. 11 Shah, N.H. and Muir, T.W. (2014). Chem. Sci. 5: 446–461.
  12. 12 Malins, L.R. and Payne, R.J. (2014). Curr. Opin. Chem. Biol. 22: 70–78.
  13. 13 Thompson, R.E., Liu, X., Alonso‐Garcia, N. et al. (2014). J. Am. Chem. Soc. 136: 8161–8164.
  14. 14 Milton, R.C. and Kent, S.B. (1992). Science 256: 1445–1448.
  15. 15 Unverzagt, C. and Kajihara, Y. (2013). Chem. Soc. Rev. 42: 4408–4420.
  16. 16 Kumar, K.S.A., Bavikar, S.N., Spasser, L. et al. (2011). Angew. Chem. Int. Ed. 50: 6137–6141.
  17. 17 Muir, T.W., Sondhi, D., and Cole, P.A. (1998). Proc. Natl Acad. Sci. USA 95: 6705–6710.
  18. 18 Berrade, L. and Camarero, J.A. (2009). Cell. Mol. Life Sci. 66: 3909–3922.
  19. 19 Koniev, O. and Wagner, A. (2015). Chem. Soc. Rev. 44: 5495–5551.
  20. 20 deGruyter, J.N., Malins, L.R., and Baran, P.S. (2017). Biochemistry 56: 3863–3873.
  21. 21 Chen, D., Disotaur, M.M., Xiong, C. et al. (2017). Chem. Sci. 8: 2717–2722.
  22. 22 Chalker, J.M., Bernardes, G.J.L., Lin, Y.A., and Davis, B.G. (2009). Chem. Asian J. 4: 630–640.
  23. 23 Chalker, J.M., Gunnoo, S.B., Boutureira, O. et al. (2011). Chem. Sci. 2: 1666–1676.
  24. 24 Wright, T.H., Bower, B.J., Chalker, J.M. et al. (2016). Science 354: aag1465.
  25. 25 Sletten, E.M. and Bertozzi, C.R. (2009). Angew. Chem. Int. Ed. 48: 6974–6998.
  26. 26 Saxon, E. and Bertozzi, C.R. (2000). Science 287: 2007–2010.
  27. 27 (a) Saxon, E., Armstrong, J.I., and Bertozzi, C.R. (2000). Org. Lett. 2: 2141–2143.(b) Nilsson, B.L., Kiessling, L.L., and Raines, R.T. (2000). Org. Lett. 2: 1939–1941.
  28. 28 Schilling, C.I., Jung, N., Biskup, M. et al. (2011). Chem. Soc. Rev. 40: 4840–4871.
  29. 29 Meldal, M. and Tornøe, C.W. (2008). Chem. Rev. 108: 2952–3015.
  30. 30 McKay, C.S. and Finn, M.G. (2014). Chem. Biol. 21: 1075.
  31. 31 Worrell, B.T., Malik, J.A., and Fokin, V.V. (2013). Science 340: 457–460.
  32. 32 Presloski, S.I., Hong, V.P., and Finn, M.G. (2011). Curr. Protoc. Chem. Biol. 3: 153–162.
  33. 33 Bach, R.D. (2009). J. Am. Chem. Soc. 131: 5233–5243.
  34. 34 Agard, N.J., Prescher, J.A., and Bertozzi, C.R. (2004). J. Am. Chem. Soc. 126: 15046–15047.
  35. 35 Dommerholt, J., Rutjes, F.P.J.T., and van Delft, F.L. (2016). Top. Curr. Chem. 374: 16.
  36. 36 (a) Chen, W., Wang, D., Dai, C. et al. (2012). Chem. Commun. 48: 1736–1738.(b) Lang, K., Davis, L., Wallace, S. et al. (2012). J. Am. Chem. Soc. 134: 10317–10320.
  37. 37 Oliveira, B.L., Guoa, Z., and Bernardes, G.J.L. (2017). Chem. Soc. Rev. 46: 4895–4950.
  38. 38 Barton, W.A., Tzvetkova‐Robev, D., Erdjument‐Bromage, H. et al. (2009). Protein Sci. 15: 2008–2013.
  39. 39 Lang, K. and Chin, J.W. (2014). Chem. Rev. 114: 4764–4806.
  40. 40 (a) Schmied, W.H., Elsässer, S.J., Uttamapinant, C., and Chin, J.W. (2014). J. Am. Chem. Soc. 136: 15577–15583.(b) Davis, L. and Greiss, S. (2018). Genetic Encoding of Unnatural Amino Acids in C. elegans. In: Noncanonical Amino Acids, Methods in Molecular Biology, vol. 1728 (ed. E. Lemke). New York, NY: Humana Press.
  41. 41 Patterson, D.M., Nazarova, L.A., Xie, B. et al. (2012). J. Am. Chem. Soc. 134 (45): 18638–18643.
  42. 42 Agarwal, P., Beahm, B.J., Shieh, P., and Bertozzi, C.R. (2015). Angew. Chem. Int. Ed. Engl 54 (39): 11504–11510.
  43. 43 Simon, M.D., Chu, F., Racki, L.R. et al. (2007). Cell 128: 1003–1012.
  44. 44 Maity, S.K., Jbara, M., Mann, G. et al. (2017). Nat. Protoc. 12: 2293–2322.
  45. 45 Nguyen, D.P., Garcia Alai, M.M., Kapadnis, P.B. et al. (2009). J. Am. Chem. Soc. 131: 14194–14195.

Further Reading

  1. Algar, W.R., Dawson, P., and Medintz, I.L. (2017). Chemoselective and Bioorthogonal Ligation Reactions: Concepts and Applications. Wiley VCH.
  2. Dobson, C.M., Gerrard, J.A., and Pratt, A.J. (2002). Foundations of Chemical Biology. USA: Oxford University Press.
  3. Hermanson, G.T. (2013). Bioconjugate Techniques. Academic Press.
  4. Jones, J. (2002). Amino Acid and Peptide Synthesis. USA: Oxford University Press.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset