16
Principles and Practice in Macromolecular X‐Ray Crystallography

Arnaud Baslé and Richard J. Lewis

Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne, NE2 4HH, UK

16.1 Significance and Short Background

To understand the function of proteins we need to observe them on the atomic scale in 3D. Proteins can be as small as 5 nm and even large complexes, like ribosomes, are only 30 nm in diameter. The length of covalent bonds in proteins, ∼0.15 nm, is simply too short to be observed using the visible portion, 400–700 nm, of the electromagnetic spectrum. A form of illumination has to be used that is better matched to the dimensions of the object under study, and that means X‐rays. It was first established over a century ago that X‐rays could be used to determine molecular structures and X‐rays have since been applied to molecules ranging in size and complexity from table salt to the ribosome, the molecular machine powering protein synthesis in all cells. The first protein structures, myoglobin and haemoglobin, were solved ∼60 years ago. There were just 13 structures in 1976 and 1994 was the first year that over 1000 new structures were deposited in a calendar year. At the time of writing there are over 132 000 PDB entries based on X‐ray data. The explosion in crystallographic analysis can be traced to the development of molecular biology in the 1980s, personal computing and third generation synchrotron light sources in the 1990s.

Our molecular understanding of the fundamentals of life, DNA replication and transcription; RNA translation and protein synthesis; trans‐membrane trafficking and transport; ATP generation and metabolism, have all depended upon the application of crystallography. X‐ray crystallography not only provides critical information on the structure, function and mechanism of proteins but the molecular understanding of disease, and drug discovery and development are also underpinned by crystallography. Of course, there are other methods that can be applied to the problem of protein structure determination; cryo‐electron microscopy (cryoEM) is currently in vogue for its applicability to large proteins and macromolecular complexes, and X‐ray free electron laser (XFEL) facilities are likely to revolutionise structural biology because of their time‐resolved illumination sources six orders of magnitude brighter than synchrotrons. Here we concern ourselves only with X‐ray crystallography, with a focus on practical application.

16.2 Theory and Principles: Overview

Protein crystallography is underwritten by physics and mathematics, but nowadays most practitioners are life scientists who seek to understand biological phenomena. X‐rays are used as their wavelengths are better matched to the lengths of covalent bonds in proteins, which are simply too short to be seen by white light. However, there is no microscope that can focus X‐rays to provide the resolution required for determining atomic positions in proteins. The scattered X‐rays from the electron orbitals of atoms in an individual protein molecule are too weak to be measured but the scattering signal can be amplified by measuring diffraction from protein crystals. The recorded diffraction results from the interaction of the incident X‐ray beam with electrons from all the protein's atoms, averaged across all the molecules in the path of the X‐ray beam. For instance, tens of thousands of these measurements, called reflections, are required to solve the structure of a 43 kDa protein, which typically contains ∼3000 non‐hydrogen atoms. The very high data‐to‐parameter ratio of X‐ray crystallography is one of the reasons why it is a very powerful and robust technique. The diffracted reflections are X‐ray waves and thus contain three essential elements for calculating an electron density map: (i) the wavelength of the incident beam; (ii) the amplitude of the wave, which is calculated directly from the measured intensities; (iii) the phase of each wave. However, the phase information cannot be recorded directly and has to be determined experimentally or inferred from related structure determinations. The loss of the phase information is problematic because the phase is an exponential term in the mathematical equation that defines the electron density map and as such it dominates the equation product. Only once the phase of each reflection is determined or inferred can an electron density map be calculated from which an atomic model can be built (Figure 16.1). Therefore, each model deposited at the PDB is an interpretation of its electron density. The model may contain errors, some of them substantial, but the data collected are a physical phenomenon resulting from the exposure of the crystal to X‐rays. Modern tools allow careful inspection and validation of both model and data to avoid misinterpretation.

Image described by caption and surrounding text.

Figure 16.1 Schematic of experimental setup. The typical setup comprises an X‐ray source, a diffractometer, a beam stop and a detector, all under computer control. The reflections are indexed, integrated and scaled, and once the phase problem is solved, a model can be built and refined against the collected data.

16.2.1 Basics of Crystallisation

Protein crystallisation was first developed in the late nineteenth century as a method to demonstrate protein purity, and crystallisation is still used during the chemical purification of organic compounds. However, most proteins do not tolerate the extremes of temperature, pH and solvents used for crystallising small molecules and predicting how any protein will crystallise is effectively impossible.

A crystal is an ordered repetition of a building block called the asymmetric unit, which is the smallest packing arrangement of the protein before crystallographic symmetry operators are applied to generate the effectively infinite crystal lattice. The packing of protein to build a three‐dimensional crystal is achieved by a series of specific salt bridges, hydrogen bonds and van der Waals' contacts between protein molecules. It is generally accepted that proteins destined for crystallisation should be as chemically pure as possible, which is trivial to assess by SDS polyacrylamide gel electrophoresis and mass spectrometry. The chances of growing crystals are improved by reducing conformational heterogeneity, though this is harder to monitor and achieve. Occasionally the presence of serendipitous impurities in the preparation or deliberately included ligands will aid crystal lattice formation but, on the whole, impurities decrease crystal quality by intercalating between asymmetric units, which may hinder crystal growth completely. It is important to have a monodisperse (a particle of uniform size), stable sample as crystallisation can take up to a few months. Proteins that fold poorly, have low thermal stability or are prone to degradation are less likely to crystallise. Time invested in protein quality control (e.g. circular dichroism for protein folding, thermal melts for protein stability) can save time overall. Protein flexibility, instability, denaturation and polydispersity are most likely to hinder crystallisation.

16.2.1.1 Crystallisation

Crystallisation is a two‐step process. First, a supersaturated solution of the molecule must form and be induced to form critical nuclei, microscopic clusters of the molecule in solution. Second, the nuclei must grow into a three‐dimensional crystal by the deposition of additional molecules on to the growing surface. The formation of nuclei is a thermodynamic process [1] that occurs once the protein reaches supersaturation, but exceeding the supersaturation limit results in protein precipitation. Supersaturation can be achieved by mixing a highly concentrated solution of the protein with crystallisation solutions, usually at a 1 : 1 ratio, and this mixture is left to equilibrate. In the case of vapour diffusion, the most popular protein crystallisation technique, the concentration of the components of the crystallisation drop reach equilibrium with those in the crystallisation well solution. Water is drawn from the drop to the well solution and eventually the protein reaches supersaturation (Figure 16.2).

Image described by caption.

Figure 16.2 Phase diagram. The solid lines delineate the protein and precipitant concentration required to reach the metastable zone and the concentration of either should be increased to migrate from the undersaturated region (1) to the nucleation zone (2) in order to obtain crystals. The grey, shaded region delineates the precipitation zone to avoid where protein will form amorphous precipitates. Crystals grow in size in the path from (2) to (3) and when this point is reached crystal growth will stop (3).

There are three variants of vapour diffusion, hanging, sitting and sandwich, where water equilibrates between the drop and reservoir (Figure 16.3). Crystals can be obtained by dialysis where water and small molecule components of the crystallisation solution equilibrate between two compartments separated by a semi‐permeable membrane. In the batch method, the crystallisation reagent and protein mixture is covered by oil and water molecules evaporate slowly through the oil. Proteins can also be crystallised in capillaries where the protein and crystallisation solution are not mixed but are just in contact and the two solutions counterdiffuse; somewhere in this dynamic arrangement a supersaturated zone may form. For a review of other common methods, see McPherson and Gavira [2]. Finally, crystallisation of membrane proteins forms a distinct challenge as substantial parts of these protein surfaces are highly hydrophobic as they would ordinarily be embedded in a lipid membrane. Membrane proteins therefore need to be solubilised and crystallised in the presence of detergents, or lipidic cubic phases, introducing further important variables to crystallisation. Membrane protein crystallisation has been reviewed recently by Parker and Newstead [3].

Image described by caption.

Figure 16.3 Sitting and hanging drop vapour diffusion. Protein and reservoir are mixed and are represented here as a dark drop. The mixture is left to either sit on a pedestal or hang by surface tension on a cover slip and the experiment is sealed with either tape or a cover slip. Over time water evaporates from the drop to equilibrate with the reservoir, represented in light grey, and the drop shrinks in volume.

16.2.2 Basics of X‐Ray Crystallography

A single protein molecule in solution will diffract X‐rays but will generate a scattering signal below the detection capabilities of any detector; even if a suitably sensitive detector did exist, the protein would likely be destroyed by absorbing the applied radiation before a useful signal was recorded. Protein crystals thus amplify the scattering signal and mitigate the damaging effects of the radiation. Whilst crystals of small molecules (Mr ∼ < 500 Da) typically do not contain solvent, proteins are highly solvated molecules and consequently macromolecular crystals have solvent contents in the 20–80% range, which can be exploited in soaking experiments of inhibitors, effectors and other ligands, or compounds for experimental phasing. High solvent content is advantageous in phase improvement strategies but creates problems in cryocrystallography because of potential ice formation.

In order to interpret the X‐ray data, we must first understand the principles by which molecules assemble into crystalline three‐dimensional repeating arrays. The unit cell (Figure 16.4), the smallest volume to describe a crystal lattice, repeats identically in three dimensions to form the crystal and is characterised by three distances (measured in Å) a, b, c and three angles α, β, γ (measured in °). There are 230 unique ways that molecules can be arranged symmetrically within a unit cell and still obey the translational symmetry in all three directions. Hence there are 230 possible individual space groups that describe the symmetry within the unit cell. If the space group of the crystal cannot be determined correctly, the crystal structure cannot be solved. Each space group is formed by a unique combination of 7 lattice systems, 14 Bravais lattices and 32 crystallographic point groups. Each of these terms is described briefly in the next two paragraphs, but for a comprehensive description the reader should consult the International Tables for Crystallography [4].

Image described by caption and surrounding text.

Figure 16.4 Unit cell parameters. The lengths (a, b, c) and angles (α, β, γ) of a typical unit cell are represented.

Symmetry rules define the building block of each unit cell, called the asymmetric unit, though the asymmetric unit may contain more than one copy of the molecule. The seven lattice systems impose symmetry rules on the crystal in defining its space group. These lattice systems include triclinic, which imposes no symmetry restraints on the unit cell parameters, and tetragonal, with one fourfold rotation axis, requiring that unit cell dimensions a = b and all three angles are 90°. The most complex lattice system is cubic, in which a = b = c and all three angles are 90°. The 14 Bravais lattices are formed by combining a lattice system with fixed lattice points, or origins, found at the corners of each unit cell only, called primitive (or P for short). Additional lattice points can be accommodated at the centre of each face of the unit cell, called face‐centred (F), at the centre of the unit cell, called body‐centred (I) or on one parallel face, called C‐centred (C). Crystallographic point groups are a set of symmetry operators applied to the asymmetric unit to generate the unit cell and leave a lattice point fixed while moving other parts of the crystal to symmetry‐equivalent positions. Other than in the triclinic space group P1, each unit cell comprises more than one asymmetric unit. In the tetragonal space group P422 there are 8 asymmetric units and in one of the most extreme cases, the cubic space group F432, there are 96.

Four symmetry operators define point groups: rotation and screw axes, mirror and glide planes. However, because amino acids (except glycine) contain chiral centres, mirror and glide planes are incompatible with chiral protein crystals and consequently there are just 65 macromolecular space groups. Rotation axes are n‐fold where n is 2, 3, 4 or 6, because only these rotations can be applied to generate infinite repeats with no gaps, called the crystallographic restriction. Crystallographic and macromolecular symmetry axes can coincide: OmpF crystallises with one molecule in the asymmetric unit of the trigonal space group P321 (PDBid 2OMF) and the biological, trimeric structure of OmpF is reassembled by application of the crystallographic threefold axis (Figure 16.5). Macromolecules with symmetries distinct from two‐, three‐, four‐, and sixfold can still crystallise: GroEL contains sevenfold symmetry [5] and the RNA‐binding protein TRAP has 11‐fold symmetry [6]. The incompatibility of these symmetries with the crystallographic restriction means that entire assemblies are present within crystallographic asymmetric units. Screw axes combine n‐fold/360° rotations with translations along the crystallographic axis and are demarked with a subscript number related to the fraction of the unit cell dimension of the translation. For instance, the space group P21 means that there is one twofold axis of rotation combined with a translation of half a unit cell length along the same crystallographic axis.

Image described by caption.

Figure 16.5 OmpF (PDBid 2OMF) unit cell construction. In this example OmpF has crystalized in space group P321. The unit cell represented in this figure contains six asymmetric units. Only one molecule (black) is built in the asymmetric unit and five symmetry operations permit the building of the unit cell. The threefold molecular axes are marked with a solid triangle and are parallel to the crystallographic threefold axis, going into the face of the page.

16.2.2.1 Bragg's Law

Father and son, William Henry and William Lawrence Bragg, are generally viewed as the Godfathers of crystallography and shared the Nobel Prize for Physics in 1915 – when Lawrence was just 25 years old. The Braggs observed that crystalline materials diffracted X‐rays in an ordered manner and William Lawrence proposed that crystals could be viewed as a set of parallel planes separated by the distance d. The electrons interfering with incident X‐rays would produce constructive interference only if the phase shift of waves reflected by different planes was a multiple of 2π. William Henry built a diffractometer to record each reflection on a photographic plate enabling atomic distances in simple salt crystals, such as NaCl, to be determined. Bragg's law is normally expressed as

equation

where d is the interplanar distance, θ is the incident or scattering angle, n is a positive integer and λ is the wavelength of the incident beam (Figure 16.6). Note that Bragg's law also applies to electron and neutron diffraction, such is its applicability to the fundamentals of diffraction.

Image described by caption and surrounding text.

Figure 16.6 Bragg's law ( = 2d sin θ). Three lattice planes are represented with the interplanar distance d. X‐rays of wavelength λ are represented as a dotted line with an incident angle θ. For the diffracted X‐rays to arrive in phase on the detector, the distance travelled by the lower of the two reflections must be an integral number of wavelength, , which must equal twice d sin θ.

Waves that traverse larger interplanar spacings are diffracted by smaller θ angles and are usually observed towards the centre of the detector. These waves convey low resolution information on larger features such as the overall solvent distribution and the protein envelope. These low resolution reflections are essential for the structure solution process and have higher than average intensities. Reflections with high θ angles are observed towards the perimeter of the detector due to the small interplanar spacing d (Figure 16.7). These reflections carry information about spatially close features and are the high resolution data essential for building an atomic model of the protein. However, the high resolution reflections can be difficult to measure accurately over background and are lost first if the crystal suffers from radiation damage.

Image described by caption and surrounding text.

Figure 16.7 Diffraction pattern. (a) Macromolecular diffraction recorded on a Pilatus 6M detector (0.5° oscillation). Two rings have been displayed with the corresponding resolution. The slightly darker background outside of the 4.89 Å ring is due to the solvent scatter between 3.7 and 2.5 Å resolution. (b) Magnification of panel A with contrast adjustment; a barely visible reflection at 1.4 Å is circled. (c) The sample mounted in a nylon loop with the beam shape (80 μm × 20 μm) and its position marked with an ellipse. Please note that this dataset was integrated successfully to 1.18 Å, I/[σ]I of 1.4 and a CC1/2 of 0.6 for the highest resolution shell (1.2–1.18 Å).

The pattern of reflections recorded is derived from the diffraction of X‐rays from points in the reciprocal lattice, a coordinate system related to real space by the following operations:

equation

We record the intensity of each reflection on the detector in real terms (i.e. a detector coordinate in x and y, and an intensity for each pixel) from which their coordinates in reciprocal space (h, k, l), called Miller indices, can be determined during the indexing and initial processing of the diffraction data. In short, low resolution reflections have low Miller indices and higher resolution reflections have one or more high Miller indices.

The Ewald sphere (Figure 16.8) is a geometrical construction that helps to explain diffraction. The crystal is located at the centre of the Ewald sphere, which has a radius of 1/λ, where λ is the radiation wavelength. Diffraction can only be recorded when reciprocal lattice points coincide with the sphere surface, but only a small number of points fulfil this condition at any given time. To record the data required to calculate the structure of the crystallised molecule it is necessary to rotate the crystal around a rotation axis whilst it is illuminated by the X‐ray beam.

Image described by caption and surrounding text.

Figure 16.8 Ewald sphere. The Ewald sphere is a sphere with a radius of 1/λ centred on the crystal. The reciprocal lattice is represented with its origin (0, 0, 0) where the incident beam crosses the Ewald sphere. Only the points where the surface of the Ewald sphere coincide with a reciprocal lattice point (e.g. (−2, 0, 2) and (−2, 0, –2)) will result in a reflection recorded on the detector.

The symmetry relationships within the unit cell dictate that symmetry‐related reflections exhibit the same intensities. Therefore, symmetry within the crystal reduces the quantity of data needed to solve the structure. Some special relationships require further explanation. Friedel pairs are reflections that are related by a transformation through the origin such that the intensity of any reflection, I h,k,l , is normally equal to the intensity of its Friedel pair I h,−k,−l . Friedel's law breaks down in the presence of heavier atoms (S, Se or any transition metal) that are anomalous scatterers when the energy of the incident radiation is at or above an atomic absorption edge of the heavy atom. Other reflections may be related to one of the Friedel pairs, for example in the presence of a twofold symmetry axis, and such reflections are called Bijovet pairs and can be used to strategically orient the crystal for data collection.

Finally, all X‐ray detectors are capable of recording the intensity of the diffracted X‐rays, which can be considered as wave forms. A simple wave is a sinusoidal function (Figure 16.9) and though diffracted X‐rays are complex waves, they are repetitive. The mathematical function that relates the electron density of the crystallised molecule to the diffracted waves is the Fourier transform, named after the eighteenth century French mathematician Joseph Fourier, and the operation from reciprocal space to real space is an inverse Fourier transform. In brief, two parameters are needed for the Fourier transform. The first is the amplitude of the wave, which is proportional to the root mean square of the intensity, and the second is the phase, which is the angle at which the wave peaks.

Amplitude vs. time displaying a sine wave with double-headed arrow labeled wavelength.

Figure 16.9 Sine wave. The amplitude and the wavelength are represented on a typical sine wave.

16.2.3 Scaling

Symmetry‐equivalent reflections may not experience the same environment – for instance, variations in crystal quality, shape and size, and the path of the X‐ray beam through the loop and cryoprotectant can affect the signal‐to‐noise ratio of the recorded reflections. The crystal may have been centred poorly, and the X‐ray dose received may vary during the experiment. Many modern detectors are mosaic assemblies and each tile in the detector can have slight variations in sensitivity. The beam flux may also reduce during the time of the data collection. Therefore, each reflection should be recorded multiple times before merging, averaging and outputting a single mean intensity, a process called scaling. Data for multidataset techniques also need to be on the same relative scale and cross‐dataset scaling must be done after internal scaling has been completed. The amplitude of each reflection needs to be calculated, which is proportional to the root mean square of its intensity, for calculating the structure factors necessary for Fourier syntheses. Typically, 5% of the measured data are randomly marked at the end of scaling for cross‐validation, the R free, which are used to provide an unbiased metric of a quality model.

16.2.4 Radiation Damage

All biological materials suffer from radiation damage when exposed to ionising radiation and radiation damage in protein crystallography is a significant challenge, especially considering third generation synchrotron beamlines where fluxes of 1012 photons per second and beam cross‐sections of 20–50 μm are not uncommon. One of the first properties of the crystal that is lost during radiation damage is diffraction strength. The higher resolution reflections fade quickly and disappear, a phenomenon first described in the 1960s [7]. The diffraction lifetime is related to the X‐ray dose and Henderson has proposed a generalised limit for protein crystallography of 2 × 107 Gy (J/kg), the X‐ray dose that can be absorbed by a cryocooled protein crystal before the mean reflection intensity is halved [8]. The dose constraint has been refined subsequently to 3–4 × 107 Gy [9]. The radiation absorbed by the sample leads to crystal decay, increasing disorder within the sample, and a consequent reduction in reflection intensity through two phenomena. Primary damage is the ionisation of an atom due to photoelectric absorption or Compton scattering and only a reduction in the total applied dose can reduce the effects of primary damage. Secondary damage occurs from the formation of secondary electrons that diffuse and induce further damage [10] but its accrual rate can be reduced by maintaining the sample at cryogenic temperatures. A constant stream of dry nitrogen gas at 100 K is directed over the crystal during data collection to slow secondary radiation damage. Covalent bonds break on absorbing sufficient energy, resulting in the destruction of the protein. The overall loss of diffraction intensity affects the whole experiment but localised specific damage can also occur, such as disulphide bond destruction, leading to artefacts in the crystallographic analysis. Radiation damage should be avoided as much as possible as even mild damage can complicate structure solution and analysis unnecessarily.

16.2.5 Matthews' Coefficient

Matthews' coefficient (V M) is a measure of the solvent content of protein crystals and typically ranges between 27 and 65%, corresponding to V Ms of 1.62 and 3.53 Å3/Da, respectively [11]. Crystals with low V Ms tend to diffract well and those with high V Ms tend to diffract modestly. The most likely number of copies of the crystallised molecule in the crystallographic asymmetric unit can be calculated in several ways (e.g. http://csb.wfu.edu/tools/vmcalc/vm.html, http://www.ruppweb.org/mattprob/default.html and the CCP4 suite), knowledge of which is imperative for molecular replacement and density modification routines.

16.2.6 Non‐crystallographic Symmetry

Proteins, especially symmetric ones, can pack with multiple copies within the asymmetric unit. The spatial relationships between each molecule do not obey crystallographic symmetry rules; instead they follow non‐crystallographic symmetry (NCS) operators. Detecting and defining NCS operators can be critical to successful structure solution and in the special case of icosahedral virus crystallography, NCS can be exploited to solve new structures without recourse to experimental phasing [12].

There are two forms of NCS. First, protein molecules orientated in the same way within the asymmetric unit can be separated by a simple translation, and the translational symmetry is detected by a native Patterson function. Alternatively, and more commonly, molecules within the asymmetric unit are related by set of rotations alone, as first detected for haemoglobin [13] in a calculation called a self‐rotation function. Both types of NCS are detected by different applications of the Patterson function, which is calculated directly from the intensities of the diffraction data and is thus independent of phase information. The resultant Patterson map containing peaks corresponding to the vector between every atom in the asymmetric unit and every other atom. A simple structure containing three atoms will yield a Patterson map containing just six peaks, but a protein of 43 kDa will yield an uninterpretable Patterson map containing 9 million peaks. Subsets of these peaks will superimpose by correctly applying NCS to determine the spatial relationships between molecules in the asymmetric unit.

16.2.7 Structure Factors

While the amplitude of each reflection is calculated from its measured intensity, this information is insufficient to calculate an electron density map. The Fourier synthesis to calculate the electron density map depends upon the structure factor, F h,k,l , of each measured reflection. Structure factors encompass both amplitude and phase information of each reflection and each F h,k,l is thus related to the corresponding reflection intensity, I h,k,l . Once phase information has been obtained, each F h,k,l , can be derived and an electron density map calculated.

16.2.8 Phase Problem

It is impossible to measure reflection phases directly and therefore structure factors cannot be calculated for the Fourier transform to generate an electron density map. This is the crystallographic phase problem that can be solved either by molecular replacement, and modern software pipelines perform initial searches automatically, or by experimental phasing (see [14,15)). We introduce both in the next sections but do not discuss direct methods that are used mostly to solve small molecule structures and sometimes to solve atomic resolution structures of small proteins.

16.2.8.1 Molecular Replacement

Molecular replacement relies on similar protein sequences having similar 3D folds. The amino acid sequence of the target is used to find a potential molecular replacement search model from the PDB based on sequence homology only. If the target and search model share at least 50% sequence identity, molecular replacement will likely succeed. In exceptional cases molecular replacement can still succeed even if the identity drops to 13% [16]. In molecular replacement the search model is repositioned so that it matches the packing arrangement of the target protein in its crystal form. The procedure has two components, a rotation search to orient the search model correctly and a translation search that slides the rotated search model in three orthogonal directions until it occupies the same space and orientation as the target structure. If molecular replacement has worked, the calculated phases will be a reasonable approximation, enabling meaningful electron density maps to be calculated, and model building and refinement can commence.

16.2.8.2 Experimental Phasing: SAD

The current method of choice for experimental phasing is single‐wavelength anomalous dispersion (SAD). It relies upon collecting data on an anomalous scattering‐containing sample from incident radiation at or above the scatterer's atomic absorption energy. Selenomethionine is most commonly used for anomalous scattering measurements, with data collected above the Se K edge, 12 657 eV (0.9795 Å). Reflections no longer scatter in phase with the incident beam with radiation at or above the electron transition energy, the diffracted X‐rays no longer scatter in phase with the incident beam, I h,k,l no longer equals I h,−k,−l and Friedel's law breaks down. Phase information can be obtained by careful measurement of the intensity differences between Friedel pairs, called anomalous differences (Figure 16.10).

Image described by caption and surrounding text.

Figure 16.10 Harker diagram. Reflections are presented as vectors. A given reflection F PH+ results from the normal contribution of the protein (F P+) as well as the normal (FH+) and imaginary contribution (FH+) of the heavy atom. Two circles can be drawn with a radius of F PH+ (solid line) and for the opposite Friedel pair a radius of F PH− (dashed line) having their origins shifted with the anomalous contributions FH+ and FH−. The positions at which the circles intersect give two possible phase angles (ϕ + and ϕ ) for this reflection. Only one of the two is correct and the phase ambiguity will have to be solved by a subsequent step.

The heavy atom positions are found by calculating a Patterson function from the anomalous differences. The Patterson peaks represent vectors between anomalous scatterers within and between asymmetric units. As anomalous differences are miniscule in comparison to normal intensities, the corresponding intensities of I h,k,l and I h,k,l must be measured accurately; highly redundant data collection is one solution to this problem. The Patterson map is complicated by the presence of multiple heavy atom positions (there are N 2N peaks in the Patterson function, where N is the number of heavy atoms) and partial occupancy and/or mobile atoms reduce Patterson map peak heights. If the position(s) of the heavy atom(s) can be deduced, phase estimates can be calculated. The phase problem is represented geometrically with the elegant Harker diagram (Figure 16.10): here the structure factor FPH+ and its Bijvoet pair FPH− result from the real contributions of the protein (FP) and heavy atom (FH) and imaginary contributions FH+ and FH−, which are oriented 90° away from the normal contribution FH but in opposite directions. Two circles of radius FPH+ and FPH− can be drawn with the enantiomer shifted from the origin by the enantiomer imaginary contribution (Figure 16.10).

SAD is very effective at finding heavy atom positions when data are collected carefully but initial phases are dependent solely upon the anomalous differences and tend to be poorer than phases from MAD experiments. Single and multiple isomorphous replacement can be combined with anomalous scattering (SIRAS and MIRAS, see section 16.3.5.4), but unlike SAD and MAD these experiments rely upon having multiple isomorphous crystals available and the hit‐and‐miss approach of heavy atom derivatisation.

16.2.8.3 Experimental Phasing: MAD

The normal and anomalous scattering changes, which depend on the wavelength of the incident beam, can also be used to determine phases. This method is called multiple‐wavelength anomalous dispersion (MAD) and unlike SAD addresses phase ambiguity directly [17]. However, multiple datasets are required from ideally a single crystal and only heavy atoms with a transition edge in the range of a tunable synchrotron beamline can be used. Three datasets are typically collected: (i) the peak, where the anomalous scattering is at its highest; (ii) the high energy remote, where the normal scattering is close to its native value; (iii) the inflexion, where the normal scattering is at its lowest (Figure 16.11). Two‐wavelength experiments can mitigate radiation damage when collecting multiple datasets from a single crystal [18]. Heavy atom positions are found as in SAD and the third dataset resolves phase ambiguity because three circles in a Harker can only intersect at one point.

Image described by caption and surrounding text.

Figure 16.11 Edge plot. Selenium normal (f′) and anomalous (f″) edge plots are represented as solid lines. For comparison sulphur is represented with dashed lines but does not have an edge between 5500 and 17 000 eV. Most standard synchrotron beamlines operate around the Se edge atomic absorption.

Source: The scattering factor data were downloaded from http://skuld.bmsc.washington.edu/scatter/AS_form.html.

16.2.8.4 Experimental Phasing: SIR and MIR

Single isomorphous replacement (SIR) and multiple isomorphous replacement (MIR) were the methods of choice to solve the phase problem before the advent of tunable third generation synchrotrons beamlines. Both require multiple datasets, a native and at least one heavy atom derivative. Heavy atom derivatives arise when protein crystals are soaked in solutions of gold, mercury, platinum, etc. salts if the heavy atom is bound at specific points and if the crystal remains isomorphous with the native. The Patterson function is again used to identify the heavy atom positions, which are used to estimate initial phases. All MIR datasets have to be in the same space group and highly isomorphous with cell parameters that differ by less than 0.5%. Phase ambiguity will be resolved in MIR with two or more derivatives.

16.2.9 Model Building and Refinement

Atomic models can be built after successful molecular replacement or the calculation of initial experimental phases. The protein model is manipulated using a computer graphics programme (e.g. FRODO and O [19], but nowadays Coot [20]) by reference to the electron density map, followed by rounds of model refinement. The changes to the model are restrained to known stereochemical parameters, such as the covalent bond lengths and angles, and non‐covalent interactions such as van der Waals' contacts. A restraint is an ideal target value to which a function may converge. Refinement is an iterative process with the ambition of interpreting correctly every feature in the electron density map including protein atoms, bound ligands, ions and solvent. During refinement the observed data will be compared to calculated data from the model (atomic coordinates in x, y and z, atomic occupancies and a model for the positional error of each atom [or group of atoms] called the B factor). The two most important reliability factors, R work and R free, are calculated; the R work is calculated using 95% of the data used in refinement and R free is calculated on the residual 5% that are set aside for cross‐validation [21].

Some side chains or loops in the model may occupy multiple conformations because they are in different orientations in the asymmetric units through which the X‐ray beam passes during the experiment. At low resolution these differences may be invisible, but at resolutions better than 1.8 Å each conformation and the relative occupancy of each can be modelled. Ligands may bind only partially and the occupancy can be adjusted to reflect the electron density.

The atom positional error is represented by the B factor (Å2) as a spherical distribution and is normally modelled isotropically for each atom. Residue groups or overall B factor models may be more appropriate at very low resolution. Anisotropic B factor models can be modelled with ultrahigh resolution datasets. An additional thermal description of the model can be achieved with the TLS (translation libration screw) method (reviewed in [22]) with isotropic B factors.

16.3 Methodology

In the following section we cover current practice in most aspects of the entire structure solution pipeline.

16.3.1 Construct Design

Potential disordered regions, especially at the N‐ and C‐termini, should be removed from the recombinant construct and XtalPred [23] and PONDR [24] used to predict disorder at the amino acid level. Secondary structure predictions from JPred [25] can check that terminal helices are not cut in half when designing constructs. The crystallisation of thermostable orthologues or engineered variants [26] when the mesophilic equivalent protein has poor solution and/or thermostable characteristics should be attempted rather than focusing on just one family member. Alternate expression systems should be considered and the Oxford Protein Production Facility (OPPF at https://www.oppf.rc‐harwell.ac.uk/OPPF/public/services) offers high throughput cloning and expression services.

16.3.2 Crystallisation Screening

Successful chemical conditions to crystallise groups of proteins have been identified and made commercially available as crystallisation kits for soluble proteins (e.g. PACT, JCSG+), monoclonal antibodies (e.g. GRAS), nucleic acids (e.g. HELIX) and their binding proteins (e.g. Natrix), protein complexes (e.g. Wizard), kinases (e.g. Kinase) and membrane proteins (e.g. MembFac, MemGold, Lipidic‐Sponge Phase). The screens are biased towards what has worked previously, but new screens have emerged (MORPHEUS II and MIDAS) that explore different chemical space. Screens based on one fixed chemical family are also available (MORPHEUS, MPD and ammonium sulphate). Since optimisation may still be necessary to convert poor crystals into well‐diffracting ones, additive screens that contain low concentrations of ions, detergents and solvents may improve crystal quality. The websites of screen suppliers (www.moleculardimensions.com, www.jenabioscience.com, www.qiagen.com and www.hamptonresearch.com) are useful resources.

Most screens will work with protein concentrations of 5–50 mg/ml. The protein should be kept in a simple buffer, and metal ions and phosphates should be avoided to stop salt crystal formation during screening. Small proteins (<30 kDa) might need higher concentrations to crystallise: 190 mg/ml was required for the 15 kDa CBM62 [27]. Larger or less soluble proteins can be crystallised at concentrations as low as 1 mg/ml. Crystallisation should start with a sparse matrix screen and the experiment inspected immediately with a stereo microscope; many heavy precipitates suggest that the protein concentration is too high. If the drops are mostly clear the protein concentration is likely to be too low.

Crystallisation robots are commonplace and permit screening hundreds of conditions using small amounts of protein, 50–300 nl, per drop, and 70–100 μl of crystallisation screen per condition. Most crystallisation robots use almost exclusively multiwell plate format labware. There are many manufacturers and distributors of multiwell plates that can be used, with variations in reservoir volumes, number of wells and shapes (flat bottom or round bottom wells) for the experiment. Initial screening is usually performed in 96‐well format plates with small drops. Subsequently, initial hits are optimised on a larger scale. Anaerobic chambers can be used to store crystallisation experiments if the protein is redox sensitive.

When crystallising protein–ligand complexes, the ligand will generally need to be at a concentration of ∼20‐fold higher than K d for the protein in order to saturate the binding site. Fundamental biochemistry thus ought to be performed before commencing potentially lengthy crystallisation screenings. Protein–protein complexes ought to be co‐expressed and only those fractions that contain both (or all) proteins purified. Alternatively, pre‐purified proteins should be mixed and the complex purified away from uncomplexed components by size exclusion chromatography.

Sitting drop vapour diffusion is the most common crystallisation technique because it matches best the experimental setups of crystallisation robots. Protein and crystallisation conditions are dispensed on to a plastic shelf. This is a simple and rapid procedure with most robots. For instance, it takes five minutes to dispense screen solutions from a deep well block into a 96‐well plate using a hand‐held multichannel pipette and two minutes for a TTP Labtech Mosquito to dispense protein and crystallisation solution for the entire plate at two drop ratios. Therefore, 192 experiments take less than 10 minutes to set up using only 28.8 μl of protein. Once the tray is finished the user seals all the drops using a clear plastic adhesive sheet and the tray is left at the chosen temperature for the precipitant concentration in the drop to equilibrate with that in the well.

Hanging drop experiments can also be established using robotics but this technique is normally favoured for manual crystallisation optimisation experiments. Here the optimisation reagents are pipetted directly into a 24‐well tissue culture tray and high vacuum grease around the rim of each well is used to provide an airtight seal. The protein and mother liquor are mixed on plastic or silanised glass coverslips, which are inverted over the greased well rim and an airtight seal made by gentle pressing.

In sandwich drop experiments, the protein is mixed with the crystallisation solution and ‘sandwiched’ between two flat glass or plastic surfaces. The sandwich is placed on to a support above the well solution in a tissue culture plate and sealed with vacuum grease and coverslips. Though less trivial to set up, the sandwich method can be advantageous because the sandwiched drop surface area is greatly reduced, which slows the equilibration rate substantially. Slowing the crystallisation process can lead to fewer, larger, better quality crystals, but the sandwich process does not lend itself easily to the economies of scale and throughput of sitting drop crystallisation. In capillary dialysis crystallisation a narrow bore capillary is filled with protein by capillary action and sealed at one end with low melting point wax. The open end of the capillary is placed into a plug of agarose gel into which the crystallisation reagent has been impregnated. Protein and crystallisation reagent diffuse to set up a range of unique crystallisation conditions within one capillary. The batch method has also been miniaturised; here small volumes of protein and crystallisation screen are mixed in a multiwell batch plate and maintained under a layer of paraffin oil to prevent the drop from drying out.

In the absence of crystals or poorly diffracting ones, the construct could be modified, or limited proteolysis in situ could be used to remove disordered regions [28]. Lysines can be methylated [29] and lysines and other residues mutated to alanine [30] to reduce surface entropy. Cysteines can be modified [31] to reduce redox sensitivity. Differential scanning fluorimetry and other biophysical techniques [32] may be used to improve the protein stability as a function of buffer composition (i.e. buffer type, pH, salt).

16.3.2.1 Crystallisation Optimisation

Diffraction data of sufficient quality to solve structures can be obtained for most ‘normal’ projects using third generation synchrotron beamlines and crystals harvested direct from screening experiments that may be no bigger than 10 μm × 10 μm × 10 μm in size. However, if screen yields crystals that are too small (larger proteins or protein complexes often yield small, poorly diffracting crystals) or will be the subject of ligand soaks or experimental phasing, the crystallisation condition must be reproduced and optimised. Optimisation involves varying the concentration of the components of the crystallisation condition one at a time, and is performed normally in a 6 × 4 Linbro tissue culture tray. Typically, the precipitant concentration or type is varied in the longest dimension and the salt concentration or type in the shortest. Hanging drop vapour diffusion in Linbro plates is simple to achieve and is convenient for opening and closing the drop for harvesting. The cover slip is suitable for sample manipulation without changing the microscope focus: the less time spent trying to keep visual track of the crystal, the easier it is to harvest. The pH can be screened by repeating the optimisation tray at a different constant pH or by changing the buffer molecule itself, as in rare occasions it can contribute to the crystal lattice (e.g. [33]).

16.3.2.2 Crystallisation Seeding

If the crystallisation experiment reaches the metastable zone but without forming nuclei spontaneously (Figure 16.2), the introduction of seeds can produce controlled and reliable crystallisation of the protein. Seeds can be obtained from low quality crystals, crystalline precipitate and even inert PTFE or ceramic shards. There are two common seeding approaches. In macroseeding, a crystal (or crystals) is broken into several pieces and each macroseed is transferred into single crystallisation drops that have already equilibrated with their mother liquor, but in which no crystals have grown. The macroseed can be washed with mother liquor with a lower precipitant concentration to partially dissolve the outer layers of the macroseed. The seed encourages protein molecules to arrange themselves on to the introduced crystal lattice and thus a larger crystal can grow.

There are several ways to initiate crystal growth by microseeding, including in microfluidics platforms [34], and several approaches may be tried to find the best outcome. First, a microseed stock is generated by harvesting a small number of poor‐quality crystals into an Eppendorf tube containing a few μl of crystallisation solution. The crystals are ground with the blunt end of a pipette tip and a ∼10−1–∼10−6serial dilution of the microseeds is made. A dilution series is required so that just a few good‐quality crystals grow instead of a shower of thousands of small crystals. The protein is doped with aliquots of the microseed stocks and robotic crystallisation trays are set up normally. Alternatively, equal volumes of protein, microseed and crystallisation solution are mixed for setting up a manual crystallisation plate by optimising the original condition. Cross‐seeding can be used to re‐screen conditions where nucleation did not occur with the protein alone. Finally, microseeds can be introduced to crystallisation drops that have reached equilibrium but without producing crystals with the use of a thin, flexible fibre such as a cat's whisker. Here the whisker is dipped into a seed stock and drawn over the top of 1–3 drops in succession, so a few microseeds are transferred from the whisker to the drop to nucleate crystal growth.

16.3.3 Cryoprotection

The natural form of crystalline ice on earth is sometimes referred to as hexagonal ice. It was predicted by Hass and Rossmann ∼50 years ago that crystalline ice formation in the solvent channels of protein crystals will disrupt or break the crystal lattice [35]. Cryoprotectants are small molecules such as organic solvents, salts or small organic compounds that inhibit the formation of hexagonal ice in protein crystals [36]. Alcohols have low melting points but are sometimes incompatible with proteins. Salts are widely used on the roads in winter to prevent ice formation and saturated salts can also be used to cryoprotect protein crystals [37]. Polyethylene glycols (PEGs) are widely used as precipitants in protein crystallography, and hence PEG 400 is a very popular cryoprotectant. Other than undiluted, saturated salts, most cryoprotectants are used in the 20–30% range. Ideally the cryoprotectant replaces water and maintains the reservoir composition, assuming that stock solution concentrations can be made high enough to supplement the crystallisation condition with an appropriate cryoprotectant. It is tedious and time consuming to make each mother liquor and cryoprotectant combination from scratch for a large number of crystallisation conditions and often it is possible to simply dilute the reservoir directly with the cryoprotectant (e.g. 8 μl of reservoir and 2 μl of PEG 400 to achieve a 20% PEG 400 cryoprotectant). Some cryoprotectants are immiscible with the reservoir (for instance ammonium sulphate and ethylene glycol) and inspection of the cryoprotectant under the microscope will identify unwanted phase separation.

If the sample diffracts badly, its diffraction at room temperature should be tested to determine if the cryoregime is the root cause of the poor diffraction. The crystal should be mounted in a loop with a viscous oil, such as paratone N, which will protect it from dehydration at room temperature for long enough to establish the sample's base diffraction properties. Some crystallisation conditions are already cryoprotected because of their composition (e.g. the entire MORPHEUS screen) and no additional cryoprotection steps are required.

Glycerol has been used extensively as a cryoprotectant and the concentration required to cryoprotect common crystallisation conditions has been published [38]. Glycerol increases protein solubility and can dissolve protein crystals [39]. Glycerol can also be found in the binding pockets of carbohydrate binding and processing proteins instead of the native ligand when glycerol is used as a cryoprotectant because 20% glycerol, 2.1 M, outcompetes the binding of the desired ligand that is normally often present at mM levels (e.g. [40]).

There are many methods and cryoprotectants available and we direct the reader to Ciccone et al. [39] and Parkin et al. [41].

16.3.3.1 Mounting

Increased automation, especially at synchrotrons, has necessitated mount type and length standardisation. Several crystal mounts, pins and loops are available from commercial suppliers (e.g. www.mitegen.com, www.hamptonresearch.com and www.moleculardimensions.com). The mounts come in a variety of shapes, sizes and materials, including nylon, polyimide and other polymers (Figure 16.12). To reduce background on the diffraction image and to aid sample visualisation, the size of the loop should be matched to that of the crystal.

Image described by caption.

Figure 16.12 Typical and specialised mounts. (a) Litholoop (Molecular Dimensions). (b) Nylon loop (Hampton Research). (c) Micromount (Mitigen). (d) Inclined mount that can be used for rods or needles (Mitigen). (e) Dual thickness microloop useful with viscous material such as LCP (Mitigen). (f) Mesh loop that can be used to support plates or harvest microcrystals (Molecular Dimensions).

If no crystal manipulation is necessary, samples can be harvested and cryocooled directly. Otherwise 0.5–1 μl of the cryoprotectant should be placed next to the sample‐containing drop, ideally in the same focal plane (Figure 16.13). The crystal is picked up with the loop and transferred to the adjacent cryoprotectant drop. The loop is rinsed in the cryoprotectant, away from the crystal, before the crystal is removed from the cryoprotectant and cryocooled. This procedure should be completed within a few seconds to avoid dehydrating the sample, so all necessary tools should be to hand. If longer times (minutes to overnight) are required for ligand soaking and/or cryoprotection, the drop must be kept sealed within the experimental chamber.

Photo of cryoprotecting on a 96 well plate format with arrows indicating crystal and loop, and parts labeled reservoir and metal pin of the loop.

Figure 16.13 Cryoprotecting on a 96 well plate format. The condition H8 was opened using a scalpel. A drop of cryoprotectant has been placed on the centre ledge between the reservoir and wells. A crystal and a nylon loop are in the solution and the metal pin of the loop is visible in the bottom right hand corner of the image, but slightly out of focus.

Acupuncture needles to remove skin (made of unfolded proteins) from the surface of the drop are useful and can also be used to break crystals free from the well or separate multiple, intergrown crystals. A cat's whisker glued to the end of a laboratory pipette tip, mounted on a pencil for handling, is useful as it is just stiff enough to gently push a crystal around without damaging it.

16.3.3.2 Cryocooling

The sample must be cryocooled rapidly once cryoprotected. Since speed is essential to vitrify water without crystalline ice formation, cryocooling by plunging the sample directly into a liquid nitrogen bath is best. Slow cryocooling can result in hexagonal ice formation, even if the sample is cryoprotected adequately. Hexagonal ice has a translucent, white appearance and should be avoided. The cryocooled samples are placed into a pre‐labelled cryovial while being retained under liquid nitrogen by careful juxtaposition of the mount and cryovial (Figure 16.14); such samples can be stored indefinitely in liquid nitrogen dewars.

Image described by caption.

Figure 16.14 Sample handling material. (a): Metal cane with one vial and its mount. The cane can hold five vials. A single vial is on the right side of the cane. On the left there is a plastic sleeve in which the cane is sheathed prior to storage. (b): Tall metal dewar used to carry canes. (c) Tall foam dewar. Both are high enough to take a cane under liquid nitrogen.

When liquid nitrogen is exposed to ambient air, moisture condenses on the cold nitrogen gas forming flocculent ice, which should not build up in the liquid nitrogen dewar. Flocculent ice can be removed from liquid nitrogen by filtering through ‘blue roll’ in a metal funnel. The cryomount components can build up static and attract flocculent ice, and this should be avoided. Filling the liquid nitrogen bath and dewars used in cryocooling (Figure 16.15) completely will dilute the potential effect of condensing moisture. Replacing liquid nitrogen frequently or recycling liquid nitrogen by filtering through blue roll into a clean dry dewar every five samples will reduce the risk of flocculent ice. A tall dewar should be used because flocculent ice tends to sink in liquid nitrogen.

Image described by caption.

Figure 16.15 Foam dewars. (a) A typical foam dewar used for cryocooling samples. (b) A specialised dewar used for carrying sample holders called pucks, which hold up to 16 samples at once.

16.3.3.3 Storage and Transportation

Cryocooled samples can be kept indefinitely in storage dewars that are topped‐up regularly with liquid nitrogen. ‘Dry shippers’, dewars containing an inner, sponge‐like compartment that can be saturated with liquid nitrogen, are used in transportation. Saturated dry shippers can maintain sub‐100 K temperatures for more than one week [42]. Dry shippers should be dried out completely and routinely to remove water. A dry shipper at room temperature can be chilled to operational in less than 30 minutes if refilled with liquid nitrogen as soon as it evaporates. The shipper is ready for use once the liquid nitrogen within stops boiling, but prior to transportation the excess liquid nitrogen must be decanted.

16.3.4 Data Collection Strategies

Data collection strategies depend on the type of available X‐ray source and goniometry [4345]. Many laboratories have stopped investing in in‐house X‐ray facilities, because of the current excellent access to synchrotrons and the high recurrent cost of maintaining in‐house facilities; therefore, much of the following concerns synchrotron data collection strategies. The primary goal is normally to obtain a complete diffraction dataset to the highest resolution possible while avoiding data corruption through radiation damage. The more brilliant the incident X‐ray beam, the stronger is the signal. A bigger crystal increases the signal but may have more internal defects, such as higher mosaic spreads, and a better sample may be required. Each crystal has an inherent diffraction limit and increasing the X‐ray dose will not result in higher resolution; instead poorer quality data will be collected because of radiation damage.

The following discussion focuses on ‘normal’ crystallographic problems. The challenges posed by very large unit cells and crystals exquisitely sensitive to radiation damage (e.g. membrane proteins, viruses and the ribosome) are not covered here, and the reader should consult an excellent recent review that addresses these problems and how they can be overcome [46].

16.3.4.1 Mounting and Centring the Sample

The crystal is placed on the goniometer either manually or through a robotic sample changer and maintained at a cryogenic temperature throughout. The sample is centred in the intersection of the X‐ray beam and the rotation axis of the goniometer and while historically the centring was done manually it is now commonly software‐driven.

Any surface ice detected ought to be removed by manual washing of the sample with a 25 ml plastic pipette that has been filled with dry liquid nitrogen. Some synchrotron beamlines provide a crystal wash device. If a wash device is not available and/or the diffractometer environment has insufficient space to permit manual washing of the sample, simple unmounting of the sample into liquid nitrogen before returning it to the goniometer is often sufficient to remove at least some of the surface ice.

16.3.4.2 Collecting the First Few Images and Indexing

The inherent diffraction properties of the crystal are assessed and the crystal symmetry, which impacts on the data collection strategy, determined. Then two, three or even four initial images are collected at relative phi/omega positions of 0°, 90°, and if necessary 45° or 30° and 60°. The oscillation angle depends on the unit cell parameters, the strength and quality of the diffraction and the detector characteristics, but ordinarily an oscillation angle of 0.5° is sufficient. The exposure time and transmission have to be determined and in the absence of previous data on the crystal type, or experience with the equipment, it is best to start with conservative values and increase the exposure time and/or the transmission until the crystal's intrinsic diffraction limit is reached.

These initial images are used to determine the basic symmetry of the crystal, its unit cell parameters and the orientation of the crystallographic symmetry axes in relation to the axes of the experimental setup. Indexing software calculates the 3 × 3 orientation matrix that describes the relationship between the reflections to the position of the lattice plane. The nine values of the matrix can be considered as a combination of the six unit cell dimensions and three orientation values. The likely Bravais lattice is chosen as the actual space group might not be discernible until the data have been processed or even until the structure is solved. More than one data collection strategy might be suggested but additional considerations will apply in experimental phasing problems, which are described in sections 16.3.5.1 to 16.3.5.4. For more detailed information on data collection strategies the reader is referred to these excellent articles [44, 46].

Some detector types, such as silicon pixel area detectors (PADs), yield optimal signal‐to‐noise ratios by ‘fine slicing’ with 0.1° oscillation angles per image because of the fast, noise‐less readout of PADs. It is almost impossible to see the weak high resolution reflections under these conditions, but current data processing packages handle these data easily. Finally, the exposure time and transmission should be considered. For example, if the best resolution is attained using 0.5° oscillations for a 0.5 s exposure at 50% transmission, then in fine slicing mode 0.1° oscillations at 0.1 s exposure and 50% transmission should be selected. If the data acquisition rate of the detector is sufficiently fast the exposure time can be halved and the transmission doubled to keep the same overall dose; in this scenario the parameters will be 0.1°oscillations at 0.05 s and at 100% transmission. The crystal system and an exposure time per degree at a given transmission are input factors to programmes such as RADDOSE [47] that supplement the data collection strategy to avoid exceeding the Henderson limit [8]. The weak data – those with low background‐to‐noise ratios – in the highest resolution zones of the image contain the crucial atomic detail of the structure; if the data processing reveals that they have been recorded poorly they can be discarded later.

The diffraction pattern should contain well‐resolved spots, without overlaps or elongated, ‘streaky’ profiles. Ideally, the diffraction should have the same resolution limits in all directions and similar resolution limits on any image. If this is not the case the sample is anisotropic, a phenomenon illustrated well in plate‐like crystals where the diffraction may be reasonable when the X‐ray beam traverses the long dimensions of the crystal but very weak when traversing the shortest. The incident beam can be made smaller on some synchrotron beamlines and the entire crystal scanned in a grid to find a region with improved diffraction. The incident beam should not exceed the sample size to reduce background scatter. Diffraction from crystalline ice forms rings at resolutions of 3.90, 3.70, 3.44, 2.67 and 2.25 Å if the cryoprotection regime has been unsuccessful and the diffraction from the macromolecule is likely to be badly affected. Poor inherent diffraction can be improved by annealing; the poor mosaic spread of a badly diffracting crystal suffering from internal disorder can be improved by a warming and cooling cycle. In the absence of additional crystals or crystal forms, annealing is worth trying but its success rate is somewhat limited.

Finally, multiple crystal lattices can be detected during the initial stages of data collection. Multiple lattices result in additional reflections that cannot be predicted by any indexing solution. Some reflections overlap and their deconvolution is not straightforward as the relative contribution from each lattice is not trivial to determine. Therefore, only one crystal per loop should be mounted. At synchrotrons it may be possible to position the loop or minimise the beam cross‐section so that only one crystal is illuminated by the X‐ray beam and hence only a single diffraction pattern is recorded.

16.3.4.3 Sacrificial Sample

If more than one similar sample is available, one crystal can be ‘sacrificed’ by assessing its radiation sensitivity on the synchrotron beamline so that the user can use exposure and transmission settings that are close to what the next sample is likely to tolerate. In addition to the potential loss of a sample through mishandling, a sacrificial crystal means that several samples of the same crystal should be sent to each synchrotron trip. Fortunately, the universal puck system can hold up to 112 samples in a dry shipper (Figure 16.16) and higher density systems are becoming available.

Image described by caption.

Figure 16.16 Dry shipper and pucks. (a) A universal V1‐puck. The puck cover is on the left and the puck plate on the right. 16 mounts are on the plate. (b) A puck shelf for transportation of up to 7 pucks. (c) A dry shipper where the puck shelf can be loaded and kept cold for >2 weeks under regular conditions.

16.3.4.4 Indexing and Integration

Data indexing and integrating programmes are now highly automated, but it is important to ensure the right Bravais lattice has been selected so the correct data collection strategy is decided. iMosflm [48], XDS [49] and DIALS [50] are excellent for indexing and integrating and complement the Xia2 [51] and AutoPROC [52] automated pipelines.

16.3.4.5 Scaling

The Phenix [53] module Xtriage and the CCP4 [54] packages Aimless [55] and Pointless [56] are under constant development. Scaling of diffraction data is straightforward as many tasks are automated and various diagnostic subroutines direct decision making. The diagnostics include space group determination and ice ring and anisotropy detection. The output reports should be considered and action taken to mitigate data defects. A key decision is the high resolution cut‐off [55]. While it is desirable to have as much atomic detail as possible, it is detrimental to subsequent processes to include data that are simply not observed. Many metrics are computed to determine the resolution limit and the following are merely a guide. Ideally, the high resolution shell should have a completeness above 95%, a ½ correlation coefficient (CC½) above 50% and a signal‐to‐noise (I/[σ]I) ratio above 1.5. However, some compromises may be necessary and lower data completeness levels can still be useful as every reflection carries some information about the entire structure. In SAD experiments, the data collection strategy is the most critical step and the anomalous signal (often 1 Å less than native resolution) is much more important than the high resolution cut‐off. Highly redundant data are critical for experimental phasing, especially for weak anomalous scattering signals, and should be as high as possible without incurring radiation damage. If radiation damage is detected during scaling, the affected images should be removed.

16.3.5 Native Data Collection, Molecular Replacement and Ligand Soaks

The simplest structures to solve have the simplest data collection requirements. The data collection requirements are low when the structure is already known and the experimental question is ‘How does this small molecule bind to my protein?’ or ‘How does this mutation affect the protein structure?’ The aim is to collect complete, as high resolution data as possible without compromising data quality. The signal‐to‐noise ratio is set by the exposure time and transmission per image and the overall dose should be pushed close to the Henderson limit. The resolution should be below 2 Å to observe most of the ordered water molecules that play important roles in mediating protein–ligand complexes. The data completeness ought to exceed 90–95% and the data redundancy, the number of times single reflections have been recorded, can be as low as 2–3. In molecular replacement, the data should be complete and well‐recorded and it would be better to double the redundancy and ensure 100% completeness for the loss of 0.2 Å in resolution. These important parameters are all defined in the data collection strategy.

16.3.5.1 Experimental Phasing

If molecular replacement cannot or fails to solve the structure, phase angles for each reflection must be determined de novo. The most commonly used method to this end is anomalous scattering from selenomethionine‐labelled proteins. Briefly, the recombinant protein is expressed in a medium in which methionine is replaced by selenomethionine, where selenium replaces the naturally occurring sulphur [ 17,57,58]. Methionine‐auxotrophic E. coli strains ensure that natural methionine is not biosynthesised and bacterial growth thus depends upon selenomethionine in the medium. Selenomethionine‐labelled proteins should purify and crystallise similarly to the wildtype, though additional care (and reducing agents) might be necessary to avoid selenium oxidation. Inherent transition metals, such as zinc, can also be used for phasing and before a selenomethione‐substituted sample is made the potential metal content of native crystals should be tested by a fluorescence emission scan on a tunable synchrotron beamline. The two approaches to structure solution based on anomalous scattering are covered next.

16.3.5.2 Single Wavelength Anomalous Scattering – SAD

SAD experiments, like all others, depend upon a well‐cryoprotected sample that diffracts sufficiently well to be indexed. However, additional factors have to be considered. An incident radiation energy (or wavelength) must be selected for recording sufficient anomalous differences to solve the structure. On a tunable, synchrotron X‐ray beamline, the user can collect data at the optimal wavelength and webservers (e.g. www.bmsc.washington.edu/scatter) used to estimate the likely SAD signal at any given wavelength for any atom (Figure 16.11).

On tunable synchrotron beamlines the sample can be scanned for element‐specific fluorescence emission, allowing fast identification of potential scatterers and the precise absorption edge that is related to each scatterer's local environment. In the absence of a scan the theoretical energy should guide the SAD data collection strategy, for example, above the Se K atomic absorption edge (12 657.8 eV; 0.9795 Å). The anomalous scatterer does not require an accessible edge for successful SAD. For example, SAD phasing can be done on light atoms such as sulphur or iodine at low energies (high wavelength) at a synchrotron (e.g. [59]) or on a home source (e.g. [6062]). The Diamond synchrotron beamline I23 is dedicated to low energy data collection and the beamline operates under vacuum to minimise air absorption of scattered X‐rays. It is critical in SAD experiments to measure the intensities of Friedel pairs accurately to determine their anomalous differences. In comparison to the absolute intensity of each reflection, the intensity differences between Friedel pairs are very weak and their accurate measurement is challenging. Each Friedel pair is measured multiple times to achieve the necessary accuracy. Since solving the anomalous scatterer substructure is heavily dependent upon highly redundant data (perhaps greater than 50‐fold for weak anomalous scatterers such as sulphur [61,63]) very conservative data collection strategies are required for the sample to survive such large X‐ray doses. The anomalous scatterer is prone to suffering the greatest radiation damage exposed to a wavelength close to its atomic absorption edge [64]; therefore, the transmission and/or exposure time per image should be reduced. Under such conditions the crystal's inherent diffraction limit might not be reached and higher resolution data for refinement should be collected from another sample. The weak anomalous signal is the important measurement in SAD experiments and diffraction data in a SAD experiment are typically collected 0.5–1 Å lower in resolution than the sample's diffraction potential. NucB, for instance, was solved by sulphur SAD using data to 2.26 Å, but refined against data to 1.35 Å [59].

It can be important to measure Friedel pairs in close temporal proximity to minimise the problem of reduced diffraction intensity as a function of absorbed X‐ray dose. The small intensity differences between Friedel pairs are lost if the crystal suffers radiation damage. This problem is addressed by the ‘inverse beam’, where a few degrees of data are measured followed by a 180° rotation to measure the equivalent wedge in order to collect the Friedel pairs as close together as possible. This process is repeated throughout the entire data collection strategy. The beamline goniometry may permit Bijvoet pairs (Friedel pair symmetry equivalents) to be collected on the same image if crystallographic symmetry axes can be aligned appropriately on the goniometer. The synchrotron beamline data collection strategy programmes should be used to determine how the data are best collected [6567].

16.3.5.3 Multiple Wavelength Anomalous Scattering – MAD

MAD data collection is dependent upon the accessibility of an atomic absorption edge for the anomalous scatterer. Classically, MAD is performed by collecting diffraction data at wavelengths corresponding to the peak, high energy remote and inflection of the atomic absorption edge, and the intensity differences between reflections at the different wavelengths underpin the phasing problem [ 17,68]. A fluorescence emission scan is recorded to determine the sample's inflection and peak, which may be separated by only 4 eV, a tiny difference in comparison to the selenium K atomic absorption edge of 12 657 eV. Peak and inflection energies can vary between samples [69] and it is important to conduct the fluorescence scan on the sample to be used for data collection.

The maximum anomalous differences measured at the atomic absorption peak are supported in MAD by the maximum dispersive differences between the high energy remote and the inflection point. Both the dispersive and the anomalous differences therefore contribute to the phasing potential in MAD. Redundant data, not as high as in SAD, are required for the peak data set in three‐wavelength MAD. The potential for radiation damage means the transmission and/or exposure time must be considered so that complete, redundant data can be collected at each wavelength before the crystal suffers from the absorbed radiation dose. SAD and MAD each require well‐measured data to yield a successful structure solution. The choice of method adopted may be influenced by the solvent content of the crystal, the type of anomalous scatterer, prior knowledge on the sensitivity of the sample to X‐rays and the experience of the beamline staff.

16.3.5.4 Other Experimental Phasing Techniques

In isomorphous replacement ‘native’ crystals are soaked, or co‐crystallised, with heavy atoms to form a ‘derivative’ [70]. Reflection intensity differences between the native and derivative(s) are used in the structure determination process. The change in reflection intensity is a function of the number of heavy atoms introduced, their atomic number and the overall mass of the target protein. For a 100 kDa protein, the introduction of a single copper, zinc or iron ion will change the average reflection amplitude (the square root of the intensity) by ∼5%, whereas a single uranium, platinum, gold or mercury ion will result in a much bigger change, 10–20%. The heavy atom(s) may have occupancies of less than unity, reducing the mean amplitude difference in comparison to the native and increasing the difficulty in solving the heavy atom substructure.

Under certain circumstances a single heavy atom derivative – SIR – can be sufficient for structure solution if the phase ambiguity can be solved by anomalous scattering or density modification, but usually multiple different derivatives – MIR – need to be identified. The native and derivative crystals must be isomorphous, i.e. the crystal symmetry, unit cell dimensions, and crystal packing must be unaffected by the addition of the heavy atom as unit cell dimension changes of 0.5% result in intensity changes of ∼20% [71] and the failure of the structure solution. The crystallisation conditions and the buffers and cryoprotectants used in heavy atom soaking should therefore not differ from those used to handle the native crystal. Isomorphous replacement and anomalous scattering can be combined in methods called SIRAS and MIRAS.

In the final part of this section we consider slightly more ‘exotic’ means by which structures can be phased experimentally. It is possible to exploit radiation damage in a process called radiation‐damaged induced phasing (RIP). The process is predicated on deliberate, partial destruction of sample integrity, but it is usually difficult to judge how much radiation damage is sufficient to produce a substantial enough change to solve the structure [72]. UV radiation can also be used instead of X‐rays to induce damage to the protein crystal [72]. However, RIP is not universally popular because a well‐conducted SAD or MAD experiment will yield superior phases. Inert gases have been used in structural studies of globins for over 50 years [73]. Myoglobin crystals in capillaries filled with nitrogen at 140 atm bind non‐reactive gas molecules in hydrophobic cavities [74]. Pressure cells have been developed so that protein crystals mounted in loops can be ‘soaked’ in xenon gas, which acts as the ‘derivative’ at 0.5–5 MPa before flash‐freezing and collecting anomalous scattering data from the bound xenon(s). Cavities can also be targeted by halide salts; here native crystals can be derivatised by cryoprotecting with solutions containing halide ions (up to 1 M concentration) for less than one minute [75]. Water molecules in surface features are displaced by the halide ions and their binding, albeit at low occupancy, can provide sufficient anomalous scattering signal. Several different soaking regimes, where time of soak and/or concentration of halide are varied empirically, may be necessary for successful structure solution. Large polymetal clusters, including clusters of 6 tantalums and 18 tungstens, have phased large macromolecules like ribosomal subunits [76]. The ‘magic triangle’, 5‐amino‐2,4,6‐triiodoisophthalic acid bromide (aka I3C), which contains three iodine atoms in an triangular shape, has also found utility in experimental phasing [77]. The polymetals and I3C are giant features in initial low resolution electron density maps [78], but because of their large size they tend to be used for solving the structures of megalithic macromolecules where space might be available for their binding. Finally, probabilistic relationships and phase set reliability determination can be used to generate a model derived solely from the measured reflection intensities without the need for phasing by molecular replacement or experimental procedures, a process called ab initio phasing. The number of independent atoms (normally >1000 atoms) and the relatively poor diffraction of most protein crystals (worse than ∼1.2 Å) has limited the impact of ab initio methods on macromolecular crystallography. However, two recent software developments are lowering the requirements for ab initio phasing – Arcimboldo and Ample – for data as low as 2 Å. Both programmes use fragments such as small α‐helices combined with density modification routines to obtain initial phase information and this approach has proven successful for data as low as 2 Å [7981].

Many heavy atom salts are seriously deleterious to health on contact with skin or by inhalation. Some accumulate as a neurotoxin, others are carcinogenic and/or teratogenic and uranium is radioactive. Therefore, critical attention to safety considerations must be paid at all times when working with heavy atoms.

16.3.6 Data Processing

After the data have been collected the intensities of all reflections measured are averaged. Friedel pairs for anomalous scattering have to be separated. The occasional rogue reflection, which might have impinged upon the detector at the same place as an ice ring or at the junction between panels in a mosaic detector, has to be recognised and discarded. Synchrotron beamlines usually have semi‐automatic, background‐running routines for data processing. It is probably unnecessary to repeat the data processing manually for simple structure solutions unless data need to be trimmed because of radiation damage, for instance. However, for more difficult structure solutions, indexing, integration and scaling routines should be re‐run and optimised. At this juncture, the structure will be solved by difference Fourier methods, molecular replacement, experimental phasing or potentially by direct methods (which is deliberately not discussed further as it is a procedure beyond the scope of this chapter) and each of the other methods is described briefly in sections 16.3.6.1 to 16.3.6.3.

16.3.6.1 Difference Fouriers

The simplest scenario is when the target macromolecule has crystallised in the same crystal form from which the structure has already been solved and the question is, for instance, ‘Where does my ligand bind?’ The differences in amplitude between reflections from the unbound protein and the ligand‐bound form are used with phases calculated from the protein model to generate electron density maps (aka difference Fouriers) identify the location of the bound ligand.

16.3.6.2 Molecular Replacement

The first step is to identify a PDB search model from the PDBe (www.ebi.ac.uk/pdbe/entry/search/index/?advancedSearch:true=). While a high resolution structure of the same protein, or protein family, might theoretically be the best search model, the PDB validation metrics will confirm its overall quality. The PDB search model should be edited to remove ligands and solvent, additional copies if there is more than one molecule per asymmetric unit and all macromolecules that are not relevant for the search. Search models can be edited extensively to match best the target molecule. For instance, the target sequence can be modelled by homology on to the search model [82] or converted to a polyalanine chain if the sequences are too divergent. Proteins with multiple domains can be split into their components in case the domain arrangements in the target differ from the search model. If the target is multimeric a search model can be constructed as a single chain to search for multiple copies of the protein at once. Molecular replacement pipelines facilitate the process and automate search model preparation. For example, Balbes [83] uses an internal database of known domains to solve molecular replacement structures using only the target protein sequence and the diffraction data set provided. Mr Bump [84] also only requires the protein sequence and the observed data and locates and prepares search models automatically with its own algorithms. Molecular replacement programmes with more user control include Molrep [85] and Phaser [86], for which the edited search model has to be provided. When several structures with similar folds are available, a structural ensemble, built by careful superposition of the individual models, is an effective approach in Phaser. The ensemble should be edited to remove structurally distinct regions such as flexible surface loops. Arcimboldo [87] and Ample [88] pipelines solve the phase problem using in silico generated fragments based solely on sequence homology. Some rounds of refinement are typically run automatically to assess the solution quality. If the Rwork is below 45% the phase problem for that structure is probably solved. Solutions above 50% need to be inspected closely, as they are either wrong or require significant and iterative adjustments to the model. The reliance on previously determined structures is not normally a problem in molecular replacement, but if the structures of search and target molecules are too distinct or too incomplete, then molecular replacement will fail.

16.3.6.3 Experimental Phasing

Here the correct location of the heavy atom(s) (from 2 sulphurs in a disulphide to 18 tungstens in a polymetal cluster) is used to calculate experimental phases. If the heavy atom location(s) can be determined by the isomorphous differences between native and derivate datasets, or from the anomalous differences in anomalous scattering, an electron density map can be calculated. These map calculations use initial phase estimates that are sometimes far from the correct value and consequently initial electron density maps can be poor, limited to contrast between solvent and protein. Secondary structure features may be apparent, but it may not be possible to map the 1D protein sequence to the 3D electron density. Breaks in the peptide chain are common, rendering map interpretation difficult to impossible. Thankfully, electron density maps from well‐conducted experiments are usually much better than this gloomy description. Density modification routines generally enhance the quality of the initial electron density map greatly. There are two main density modification procedures, solvent flattening and averaging. Noise in the electron density map is present randomly whereas the electron density in the solvent regions between molecules should be flat and close to zero. ‘Flattening’ the electron density in solvent regions will increase the signal‐to‐noise ratio of the rest of the map and improve the electron density map quality [89]. This process is more effective the greater the solvent content and can lead to substantial improvements in map quality and the ease by which it is interpreted.

In averaging, a mask is defined around the protein (or even a domain) and the spatial relationship of this part of the structure to equivalent parts is defined; these are NCS operators and knowing them is essential for averaging of the electron density corresponding to equivalent parts of the structure. Averaging can be performed between different crystal forms of the same protein, called multicrystal averaging, if multiple datasets with some phase information for at least one of the datasets are available and if the symmetry relationships between equivalent parts of the structure are known. Averaging is a very effective density modification tool once the masks and symmetry operators are defined correctly, and its power is proportional to the square root of the number of copies being averaged.

16.3.7 Model Building and Refinement

The model is subjected to alternating cycles of refinement (e.g. Refmac [90], Phenix.refine [53], Buster [91], SHELXL [92]) and rebuilding (e.g. Coot [20]) no matter how the structure was solved. Automated routines found in ArpWarp [93], Phenix autobuild [53] and Buccaneer [94] can expedite refinement by autobuilding the protein chain, thus removing the subjectivity of the human eye, and can add ordered solvent molecules. Though these routines are robust and reliable for building the protein chain, nothing beats an experienced eye when solving sporadic crystallographic molecular puzzles. Unexpected electron density ‘blobs’ that correlate to molecules from the crystallisation solution, cryoprotectant, or purification conditions that have serendipitously co‐crystallised with the protein often inform on the biochemistry of the target [e.g. 95]. There are substantial differences in the level of detail provided by the model dependent upon the resolution and the quality of the data from which it is built and many refinement protocols can be followed, but there is insufficient space here to go into details. Briefly, a model refined with low resolution data (>3 Å) will benefit from conservative protocols such as rigid body refinement and NCS restraints if possible. Less conservative approaches can be used at high resolution (<1.5 Å), including approaches more typical of chemical crystallography [92]. B factors can be modelled anisotropically for which there should be at least six times the number of non‐H atoms in the asymmetric unit as there are unique reflections in the dataset. For multidomain proteins the TLS protocol can be advantageous and there are on‐line resources (http://skuld.bmsc.washington.edu/∼tlsmd) to generate TLS groups. Ordered water molecules and bound ligands should be built as late on in the refinement process as possible in order to avoid biasing the model.

Broad consensus on how best to model disorder has yet to be reached. Alternative conformations can generally be modelled at high resolution and limited information on the position of long side chains such as lysines and arginines is common because of their disorder. The disordered atoms can be removed from the model, but users of the PDB who are not crystallographers may not notice and may even mistake affected residues for alanine. Alternatively, the occupancy of disordered atoms can be set to zero and so will be ignored during refinement, but the naïve PDB user will not easily notice this. Finally, the B factor of the disordered atoms can be allowed to rise to meaningless values (e.g. above 100 Å2); the PDB user must display B factors while viewing the protein to identify disordered regions that have been treated in this way, which is trivial for the popular graphics programmes PyMol (https://pymol.org) and CCP4MG [96].

Refinement is completed when every feature in the electron density map is accounted for rather than fixating on a target final R work/R free. These values report on the global fit of the model to the data and the ambition is to lower them throughout refinement. If the data are being interpreted correctly, both R work and R free will diminish. If the R free rises, the model is probably being overfitted or fitted incorrectly to the data. However, it is impossible to account for everything in every structure and those parts of the electron density that cannot be explained rationally should not be modelled. Towards the end of the refinement, strong peaks in the difference map should have been accounted for and the electron density contour level displayed should be adjusted to avoid looking only at noise. Throughout refinement a variety of geometrical parameters are reported, including the root mean square deviation for bond angles and bond length, and it is generally accepted that these should be no more than 2° and 0.02 Å, respectively. Weighting schemes within refinement packages may be adjusted to ensure that these values remain close to the accepted target. Model building is completed when the electron density is interpreted to the best of the user's knowledge. The deviance of the geometry and other reliability indicators of the structure from well‐refined models in the PDB of the same resolution should give rise to concern, a comparison that can be achieved with the polygon tool in Phenix [97].

16.3.7.1 Validation

It is likely during model validation that a return to model building and refinement will be necessary to correct any anomalies. Validation of the model is vital to ensure that it makes chemical, physical and biological sense so users have confidence in it. Validation of the model starts during its building – hydrophobic amino acids dominate protein cores and charged and polar residues are found mostly on the protein surface – and refinement. Aberrant stereochemistry should be addressed during model building and refinement because outliers to accepted targets are flagged by refinement programmes and some molecular graphics packages. Validation is formed of two parts, one concerning the quality of the model and a second that assesses the fit of the model to the data. Cross‐validation methods (the R free) that have been adopted from statistics should be applied during refinement and validation [21].

As a prelude to deposition at the PDB, the structure should be checked with Molprobity, distributed with Phenix and also available from its webserver (http://molprobity.biochem.duke.edu). Molprobity performs a series of analyses including optimising the intramolecular hydrogen bond network by ‘flipping’ asparagine, glutamine and histidine side chains where necessary. The validation menu in Coot is extensive and should be used to identify and fix problems. The PDBe validation service (https://validate‐rcsb‐1.wwpdb.org) runs many deposition checks before the model is deposited and freely available. Finally, the PDB and its mirrors (RCSB, PDBe, PDBj) display graphical assessments of model quality in comparison to all other structures.

16.3.7.2 Deposition and Publication

It is a condition of publication in journals that both model and the data from which it is derived are deposited at the PDB with their release coincident with publication of the accompanying paper. PDB deposition ensures that model and data are archived together for others to use, and potentially reanalyse. Deposition also ensures that publically funded researchers' work is publically available. Depositing the structure entails validation and sanity checks on both structure and data, and provides an opportunity to record additional information about the structure solution that data mining routines cannot extract from the submitted files, including the crystallisation conditions. The PDB deposition generates a validation report required for the reviewing stage in many journals.

16.4 Applications

Next we summarise two applications of the methods and procedures that we have highlighted in this chapter; one concerns molecular replacement and throughput and the second describes exploiting anomalous scattering from sulphur.

16.4.1 Molecular Replacement, the Latest Automated Software Approaches and Drug Discovery

A drug discovery project, either at the fragment screening stage or during medicinal chemistry compound optimisation, requires a steady source of good quality crystals that diffract better than 2 Å to infer useful information about the binding mode of the ligand, the involvement of ordered water molecules and to track conformational changes in the protein. The crystal's solvent channels must allow molecules to access the ligand‐binding site if soaking experiments are used instead of co‐crystallisation. The crystal form should be robust to withstand organic solvents, such as DMSO, which may be required to solubilise the ligand. Whether co‐crystallisation or soaking approaches are used will depend upon the project and the equipment available. For example, ligands can be dispensed using ultrasound with the Labcyte Echo instruments, which dispense very small volumes of potentially rare or expensive ligands in a high‐throughput, highly automated fashion [98]. Hundreds of diffraction datasets might be collected over the course of a typical drug discovery project and to support such projects in academia the Structural Genomics Consortium and Diamond Light Source have developed the LabXChem pipeline, which allows hundreds of samples to be mounted in a day using robotics. The crystal mounts can be handled on any synchrotron beamline and the dedicated XChem beamline, I04‐1, has been designed specifically to handle high throughput projects. More details can be found on the beamline's home page (www.diamond.ac.uk/Beamlines/Mx/I04‐1.html). With automated sample centring, the whole experiment, from crystal mounting to structure solution, can be performed with little user intervention. Over the past decade many synchrotron sites have developed software pipelines to process data automatically. Space limitations do not permit a complete description of these pipelines, but the user uploads essential information (protein sequence, PDB search model, cell dimensions, space group) for each project to the IspyB database [99]. Xia2 [51] and Dimple [100] routines can be invoked for an automatic structure solution. The presence of bound ligands can be detected automatically using PanDDa [101], obviating the need to inspect manually hundreds of solutions that, in the case of fragment‐based screening, contain no bound molecule.

16.4.2 Use of Sulphur as an Anomalous Scatterer for Solving the Phase Problem

The first protein structure solved by SAD measurements was crambin [62], but the methodology has moved on considerably in the intervening 30 years, so to be current we summarise here the recent structure determination by sulphur‐SAD of a biofilm‐degrading endonuclease, NucB [59]. NucB is a 12 kDa protein with four sulphur atoms from two cysteines in a disulphide and two methionines. The crystals used for phasing and high resolution refinement were obtained from different conditions, but both were used directly from initial robotic screens without subsequent optimisation. Both conditions contained PEGs, so reservoir solutions were supplemented with 20% PEG 400 for cryoprotection. The anomalous scattering data for phasing were collected on the Diamond microfocus beamline, I24. The X‐ray energy was set to 6500 eV (1.907 Å), which represents a compromise between a low energy for maximising the anomalous differences and the absorption of the crystal's scattering by air. With improvements in both hardware and software since these data were collected, 7500 eV should now be used. Though the data were collected on a microfocus beamline, the beam dimensions were 50 μm × 50 μm and the microfocus capabilities of the beamline were not used. The photon flux of I24 is one order of magnitude above the standard macromolecular beamlines at Diamond and a 9% beam transmission, 0.1° oscillations with 100 ms exposure times were used. The beamline did not have a multiaxis goniometer at the time of data collection so 999° of data were collected to increase data redundancy to 32‐fold for accurate measurement of weak anomalous differences. The crystals diffract much further than the 2.26 Å limit imposed in this phasing experiment in which data quality was the primary driver and a useful anomalous signal was maintained to the resolution cut‐off. The positions of the four sulphur atoms were found using ShelXD [102] from which initial phases were obtained. High resolution data, to 1.35 Å, were subsequently collected on the same P41212 crystal form, but from a crystal grown under different crystallisation conditions. The experimental phases were extended to the high resolution and the model was built automatically with ArpWarp [93]. The data collection and refinement statistics for the NucB structure solution can be found in Basle et al. [59] and we have plotted some of the validation results from this structure as an example of the tools that are available to assess PDB model quality (Figure 16.17).

Image described by caption.

Figure 16.17 Model validation with Polygon in Phenix. We have used the endonuclease NucB model (PDBid 5OMT) to generate the Polygon statistics plot. Polygon has compared the submitted model with 734 models at a similar resolution range. For each statistic the lower and higher limits and the mean is given. Each statistic is binned and a dark grey square equates to a low number of models, while the lighter grey squares represent a bin with many models. The submitted model statistics are plotted and link together with a solid line.

16.5 Concluding Remarks

Here we have provided basic theoretical background to X‐ray crystallography and a deeper overview of the practicalities as used today in our academic laboratory. During our careers there have been substantial and significant changes to crystallography. For instance, there is far more processing power in today's average smartphone than there was in the desktop computers on which our PhD work was completed and one of us remembers his first synchrotron trip and the data that were collected on to film. No doubt further significant advances are ahead, excluding the impact of X‐FELS and cryo‐EM in the immediate future. Users should remain abreast of technical advances that drive crystallography and subscribing to the CCP4 bulletin board is vital for this, for open answers to subscribers' questions and a good place to search for jobs. The books given in Further Reading should be consulted for a more in‐depth explanation of the topics we have merely introduced.

Acknowledgements

The authors would like to thank Ehmke Pohl, Peter Moody, Simon Booth, Vincent Rao, Daniel Wood, Sema Ejder and Jon Marles‐Wright for suggestions on how to improve this chapter; Juan Sanchez‐Weatherby for the tip on removing flocculent ice from liquid nitrogen; Simon Booth for help with photography; Chloe, Rupert and Humphrey for supplying our lab with cats' whiskers; and the following funding agencies for their support of the Newcastle Structural Biology Laboratory at various points since its inception in 2003: the Wellcome Trust, the Royal Society, the BBSRC, the MRC and last, but by no means least, the European Union.

References

  1. 1 Van Driessche, A.E.S., Van Gerven, N., Bomans, P.H.H. et al. (2018). Molecular nucleation mechanisms and control strategies for crystal polymorph selection. Nature 556: 89–94.
  2. 2 McPherson, A. and Gavira, J.A. (2014). Introduction to protein crystallization. Acta Crystallogr. F Struct. Biol. Commun. 70(Pt 1): 2–20.
  3. 3 Parker, J.L. and Newstead, S. (2016). Membrane protein crystallisation: current trends and future perspectives. Adv. Exp. Med. Biol. 922: 61–72.
  4. 4 Hanh, T. (2002). International Tables for Crystallography, Volume A: Space Group Symmetry. New York: Springer.
  5. 5 Svensson, L.A., Surin, B.P., Dixon, N.E., and Spangfort, M.D. (1994). The symmetry of Escherichia coli cpn60 (GroEL) determined by X‐ray crystallography. J. Mol. Biol. 235 (1): 47–52.
  6. 6 Antson, A.A., Brzozowski, A.M., Dodson, E.J. et al. (1994). 11‐fold symmetry of the trp RNA‐binding attenuation protein (TRAP) from Bacillus subtilis determined by X‐ray analysis. J. Mol. Biol. 244 (1): 1–5.
  7. 7 Blake, C. and Phillips, D.C. (1962). Effects of X‐irradiation on single crystals of myoglobin. In Proceedings of the Symposium on the Biological Effects of Ionizing Radiation at the Molecular Level. Int. Atomic Energy Agency 183–191.
  8. 8 Henderson, R. (1990). Cryo‐protection of protein crystals against radiation damage in electron and X‐ray diffraction. Proc. R. Soc. Lond. Ser. B Biol. Sci. 241 (1300): 6–8.
  9. 9 Owen, R.L., Rudino‐Pinera, E., and Garman, E.F. (2006). Experimental determination of the radiation dose limit for cryocooled protein crystals. Proc. Natl Acad. Sci. USA 103 (13): 4912–4917.
  10. 10 Garman, E. (2010). Radiation damage in macromolecular crystallography: what is it and why should we care? Acta Crystallogr. D 66 (4): 339–351.
  11. 11 Matthews, B.W. (1968). Solvent content of protein crystals. J. Mol. Biol. 33 (2): 491–497.
  12. 12 Acharya, R., Fry, E., Stuart, D. et al. (1989). The three‐dimensional structure of foot‐and‐mouth disease virus at 2.9 Å resolution. Nature 337 (6209): 709–716.
  13. 13 Rossmann, M.G. and Blow, D.M. (1962). The detection of sub‐units within the crystallographic asymmetric unit. Acta Crystallogr. 15 (1): 24–31.
  14. 14 Isaacs, N. (2016). A history of experimental phasing in macromolecular crystallography. Acta Crystallogr. Sect. D Biol. Crystallogr. 72 (3): 293–295.
  15. 15 Taylor, G.L. (2010). Introduction to phasing. Acta Crystallogr. D Biol. Crystallogr. 66(4): 325–338.
  16. 16 Scapin, G. (2013). Molecular replacement then and now. Acta Crystallogr. D Biol. Crystallogr. 69(11): 2266–2275.
  17. 17 Hendrickson, W. (1991). Determination of macromolecular structures from anomalous diffraction of synchrotron radiation. Science 254 (5028): 51–58.
  18. 18 Gonzalez, A. (2003). Faster data‐collection strategies for structure determination using anomalous dispersion. Acta Crystallogr. D Biol. Crystallogr. 59 (2): 315–322.
  19. 19 Jones, T.A., Zou, J.‐Y., Cowan, S.W., and Kjeldgaard, M. (1991). Improved methods for building protein models in electron density maps and the location of errors in these models. Acta Crystallogr. A 47 (2): 110–119.
  20. 20 Emsley, P., Lohkamp, B., Scott, W.G., and Cowtan, K. (2010). Features and development of Coot. Acta Crystallogr. D Biol. Crystallogr. 66(4): 486–501.
  21. 21 Brünger, A.T. (1992). Free R value: a novel statistical quantity for assessing the accuracy of crystal structures. Nature 355: 472–475.
  22. 22 Urzhumtsev, A., Afonine, P.V., and Adams, P.D. (2013). TLS from fundamentals to practice. Crystallogr. Rev. 19 (4): 230–270.
  23. 23 Slabinski, L., Jaroszewski, L., Rychlewski, L. et al. (2007). XtalPred: a web server for prediction of protein crystallizability. Bioinformatics 23 (24): 3403–3405.
  24. 24 Romero, P., Obradovic, Z., and Dunker, A.K. (2004). Natively disordered proteins: functions and predictions. Appl. Bioinforma. 3 (2‐3): 105–113.
  25. 25 Cuff, J.A., Clamp, M.E., Siddiqui, A.S. et al. (1998). JPred: a consensus secondary structure prediction server. Bioinformatics 14 (10): 892–893.
  26. 26 Magnani, F., Serrano‐Vega, M.J., Shibata, Y. et al. (2016). A mutagenesis and screening strategy to generate optimally thermostabilized membrane proteins for structural studies. Nat. Protoc. 11 (8): 1554–1571.
  27. 27 Montanier, C.Y., Correia, M.A., Flint, J.E. et al. (2011). A novel, noncatalytic carbohydrate‐binding module displays specificity for galactose‐containing polysaccharides through calcium‐mediated oligomerization. J. Biol. Chem. 286 (25): 22499–22509.
  28. 28 Dong, A., Xu, X., Edwards, A.M., Midwest Center for Structural Genomics et al. (2007). In situ proteolysis for protein crystallization and structure determination. Nat. Methods 4 (12): 1019–1021.
  29. 29 Walter, T.S., Meier, C., Assenberg, R. et al. (2006). Lysine methylation as a routine rescue strategy for protein crystallization. Structure 14 (11): 1617–1622.
  30. 30 Derewenda, Z.S. and Vekilov, P.G. (2006). Entropy and surface engineering in protein crystallization. Acta Crystallogr. D Biol. Crystallogr. 62 (1): 116–124.
  31. 31 Eiler, S., Gangloff, M., Duclaud, S. et al. (2001). Overexpression, purification, and crystal structure of native ERα LBD. Protein Expr. Purif. 22 (2): 165–173.
  32. 32 Groftehauge, M.K., Hajizadeh, N.R., Swann, M.J., and Pohl, E. (2015). Protein‐ligand interactions investigated by thermal shift assays (TSA) and dual polarization interferometry (DPI). Acta Crystallogr. D Biol. Crystallogr. 71 (1): 36–44.
  33. 33 Rismondo, J., Cleverley, R.M., Lane, H.V. et al. (2016). Structure of the bacterial cell division determinant GpsB and its interaction with penicillin‐binding proteins. Mol. Microbiol. 99 (5): 978–998.
  34. 34 Schieferstein, J.M., Pawate, A.S., Varel, M.J. et al. (2018). X‐ray transparent microfluidic platforms for membrane protein crystallization with microseeds. Lab Chip 18 (6): 944–954.
  35. 35 Haas, D.J. and Rossmann, M.G. (1970). Crystallographic studies on lactate dehydrogenase at −75 degrees C. Acta Crystallogr. B 26 (7): 998–1004.
  36. 36 Petsko, G.A. (1975). Protein crystallography at sub‐zero temperatures: cryo‐protective mother liquors for protein crystals. J. Mol. Biol. 96 (3): 381–392.
  37. 37 Rubinson, K.A., Ladner, J.E., Tordova, M., and Gilliland, G.L. (2000). Cryosalts: suppression of ice formation in macromolecular crystallography. Acta Crystallogr. D Biol. Crystallogr. 56 (8): 996–1001.
  38. 38 Garman, E.F. and Mitchell, E.P. (1996). Glycerol concentrations required for cryoprotection of 50 typical protein crystallization solutions. J. Appl. Crystallogr. 29 (5): 584–587.
  39. 39 Ciccone, L., Vera, L., Tepshi, L. et al. (2015). Multicomponent mixtures for cryoprotection and ligand solubilization. Biotechnol. Rep. 7: 120–127.
  40. 40 Tailford, L.E., Ducros, V.M., Flint, J.E. et al. (2009). Understanding how diverse beta‐mannanases recognize heterogeneous substrates. Biochemistry 48 (29): 7009–7018.
  41. 41 Parkin, S. and Hope, H. (1998). Macromolecular cryocrystallography: cooling, mounting, storage and transportation of crystals. J. Appl. Crystallogr. 31 (6): 945–953.
  42. 42 Owen, R.L., Pritchard, M., and Garman, E. (2004). Temperature characteristics of crystal storage devices in a CP100 dry shipping dewar. J. Appl. Crystallogr. 37 (6): 1000–1003.
  43. 43 Bourenkov, G.P. and Popov, A.N. (2006). A quantitative approach to data‐collection strategies. Acta Crystallogr. D Biol. Crystallogr. 62 (1): 58–64.
  44. 44 Dauter, Z. (1999). Data‐collection strategies. Acta Crystallogr. D Biol. Crystallogr. 55 (10): 1703–1717.
  45. 45 Krojer, T., Pike, A.C.W., and von Delft, F. (2013). Squeezing the most from every crystal: the fine details of data collection. Acta Crystallogr. D Biol. Crystallogr. 69 (7): 1303–1313.
  46. 46 Aller, P., Geng, T., Evans, G., and Foadi, J. (2016). Applications of the BLEND software to crystallographic data from membrane proteins. Adv. Exp. Med. Biol. 922: 119–135.
  47. 47 Bury, C.S., Brooks‐Bartlett, J.C., Walsh, S.P., and Garman, E.F. (2018). Estimate your dose: RADDOSE‐3D. Protein Sci. 27 (1): 217–228.
  48. 48 Powell, H.R., Johnson, O., and Leslie, A.G. (2013). Autoindexing diffraction images with iMosflm. Acta Crystallogr. D Biol. Crystallogr. 69(7): 1195–1203.
  49. 49 Kabsch, W. (2010). XDS. Acta Crystallogr. D Biol. Crystallogr. 66 (2): 125–132.
  50. 50 Winter, G., Waterman, D.G., Parkhurst, J.M. et al. (2018). DIALS: implementation and evaluation of a new integration package. Acta Crystallogr. D Struct. Biol. 74(2): 85–97.
  51. 51 Winter, G., Lobley, C.M., and Prince, S.M. (2013). Decision making in xia2. Acta Crystallogr. D Biol. Crystallogr. 69 (7): 1260–1273.
  52. 52 Vonrhein, C., Flensburg, C., Keller, P. et al. (2011). Data processing and analysis with the autoPROC toolbox. Acta Crystallogr. D Biol. Crystallogr. 67(4): 293–302.
  53. 53 Adams, P.D., Afonine, P.V., Bunkoczi, G. et al. (2010). PHENIX: a comprehensive python‐based system for macromolecular structure solution. Acta Crystallogr. D Biol. Crystallogr. 66(2): 213–221.
  54. 54 Winn, M.D., Ballard, C.C., Cowtan, K.D. et al. (2011). Overview of the CCP4 suite and current developments. Acta Crystallogr. D Biol. Crystallogr. 67(4): 235–242.
  55. 55 Evans, P.R. and Murshudov, G.N. (2013). How good are my data and what is the resolution? Acta Crystallogr. D Biol. Crystallogr. 69(7): 1204–1214.
  56. 56 Evans, P.R. (2011). An introduction to data reduction: space‐group determination, scaling and intensity statistics. Acta Crystallogr. D Biol. Crystallogr. 67(4): 282–292.
  57. 57 Nettleship, J.E., Assenberg, R., Diprose, J.M. et al. (2010). Recent advances in the production of proteins in insect and mammalian cells for structural biology. J. Struct. Biol. 172 (1): 55–65.
  58. 58 Walden, H. (2010). Selenium incorporation using recombinant techniques. Acta Crystallogr. D Biol. Crystallogr. 66 (4): 352–357.
  59. 59 Basle, A., Hewitt, L., Koh, A. et al. (2018). Crystal structure of NucB, a biofilm‐degrading endonuclease. Nucleic Acids Res. 46 (1): 473–484.
  60. 60 Abendroth, J., Gardberg, A.S., Robinson, J.I. et al. (2011). SAD phasing using iodide ions in a high‐throughput structural genomics environment. J. Struct. Funct. Genom. 12 (2): 83–95.
  61. 61 Dauter, Z., Dauter, M., de La Fortelle, E. et al. (1999). Can anomalous signal of sulfur become a tool for solving protein crystal structures? J. Mol. Biol. 289 (1): 83–92.
  62. 62 Hendrickson, W.A. and Teeter, M.M. (1981). Structure of the hydrophobic protein crambin determined directly from the anomalous scattering of sulphur. Nature 290: 107–113.
  63. 63 Liu, Q., Dahmane, T., Zhang, Z. et al. (2012). Structures from anomalous diffraction of native biological macromolecules. Science 336 (6084): 1033–1037.
  64. 64 Murray, J.W., Rudino‐Pinera, E., Owen, R.L. et al. (2005). Parameters affecting the X‐ray dose absorbed by macromolecular crystals. J. Synchrotron Radiat. 12(3): 268–275.
  65. 65 de Sanctis, D., Oscarsson, M., Popov, A. et al. (2016). Facilitating best practices in collecting anomalous scattering data for de novo structure solution at the ESRF structural biology beamlines. Acta Crystallogr. D Struct. Biol. Crystallogr. 72(3): 413–420.
  66. 66 El Omari, K., Iourin, O., Kadlec, J. et al. (2014). Pushing the limits of sulfur SAD phasing: de novo structure solution of the N‐terminal domain of the ectodomain of HCV E1. Acta Crystallogr. D Biol. Crystallogr. 70(8): 2197–2203.
  67. 67 Finke, A.D., Panepucci, E., Vonrhein, C. et al. (2016). Advanced crystallographic data collection protocols for experimental phasing. Methods Mol. Biol. 1320: 175–191.
  68. 68 Dauter, Z. (2013). SAD/MAD Phasing. Adv. Methods Biomol. Crystallogr. 135–149.
  69. 69 Ducros, V.M.A., Lewis, R.J., Verma, C.S. et al. (2001). Crystal structure of GerE, the ultimate transcriptional regulator of spore formation in Bacillus subtilis. J. Mol. Biol. 306 (4): 759–771.
  70. 70 Pike, A.C., Garman, E.F., Krojer, T. et al. (2016). An overview of heavy‐atom derivatization of protein crystals. Acta Crystallogr. D Struct. Biol. 72(3): 303–318.
  71. 71 Crick, F.H.C. and Magdoff, B.S. (1956). The theory of the method of isomorphous replacement for protein crystals. I. Acta Crystallogr. 9 (11): 901–908.
  72. 72 de Sanctis, D. and Nanao, M.H. (2012). Segmenting data sets for RIP. Acta Crystallogr. D Biol. Crystallogr. 68(9): 1152–1162.
  73. 73 Schoenborn, B.P., Watson, H.C., and Kendrew, J.C. (1965). Binding of xenon to sperm whale myoglobin. Nature 207: 28–30.
  74. 74 Tilton, R.F., Kuntz, I.D., and Petsko, G.A. (1984). Cavities in proteins: structure of a metmyoglobin xenon complex solved to 1.9 Å. Biochemistry 23 (13): 2849–2857.
  75. 75 Dauter, Z., Dauter, M., and Rajashankar, K.R. (2000). Novel approach to phasing proteins: derivatization by short cryo‐soaking with halides. Acta Crystallogr. D Biol. Crystallogr. 56 (2): 232–237.
  76. 76 Schluenzen, F., Tocilj, A., Zarivach, R. et al. (2000). Structure of functionally activated small ribosomal subunit at 3.3 Å resolution. Cell 102 (5): 615–623.
  77. 77 Beck, T., Krasauskas, A., Gruene, T., and Sheldrick, G.M. (2008). A magic triangle for experimental phasing of macromolecules. Acta Crystallogr. D Biol. Crystallogr. 64(11): 1179–1182.
  78. 78 Szczepanowski, R.H., Filipek, R., and Bochtler, M. (2005). Crystal structure of a fragment of mouse ubiquitin‐activating enzyme. J. Biol. Chem. 280 (23): 22006–22011.
  79. 79 Bibby, J., Keegan, R.M., Mayans, O. et al. (2012). AMPLE: a cluster‐and‐truncate approach to solve the crystal structures of small proteins using rapidly computed ab initio models. Acta Crystallogr. D Biol. Crystallogr. 68(12): 1622–1631.
  80. 80 Caballero, I., Sammito, M., Millán, C. et al. (2018). ARCIMBOLDO on coiled coils. Acta Crystallogr. D Biol. Crystallogr. 74(3): 194–204.
  81. 81 Rodríguez, D.D., Grosse, C., Himmel, S. et al. (2009). Crystallographic ab initio protein structure solution below atomic resolution. Nat. Methods 6: 651–653.
  82. 82 Kelley, L.A., Mezulis, S., Yates, C.M. et al. (2015). The Phyre2 web portal for protein modeling, prediction and analysis. Nat. Protoc. 10: 845–858.
  83. 83 Long, F., Vagin, A.A., Young, P., and Murshudov, G.N. (2008). BALBES: a molecular‐replacement pipeline. Acta Crystallogr. D Biol. Crystallogr. 64(1): 125–132.
  84. 84 Keegan, R.M. and Winn, M.D. (2008). MrBUMP: an automated pipeline for molecular replacement. Acta Crystallogr. D Biol. Crystallogr. 64(1): 119–124.
  85. 85 Vagin, A. and Teplyakov, A. (1997). MOLREP: an automated program for molecular replacement. J. Appl. Crystallogr. 30 (6): 1022–1025.
  86. 86 McCoy, A.J., Grosse‐Kunstleve, R.W., Adams, P.D. et al. (2007). Phaser crystallographic software. J. Appl. Crystallogr. 40(4): 658–674.
  87. 87 Sammito, M., Meindl, K., de Ilarduya, I.M. et al. (2014). Structure solution with ARCIMBOLDO using fragments derived from distant homology models. FEBS J. 281 (18): 4029–4045.
  88. 88 Bibby, J., Keegan, R.M., Mayans, O. et al. (2013). Application of the AMPLE cluster‐and‐truncate approach to NMR structures for molecular replacement. Acta Crystallogr. D Biol. Crystallogr. 69(11): 2194–2201.
  89. 89 Wang, B.‐C. (1985). Resolution of phase ambiguity in macromolecular crystallography. Methods Enzymol. , Academic Press. 115: 90–112.
  90. 90 Murshudov, G.N., Skubak, P., Lebedev, A.A. et al. (2011). REFMAC5 for the refinement of macromolecular crystal structures. Acta Crystallogr. D Biol. Crystallogr. 67(4): 355–367.
  91. 91 Blanc, E., Roversi, P., Vonrhein, C. et al. (2004). Refinement of severely incomplete structures with maximum likelihood in BUSTER‐TNT. Acta Crystallogr. D Biol. Crystallogr. 60(12): 2210–2221.
  92. 92 Sheldrick, G. (2015). Crystal structure refinement with SHELXL. Acta Crystallogr. C 71 (1): 3–8.
  93. 93 Langer, G., Cohen, S.X., Lamzin, V.S., and Perrakis, A. (2008). Automated macromolecular model building for X‐ray crystallography using ARP/wARP version 7. Nat. Protoc. 3 (7): 1171–1179.
  94. 94 Cowtan, K. (2006). The buccaneer software for automated model building. 1. Tracing protein chains. Acta Crystallogr. D Biol. Crystallogr. 62(9): 1002–1011.
  95. 95 Kawai, Y., Marles‐Wright, J., Cleverley, R.M. et al. (2011). A widespread family of bacterial cell wall assembly proteins. EMBO J. 30 (24): 4931–4941.
  96. 96 McNicholas, S., Potterton, E., Wilson, K.S., and Noble, M.E. (2011). Presenting your structures: the CCP4mg molecular‐graphics software. Acta Crystallogr. D Biol. Crystallogr. 67(4): 386–394.
  97. 97 Urzhumtseva, L., Afonine, P.V., Adams, P.D., and Urzhumtsev, A. (2009). Crystallographic model quality at a glance. Acta Crystallogr. D Biol. Crystallogr. 65(3): 297–300.
  98. 98 Yin, X., Scalia, A., Leroy, L. et al. (2014). Hitting the target: fragment screening with acoustic in situ co‐crystallization of proteins plus fragment libraries on pin‐mounted data‐collection micromeshes. Acta Crystallogr. D Biol. Crystallogr. 70 (5): 1177–1189.
  99. 99 Delageniere, S., Brenchereau, P., Launer, L. et al. (2011). ISPyB: an information management system for synchrotron macromolecular crystallography. Bioinformatics 27 (22): 3186–3192.
  100. 100 Wojdyr, M., Keegan, R., Winter, G. et al. (2014). DIMPLE ‐ a pipeline for the rapid generation of difference maps from protein crystals with putatively bound ligands. Acta Crystallogr. A 69: s299–s299.
  101. 101 Pearce, N.M., Krojer, T., Bradley, A.R. et al. (2017). A multi‐crystal method for extracting obscured crystallographic states from conventionally uninterpretable electron density. Nat. Commun. 8: 15123.
  102. 102 Sheldrick, G.M. (2008). A short history of SHELX. Acta Crystallogr. A 64(1): 112–122.

Further Reading

  1. Blow, D. (2002). Outline of Crystallography for Biologists. Oxford University Press.
  2. Chayen, N., Helliwell, J., and Snell, E. (2010). Macromolecular Crystallization and Crystal Perfection. Oxford University Press.
  3. Methods in Enzymology Volume 276. Carter, C. Jr. and Sweet, R. (eds.) (1997). Macromolecular Crystallography, Part A. Academic Press.
  4. Methods in Enzymology Volume 277. Carter, C. Jr. and Sweet, R. (eds.) (1997). Macromolecular Crystallography, Part B. Academic Press.
  5. Methods in Enzymology Volume 358. Carter, C. Jr. and Sweet, R. (eds.) (2003). Macromolecular Crystallography, Part C. Academic Press.
  6. Doublie, S. (2007). Macromolecular Crystallography Protocols, Volume 2: Structure Determination. Humana Press.
  7. Rupp, B. (2010). Biomolecular Crystallography: Principles, Practice and Application to Structural Biology. Garland Science.
  8. Sherwood, D. and Cooper, J. (2010). Crystals, X‐rays and Proteins. Oxford University Press.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset