Phasing in Crystallography
Latest Publications


TOTAL DOCUMENTS

15
(FIVE YEARS 0)

H-INDEX

0
(FIVE YEARS 0)

Published By Oxford University Press

9780199686995, 9780191918377

Author(s):  
Carmelo Giacovazzo

The title of this chapter may seem a little strange; it relates Fourier syntheses, an algebraic method for calculating electron densities, to the joint probability distribution functions of structure factors, which are devoted to the probabilistic estimate of s.i.s and s.s.s. We will see that the two topics are strictly related, and that optimization of the Fourier syntheses requires previous knowledge and the use of joint probability distributions. The distributions used in Chapters 4 to 6 are able to estimate s.i. or s.s. by exploiting the information contained in the experimental diffraction moduli of the target structure (the structure one wants to phase). An important tool for such distributions are the theories of neighbourhoods and of representations, which allow us to arrange, for each invariant or seminvariant Φ, the set of amplitudes in a sequence of shells, each contained within the subsequent shell, with the property that any s.i. or s.s. may be estimated via the magnitudes constituting any shell. The resulting conditional distributions were of the type, . . . P(Φ| {R}), (7.1) . . . where {R} represents the chosen phasing shell for the observed magnitudes. The more information contained within the set of observed moduli {R}, the better will be the Φ estimate. By definition, conditional distributions (7.1) cannot change during the phasing process because prior information (i.e. the observed moduli) does not change; equation (7.1) maintains the same identical algebraic form. However, during any phasing process, various model structures progressively become available, with different degrees of correlation with the target structure. Such models are a source of supplementary information (e.g. the current model phases) which, in principle, can be exploited during the phasing procedure. If this observation is accepted, the method of joint probability distribution, as described so far, should be suitably modified. In a symbolic way, we should look for deriving conditional distributions . . . P (Φ| {R}, {Rp}) , (7.2) . . . rather than (7.1), where {Rp} represents a suitable subset of the amplitudes of the model structure factors. Such an approach modifies the traditional phasing strategy described in the preceding chapters; indeed, the set {Rp} will change during the phasing process in conjunction with the model changes, which will continuously modify the probabilities (7.2).


Author(s):  
Carmelo Giacovazzo

In this chapter we summarize the basic concepts, formulas and tables which constitute the essence of general crystallography. In Sections 1.2 to 1.5 we recall, without examples, definitions for unit cells, lattices, crystals, space groups, diffraction conditions, etc. and their main properties: reading these may constitute a useful reminder and support for daily work. In Section 1.6 we establish and discuss the basic postulate of structural crystallography: this was never formulated, but during any practical phasing process it is simply assumed to be true by default. We will also consider the consequences of such a postulate and the caution necessary in its use. We recall the main concepts and definitions concerning crystals and crystallographic symmetry. Crystal. This is the periodic repetition of a motif (e.g. a collection of molecules, see Fig. 1.1). An equivalent mathematical definition is: the crystal is the convolution between a lattice and the unit cell content (for this definition see (1.4) below in this section). Unit cell. This is the parallelepiped containing the motif periodically repeated in the crystal. It is defined by the unit vectors a, b, c, or, by the six scalar parameters a, b, c, α, β, γ (see Fig. 1.1). The generic point into the unit cell is defined by the vector . . . r = x a + y b + z c, . . . where x, y, z are fractional coordinates (dimensionless and lying between 0 and 1). The volume of the unit cell is given by (see Fig. 1.2) . . . V = a ∧ b · c = b ∧ c · a = c ∧ a · b. (1.1). . .


Author(s):  
Carmelo Giacovazzo

The term anomalous scattering originates from the first research on light dispersion in transparent materials. It was found that, in general, the index of refraction increases when the wavelength decreases (this was considered to be normal). It was also found that, close to the absorption edges, the refractive index shows a negative slope, and this effect was called anomalous. Today, it is clear that anomalous dispersion is a resonance effect. Indeed, atomic electrons may be considered to be oscillators with natural frequencies; they are bound to the nucleus by forces which depend on the atomic field strength and on the quantum state of the electron. If the frequency of the primary beam is near to some of these natural frequencies, resonance will take place (the concept of dispersion involves a change of property with frequency).


Author(s):  
Carmelo Giacovazzo

In a very traditional village game, popular over the period of Lent (usually on the pigñata day, the first Sunday of Lent), a young player, suitably blindfolded and armed with a long cudgel, tries to hit a pot (the pigñata) located some distance away, in order to win the sweetmeats contained inside. To break the pot they take random steps, and at each step they try to hit the pot with the cudgel. Is it possible to guess the distance of the player from their starting position after n random steps? Is it possible to guess the direction of the vectorial resultant of the n steps? A very simple analysis of the problem suggests that the distance after n steps may be estimated but the direction of the resultant step cannot, because a preferred privileged orientation does not exist. The situation is very similar to structure factor statistics. Each of the N atoms in the unit cell provides the vectorial contribution . . . fj = fj exp(2πih · rj) = fj exp(i θj). . . to the structure factor; this is equivalent to a vectorial step of the pigñata player. The modulus of the atomic contribution, like the amplitude of the step in the pigñata game, is known (because the chemical composition of the molecules in the unit cell is supposed to be known), but the phase θj (corresponding to the direction of the step) remains unknown; indeed we do not know the position rj of the j th atom. The analogy with the pigñata game suggests that some information on the moduli of the structure factors can be obtained via a suitable statistical approach, while no phase information can be obtained using this approach. This chapter deals just with this statistical approach and owing to the relevant contributions of A. J. C. Wilson, we call this chapter Wilson statistics.


Author(s):  
Carmelo Giacovazzo

Modern phasing methods may be subdivided into: (a) ab initio approaches, which include direct methods, Patterson techniques, charge flipping, and VLD (vive la difference). These approaches do not use (but, suitably modified, some of them can) any prior information on the molecular geometry. (b) non-ab initio methods. In this category, we include molecular replacement (MR), isomorphous derivatives (SIR-MIR) and anomalous dispersion (SAD-MAD) approaches. MR exploits information on the molecular geometry (i.e. the target molecule is known to be similar to that present in another previously solved structure), SIR-MIR uses the supplementary information contained in the experimental data from one or more isomorphous structures, and SAD-MAD exploits anomalous dispersion effects (we will see that such effects simulate isomorphism). It is immediately clear that classification into ab initio and non-ab initio categories may be questionable, because it hides substantial diversities in the prior information. For example, SAD-MAD, unlike SIR-MIR, may use the native protein data only, and no prior information on the molecular geometry is necessary; apparently, this may be considered to belong to the ab initio category. MR does not use supplementary experimental data, and therefore seems not to be similar to SAD-MAD and SIR-MIR. The latter two techniques are often referred to as experimental phasing approaches, but also this appellation is questionable; indeed, the experiment does not provide phases, these are derived by treating the experimental data, as in any other phasing approach. The above considerations suggest that a more precise, even if conventional, definition for ab initio methods is necessary; in this book, they are identified as those techniques which do not use the molecular geometry as prior information and exploit only native data, without anomalous dispersion effects. We have seen in Section 12.8 that some approaches use low-level prior information, not specific to the current structure, but valid for a large range of compounds (e.g. the coordination of some heavy atoms and corresponding bond angles and distances). Also such procedures may be considered as ab initio approaches; to this category we add ARCIMBOLDO, which combines the ‘trivial’ information that a protein consists of smaller molecular fragments of known geometry (among which are α-helices) with MR. ARCIMBOLDO is summarized in Section 13.9.


Author(s):  
Carmelo Giacovazzo

Among the statistics freely available on the webpage of the Cambridge Structural Database, there is a detail of interest for this chapter: of the 596 910 crystal structures deposited up to 1 January 2012, only 1534 were solved by neutron data (see Table 1.11). No information is provided on the number of structures solved by electron data because it is negligible (organic samples are soon damaged by the electron beams). A statistical search of the Inorganic Crystal Structure Database (ICSD, Ver. 2012–1, about 150 000 entries; by courtesy of Thomas Weirich) on structures that have been solved by means of electron diffraction, eventually in combination with other techniques, indicates a total of about 0.7%. In spite of limited impact on the databases, electron and neutron diffraction play a fundamental role in materials science and in crystallography. The main reason is that they provide alternative techniques to X-rays. Let us first consider electron diffraction (ED) techniques. The study of crystalline samples at the nanometer scale is mandatory for many industrial applications; indeed, physical properties depend on the crystal structure. Unfortunately it is not unusual for compounds to only exist in the nanocrystalline state; then, traditional X-ray diffraction techniques for atomic structure determination cannot be applied, because of the weak interactions between X-rays and matter. As a consequence, such structures remain unknown, in spite of their technological importance. This limits the contribution of X-ray crystallography to nanoscience, a growing scientific area, crucial to many fields, from semiconductors to pharmaceuticals and proteins. The result is a lack of knowledge on the underlying structure–property relationships, which often retards further research and development. Structure analysis by electron diffraction began as early as the 1930s (in particular, by Rigamonti, in 1936), but the interest of the crystallographic community in such a technique soon faded, mostly because electron diffraction intensities are not routinely transferable into kinematical |F|2. In spite of this limitation, the technique has been used for investigating the structure of many inorganic, organic, and metallo-organic crystals, biological structures, and various minerals, especially layer silicates.


Author(s):  
Carmelo Giacovazzo

According to the basic principles of structural crystallography, stated in Section 1.6: (i) it is logically possible to recover the structure from experimental diffraction moduli; (ii) the necessary information lies in the diffraction amplitudes themselves, because they depend on interatomic vectors. The first systematic approach to structure determination based on the above principle was developed by Patterson (1934a,b). In the small molecule field related techniques, even if computerized (Mighell and Jacobson, 1963; Nordman and Nakatsu, 1963), were relegated to niche by the advent of direct methods. Conversely, in macromolecular crystallography, they survived and are still widely used today. Nowadays, Patterson techniques have been reborn as a general phasing approach, valid for small-, medium-, and large-sized molecules. The bases of Patterson methods are described in Section 10.2; in Section 10.3 some methods for Patterson deconvolution (i.e. for passing from the Patterson map to the correct electron density map) are described, and in Section 10.4 some applications to ab initio phasing are summarized. The use of Patterson methods in non-ab initio approaches like MR, SAD-MAD, or SIR-MIR are deferred to Chapters 13 to 15. We do not want to leave this chapter without mentioning some fundamental relations between direct space properties and reciprocal space phase relationships. Patterson, unlike direct methods, seek their phasing way in direct space; conversely, DM are the counterpart, in reciprocal space, of some direct space properties (positivity, atomicity, etc.). One may wonder if, by Fourier transform, it is possible to immediately derive phase information from such properties, without the heavy probabilistic machinery. In Appendix 10.A, we show some of many relations between electron density properties and phase relationships, and in Appendix 10.B, we summarize some relations between Patterson space and phase relationships. Patterson (1949) defined a second synthesis, known as the Patterson synthesis of the second kind. Even if theoretically interesting, it is of limited use in practice. We provide information on this in Appendix 10.C.


Author(s):  
Carmelo Giacovazzo

The descriptions of the various types of Fourier synthesis (observed, difference, hybrid) and of their properties, given in Chapter 7, suggest that electron density maps are not only a tool for depicting the distribution of the electrons in the target structure, but also a source of information which may be continuously exploited during the phasing process, no matter whether ab initio or non-ab initio methods were used for deriving the initial model. Here, we will describe two important techniques based on the properties of electron density maps. (i) The recursive approach for phase extension and refinement called EDM (electron density modification). Such techniques have dramatically improved the efficiency of phasing procedures, which usually end with a limited percentage of phased reflections and non-negligible phase errors. EDM techniques allow us to extend phase assignment and to improve phase quality. The author is firmly convinced that practical solution of the phase problem for structures with Nasym up to 200 atoms in the asymmetric unit may be jointly ascribed to direct methods and to EDM techniques. (ii) The AMB (automated model building) procedures; these may be considered to be partly EDM techniques and they are used for automatic building of molecular models from electron density maps. Essentially, we will refer to proteins; the procedures used for small to medium-sized molecules have already been described in Section 6.3.5. Two new ab initio phasing approaches, charge flipping and VLD, essentially based on the properties of the Fourier transform, belong to the EDM category, and since they require a special treatment, they will be described later in Chapter 9. Phase extension and refinement may be performed in reciprocal and in direct space. We described the former in Section 6.3.6; here, we are just interested in direct space procedures, the so-called EDM (electron density modification) techniques. Such procedures are based on the following hypothesis: a poor electron density map, ρ, may be modified by a suitable function, f , to obtain a new map, say ρmod, which better approximates the true map: . . . ρmod (r) = f [ρ(r)]. (8.1) . . . If function f is chosen properly, more accurate phases can be obtained by Fourier inversion of ρmod, which may in turn be used to calculate a new electron density map.


Author(s):  
Carmelo Giacovazzo

Which phasing methods can be included in the category direct methods, and which require a different appellation? Originally, direct phasing was associated with those approaches which were able to derive phases directly from the diffraction moduli, without passing through deconvolution of the Patterson function. Since a Patterson map provides interatomic distances, and therefore lies in ‘direct space’, direct methods were also referred to as reciprocal space methods, and Patterson techniques as real-space methods. Historically, direct methods use 3-,4-, . . . , n-phase invariants and 1-2-, . . . phase seminvariants via the tangent formula or its modified algorithms. Since the 1950s, about a half a century of scientific effort has fallen under the above definition. Such approaches are classified here as traditional direct methods. Today, the situation is more ambiguous, because: (i) modern direct methods programs involve steps operating both in reciprocal space and in direct space, the latter mainly devoted to phase extension and refinement (see Chapter 8); (ii) in the past decade, new phasing methods for crystal structure solution (see Chapter 9) have been developed, based on the properties of Fourier transforms, which again work both in direct and in reciprocal space. Should they therefore be considered to be outside the direct methods category or not? Our choice is as follows. Direct methods are all of the approaches which allow us to derive phases from diffraction amplitudes, without passing through a Patterson function deconvolution. Thus, we also include in this category, charge flipping and VLD (vive la difference), here classified as non-traditional direct methods; their description is postponed to Chapter 9. In accordance with the above assumptions, in this chapter we will shortly illustrate traditional direct phasing procedures, with particular reference to those which are current and in regular use today: mainly the tangent procedures (see Section 6.2) and the cosine least squares technique, which is the basic tool of the shake and Bake method (see Section 6.4).


Author(s):  
Carmelo Giacovazzo

The isomorphous replacement method is a very old technique, used incidentally by Bragg to solve NaCl and KCl structures: it was later formulated in a more general way by Robertson (1935, 1936) and by Robertson and Woodward (1937). Its modern formulation is essentially due to Green et al. (1954) and to Bragg and Perutz (1954), who applied the method to haemoglobin. The technique has made possible the determination of the first three macromolecular structures, myoglobin, haemoglobin, and lysozyme. The approach may be summarized as follows. Suppose that the target structure is difficult to solve (e.g. it is a medium-sized structure, resistant to any phasing attempt, or it is a protein with bad data resolution) and we want to adopt isomorphous replacement techniques. Then we should perform the following steps: (a) Collect the diffraction data of the target structure; in the following we will suppose that it is the native protein. (b) Crystallize a new compound in which one or more heavy atoms are incorporated into the target structure. This new compound is called derivative. (c) Check if the operations in (b) heavily disturb the target structure. If not, the derivative is called isomorphous; then, only local (in the near vicinity of the binding site) structural modifications are induced by the heavy atom addition. Non-isomorphous derivative data are useless. (d) Use the two sets of diffraction data, say set {|FP|} of the target structure and set {|Fd|} of the isomorphous derivative, to solve the target structure. The above case is referred to as SIR (single isomorphous replacement). The reader should notice that redundant experimental information is available; indeed, two experimental sets of diffraction data relative to two isomorphous structures may be simultaneously used for solving the native protein. The redundancy of the experimental information allows crystal structure solution even if data resolution is far from being atomic (e.g. also when RES is about 3 or 4 Å, and even more in lucky cases). Imperfect isomorphism may hinder crystal structure solution. Then, more derivatives may be prepared; their diffraction data may be used in a combined way and may more easily lead the phasing process to success.


Sign in / Sign up

Export Citation Format

Share Document