Shelx: Applications to Macromolecular Crystallography - GWDG

45 downloads 58740 Views 5MB Size Report
Model building with coot [6] now works very well with shelxl refinement (reading of .... Simplest application: occupancy refinement of partially occupied ligand. .... of shelxe try and build a poly-ALA model into the electron density, iterating.
Tim Grüne

The Shelx Suite: Applications to Macromolecular Crystallography Tim Grüne Dept. of Structural Chemistry, University of Göttingen June 2011 http://shelx.uni-ac.gwdg.de [email protected]

SBGrid Symposium 2011: The Shelx Suite

1/35

Tim Grüne

Package Content The “Shelx Suite” [1] consists of SHELXS (Small Molecule) structure solution by Patterson & direct methods SHELXD mx (Small molecule) structure solution and (macromolecular) heavy atom location by direct methods SHELXE mx (Macromolecular) Phasing and density modification SHELXL mx Small molecule and macromolecular structure refinement CIFTAB Tables creation from CIF-files for publication SHELXC mx Data preparation for macromolecular phasing with shelxd & shelxe SHELXPRO mx Collection of conversion utilities (e.g. PDB → .ins & .res → PDB) SHELXWAT automated solvent molecule search for macromolecules (rather obsolete with coot) The download page also contains the programs mtz2sca and mtz2hkl [3] for conversion of mtz-format files to sca- and hkl-format respectively.

SBGrid Symposium 2011: The Shelx Suite

2/35

Tim Grüne

SHELXmx Programs labelled with mx are (also) used in macromolecular crystallography. SHELXD originally written for structure solution of small molecules by direct methods, now also used for substructure solution for experimental phasing (this was possible without modification of the program). SHELXE density modification and (β-version) auto-tracing of protein structures SHELX C/D/E the “triad” shelx c/d/e is best know for experimental phasing SHELXL high resolution refinement and refinement of neutron diffraction data SHELXPRO Preparation of .ins files from PDB files including standard restraints for peptides and nucleic acid structures; creation of maps for O and XtalView (useful in the “pre-coot-era")

SBGrid Symposium 2011: The Shelx Suite

3/35

Tim Grüne

Impact on Crystallography The SHELX programs, shelxd/s (structure solution with direct methods) and especially shelxl (refinement), have for long been the de facto standard for small molecule crystallography, best illustrated by the rise of the impact factor of Acta Crystallography A in 2009/2010 (http://www.iucr.org/index.html/leading-article/2010/2010-07-12). In macromolecular crystallography, shelxd is one of the major phasing programs, and also used by many pipelines like autoSharp, crank, autorickshaw,. . . shelxl is less popular when it comes to macromolecules (majors: refmac5 [7] & phenix.refine [5]). Garib Murshudov’s favourite quote on new features in refmac5:

“[. . . feature xyz] has been available in SHELXL since the beginning of time.”

SBGrid Symposium 2011: The Shelx Suite

4/35

Tim Grüne

Macromolecular Crystallography In macromolecular crystallography, the shelx programs have some alternatives: SHELXD • Sharp www.globalphasing.com • solve https://solve.lanl.gov • bp3 http://www.bfsc.leidenuniv.nl/software/bp3 • SnB http://www.hwi.buffalo.edu/snb, . . . SHELXE • DM, • parrot http://www.ysbl.york.ac.uk/˜cowtan/parrot/parrot.html • buccanneer http://www.ysbl.york.ac.uk/˜cowtan/buccaneer/buccaneer.html • resolve https://solve.lanl.gov, . . . SHELXL • Refmac5 http://www.ysbl.york.ac.uk/˜garib/refmac/ • Phenix http://www.phenix-online.org • Buster www.globalphasing.com • TNT http://www.uoxray.uoregon.edu/tnt/,. . . (these lists are definitely far from complete )

SBGrid Symposium 2011: The Shelx Suite

5/35

Tim Grüne

Additional Programs The following programs are also authored by George Sheldrick but distributed by Bruker AXS: xprep Space group determination, data analysis and preparation of phasing sadabs Scaling of integrated data (normal and modulated crystals) twinabs Scaling of integrated data (non-merohedrally twinned crystals)

sadabs and twinabs are fine-tuned to work with the data processing program saint (Bruker AXS) and produce excellent results even with twinned macromolecular data. Scaling with sadabs For the transition from HKL2000 data (.x-files) to scaling with sadabs (instead of scalepack), the Shelx homepage provides the program x2sad. For the transition from XDS (XDS ASCII.HKL) to scaling with sadabs (instead of scala or xscale), the Shelx homepage provides the program xds2sad.

SBGrid Symposium 2011: The Shelx Suite

6/35

Tim Grüne

Program Philosophy All shelx programm are command line programs, i.e. they are started from a terminal window (Linux/UNIX) or the command prompt (windows) respectively. • The main programs shelxd, shelxs, shelxl require an .ins file with instructions. • shelxe takes all its options from the command line. • shelxpro, xprep and sadabs/twinabs are interactive programs offering (text-based) menus.

SBGrid Symposium 2011: The Shelx Suite

7/35

Tim Grüne

Structure Refinement with Shelxl

SBGrid Symposium 2011: The Shelx Suite

8/35

Tim Grüne

Advantages of Shelxl • • • • • •

Enormous flexibility, even refinement of Laue-data, and neutron data. Refinement of (multiple domain) twin data Refinement against intensities ⇒ inclusion of negative intensities. Proper treatment of anisotropy and symmetry Calculation for standard uncertainties (esds) for small structures Availability of parallelised version (shelxl mp)

SBGrid Symposium 2011: The Shelx Suite

9/35

Tim Grüne

Reasons for Shelxl Refinement • • • • • •

Anisotropic refinement (at d . 1.5 Å) Twin refinement Occupancy refinement (ligand studies) Complicated disorder (free variables) Laue data Neutron diffraction data (time-of-flight Laue)

SBGrid Symposium 2011: The Shelx Suite

10/35

Tim Grüne

Input to Shelxl Shelxl requires two input files with the same basename: 1. myname.hkl shelxl reads data from plain text files, one Miller-index per line. Several different formats are allowed. Most commonly used: • HKLF 4: HKLIobsσI Miller indices followed by intensity data • HKLF 5: HKLIobsσI m where m is the twin domain • HKLF 2: HKLIσI BN λ (X-ray or neutron) Laue data 2. myname.ins Instruction file containing header with instructions, restraints, and constraints; followed by list of atoms. For historical reasons (punch cards era) line widths are restricted to 80 characters (columns). Lines can be continued by an ’=’ sign as last characters (before column 80) and leaving the first four characters empty.

SBGrid Symposium 2011: The Shelx Suite

11/35

Tim Grüne

Input File Generation With macromolecular data it is most convenient to start model building and refinement with phenix [5] or refmac5 [7]. Switch to shelxl, once the model is relatively complete. Creating the hkl-file apply mtz2hkl [3] to mtz-file used as input to refmac5/ phenix. ins-file shelxpro to create ins-file from PDB-file. Automatically includes restraints for standard amino acids and nucleic acids and code for riding hydrogen positions (HFIX, initially commented out)

SBGrid Symposium 2011: The Shelx Suite

12/35

Tim Grüne

Shelxl Interaction with Coot Model building with coot [6] now works very well with shelxl refinement (reading of name.res and name.fcf, writing of updated name new-round.ins).

Automatic generation of (σa-weighted) map and difference map from fcf-file.

Save updated coordinates to .ins-file but check occupancies of newly placed atoms (solvent, ions) to be 11.0 and not 1.0 (this is a bug in coot version 0.6.1).

SBGrid Symposium 2011: The Shelx Suite

13/35

Tim Grüne

Coot: Displaying Hydrogen

Riding hydrogens are not moved upon refinement in coot. It is sufficient not to display them in coot (Edit -> Bond Parameters)

Hydrogens are handled by shelxl with the AFIX command which ignores the coordinates of the calculated atom positions in the .ins-file.

SBGrid Symposium 2011: The Shelx Suite

14/35

Tim Grüne

Fractional Coordinates Unlike PDB-files shelxl stores atom coordinates in the ins-file as fractional coordinates. Uiso 0.18181

O

4 0.4541 -0.0399 0.2690 11.00 | | | | | atom type x y z occ line continuation | | | | | | N 3 0.2722 -0.1317 -0.1280 11.00 = 0.24975 0.13001 0.18948 -0.03210 -0.05152 0.00098 | | | | | | U11 U22 U33 U23 U13 U12

(isotropic atom)

(anisotropic atom)

The anisotropic, symmetric ADP matrix (Uij ) is used to calculate the scattering factor  ∗ U11 U12 U13 a    −2π 2 (hkl)U12 U22 U23  b∗  ∗ U U U c 13 23 33 )e e2πi(hxj +kyj +lzj ) 

F (hkl) =

atoms X j

fj (θhkl



in unit cell

SBGrid Symposium 2011: The Shelx Suite

15/35

Tim Grüne

Fractional Coordinates Unlike PDB-files shelxl stores atom coordinates in the ins-file as fractional coordinates. Uiso 0.18181

O

4 0.4541 -0.0399 0.2690 11.00 | | | | | atom type x y z occ line continuation | | | | | | N 3 0.2722 -0.1317 -0.1280 11.00 = 0.24975 0.13001 0.18948 -0.03210 -0.05152 0.00098 | | | | | | U11 U22 U33 U23 U13 U12

(isotropic atom)

(anisotropic atom)

• Atom names are arbitrary (up to 4 characters - digits and letters, except keywords) • Scattering factor derived from atom type and SFAC keyword

SFAC

C H N O S 1 2 3 4 5

Element names as they appear in the PSE have their scattering properties predefined in shelxl.

SBGrid Symposium 2011: The Shelx Suite

16/35

Tim Grüne

FVAR: Free variables The use of fractional coordinates and U- instead of B-values allows for one of the major strengths of shelxl: The use of free variables as restraints and constraints.

SBGrid Symposium 2011: The Shelx Suite

17/35

Tim Grüne

FVAR: The Concept Free variables are enumerated by the FVAR card:

FVAR

0.07531 0.54646 "1" 2

0.56437 0.60583 3 4

(The first free variable is used as scaling factor between calculated and observed data.) Numbers in atom descriptions (and in SUMP, CHIV, and DFIX) are interpreted as 10m+p, where m is and integer and −5 < p < 5. m = 0 p is freely refined, e.g. coordinates (x, y, z) m = 1 p is fixed and not refined at all. Usually used for occupancy occ = 11.00 m > 1 p is refined as the mth number of the FVAR card (p = f var(m)). This way groups of atoms can be refined together, e.g. the occupancy of ligand molecules. m ≤ −1 p is constraint to the value 1 − f var(m). E.g. the occupancy of a two-fold disorder is thus handled by a single parameter

SBGrid Symposium 2011: The Shelx Suite

18/35

Tim Grüne

FVAR-Example: occupancy refinement Simplest application: occupancy refinement of partially occupied ligand. Partially occupied glycerol in a protein structure. Setting occupancies of all atoms to 0.5: independent refinement results in chemical nonsense:

O3 occ.: 0.68 C3 occ.: 0.31

C2 occ.: 0.42 ...

Solution: Use FVAR #2: before refinement:

FVAR 0.14497 0.5 O3 occ.: 21.0 C3 occ.: 21.0

C2 occ.: 21.0 ...

after refinement:

FVAR 0.14412 0.55387 O3 occ.: 21.0 C3 occ.: 21.0

SBGrid Symposium 2011: The Shelx Suite

C2 occ.: 21.0 ...

19/35

Tim Grüne

FVAR-Example: occupancy refinement Simplest application: occupancy refinement of partially occupied ligand. Partially occupied glycerol in a protein structure. Setting occupancies of all atoms to 0.5: independent refinement results in chemical nonsense:

O3 occ.: 0.68 C3 occ.: 0.31

C2 occ.: 0.42 ...

Solution: Use FVAR #2: before refinement:

FVAR 0.14497 0.5 O3 occ.: 21.0 C3 occ.: 21.0

C2 occ.: 21.0 ...

after refinement:

FVAR 0.14412 0.55387 O3 occ.: 21.0 C3 occ.: 21.0

SBGrid Symposium 2011: The Shelx Suite

C2 occ.: 21.0 ...

20/35

Tim Grüne

Alternative Conformations: PART + FVAR Partially occupied ligands often lead to alternative conformations of side-chains: Interaction with the ligand in those unit-cells where it is present may result in a different orientation than in those unit-cells where the ligand is missing. Modeling disorder (3)

shelxl provides PARTs for mutually excluding interactions: All atoms in PART n can make bonds to each other and to the atoms in PART 0, but not to other PARTs. E.g. the occupancy of a two-fold disorder can be modelled by using FVAR N for one part and FVAR -N for the other one. The free variable -N is refined to 1-(value f N).

1 parameter describes the occupancies of 22 atoms !!

RESI 233 TYR .. PART 1 31.0 CB .. PART 2 -31.0 CB .. PART 0 ..

RESI 123 THR .. PART 1 31.0 CB .. PART 2 -31.0 CB .. PART 0 .. Thomas R. Schneider

SBGrid Symposium 2011: The Shelx Suite

21/35

Tim Grüne

SUMP: Modelling more than two-fold Disorder

• PARTs with three-fold or higher disorder: assign one free variable each. • Restrain sum of all fvar’s with SUMP:

SUMP 1.0 0.01 1 19 1 20 1 21

1 × f var(19) + 1 × f var(20) + 1 × f var(21) = 1.0 ± 0.01

SBGrid Symposium 2011: The Shelx Suite

22/35

Tim Grüne

Anisotropic Refinement • Can be carried out if data:parameter ratio is sufficiently high (roughly at about 1.5 Å or better) • Increases number of parameters from 4 to 9 ⇒ Should only be started at end of refinement when model is fairly complete Transition from isotropic to anisotropic refinement: embrace the corresponding region in the .ins-file with

ANIS . . . ANIS 0 shelxl automatically and correctly sets up symmetry related constrains of ADPs.

SBGrid Symposium 2011: The Shelx Suite

23/35

Tim Grüne

Anisotropic Restraints Chemical environment has an effect on the ADP’s of bonded or neighbouring atoms:

Restraints on ADP's

DELU

SIMU

ISOR

Thomas R. Schneider

SBGrid Symposium 2011: The Shelx Suite

24/35

Tim Grüne

shelxle — A GUI for shelxl shelxl now has a GUI, shelxle developed by Christian Hübschle ([email protected]) in direct collaboration with George Sheldrick. The program has now reached a stable status and can be downloaded as β-version upon email request to Christian Hübschle.

A video of the program in action is available at http://ewald.ac.chemie.uni-goettingen.de/lehre/pm.html.

SBGrid Symposium 2011: The Shelx Suite

25/35

Tim Grüne

Macromolecular Phasing with shelx c/d/e

SBGrid Symposium 2011: The Shelx Suite

26/35

Tim Grüne

Possible Phasing Techniques with shelx c/d/e The triad shelx c/d/e can be used in experimental phasing for • • • • •

S/MAD (single/ multi-) wavelength anomalous dispersion SIR single wavelength isomorphous replacement RIP (radiation damage induced phasing) SIRAS combination of SIR and SAD RIPAS combination of RIP and SAD

SBGrid Symposium 2011: The Shelx Suite

27/35

Tim Grüne

Latest Improvments in shelxd The latest β-version of shelxd, shelxd mp is available upon email request to [email protected] or [email protected]. shelxd mp is a parallelised version using openMP. • Parallel version shelxd mp: approximate 29 times faster on 32 CPU machine than single CPU version. • shelxd mp runs faster even on 1 CPU compared to previous version (due to improvement of calculation of Patterson Minimum Function PMSF) • Criterion of “best” solution: CFOM = CC + CCweak

SBGrid Symposium 2011: The Shelx Suite

28/35

Tim Grüne

Automated Model Building in shelxe shelxd often finds a correct solution to the substructure, even at low (anomalous) resolution (5-8 Å) Still the data are sometimes not good enough for density modification programs to produce an interpretable map. The option -a lets the beta-version of shelxe try and build a poly-ALA model into the electron density, iterating between model building and density modification. By this method the performance of shelxe has been pushed to produce interpretable electron density maps in cases which were hopeless before.

SBGrid Symposium 2011: The Shelx Suite

29/35

Tim Grüne

Molecular Replacement Boosts The flexibility of shelx programs could often be “abused” for applications they were initially not intended for, e.g. macromolecular phasing with the direct methods program shelxd. Density modification can work with very poor phase information. This recently led to using shelxe as arbiter for extremely low quality molecular replacement solutions. e.g. in cases where only low homology starting models are available. It works as easy as

shelxe mymodel.pda -a30 -s0.6 -q -y2.0 This creates a poly-Alanine trace (mymodel.pdb) for the data stored in mymodel.hkl (e.g. extracted from the phaser input mtz-file with mtz2hkl [3]. Because of the way shelxe works this procedure also remove model bias from the resulting electron density map This works best with data to 2.0 Å resolution but should also work at 2-3 (’ish) Å. SBGrid Symposium 2011: The Shelx Suite

30/35

Tim Grüne

Phaser & Shelxe Data courtesy A. Thorn. • Small search fragments used with phaser for a 1.7 Å data set. • Phaser TFZ-score and shelxe-CC do not correlate. • T F Z > 8 non-reliable criterion with this method • Correct solutions marked by CC > 25 %

SBGrid Symposium 2011: The Shelx Suite

31/35

Tim Grüne

Arcimboldo The Computer program Arcimboldo [8] pushes this to extremes: Phaser: search with small helical motif

Write out many solutions

Let shelxe expand solutions

Keep those with CC>25%

Since this works without input model and without experimental phase information, this method can be regarded as ab initio method for macromolecules at 2 Å resolution! SBGrid Symposium 2011: The Shelx Suite

32/35

Tim Grüne

shelxe — Getting the best out of your data

Electron density from initial helix fragment (7 out of 145 residues) after 20 cycles density modification with shelxe

SBGrid Symposium 2011: The Shelx Suite

The same region after density modification combined with poly-ALA model building.

33/35

Tim Grüne

Availability The Shelx programs are available free of charge for academic users and 2499 USD for profit-users. The application form is available from shelx.uni-ac.gwdg.de/SHELX. The latest version is SHELX-97 which was released in 1997. The next major release is scheduled for 2012. β-test versions of some of the programs (currently notably shelxe with auto-tracing of proteins and the multiprocessor version of shelxd) are available to Shelx users upon email request to George Sheldrick ([email protected]) Tim Grüne ([email protected])

SBGrid Symposium 2011: The Shelx Suite

34/35

Tim Grüne

References 1. G. M. Sheldrick, A short history of SHELX, Acta Crystallogr. (2008), A64 2. G. M. Sheldrick, Experimental phasing with SHELXC/D/E: combining chain tracing with density modification, Acta Crystallogr. (2010), D66 3. T. Grune, mtz2sca and mtz2hkl: facilitated transition from CCP4 to the SHELX program suite J. App. Cryst. (2008), 41(1) 4. C. Hübschle, University of Göttingen 5. P. D. Adams et al., PHENIX: a comprehensive Python-based system for macromolecular structure solution, Acta Crystallogr. (2010) D66, 213–221 6. P. Emsley et al., Features and Development of Coot, Acta Crystallogr. (2010), D66 7. The CCP4 Suite: Programs for Protein Crystallography. Acta Crystallogr. D50, 760–763 8. Rodríguez, D. D. et al., Crystallographic ab initio protein structure solution below atomic resolution, Nature Methods (2009), volume 6(9); http://chango.ibmb.csic.es/ARCIMBOLDO

SBGrid Symposium 2011: The Shelx Suite

35/35