Structure Factor Conversion and Validation



  • Introduction
  • Input Format of the Coordinate and Structure Factor Files
  • Structure Factor File Format Conversion
  • Check Model against Structure Factors
  • Results Summary

  • Introduction     (TOP)

    The SF-TOOL web server can be used to convert various structure factor file formats and to validate the model against the structure factors. Structure factor format conversion is performed by the program sf_convert and the validation against the coordinates (Xray or Neutron diffraction) is handled by DCC which wraps the external programs ( sfcheck , refmac , phenix.model_vs_data ) Details of the warning/error messages are exported during format conversion and validation.

    The web interface also generates 2mFo-DFc and mFo-DFc sigmaa electron density maps in both CCP4 and DSN6 formats. These maps are either for the whole asymmetric unit or around all the ligands and peptides. The electron density maps of all the ligand and peptide may be displayed by Jmol.

    The mmCIF format of the structure factor file is used for wwPDB deposition. If your structure is very large (such as ribosomes) and your pdb file is split into several entries, you need to combine all the coordinate into one before uploading to the server for validation. For a virus, if you have NCS constraint in refinement, you need to provide the NCS matrix in your pdb file.

    Input Format of the Coordinate and Structure Factor Files     (TOP)

    For coordinate file, both pdb and mmcif formats are supported.

    The input structure factor files can be in any one of the formats (mmCIF, CIF, MTZ, CNS, Xplor, HKL2000, Scalepack, Dtrek, TNT, SHELX, SAINT, EPMR, XSCALE, XPREP, XTALVIEW, X-GEN, XENGEN, MULTAN, MAIN). Here the mmCIF is the CIF foramt for macromolecules and the CIF is the CIF format for small molecules.

    Structure Factor File Format Conversion     (TOP)
    1. Upload your structure factor file (If you wish to do format conversion, coordinate file input is optional).
    2. Select the desired output SF file format from the drop-down menu beside it.
    3. Click the RUN button at the bottom of the page.
    4. If the input format is CNS (.cv) or MTZ format and you know the labels of the columns, it is recommended to use the semi-automatic method.

    If you wish to convert TNT, SHELX, and OTHER SF files to another format, you must indicate if the data contains amplitudes (F) or intensities (I).

    Automatic vs. Semi-Automatic conversion

    Most reflection/structure factor data are organized into (at minimum) 5 columns (see examples). The default column assignments used by most programs are:

    H  K  L  F(or I)  SigF(or SigI or status)  ...and a floating Free_R flag column
    with at least one space in between each column. These column labels might not be displayed depending on the program's particular format, but for mtz, it will be displayed in an mtz_dump output.

    An automatic conversion is used to convert one standardized SF format to another. If you are using one of the accepted formats (column labels, etc.), and you have made no manual changes to the structure file format yourself, then this method will work for you. This tool is programmed to recognize which columns correspond to each parameter (depending on the program's format) and "automatically" converts the data to the new format accordingly.

    However, if you used a novel program whose SF file format does not match defined criteria, or you have manually changed the column labels, then your file format will not be recognized by sf-convert. In this case, a semi-automatic conversion is necessary which will require some additional input from the user. A table will be displayed where you must match the appropriate SF parameters with your column labels. This tool will then perform a complete DATA TYPE conversion from CNS or MTZ format to the mmCIF format, rather than only converting the essential data as in the automatic conversion.

    Free_R flag assignment

    During conversion, you can also set aside 5, 8, or 10% of the reflection data for cross-validation (Free_R factor). This can be either a new Free-R selection, or you can repopulate the reflection list. To do this, write the percentage as a whole number in the provided box. Please be advised that changing/adding the free set is ONLY for refinement. If you want to convert the format for deposition only, leave the current Free_R flags untouched, or leave this box empty.

    Check Model against Structure Factors     (TOP)
    1. Input both structure factor file and the coordinate file.
    2. Select the desired program (Only select phenix, if it is neutron or xray-neutron diffraction).
    3. Click the RUN button at the bottom of the page.

    Tips for program selection

    Using refmac5

    We use zero cycle refinement with restrain/unrestraint parameters. If your model has partial B factors, the full B factors will be generated using the TLSANL program. If the calculated R factors is 2 percent more than the reported R factor, other options will be applied in a stepwise way to get the best match between the reported and the calculated R factors. Twining test is included in the default options. The real space R factor and density correlations are calculated using the MAPMAN program based the calculated electron density maps. The maps around the ligands are displayed by the Jmol program.

    Using phenix (model_vs_data)

    model_vs_data is a fully automatic program. If your structure is refined by the phenix.refine, it is recommended to select model_vs_data for validation. Sometimes you may experience slow in computing. Viruses with NCS matries and large ribosomes are not recommended for this program. No maps are generated with this option

    Using sfcheck

    Sfcheck (v7.04.2) is a tool included as a part of the CCP4 package, and assesses the refined atomic coordinates (protein, nucleic acid, and ligand) against your structure factors without the need for ligand chemical information libraries. If your structure factors are twinned, then use the twinned structure validation tool found at the bottom of the SF-Tool page instead. Detailed information about this program can be found at the Official sfcheck Documentation site.

    Results Summary     (TOP)

    The SF-TOOL will return the following report:

    • a printout of the structure factors file header
    • a printout of the SF file to mmCIF format conversion
    • the converted SF file in mmCIF format to download (optional - another mmCIF file will be created during PDB_extract)
    • a Results Summary Chart which reports the resolution range, completeness, reflection #, R-factors (R-work and R-free), and Correlation Factor
    • the full sfcheck log file and full report for you to examine