About This Example:

This example is one of a few that shows how the PDBX library can be used to interface with Chimera, such that useful and interesting aspects of a molecule, e.g., certain connections among certain atoms, obtainable via parsing CIF files, can be located and used as the subject of a Chimera render or animation. This particular example shows how to find connections of certain types that involve certain entities by retrieving and iterating over the struct_conn category, which delineates connections in a molecule, and using the struct_asym and entity categories to determine the entity types involved in each connection. In this case, polymer-polymer covalent bonds are sought for Chimera to emphasize and animate. It is easy to extend this example, say, to handle a set of connection types of interest involving certain entity pairings, each to be displayed in a different color in Chimera, or to focus on connection types of interest only among certain atoms and entities of interest.

Build Instructions:

Files: Connections2.py, 5HVP.cif, Connections2.sh

Save Connections2.py and the CIF data file. Run python Connections2.py /path/to/file.cif, which generates a /path/to/5HVP.com file which you can open with chimera /path/to/5HVP.com Alternatively, you can save the script with Connections2.py, set the Chimera path, and run ./Connections.sh /path/to/5HVP.cif, which will automate the process

Methods To Note

from pdbx.reader.PdbxContainers import ContainerBase
from pdbx.reader.PdbxContainers import DataCategory
  • getObj(self, name) Returns the DataCategory object specified by name.
  • getRowCount(self) Returns the number of rows in the category table.
  • getValue(self, attributeName=None, rowIndex=None)Returns the value of the attribute attributeName at row index rowIndex.

Covalent Polymer-Polymer Linkages for 5HVP.cif

Example Source Code

"""
 Connections2.py

 For some CIF file, generate a Chimera command (COM) file
 to iterate through and emphasize connections of specific connection types
 and involving specific types of entities. As an example, we will
 look for polymer-polymer covalent linkages. 

 Method: For connections of interest, determine each partner atom's entity
 type by indexing into the struct_asym category table with the atom's asym_id to 
 determine its entity ID. Then, index into the entity category table to 
 determine its entity type.

 Lines with superscriptions contain footnoted references or explanations.
"""

from os.path import splitext
from pdbx.reader.PdbxReader import PdbxReader
from pdbx.reader.PdbxContainers import *
from sys import argv, exit

def prepareOutFile(file, name) :
    file.write("windowsize 500 500\n") # Set the window size to 500 x 500 px
    file.write("open %s\n" % name) # Open the CIF file
    file.write("preset apply pub 4\n") # Apply publication preset #4
    file.write("color white\n") # Color the entire molecule white
    file.write("set bg_color gray\n") # Color the background gray
    file.write("repr bs\n") # Represent the atoms in ball-and-stick format
    file.write("savepos fullview\n") # Remember this position (the full view of the molecule)

def writeConnection(selection, file) :
    sel = " | ".join(selection) 
    file.write("sel %s\n" % sel) # Select the two partner atoms 
    file.write("color byelement sel\n") # Color them by element
    file.write("sel sel za<3.0; wait 20\n") # Further select all atoms within 3.0 angstroms of the partner atoms
    file.write("focus sel; wait 25; ~disp ~sel\n"); # Focus in on the selection and hide all non-selected atoms
    file.write("turn y 5.25 68; wait 68\n") # Perform a basic y-axis turning animation
    file.write("disp ~sel; reset fullview 20\n") # Return to the full molecule view
    file.write("color white sel; ~sel; wait 20\n") # Uncolor and drop the selection

# Check for improper usage
if len(argv) != 2 :
    exit("Usage: python Connections2.py /path/to/file.cif");

# Open the CIF file
cif = open(argv[1])

# Create a list to store data blocks
data = []

# Create a PdbxReader object
pRd = PdbxReader(cif)

# Read the CIF file, propagating the data list
pRd.read(data)

# Close the CIF file, as it is no longer needed
cif.close()

# Retrieve the first data block
block = data[0]

# Get the struct_conn category table, which delineates connections1
struct_conn = block.getObj("struct_conn")

# Get the struct_asym category table, which details structural elements in the asymmetric unit2
struct_asym = block.getObj("struct_asym")

# Get the entity category table, which details the molecular entities present in the crystallographic structure3
entity = block.getObj("entity")

# Use the CIF file pathname to generate the Chimera command file (.COM) pathname
(file, ext) = splitext(argv[1])
comFileName = file + ".com"

# Open the COM command file for writing
comFile = open(comFileName, 'w');

# Write out some basic Chimera initialization commands
prepareOutFile(comFile, argv[1])

# Iterate over the rows in struct_conn, where each row delineates an interatomic connection
for index in range(struct_conn.getRowCount()) :

    # A container to hold each partner atom's entity type
    entities = []

    # A container to hold each partner atom's Chimera selection string
    selection = []

    # Verify that the connection is covalent4
    if struct_conn.getValue("conn_type_id", index) == "covale" :

        # Analyze the current row twice, once per partner
        for partner in ["ptnr1_", "ptnr2_"] :
    
            # Retrieve the partner atom's asym_id, with which we will index into struct_asym
            asym_id = struct_conn.getValue(partner + "auth_asym_id", index) 
	
            # Add this atom's Chimera selection string to the container5,6
            selection.append(":%s,%s.%s@%s.%s" % (struct_conn.getValue(partner + "auth_seq_id", index), 
                                      struct_conn.getValue(partner + "auth_comp_id", index),
                                      asym_id, 
                                      struct_conn.getValue(partner + "label_atom_id", index),
                                      struct_conn.getValue("pdbx_" + partner + "label_alt_id", index)))
        
            # Holds the atom's entity ID, which we will find in the struct_asym category table
            entityID = 0
 
            # Find the atom's entity ID in the struct_asym category table
            for i in range(struct_asym.getRowCount()) :
                if struct_asym.getValue("id", i) == asym_id :
                    entityID = (int)(struct_asym.getValue("entity_id", i))
                    break

            # Retrieve and store the atom's entity type
            entities.append(entity.getValue("type", entityID - 1))
           
        # Write satisfactory entity pairings to the COM file (in this case polymer-polymer)
        if entities[0] == "polymer" and entities[1] == "polymer" :        
            writeConnection(selection, comFile)

# Write out the Chimera close command
comFile.write("stop\n")

# Close the COM file as all connections have been processed
comFile.close()

NOTES AND REFERENCES

  1. http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Categories/struct_conn.html
  2. http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Categories/struct_asym.html
  3. http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Categories/entity.html
  4. For an enumeration of the connection types and their descriptions, see: http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_conn_type.id.html
  5. Note that for brevity we are assuming that author-provided values, which are non-mandatory but commonly present, exist for three of these attributes (viz., asym_id, comp_id, seq_id), and that the alt_id, also non-mandatory, is both present and necessary to identify each partner atom. In a more extensive program, these are easily accounted for with hasAttribute(self, attributeName), which returns a bool indicating the presence or absence of some attribute specified by attributeName. Note also that while some columns may be present, their values may be "?", which indicates a missing data item value, or ".", which indicates that there is no appropriate value for that data item or that it has been intentionally omitted.
  6. The form used here is :seq_id,comp_id.asym_id@atom_id.alt_id, based on the Chimera Atom Specification reference found here: http://www.cgl.ucsf.edu/chimera/1.2309/docs/UsersGuide/midas/frameatom_spec.html