Archive for Software

How to create PyQuante forcefield for MMTK. Part 1.

How to create PyQuante forcefield for MMTK. Part 1.

Introduction: Why and What

One of the constant activities of Computational Chemistry Group, Kiev is an optimization and a redesign of our computational toolchain.

Finally, we’ve came to the point where no one of existing, nor our own quantum chemistry computational packages is not sufficient. Say, I would like to set a random distribution of bunch of some molecules in space, run molecular dynamics simulation, cooling them, then perform an energy minimization and finally, compute their IR spectra. And then repeat this 50 times for different molecules and setups.

One package allows nice molecular dynamics, but, alas only for biomolecular systems. Another is fine regarding HF and DFT calculations, but what about dynamics? In our semi-empirical code we routinely calculate systems in an external electric field or under mechanical stress, but tool integration (e.g. visualization) stays problematic.

Being (a) responsible for the software redesign activity (b) chemist by education and (c) programmer by nature, I’ve decided – it’s time to automate all this stuff. Since I love high-level scripting and I love Python, the roadmap was clear to me: find existing python modules that encapsulate things like force field calculations, dynamics or visualization and glue them together.

Thanks to open source we have a bunch of helpful projects:

  • PyQuante that provides fine-grained python interface to the heart of hartree-fock and density functional calculations

  • MMTK for daily molecular modelling activities

  • PyMol and VMD for molecular visualization and construction

  • lots of open source (and, of course, our own) ab-initio and semi-empiric HF and DFT codes

So, given the goal and the starting point, my first step is to create new MMTK force field using PyQuante.

Technical note. I’m using ALT Linux distribution that’s rpm- and apt-based. It’s central package repository is Sisyphus (similar to Debian unstable)

Preparing

First, I’ve started by building stable versions of required packages:

  • python-module-PyQuante 1.5.0

  • python-module-Scientific 2.4.11 (MMTK helper module – )

  • python-module-MMTK 2.4.6

The process was pretty straighforward, despite I’ve found that the Numeric python module that’s present in my distribution is outdated and has bug in LinearAlgrebra module. Result: the bug report.

MMTK forcefield

Finally, I’ve get my hands onto fresh installed python modules. Investigating the internals of MMTK energy calculations…

MMTK vocabulary uses Universe to define the system under study, the ForceField to set the energy calculation rules and usual Atom and Molecule to set the molecular structure.

I’ve called my new ForceField RhfForceField and set up simple test script:

   1 from MMTK import *

   2 from MMTK.ForceFields.ForceField import ForceField
   3 

   4 class RhfForceField(ForceField):
   5     def __init__(self): pass

   6     pass
   7 
   8 universe = InfiniteUniverse(RhfForceField())

   9 universe.mol = Molecule('water')
  10 
  11 print universe.energy()

Sure it fails. I’ve not defined anything!

  File "t2.py", line 11, in ?
    print universe.energy()
  File "/usr/lib/python2.4/site-packages/MMTK/Universe.py", line 629, in energy
    eval = self.energyEvaluator(subset1, subset2)
  File "/usr/lib/python2.4/site-packages/MMTK/Universe.py", line 615, in energyEvaluator
    threads, mpi_communicator)
  File "/usr/lib/python2.4/site-packages/MMTK/ForceFields/ForceField.py", line 183, in __init__
    self.global_data)
  File "/usr/lib/python2.4/site-packages/MMTK/ForceFields/ForceField.py", line 40, in evaluatorTerms
    raise AttributeError
AttributeError

Short peek into MMTK.ForceFields.ForceFileds reveals the evaluatorTerms method. But when I define it to return nothing

   1 class RhfForceField(ForceField):
   2     def __init__(self): pass

   3     def evaluatorTerms(self, universe, subset1, subset2, global_data):

   4         return
   5     pass

my python coredumps.

At this point (any really clever developer would do this from the very beginning), I read the examples, supplied with MMTK. There is simple harmonic oscillator force field defined (Examples/Forcefield/). Alas, it uses C-coded energy evaluation routine. Not the way, I’d like to venture now.

Several hours of reading C and Python code resulted in various attempts, including the attempt to use undocumented MMTK_forcefield.PythonTerm class, but of no use – still coredump.

The problem was easily solved (the power of open source!) by single e-mail to the mmtk mailing list. The answer is simple: to use python-defined force fields, one must use the development (2.5.x) version of MMTK. Actually, as Konrad says, the only reason it wasn’t declared ’stable’ is the deficiency in his documentation toolkit: he can’t generate html, xml and pdf docs from old documentation base.

So, once again I’ve switched to rpm packaging, including packaging of python-module-cElementTree, required by 2.5.13 version of MMTK.

This version, among other, includes couple new ForceField examples, each implemented in C, Pyrex and Python.

Now, the implementation of the restricted Hartree-Fock force field seems straighforward:

   1 from MMTK import *

   2 from MMTK.ForceFields.ForceField import ForceField, EnergyTerm

   3 import PyQuante.Molecule
   4 from PyQuante import hartree_fock

   5 
   6 sym_to_nucl_map = {
   7     'H' : 1,

   8     'C' : 6,
   9     'O' : 8,

  10     }
  11 
  12 class HfFieldTerm(EnergyTerm):

  13     def __init__(self, universe):
  14         EnergyTerm.__init__(self, 'Hatree Fock', universe)

  15         return
  16 
  17     def evaluate(self, configuration, do_gradients, do_force_constants):

  18         atomlist = []
  19         for atom in self.universe.atomList(): atomlist.append([

  20             sym_to_nucl_map[atom.type.symbol],
  21             atom.position()])

  22 
  23         mol = PyQuante.Molecule.Molecule('molecule', atomlist,

  24                                          units = 'Angstrom',
  25                                          charge = 0,

  26                                          multiplicity = 1)
  27         en,orbe,self.orbs = hartree_fock.hf(mol, verbose=1)

  28         results = { 'energy' : en }

  29 
  30         if do_gradients: assert 'Not implemented'
  31         if do_force_constants: assert 'Not implemented'

  32 
  33         return results
  34     pass
  35 

  36 
  37 class HfForceField(ForceField):
  38     def __init__(self): ForceField('Hartree Fock')

  39 
  40     def evaluatorTerms(self, universe, subset1, subset2, global_data):

  41         # We call this for its side effect: it makes sure that all
  42         # atoms have a correct index assigned to them.
  43         universe.configuration()

  44         return [HfFieldTerm(universe)]
  45     pass

  46 
  47 universe = InfiniteUniverse(HfForceField())
  48 universe.mol = Molecule('water')

  49 
  50 print universe.energy()

This reveals the lack of one important property in MMTK Atom database – the charge of nuclea. That’s why I had to create the map between the IUPAC symbol and the charge manually.

The result of the code run is

Nbf =  25
Nclosed =  5
Optimization of HF orbitals
0 -7.72644873092
1 -9.46445551337
2 -17.2283843599
3 -19.050454015
4 -19.8624953604
5 -20.0540489183
6 -20.1023997138
7 -20.1133127164
8 -20.115849504
9 -20.1164274514
10 -20.116560223
11 -20.1165905954
-20.1165905954

Wow, it calculates something!

Spot the difference: nanometers vs angstroms

Yes, the code really does calculations. But is it correct?

Quick run of one of the PyQuante examples shows that it is not:

>>> from PyQuante.Molecule import Molecule
>>> from PyQuante.hartree_fock import *
>>> h2o = Molecule('H2O',
...                atomlist = [(8,(0,0,0)),
...                            (1,(0.959,0,0)),
...                            (1,(-.230,0.930,0))],
...                units = 'Angstrom')
>>> en, orbe, orbs = hf(h2o, verbose=True)
Nbf =  25
Nclosed =  5
Optimization of HF orbitals
0 -68.2710764842
1 -71.4107153837
2 -73.8915175997
3 -75.0373719156
4 -75.6459826198
5 -75.8850728221
6 -75.9751560983
7 -76.0064588121
8 -76.0173702077
9 -76.0210974045
10 -76.0223774136
11 -76.0228145355
12 -76.0229641837
13 -76.023015328
>>> en
-76.023015327968622

What’s wrong?

Oops… Looks like we’ve missed the measurement units completely: MMTK uses SI units, while PyQuante expects them in Angstroms.

The quick fix

   1     def evaluate(self, configuration, do_gradients, do_force_constants):

   2         atomlist = []
   3         for atom in self.universe.atomList(): atomlist.append([

   4             sym_to_nucl_map[atom.type.symbol],
   5             atom.position()/Units.Ang])

   6 
   7         mol = PyQuante.Molecule.Molecule('molecule', atomlist,

   8                                          units = 'Angstrom',
   9                                          charge = 0,

  10                                          multiplicity = 1)
  11         en,orbe,self.orbs = hartree_fock.hf(mol, verbose=1)

  12         results = { 'energy' : en }

  13 
  14         if do_gradients: assert 'Not implemented'
  15         if do_force_constants: assert 'Not implemented'

  16 
  17         return results

results in something consistent with previous calculation and more meaningfull:

Nbf =  25
Nclosed =  5
Optimization of HF orbitals
0 -68.2698558264
1 -71.4163874054
2 -73.8989849529
3 -75.0479917788
4 -75.6520823142
5 -75.8880296116
6 -75.9764512423
7 -76.0070491726
8 -76.0176718892
9 -76.0212871677
10 -76.0225241199
11 -76.0229450137
12 -76.0230885803
13 -76.0231374689
-76.0231374689

In the next article

  • System energy to the Heat of formation

  • Reuse orbitals from previous calculation

  • Polishing

  • Geometry optimization attempt

  • Further development

Comments (3)