Package gromacs :: Module qsub
[hide private]
[frames] | no frames]

Module qsub

source code


:mod:`gromacs.qsub` -- utilities for batch submission systems
=============================================================

The module helps writing submission scripts for various batch submission
queuing systems. The known ones are listed stored as
:class:`~gromacs.qsub.QueuingSystem` instances in
:data:`~gromacs.qsub.queuing_systems`; append new ones to this list.

The working paradigm is that template scripts are provided (see
:data:`gromacs.config.templates`) and only a few place holders are substituted
(using :func:`gromacs.cbook.edit_txt`).

*User-supplied template scripts* can be stored in
:data:`gromacs.config.qscriptdir` (by default ``~/.gromacswrapper/qscripts``)
and they will be picked up before the package-supplied ones.

The :class:`~gromacs.qsub.Manager` handles setup and control of jobs
in a queuing system on a remote system via :program:`ssh`.

At the moment, some of the functions in :mod:`gromacs.setup` use this module
but it is fairly independent and could conceivably be used for a wider range of
projects.


Queuing system templates
------------------------

The queuing system scripts are highly specific and you will need to add
your own.  Templates should be shell scripts. Some parts of the
templates are modified by the
:func:`~gromacs.qsub.generate_submit_scripts` function. The "place
holders" that can be replaced are shown in the table below. Typically,
the place holders are either shell variable assignments or batch
submission system commands. The table shows SGE_ commands but PBS_ and
LoadLeveler_ have similar constructs; e.g. PBS commands start with
``#PBS`` and LoadLeveller uses ``#@`` with its own command keywords).

.. Table:: Substitutions in queuing system templates.

   ===============  ===========  ================ ================= =====================================
   place holder     default      replacement      description       regex
   ===============  ===========  ================ ================= =====================================
   #$ -N            GMX_MD       *sgename*        job name          /^#.*(-N|job_name)/
   #$ -l walltime=  00:20:00     *walltime*       max run time      /^#.*(-l walltime|wall_clock_limit)/
   #$ -A            BUDGET       *budget*         account           /^#.*(-A|account_no)/
   DEFFNM=          md           *deffnm*         default gmx name  /^DEFFNM=/
   WALL_HOURS=      0.33         *walltime* h     mdrun's -maxh     /^WALL_HOURS=/
   MDRUN_OPTS=      ""           *mdrun_opts*     more options      /^MDRUN_OPTS=/
   ===============  ===========  ================ ================= =====================================

Lines with place holders should not have any white space at the beginning. The
regular expression pattern ("regex") is used to find the lines for the
replacement and the literal default values ("default") are replaced. Not all
place holders have to occur in a template; for instance, if a queue has no run
time limitation then one would probably not include *walltime* and *WALL_HOURS*
place holders.

The line ``# JOB_ARRAY_PLACEHOLDER`` can be replaced by
:func:`~gromacs.qsub.generate_submit_array` to produce a "job array"
(also known as a "task array") script that runs a large number of
related simulations under the control of a single queuing system
job. The individual array tasks are run from different sub
directories. Only queuing system scripts that are using the
:program:`bash` shell are supported for job arrays at the moment.

A queuing system script *must* have the appropriate suffix to be properly
recognized, as shown in the table below.

.. Table:: Suffices for queuing system templates. Pure shell-scripts are only used to run locally.

   ==============================  ===========  ===========================
   Queuing system                  suffix       notes
   ==============================  ===========  ===========================
   Sun Gridengine                  .sge         Sun's `Sun Gridengine`_
   Portable Batch queuing system   .pbs         OpenPBS_ and `PBS Pro`_
   LoadLeveler                     .ll          IBM's `LoadLeveler`_
   bash script                     .bash, .sh   `Advanced bash scripting`_
   csh script                      .csh         avoid_ csh_
   ==============================  ===========  ===========================

.. _OpenPBS: http://www.mcs.anl.gov/research/projects/openpbs/
.. _PBS: OpenPBS_
.. _PBS Pro: http://www.pbsworks.com/Product.aspx?id=1
.. _Sun Gridengine: http://gridengine.sunsource.net/
.. _SGE: Sun Gridengine_
.. _LoadLeveler: http://www-03.ibm.com/systems/software/loadleveler/index.html
.. _Advanced bash scripting: http://tldp.org/LDP/abs/html/
.. _avoid: http://www.grymoire.com/Unix/CshTop10.txt
.. _csh: http://www.faqs.org/faqs/unix-faq/shell/csh-whynot/


Example queuing system script template for PBS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The following script is a usable PBS_ script for a super computer. It
contains almost all of the replacement tokens listed in the table
(indicated by ++++++; these values should be kept in the template as
they are or they will not be subject to replacement). ::

   #!/bin/bash
   # File name: ~/.gromacswrapper/qscripts/supercomputer.somewhere.fr_64core.pbs
   #PBS -N GMX_MD
   #       ++++++
   #PBS -j oe
   #PBS -l select=8:ncpus=8:mpiprocs=8
   #PBS -l walltime=00:20:00
   #                ++++++++

   # host: supercomputer.somewhere.fr
   # queuing system: PBS

   # set this to the same value as walltime; mdrun will stop cleanly
   # at 0.99 * WALL_HOURS 
   WALL_HOURS=0.33
   #          ++++

   # deffnm line is possibly modified by gromacs.setup
   # (leave it as it is in the template)
   DEFFNM=md
   #      ++

   TPR=${DEFFNM}.tpr
   OUTPUT=${DEFFNM}.out
   PDB=${DEFFNM}.pdb

   MDRUN_OPTS=""
   #          ++

   # If you always want to add additional MDRUN options in this script then
   # you can either do this directly in the mdrun commandline below or by
   # constructs such as the following:
   ## MDRUN_OPTS="-npme 24 $MDRUN_OPTS"

   # JOB_ARRAY_PLACEHOLDER
   #++++++++++++++++++++++   leave the full commented line intact!

   # avoids some failures
   export MPI_GROUP_MAX=1024
   # use hard coded path for time being
   GMXBIN="/opt/software/SGI/gromacs/4.0.3/bin"
   MPIRUN=/usr/pbs/bin/mpiexec
   APPLICATION=$GMXBIN/mdrun_mpi
   
   $MPIRUN $APPLICATION -stepout 1000 -deffnm ${DEFFNM} -s ${TPR} -c ${PDB} -cpi                         $MDRUN_OPTS                         -maxh ${WALL_HOURS} > $OUTPUT
   rc=$?

   # dependent jobs will only start if rc == 0
   exit $rc

Save the above script in ``~/.gromacswrapper/qscripts`` under the name
``supercomputer.somewhere.fr_64core.pbs``. This will make the script
immediately usable. For example, in order to set up a production MD run with
:func:`gromacs.setup.MD` for this super computer one would use ::

   gromacs.setup.MD(..., qscripts=['supercomputer.somewhere.fr_64core.pbs', 'local.sh'])

This will generate submission scripts based on
``supercomputer.somewhere.fr_64core.pbs`` and also the default ``local.sh``
that is provided with *GromacsWrapper*.

In order to modify ``MDRUN_OPTS`` one would use the additonal *mdrun_opts*
argument, for instance::

   gromacs.setup.MD(..., qscripts=['supercomputer.somewhere.fr_64core.pbs', 'local.sh'],
                    mdrun_opts="-v -npme 20 -dlb yes -nosum")


Currently there is no good way to specify the number of processors when
creating run scripts. You will need to provided scripts with different numbers
of cores hard coded or set them when submitting the scripts with command line
options to :program:`qsub`.



Classes and functions
---------------------

.. autoclass:: QueuingSystem
   :members:
.. autofunction:: generate_submit_scripts
.. autofunction:: generate_submit_array
.. autofunction:: detect_queuing_system

.. autodata:: queuing_systems



Queuing system Manager
----------------------

The :class:`Manager` class must be customized for each system such as
a cluster or a super computer. It then allows submission and control of
jobs remotely (using ssh_).

.. autoclass:: Manager
   :members:
   :exclude-members: job_done, qstat

   .. autoattribute:: _hostname
   .. autoattribute:: _scratchdir
   .. autoattribute:: _qscript
   .. autoattribute:: _walltime
   .. method:: job_done

               alias for :meth:`get_status`

   .. method:: qstat

               alias for :meth:`get_status`


.. _ssh: http://www.openssh.com/
.. _~/.ssh/config: http://linux.die.net/man/5/ssh_config

Classes [hide private]
  QueuingSystem
Class that represents minimum information about a batch submission system.
  Manager
Base class to launch simulations remotely on computers with queuing systems.
Functions [hide private]
 
relpath(path, start='.')
Return a relative version of a path (from posixpath 2.6)
source code
 
detect_queuing_system(scriptfile)
Return the queuing system for which *scriptfile* was written.
source code
 
generate_submit_scripts(templates, prefix=None, deffnm='md', jobname='MD', budget=None, mdrun_opts=None, walltime=1.0, jobarray_string=None, **kwargs)
Write scripts for queuing systems.
source code
 
generate_submit_array(templates, directories, **kwargs)
Generate a array job.
source code
Variables [hide private]
  logger = logging.getLogger('gromacs.qsub')
  queuing_systems = [<Sun Gridengine QueuingSystem instance>, <P...
Pre-defined queuing systems (SGE, PBS).
Function Details [hide private]

generate_submit_scripts(templates, prefix=None, deffnm='md', jobname='MD', budget=None, mdrun_opts=None, walltime=1.0, jobarray_string=None, **kwargs)

source code 
Write scripts for queuing systems.


This sets up queuing system run scripts with a simple search and replace in
templates. See :func:`gromacs.cbook.edit_txt` for details. Shell scripts
are made executable.

:Arguments:
  *templates*
      Template file or list of template files. The "files" can also be names
      or symbolic names for templates in the templates directory. See
      :mod:`gromacs.config` for details and rules for writing templates.
  *prefix*
      Prefix for the final run script filename; by default the filename will be
      the same as the template. [None]
  *dirname*
      Directory in which to place the submit scripts. [.]
  *deffnm*
      Default filename prefix for :program:`mdrun` ``-deffnm`` [md]
  *jobname*
      Name of the job in the queuing system. [MD]
  *budget*
      Which budget to book the runtime on [None]
  *mdrun_opts*
      String of additional options for :program:`mdrun`.
  *walltime*
      Maximum runtime of the job in hours. [1]
  *jobarray_string*
      Multi-line string that is spliced in for job array functionality
      (see :func:`gromacs.qsub.generate_submit_array`; do not use manually)
  *kwargs*
      all other kwargs are ignored

:Returns: list of generated run scripts

generate_submit_array(templates, directories, **kwargs)

source code 
Generate a array job.

For each ``work_dir`` in *directories*, the array job will
 1. cd into ``work_dir``
 2. run the job as detailed in the template
It will use all the queuing system directives found in the 
template. If more complicated set ups are required, then this
function cannot be used.

:Arguments:
   *templates*
      Basic template for a single job; the job array logic is spliced into 
      the position of the line ::
          # JOB_ARRAY_PLACEHOLDER
      The appropriate commands for common queuing systems (Sun Gridengine, PBS)
      are hard coded here. The queuing system is detected from the suffix of 
      the template.
   *directories*
      List of directories under *dirname*. One task is set up for each 
      directory.
   *dirname*
      The array script will be placed in this directory. The *directories*
      **must** be located under *dirname*.
   *kwargs*
      See :func:`gromacs.setup.generate_submit_script` for details.


Variables Details [hide private]

queuing_systems

Pre-defined queuing systems (SGE, PBS). Add your own here.

Value:
[<Sun Gridengine QueuingSystem instance>,
 <PBS QueuingSystem instance>,
 <LoadLeveler QueuingSystem instance>]