Package gromacs :: Module qsub :: Class Manager
[hide private]
[frames] | no frames]

Class Manager

source code

object --+
         |
        Manager

Base class to launch simulations remotely on computers with queuing systems.

Basically, ssh into machine and run job.

Derive a class from :class:`Manager` and override the attributes

and implement a specialized :meth:`Manager.qsub` method if needed.

ssh_ must be set up (via `~/.ssh/config`_) to allow access via a commandline such as :

   ssh <hostname> <command> ...

Typically you want something such as :

  host <hostname>
       hostname <hostname>.fqdn.org
       user     <remote_user>

in ``~/.ssh/config`` and also set up public-key authentication in order to avoid typing your password all the time.

Instance Methods [hide private]
 
__init__(self, dirname='.', **kwargs)
Set up the manager.
source code
 
_assertnotempty(self, value, name)
Simple sanity check.
source code
 
remotepath(self, *args)
Directory on the remote machine.
source code
 
get_dir(self, *args)
Directory on the remote machine.
source code
 
remoteuri(self, *args)
URI of the directory on the remote machine.
source code
 
put(self, dirname)
scp dirname to host.
source code
 
putfile(self, filename, dirname)
scp *filename* to host in *dirname*.
source code
 
get(self, dirname, checkfile=None, targetdir='.')
``scp -r`` *dirname* from host into *targetdir*
source code
 
local_get(self, dirname, checkfile, cattrajectories=True, cleanup=False)
Find *checkfile* locally if possible.
source code
 
cat(self, dirname, prefix='md', cleanup=True)
Concatenate parts of a run in *dirname*.
source code
 
qsub(self, dirname, **kwargs)
Submit job remotely on host.
source code
 
get_status(self, dirname, logfilename='md*.log', silent=False)
Check status of remote job by looking into the logfile.
source code
 
job_done(self, dirname, logfilename='md*.log', silent=False)
Check status of remote job by looking into the logfile.
source code
 
qstat(self, dirname, logfilename='md*.log', silent=False)
Check status of remote job by looking into the logfile.
source code
 
ndependent(self, runtime, performance=None, walltime=None)
Calculate how many dependent (chained) jobs are required.
source code
 
waitfor(self, dirname, **kwargs)
Wait until the job associated with *dirname* is done.
source code
 
setup_posres(self, **kwargs)
Set up position restraints run and transfer to host.
source code
 
setup_MD(self, jobnumber, struct='MD_POSRES/md.pdb', **kwargs)
Set up production and transfer to host.
source code

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

Class Variables [hide private]
  _hostname = None
hostname of the super computer (**required**)
  _scratchdir = None
scratch dir on hostname (**required**)
  _qscript = None
name of the template submission script appropriate for the queuing system on :attr:`Manager._hostname`; can be a path to a local file or a template stored in :data:`gromacs.config.qscriptdir` or a key for :data:`gromacs.config.templates` (**required**)
  _walltime = None
maximum run time of script in hours; the queuing system script :attr:`Manager._qscript` is supposed to stop :program:`mdrun` after 99% of this time via the ``-maxh`` option.
  log_RE = re.compile(r'(?x)Run\stime\sexceeded\s+(?P<exceeded>....
Regular expression used by :meth:`Manager.get_status` to parse the logfile from :program:`mdrun`.
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, dirname='.', **kwargs)
(Constructor)

source code 
Set up the manager.

:Arguments:
  *statedir*
      directory component under the remote scratch dir (should
      be different for different jobs)  [basename(CWD)]
  *prefix*
      identifier for job names [MD]

Overrides: object.__init__

put(self, dirname)

source code 

scp dirname to host.

:Arguments: dirname to be transferred :Returns: return code from scp

putfile(self, filename, dirname)

source code 

scp *filename* to host in *dirname*.

:Arguments: filename and dirname to be transferred to :Returns: return code from scp

get(self, dirname, checkfile=None, targetdir='.')

source code 

``scp -r`` *dirname* from host into *targetdir*

:Arguments:

  • *dirname*: dir to download
  • *checkfile*: raise OSError/ENOENT if *targetdir/dirname/checkfile* was not found
  • *targetdir*: put *dirname* into this directory

:Returns: return code from scp

local_get(self, dirname, checkfile, cattrajectories=True, cleanup=False)

source code 

Find *checkfile* locally if possible.

If *checkfile* is not found in *dirname* then it is transferred from the remote host.

If needed, the trajectories are concatenated using :meth:`Manager.cat`.

:Returns: local path of *checkfile*

cat(self, dirname, prefix='md', cleanup=True)

source code 
Concatenate parts of a run in *dirname*.

Always uses :func:`gromacs.cbook.cat` with *resolve_multi* = 'guess'.

.. Note:: The default is to immediately delete the original files
          (*cleanup* = ``True``).

:Keywords:
   *dirname*
      directory to work in
   *prefix*
      prefix (deffnm) of the files [md]
   *cleanup* : boolean
      if ``True``, remove all used files [``True``]

qsub(self, dirname, **kwargs)

source code 

Submit job remotely on host.

This is the most primitive implementation: it just runs the commands :

  cd remotedir && qsub qscript

on :attr:`Manager._hostname`. *remotedir* is *dirname* under :attr:`Manager._scratchdir` and *qscript* defaults to the queuing system script hat was produced from the template :attr:`Manager._qscript`.

get_status(self, dirname, logfilename='md*.log', silent=False)

source code 
Check status of remote job by looking into the logfile.

Report on the status of the job and extracts the performance in ns/d if
available (which is saved in :attr:`Manager.performance`).

:Arguments:
  - *dirname*
  - *logfilename* can be a shell glob pattern [md*.log]
  - *silent* = True/False; True suppresses log.info messages

:Returns: ``True`` is job is done, ``False`` if still running
          ``None`` if no log file found to look at

.. Note:: Also returns ``False`` if the connection failed.

.. Warning:: This is an important but somewhat  **fragile** method. It
             needs to be improved to be more robust.

job_done(self, dirname, logfilename='md*.log', silent=False)

source code 
Check status of remote job by looking into the logfile.

Report on the status of the job and extracts the performance in ns/d if
available (which is saved in :attr:`Manager.performance`).

:Arguments:
  - *dirname*
  - *logfilename* can be a shell glob pattern [md*.log]
  - *silent* = True/False; True suppresses log.info messages

:Returns: ``True`` is job is done, ``False`` if still running
          ``None`` if no log file found to look at

.. Note:: Also returns ``False`` if the connection failed.

.. Warning:: This is an important but somewhat  **fragile** method. It
             needs to be improved to be more robust.

qstat(self, dirname, logfilename='md*.log', silent=False)

source code 
Check status of remote job by looking into the logfile.

Report on the status of the job and extracts the performance in ns/d if
available (which is saved in :attr:`Manager.performance`).

:Arguments:
  - *dirname*
  - *logfilename* can be a shell glob pattern [md*.log]
  - *silent* = True/False; True suppresses log.info messages

:Returns: ``True`` is job is done, ``False`` if still running
          ``None`` if no log file found to look at

.. Note:: Also returns ``False`` if the connection failed.

.. Warning:: This is an important but somewhat  **fragile** method. It
             needs to be improved to be more robust.

ndependent(self, runtime, performance=None, walltime=None)

source code 
Calculate how many dependent (chained) jobs are required.

Uses *performance* in ns/d (gathered from :meth:`get_status`) and job max
*walltime* (in hours) from the class unless provided as keywords.

   n = ceil(runtime/(performance*0.99*walltime)

:Keywords:
   *runtime*
       length of run in ns
   *performance*
       ns/d with the given setup
   *walltime*
       maximum run length of the script (using 99% of it), in h

:Returns: *n*  or 1 if walltime is unlimited

waitfor(self, dirname, **kwargs)

source code 
Wait until the job associated with *dirname* is done.

Super-primitive, uses a simple while ... sleep for *seconds* delay

:Arguments:
  *dirname*
      look for log files under the remote dir corresponding to *dirname*
  *seconds*
      delay in *seconds* during re-polling

setup_posres(self, **kwargs)

source code 

Set up position restraints run and transfer to host.

*kwargs* are passed to :func:`gromacs.setup.MD_restrained`

setup_MD(self, jobnumber, struct='MD_POSRES/md.pdb', **kwargs)

source code 

Set up production and transfer to host.

:Arguments:

  • *jobnumber*: 1,2 ...
  • *struct* is the starting structure (default from POSRES run but that is just a guess);
  • kwargs are passed to :func:`gromacs.setup.MD`

Class Variable Details [hide private]

_walltime

maximum run time of script in hours; the queuing system script :attr:`Manager._qscript` is supposed to stop :program:`mdrun` after 99% of this time via the ``-maxh`` option. A value of ``None`` or ``inf`` indicates no limit.

Value:
None

log_RE

Regular expression used by :meth:`Manager.get_status` to parse the logfile from :program:`mdrun`.

Value:
re.compile(r'(?x)Run\stime\sexceeded\s+(?P<exceeded>.*)\s+hours,\swill\
\sterminate\sthe\srun|Performance:\s*(?P<performance>[\s\d\.]+)\n|(?P<\
completed>Finished\smdrun\son\snode)')