Package gromacs :: Module cbook
[hide private]
[frames] | no frames]

Source Code for Module gromacs.cbook

   1  # GromacsWrapper -- cbook.py 
   2  # Copyright (c) 2009 Oliver Beckstein <orbeckst@gmail.com> 
   3  # Released under the GNU Public License 3 (or higher, your choice) 
   4   
   5  """ 
   6  :mod:`gromacs.cbook` -- Gromacs Cook Book 
   7  ========================================= 
   8   
   9  The :mod:`~gromacs.cbook` (cook book) module contains short recipes for tasks 
  10  that are often repeated. In the simplest case this is just one of the 
  11  gromacs tools with a certain set of default command line options. 
  12   
  13  By abstracting and collecting these invocations here, errors can be 
  14  reduced and the code snippets can also serve as canonical examples for 
  15  how to do simple things. 
  16   
  17  Miscellaneous canned Gromacs commands 
  18  ------------------------------------- 
  19   
  20  Simple commands with new default options so that they solve a specific 
  21  problem (see also `Manipulating trajectories and structures`_): 
  22   
  23  .. function:: rmsd_backbone([s="md.tpr", f="md.xtc"[, ...]]) 
  24   
  25     Computes the RMSD of the "Backbone" atoms after fitting to the 
  26     "Backbone" (including both translation and rotation). 
  27   
  28   
  29  Manipulating trajectories and structures 
  30  ---------------------------------------- 
  31   
  32  Standard invocations for manipulating trajectories.  
  33   
  34  .. function:: trj_compact([s="md.tpr", f="md.xtc", o="compact.xtc"[, ...]]) 
  35   
  36     Writes an output trajectory or frame with a compact representation 
  37     of the system centered on the protein. It centers on the group 
  38     "Protein" and outputs the whole "System" group. 
  39   
  40   
  41  .. function:: trj_xyfitted([s="md.tpr", f="md.xtc"[, ...]]) 
  42   
  43     Writes a trajectory centered and fitted to the protein in the XY-plane only. 
  44   
  45     This is useful for membrane proteins. The system *must* be oriented so that 
  46     the membrane is in the XY plane. The protein backbone is used for the least 
  47     square fit, centering is done for the whole protein., but this can be 
  48     changed with the *input* = ``('backbone', 'protein','system')`` keyword. 
  49   
  50     .. Note:: Gromacs 4.x only 
  51       
  52  .. autofunction:: trj_fitandcenter 
  53  .. autofunction:: cat 
  54  .. autoclass:: Frames 
  55     :members: 
  56  .. autoclass:: Transformer 
  57     :members: 
  58  .. autofunction:: get_volume 
  59   
  60  Processing output 
  61  ----------------- 
  62   
  63  There are cases when a script has to to do different things depending 
  64  on the output from a Gromacs tool.  
  65   
  66  For instance, a common case is to check the total charge after 
  67  grompping a tpr file. The ``grompp_qtot`` function does just that. 
  68   
  69  .. autofunction:: grompp_qtot 
  70  .. autofunction:: get_volume 
  71  .. autofunction:: parse_ndxlist 
  72   
  73   
  74  Working with topologies and mdp files 
  75  ------------------------------------- 
  76   
  77  .. autofunction:: create_portable_topology 
  78  .. autofunction:: edit_mdp 
  79  .. autofunction:: add_mdp_includes 
  80  .. autofunction:: grompp_qtot 
  81   
  82   
  83  Working with index files 
  84  ------------------------ 
  85   
  86  Manipulation of index files (``ndx``) can be cumbersome because the 
  87  ``make_ndx`` program is not very sophisticated (yet) compared to 
  88  full-fledged atom selection expression as available in Charmm_, VMD_, or 
  89  MDAnalysis_. Some tools help in building and interpreting index files. 
  90   
  91  .. SeeAlso:: The :class:`gromacs.formats.NDX` class can solve a number 
  92               of index problems in a cleaner way than the classes and 
  93               functions here. 
  94   
  95  .. autoclass:: IndexBuilder 
  96     :members: combine, gmx_resid 
  97   
  98  .. autofunction:: parse_ndxlist 
  99  .. autofunction:: get_ndx_groups 
 100  .. autofunction:: make_ndx_captured 
 101   
 102   
 103  .. _MDAnalysis: http://mdanalysis.googlecode.com 
 104  .. _VMD: http://www.ks.uiuc.edu/Research/vmd/current/ug/node87.html 
 105  .. _Charmm: http://www.charmm.org/html/documentation/c35b1/select.html 
 106   
 107   
 108  File editing functions 
 109  ---------------------- 
 110   
 111  It is often rather useful to be able to change parts of a template 
 112  file. For specialized cases the two following functions are useful: 
 113   
 114  .. autofunction:: edit_mdp 
 115  .. autofunction:: edit_txt 
 116  """ 
 117   
 118  # Right now the simplest thing to do is to just create instances with pre-set 
 119  # values; this works fine and is succinct but has some disadvantages: 
 120  # * the underlying gromacs tool is executed to extract the help string; this 
 121  #   adds to the import time 
 122  # * adding documentation is awkward 
 123  #  
 124  # For more complicated cases one is probably better off by properly deriving a 
 125  # new class and set arguments explicitly in init (using kwargs['flag'] = 
 126  # default) ... or I can write some meta(??) class to do this nicely 
 127   
 128  from __future__ import with_statement 
 129   
 130  __docformat__ = "restructuredtext en" 
 131   
 132  import sys 
 133  import os 
 134  import re 
 135  import warnings 
 136  import tempfile 
 137  import shutil 
 138  import glob 
 139   
 140  import logging 
 141  logger = logging.getLogger('gromacs.cbook') 
 142   
 143  import gromacs 
 144  from gromacs import GromacsError, BadParameterWarning, MissingDataWarning, GromacsValueWarning 
 145  import tools 
 146  import utilities 
 147  from utilities import asiterable 
 148   
 149  trj_compact = tools.Trjconv(ur='compact', center=True, boxcenter='tric', pbc='mol', 
 150                              input=('protein','system'), 
 151                              doc=""" 
 152  Writes a compact representation of the system centered on the protein""") 
 153   
 154  rmsd_backbone = tools.G_rms(what='rmsd', fit='rot+trans', 
 155                              input=('Backbone','Backbone'), 
 156                              doc=""" 
 157  Computes RMSD of backbone after fitting to the backbone.""") 
 158   
 159  # Gromacs 4.x 
 160  trj_xyfitted = tools.Trjconv(fit='rotxy+transxy', 
 161                              center=True, boxcenter='rect', pbc='whole', 
 162                              input=('backbone', 'protein','system'), 
 163                              doc=""" 
 164  Writes a trajectory centered and fitted to the protein in the XY-plane only. 
 165   
 166  This is useful for membrane proteins. The system *must* be oriented so 
 167  that the membrane is in the XY plane. The protein backbone is used 
 168  for the least square fit, centering is done for the whole protein. 
 169   
 170  .. Note:: Gromacs 4.x only""") 
171 172 -def trj_fitandcenter(xy=False, **kwargs):
173 """Center everything and make a compact representation (pass 1) and fit the system to a reference (pass 2). 174 175 :Keywords: 176 *s* 177 input structure file (tpr file required to make molecule whole) 178 *f* 179 input trajectory 180 *o* 181 output trajectory 182 *input* 183 A list with three groups. The default is 184 ['backbone', 'protein','system'] 185 The fit command uses all three (1st for least square fit, 186 2nd for centering, 3rd for output), the centered/make-whole stage use 187 2nd for centering and 3rd for output. 188 *input1* 189 If *input1* is supplied then *input* is used exclusively 190 for the fitting stage (pass 2) and *input1* for the centering (pass 1). 191 *n* 192 Index file used for pass 1 and pass 2. 193 *n1* 194 If *n1* is supplied then index *n1* is only used for pass 1 195 (centering) and *n* for pass 2 (fitting). 196 *xy* : boolean 197 If ``True`` then only do a rot+trans fit in the xy plane 198 (good for membrane simulations); default is ``False``. 199 *kwargs* 200 All other arguments are passed to :class:`~gromacs.tools.Trjconv`. 201 202 Note that here we first center the protein and create a compact box, using 203 ``-pbc mol -ur compact -center -boxcenter tric`` and write an intermediate 204 xtc. Then in a second pass we perform a rotation+translation fit (or 205 restricted to the xy plane if *xy* = ``True`` is set) on the intermediate 206 xtc to produce the final trajectory. Doing it in this order has the 207 disadvantage that the solvent box is rotating around the protein but the 208 opposite order (with center/compact second) produces strange artifacts 209 where columns of solvent appear cut out from the box---it probably means 210 that after rotation the information for the periodic boundaries is not 211 correct any more. 212 213 Most kwargs are passed to both invocations of 214 :class:`gromacs.tools.Trjconv` so it does not really make sense to use eg 215 *skip*; in this case do things manually. 216 217 By default the *input* to the fit command is ('backbone', 218 'protein','system'); the compact command always uses the second and third 219 group for its purposes or if this fails, prompts the user. 220 221 Both steps cannot performed in one pass; this is a known limitation of 222 ``trjconv``. An intermediate temporary XTC files is generated which should 223 be automatically cleaned up unless bad things happened. 224 225 The function tries to honour the input/output formats. For instance, if you 226 want trr output you need to supply a trr file as input and explicitly give 227 the output file also a trr suffix. 228 229 .. Note:: For big trajectories it can **take a very long time** 230 and consume a **large amount of temporary diskspace**. 231 232 We follow the `g_spatial documentation`_ in preparing the trajectories:: 233 234 trjconv -s a.tpr -f a.xtc -o b.xtc -center tric -ur compact -pbc none 235 trjconv -s a.tpr -f b.xtc -o c.xtc -fit rot+trans 236 237 .. _`g_spatial documentation`: http://oldwiki.gromacs.org/index.php/Manual:g_spatial_4.0.3 238 """ 239 if xy: 240 fitmode = 'rotxy+transxy' 241 kwargs.pop('fit', None) 242 else: 243 fitmode = kwargs.pop('fit', 'rot+trans') # user can use progressive, too 244 245 intrj = kwargs.pop('f', None) 246 # get the correct suffix for the intermediate step: only trr will 247 # keep velocities/forces! 248 suffix = os.path.splitext(intrj)[1] 249 if not suffix in ('xtc', 'trr'): 250 suffix = '.xtc' 251 outtrj = kwargs.pop('o', None) 252 253 ndx = kwargs.pop('n', None) 254 ndxcompact = kwargs.pop('n1', ndx) 255 256 257 inpfit = kwargs.pop('input', ('backbone', 'protein','system')) 258 try: 259 _inpcompact = inpfit[1:] # use 2nd and 3rd group for compact 260 except TypeError: 261 _inpcompact = None 262 inpcompact = kwargs.pop('input1', _inpcompact) # ... or the user supplied ones 263 264 fd, tmptrj = tempfile.mkstemp(suffix=suffix, prefix='fitted_') 265 266 logger.info("Input trajectory: %(intrj)r\nOutput trajectory: %(outtrj)r"% vars()) 267 logger.info("... writing temporary trajectory %(tmptrj)r (will be auto-cleaned)." % vars()) 268 sys.stdout.flush() 269 try: 270 trj_compact(f=intrj, o=tmptrj, n=ndxcompact, input=inpcompact, **kwargs) 271 trj_xyfitted(f=tmptrj, o=outtrj, n=ndx, fit=fitmode, input=inpfit, **kwargs) 272 finally: 273 utilities.unlink_gmx(tmptrj)
274
275 -def cat(prefix="md", dirname=os.path.curdir, partsdir="parts", fulldir="full", 276 resolve_multi="pass"):
277 """Concatenate all parts of a simulation. 278 279 The xtc, trr, and edr files in *dirname* such as prefix.xtc, 280 prefix.part0002.xtc, prefix.part0003.xtc, ... are 281 282 1) moved to the *partsdir* (under *dirname*) 283 2) concatenated with the Gromacs tools to yield prefix.xtc, prefix.trr, 284 prefix.edr, prefix.gro (or prefix.md) in *dirname* 285 3) Store these trajectories in *fulldir* 286 287 .. Note:: Trajectory files are *never* deleted by this function to avoid 288 data loss in case of bugs. You will have to clean up yourself 289 by deleting *dirname*/*partsdir*. 290 291 Symlinks for the trajectories are *not* handled well and 292 break the function. Use hard links instead. 293 294 .. Warning:: If an exception occurs when running this function then make 295 doubly and triply sure where your files are before running 296 this function again; otherwise you might **overwrite data**. 297 Possibly you will need to manually move the files from *partsdir* 298 back into the working directory *dirname*; this should onlu overwrite 299 generated files so far but *check carefully*! 300 301 :Keywords: 302 *prefix* 303 deffnm of the trajectories [md] 304 *resolve_multi" 305 how to deal with multiple "final" gro or pdb files: normally there should 306 only be one but in case of restarting from the checkpoint of a finished 307 simulation one can end up with multiple identical ones. 308 - "pass" : do nothing and log a warning 309 - "guess" : take prefix.pdb or prefix.gro if it exists, otherwise the one of 310 prefix.partNNNN.gro|pdb with the highes NNNN 311 *dirname* 312 change to *dirname* and assume all tarjectories are located there [.] 313 *partsdir* 314 directory where to store the input files (they are moved out of the way); 315 *partsdir* must be manually deleted [parts] 316 *fulldir* 317 directory where to store the final results [full] 318 """ 319 320 gmxcat = {'xtc': gromacs.trjcat, 321 'trr': gromacs.trjcat, 322 'edr': gromacs.eneconv, 323 'log': utilities.cat, 324 } 325 326 def _cat(prefix, ext, partsdir=partsdir, fulldir=fulldir): 327 filenames = glob_parts(prefix, ext) 328 if ext.startswith('.'): 329 ext = ext[1:] 330 outfile = os.path.join(fulldir, prefix + '.' + ext) 331 if not filenames: 332 return None 333 nonempty_files = [] 334 for f in filenames: 335 if os.stat(f).st_size == 0: 336 logger.warn("File %(f)r is empty, skipping" % vars()) 337 continue 338 if os.path.islink(f): 339 # TODO: re-write the symlink to point to the original file 340 errmsg = "Symbolic links do not work (file %(f)r), sorry. " \ 341 "CHECK LOCATION OF FILES MANUALLY BEFORE RUNNING gromacs.cbook.cat() AGAIN!" % vars() 342 logger.exception(errmsg) 343 raise NotImplementedError(errmsg) 344 shutil.move(f, partsdir) 345 nonempty_files.append(f) 346 filepaths = [os.path.join(partsdir, f) for f in nonempty_files] 347 gmxcat[ext](f=filepaths, o=outfile) 348 return outfile
349 350 _resolve_options = ("pass", "guess") 351 if not resolve_multi in _resolve_options: 352 raise ValueError("resolve_multi must be one of %(_resolve_options)r, " 353 "not %(resolve_multi)r" % vars()) 354 355 if fulldir == os.path.curdir: 356 wmsg = "Using the current directory as fulldir can potentially lead to data loss if you run this function multiple times." 357 logger.warning(wmsg) 358 warnings.warn(wmsg, category=BadParameterWarning) 359 360 with utilities.in_dir(dirname, create=False): 361 utilities.mkdir_p(partsdir) 362 utilities.mkdir_p(fulldir) 363 for ext in ('log', 'edr', 'trr', 'xtc'): 364 logger.info("[%(dirname)s] concatenating %(ext)s files...", vars()) 365 outfile = _cat(prefix, ext, partsdir) 366 logger.info("[%(dirname)s] created %(outfile)r", vars()) 367 for ext in ('gro', 'pdb'): # XXX: ugly, make method out of parts? 368 filenames = glob_parts(prefix, ext) 369 if len(filenames) == 0: 370 continue # goto next ext 371 elif len(filenames) == 1: 372 pick = filenames[0] 373 else: 374 if resolve_multi == "pass": 375 logger.warning("[%(dirname)s] too many output structures %(filenames)r, " 376 "cannot decide which one --- resolve manually!", vars()) 377 for f in filenames: 378 shutil.move(f, partsdir) 379 continue # goto next ext 380 elif resolve_multi == "guess": 381 pick = prefix + '.' + ext 382 if not pick in filenames: 383 pick = filenames[-1] # filenames are ordered with highest parts at end 384 final = os.path.join(fulldir, prefix + '.' + ext) 385 shutil.copy(pick, final) # copy2 fails on nfs with Darwin at least 386 for f in filenames: 387 shutil.move(f, partsdir) 388 logger.info("[%(dirname)s] collected final structure %(final)r " 389 "(from %(pick)r)", vars()) 390 391 392 partsdirpath = utilities.realpath(dirname, partsdir) 393 logger.warn("[%(dirname)s] cat() complete in %(fulldir)r but original files " 394 "in %(partsdirpath)r must be manually removed", vars()) 395
396 -def glob_parts(prefix, ext):
397 """Find files from a continuation run""" 398 if ext.startswith('.'): 399 ext = ext[1:] 400 files = glob.glob(prefix+'.'+ext) + glob.glob(prefix+'.part[0-9][0-9][0-9][0-9].'+ext) 401 files.sort() # at least some rough sorting... 402 return files
403
404 405 -class Frames(object):
406 """A iterator that transparently provides frames from a trajectory. 407 408 The iterator chops a trajectory into individual frames for 409 analysis tools that only work on separate structures such as 410 ``gro`` or ``pdb`` files. Instead of turning the whole trajectory 411 immediately into pdb files (and potentially filling the disk), the 412 iterator can be instructed to only provide a fixed number of 413 frames and compute more frames when needed. 414 415 .. Note:: Setting a limit on the number of frames on disk can lead 416 to longish waiting times because ``trjconv`` must 417 re-seek to the middle of the trajectory and the only way 418 it can do this at the moment is by reading frames 419 sequentially. This might still be preferrable to filling 420 up a disk, though. 421 422 .. Warning:: The *maxframes* option is not implemented yet; use 423 the *dt* option or similar to keep the number of frames 424 manageable. 425 """ 426
427 - def __init__(self, structure, trj, maxframes=None, format='pdb', **kwargs):
428 """Set up the Frames iterator. 429 430 :Arguments: 431 structure 432 name of a structure file (tpr, pdb, ...) 433 trj 434 name of the trajectory (xtc, trr, ...) 435 format 436 output format for the frames, eg "pdb" or "gro" [pdb] 437 maxframes : int 438 maximum number of frames that are extracted to disk at 439 one time; set to ``None`` to extract the whole trajectory 440 at once. [``None``] 441 kwargs 442 All other arguments are passed to 443 `class:~gromacs.tools.Trjconv`; the only options that 444 cannot be changed are *sep* and the output file name *o*. 445 446 """ 447 self.structure = structure # tpr or equivalent 448 self.trj = trj # xtc, trr, ... 449 self.maxframes = maxframes 450 if not self.maxframes is None: 451 raise NotImplementedError('sorry, maxframes feature not implemented yet') 452 453 self.framedir = tempfile.mkdtemp(prefix="Frames_", suffix='_'+format) 454 self.frameprefix = os.path.join(self.framedir, 'frame') 455 self.frametemplate = self.frameprefix + '%d' + '.' + format # depends on trjconv 456 self.frameglob = self.frameprefix + '*' + '.' + format 457 kwargs['sep'] = True 458 kwargs['o'] = self.frameprefix + '.' + format 459 kwargs.setdefault('input', ('System',)) 460 self.extractor = tools.Trjconv(s=self.structure, f=self.trj, **kwargs) 461 462 #: Holds the current frame number of the currently extracted 463 #: batch of frames. Increases when iterating. 464 self.framenumber = 0 465 #: Total number of frames read so far; only important when *maxframes* > 0 is used. 466 self.totalframes = 0
467
468 - def extract(self):
469 """Extract frames from the trajectory to the temporary directory.""" 470 # XXX: extract everything at the moment, logic for maxframes not done yet 471 self.extractor.run()
472 473 @property
474 - def all_frames(self):
475 """Unordered list of all frames currently held on disk.""" 476 return glob.glob(self.frameglob)
477 478 @property
479 - def current_framename(self):
480 return self.frametemplate % self.framenumber
481
482 - def __iter__(self):
483 """Primitive iterator.""" 484 frames = self.all_frames 485 if len(frames) == 0: 486 self.extract() 487 frames = self.all_frames 488 489 # filenames are 'Frame0.pdb', 'Frame11.pdb', ... so I must 490 # order manually because glob does not give it in sequence. 491 for i in xrange(len(frames)): 492 self.framenumber = i 493 yield self.current_framename 494 self.totalframes += len(frames)
495
496 - def delete_frames(self):
497 """Delete all frames.""" 498 for frame in glob.glob(self.frameglob): 499 os.unlink(frame)
500
501 - def cleanup(self):
502 """Clean up all temporary frames (which can be HUGE).""" 503 shutil.rmtree(self.framedir) 504 self.framedir = None
505
506 - def __del__(self):
507 if not self.framedir is None: 508 self.cleanup()
509
510 # Working with topologies 511 # ----------------------- 512 513 -def grompp_qtot(*args, **kwargs):
514 """Run ``gromacs.grompp`` and return the total charge of the system. 515 516 :Arguments: 517 The arguments are the ones one would pass to :func:`gromacs.grompp`. 518 :Returns: 519 The total charge as reported 520 521 Some things to keep in mind: 522 523 * The stdout output of grompp is not shown. This can make debugging 524 pretty hard. Try running the normal :func:`gromacs.grompp` command and 525 analyze the output if the debugging messages are not sufficient. 526 * Check that ``qtot`` is correct; because the function is based on pattern 527 matching of the output it can break when the output format changes. 528 """ 529 530 # match ' System has non-zero total charge: -4.000001e+00', 531 qtot_pattern = re.compile('System has non-zero total charge: *(?P<qtot>[-+]?\d*\.\d+[eE][-+]\d+)') 532 kwargs['stdout'] = False 533 rc, output, junk = gromacs.grompp(*args, **kwargs) 534 qtot = 0 535 for line in output.split('\n'): 536 m = qtot_pattern.search(line) 537 if m: 538 qtot = float(m.group('qtot')) 539 break 540 logger.info("system total charge qtot = %(qtot)r" % vars()) 541 return qtot
542
543 -def _mdp_include_string(dirs):
544 """Generate a string that can be added to a mdp 'include = ' line.""" 545 include_paths = [os.path.expanduser(p) for p in dirs] 546 return ' -I'.join([''] + include_paths)
547
548 -def add_mdp_includes(topology=None, kwargs=None):
549 """Set the mdp *include* key in the *kwargs* dict. 550 551 1. Add the directory containing *topology*. 552 2. Add all directories appearing under the key *includes* 553 3. Generate a string of the form "-Idir1 -Idir2 ..." that 554 is stored under the key *include* (the corresponding 555 mdp parameter) 556 557 By default, the directories ``.`` and ``..`` are also added to the 558 *include* string for the mdp; when fed into 559 :func:`gromacs.cbook.edit_mdp` it will result in a line such as :: 560 561 include = -I. -I.. -I../topology_dir .... 562 563 Note that the user can always override the behaviour by setting 564 the *include* keyword herself; in this case this function does 565 nothing. 566 567 If no *kwargs* were supplied then a dict is generated with the 568 single *include* entry. 569 570 :Arguments: 571 *topology* : top filename 572 Topology file; the name of the enclosing directory is added 573 to the include path (if supplied) [``None``] 574 *kwargs* : dict 575 Optional dictionary of mdp keywords; will be modified in place. 576 If it contains the *includes* keyword with either a single string 577 or a list of strings then these paths will be added to the 578 include statement. 579 :Returns: *kwargs* with the *include* keyword added if it did not 580 exist previously; if the keyword already existed, nothing 581 happens. 582 583 .. Note:: The *kwargs* dict is **modified in place**. This 584 function is a bit of a hack. It might be removed once 585 all setup functions become methods in a nice class. 586 """ 587 if kwargs is None: 588 kwargs = {} 589 590 if not topology is None: 591 # half-hack: find additional itps in the same directory as the 592 # topology 593 topology_dir = os.path.dirname(topology) 594 include_dirs = ['.', '..', topology_dir] # should . & .. always be added? 595 596 include_dirs.extend(asiterable(kwargs.pop('includes', []))) # includes can be a list or a string 597 598 # 1. setdefault: we do nothing if user defined include 599 # 2. modify input in place! 600 kwargs.setdefault('include', _mdp_include_string(include_dirs)) 601 return kwargs
602
603 604 -def create_portable_topology(topol, struct, **kwargs):
605 """Create a processed topology. 606 607 The processed (or portable) topology file does not contain any 608 ``#include`` statements and hence can be easily copied around. It 609 also makes it possible to re-grompp without having any special itp 610 files available. 611 612 :Arguments: 613 *topol* 614 topology file 615 *struct* 616 coordinat (structure) file 617 618 :Keywords: 619 *processed* 620 name of the new topology file; if not set then it is named like 621 *topol* but with ``pp_`` prepended 622 *includes* 623 path or list of paths of directories in which itp files are 624 searched for 625 626 :Returns: full path to the processed trajectory 627 """ 628 _topoldir, _topol = os.path.split(topol) 629 processed = kwargs.pop('processed', os.path.join(_topoldir, 'pp_'+_topol)) 630 mdp_kwargs = add_mdp_includes(topol, kwargs) 631 with tempfile.NamedTemporaryFile(suffix='.mdp') as mdp: 632 mdp.write('; empty mdp file\ninclude = %(include)s\n' % mdp_kwargs) 633 mdp.flush() 634 try: 635 gromacs.grompp(v=False, f=mdp.name, p=topol, c=struct, pp=processed) 636 finally: 637 utilities.unlink_gmx('topol.tpr', 'mdout.mdp') 638 return utilities.realpath(processed)
639
640 -def get_volume(f):
641 """Return the volume in nm^3 of structure file *f*. 642 643 (Uses :func:`gromacs.editconf`; error handling is not good) 644 """ 645 fd, temp = tempfile.mkstemp('.gro') 646 try: 647 rc,out,err = gromacs.editconf(f=f, o=temp, stdout=False) 648 finally: 649 os.unlink(temp) 650 return [float(x.split()[1]) for x in out.splitlines() 651 if x.startswith('Volume:')][0]
652
653 654 # Editing textual input files 655 # --------------------------- 656 657 -def edit_mdp(mdp, new_mdp=None, extend_parameters=None, **substitutions):
658 """Change values in a Gromacs mdp file. 659 660 Parameters and values are supplied as substitutions, eg ``nsteps=1000``. 661 662 By default the template mdp file is **overwritten in place**. 663 664 If a parameter does not exist in the template then it cannot be substituted 665 and the parameter/value pair is returned. The user has to check the 666 returned list in order to make sure that everything worked as expected. At 667 the moment it is not possible to automatically append the new values to the 668 mdp file because of ambiguities when having to replace dashes in parameter 669 names with underscores (see the notes below on dashes/underscores). 670 671 If a parameter is set to the value ``None`` then it will be ignored. 672 673 :Arguments: 674 *mdp* : filename 675 filename of input (and output filename of ``new_mdp=None``) 676 *new_mdp* : filename 677 filename of alternative output mdp file [None] 678 *extend_parameters* : string or list of strings 679 single parameter or list of parameters for which the new values 680 should be appended to the existing value in the mdp file. This 681 makes mostly sense for a single parameter, namely 'include', which 682 is set as the default. Set to ``[]`` to disable. ['include'] 683 *substitutions* 684 parameter=value pairs, where parameter is defined by the Gromacs mdp file; 685 dashes in parameter names have to be replaced by underscores. 686 687 :Returns: 688 Dict of parameters that have *not* been substituted. 689 690 **Example** :: 691 692 edit_mdp('md.mdp', new_mdp='long_md.mdp', nsteps=100000, nstxtcout=1000, lincs_iter=2) 693 694 .. Note:: 695 696 * Dashes in Gromacs mdp parameters have to be replaced by an underscore 697 when supplied as python keyword arguments (a limitation of python). For example 698 the MDP syntax is ``lincs-iter = 4`` but the corresponding keyword would be 699 ``lincs_iter = 4``. 700 * If the keyword is set as a dict key, eg ``mdp_params['lincs-iter']=4`` then one 701 does not have to substitute. 702 * Parameters *aa_bb* and *aa-bb* are considered the same (although this should 703 not be a problem in practice because there are no mdp parameters that only 704 differ by a underscore). 705 * This code is more compact in ``Perl`` as one can use ``s///`` operators: 706 ``s/^(\s*${key}\s*=\s*).*/$1${val}/`` 707 708 .. SeeAlso:: One can also load the mdp file with 709 :class:`gromacs.formats.MDP`, edit the object (a dict), and save it again. 710 """ 711 if new_mdp is None: 712 new_mdp = mdp 713 if extend_parameters is None: 714 extend_parameters = ['include'] 715 else: 716 extend_parameters = list(asiterable(extend_parameters)) 717 718 # None parameters should be ignored (simple way to keep the template defaults) 719 substitutions = dict([(k,v) for k,v in substitutions.items() if not v is None]) 720 721 params = substitutions.keys()[:] # list will be reduced for each match 722 723 def demangled(p): 724 """Return a RE string that matches the parameter.""" 725 return p.replace('_', '[-_]') # must catch either - or _
726 727 patterns = dict([(parameter, 728 re.compile("""\ 729 (?P<assignment>\s*%s\s*=\s*) # parameter == everything before the value 730 (?P<value>[^;]*) # value (stop before comment=;) 731 (?P<comment>\s*;.*)? # optional comment 732 """ % demangled(parameter), re.VERBOSE)) 733 for parameter in substitutions]) 734 735 target = tempfile.TemporaryFile() 736 with open(mdp) as src: 737 logger.info("editing mdp = %r: %r" % (mdp, substitutions.keys())) 738 for line in src: 739 new_line = line.strip() # \n must be stripped to ensure that new line is built without break 740 for p in params[:]: 741 m = patterns[p].match(new_line) 742 if m: 743 # I am too stupid to replace a specific region in the string so I rebuild it 744 # (matching a line and then replacing value requires TWO re calls) 745 #print 'line:' + new_line 746 #print m.groupdict() 747 if m.group('comment') is None: 748 comment = '' 749 else: 750 comment = " "+m.group('comment') 751 assignment = m.group('assignment') 752 if not assignment.endswith(' '): 753 assignment += ' ' 754 # build new line piece-wise: 755 new_line = assignment 756 if p in extend_parameters: 757 # keep original value and add new stuff at end 758 new_line += str(m.group('value')) + ' ' 759 new_line += str(substitutions[p]) + comment 760 params.remove(p) 761 break 762 target.write(new_line+'\n') 763 target.seek(0) 764 # XXX: Is there a danger of corrupting the original mdp if something went wrong? 765 with open(new_mdp, 'w') as final: 766 shutil.copyfileobj(target, final) 767 target.close() 768 # return all parameters that have NOT been substituted 769 if len(params) > 0: 770 logger.warn("Not substituted in %(new_mdp)r: %(params)r" % vars()) 771 return dict([(p, substitutions[p]) for p in params]) 772
773 -def edit_txt(filename, substitutions, newname=None):
774 """Primitive text file stream editor. 775 776 This function can be used to edit free-form text files such as the 777 topology file. By default it does an **in-place edit** of 778 *filename*. If *newname* is supplied then the edited 779 file is written to *newname*. 780 781 :Arguments: 782 *filename* 783 input text file 784 *substitutions* 785 substitution commands (see below for format) 786 *newname* 787 output filename; if ``None`` then *filename* is changed in 788 place [``None``] 789 790 *substitutions* is a list of triplets; the first two elements are regular 791 expression strings, the last is the substitution value. It mimics 792 ``sed`` search and replace. The rules for *substitutions*: 793 794 .. productionlist:: 795 substitutions: "[" search_replace_tuple, ... "]" 796 search_replace_tuple: "(" line_match_RE "," search_RE "," replacement ")" 797 line_match_RE: regular expression that selects the line (uses match) 798 search_RE: regular expression that is searched in the line 799 replacement: replacement string for search_RE 800 801 Running :func:`edit_txt` does pretty much what a simple :: 802 803 sed /line_match_RE/s/search_RE/replacement/ 804 805 with repeated substitution commands does. 806 807 Special replacement values: 808 - ``None``: the rule is ignored 809 - ``False``: the line is deleted (even if other rules match) 810 811 .. note:: 812 813 * No sanity checks are performed and the substitutions must be supplied 814 exactly as shown. 815 * All substitutions are applied to a line; thus the order of the substitution 816 commands may matter when one substitution generates a match for a subsequent rule. 817 * If replacement is set to ``None`` then the whole expression is ignored and 818 whatever is in the template is used. To unset values you must provided an 819 empty string or similar. 820 * Delete a matching line if replacement=``False``. 821 """ 822 if newname is None: 823 newname = filename 824 825 # No sanity checks (figure out later how to give decent diagnostics). 826 # Filter out any rules that have None in replacement. 827 _substitutions = [{'lRE': re.compile(str(lRE)), 828 'sRE': re.compile(str(sRE)), 829 'repl': repl} 830 for lRE,sRE,repl in substitutions if not repl is None] 831 832 target = tempfile.TemporaryFile() 833 with open(filename) as src: 834 logger.info("editing txt = %r (%d substitutions)" % (filename, len(substitutions))) 835 for line in src: 836 keep_line = True 837 for subst in _substitutions: 838 m = subst['lRE'].match(line) 839 if m: # apply substition to this line? 840 logger.debug('match: '+line.rstrip()) 841 if subst['repl'] is False: # special rule: delete line 842 keep_line = False 843 else: # standard replacement 844 line = subst['sRE'].sub(str(subst['repl']), line) 845 logger.debug('replaced: '+line.rstrip()) 846 if keep_line: 847 target.write(line) 848 else: 849 logger.debug("Deleting line %r", line) 850 851 target.seek(0L) 852 with open(newname, 'w') as final: 853 shutil.copyfileobj(target, final) 854 target.close() 855 logger.info("edited txt = %(newname)r" % vars())
856 857 858 # Working with index files and index groups 859 # ----------------------------------------- 860 861 #: compiled regular expression to match a list of index groups 862 #: in the output of ``make_ndx``s <Enter> (empty) command. 863 NDXLIST = re.compile(r""">\s+\n # '> ' marker line from '' input (input not echoed) 864 \n # empty line 865 (?P<LIST> # list of groups 866 ( # consists of repeats of the same pattern: 867 \s*\d+ # group number 868 \s+[^\s]+\s*: # group name, separator ':' 869 \s*\d+\satoms # number of atoms in group 870 \n 871 )+ # multiple repeats 872 )""", re.VERBOSE) 873 #: compiled regular expression to match a single line of 874 #: ``make_ndx`` output (e.g. after a successful group creation) 875 NDXGROUP = re.compile(r""" 876 \s*(?P<GROUPNUMBER>\d+) # group number 877 \s+(?P<GROUPNAME>[^\s]+)\s*: # group name, separator ':' 878 \s*(?P<NATOMS>\d+)\satoms # number of atoms in group 879 """, re.VERBOSE)
880 881 -def make_ndx_captured(**kwargs):
882 """make_ndx that captures all output 883 884 Standard :func:`~gromacs.make_ndx` command with the input and 885 output pre-set in such a way that it can be conveniently used for 886 :func:`parse_ndxlist`. 887 888 Example:: 889 ndx_groups = parse_ndxlist(make_ndx_captured(n=ndx)[0]) 890 891 Note that the convenient :func:`get_ndx_groups` function does exactly 892 that and can probably used in most cases. 893 894 :Arguments: 895 keywords are passed on to :func:`~gromacs.make_ndx` 896 :Returns: 897 (*returncode*, *output*, ``None``) 898 """ 899 kwargs['stdout']=False # required for proper output as described in doc 900 kwargs['stderr']=True # ... 901 user_input = kwargs.pop('input',[]) 902 user_input = [cmd for cmd in user_input if cmd != 'q'] # filter any quit 903 kwargs['input'] = user_input + ['', 'q'] # necessary commands 904 return gromacs.make_ndx(**kwargs)
905
906 -def get_ndx_groups(ndx, **kwargs):
907 """Return a list of index groups in the index file *ndx*. 908 909 :Arguments: 910 - *ndx* is a Gromacs index file. 911 - kwargs are passed to :func:`make_ndx_captured`. 912 913 :Returns: 914 list of groups as supplied by :func:`parse_ndxlist` 915 916 Alternatively, load the index file with 917 :class:`gromacs.formats.NDX` for full control. 918 """ 919 fd, tmp_ndx = tempfile.mkstemp(suffix='.ndx') 920 kwargs['o'] = tmp_ndx 921 try: 922 g = parse_ndxlist(make_ndx_captured(n=ndx, **kwargs)[1]) 923 finally: 924 utilities.unlink_gmx(tmp_ndx) 925 return g
926
927 -def parse_ndxlist(output):
928 """Parse output from make_ndx to build list of index groups:: 929 930 groups = parse_ndxlist(output) 931 932 output should be the standard output from ``make_ndx``, e.g.:: 933 934 rc,output,junk = gromacs.make_ndx(..., input=('', 'q'), stdout=False, stderr=True) 935 936 (or simply use 937 938 rc,output,junk = cbook.make_ndx_captured(...) 939 940 which presets input, stdout and stderr; of course input can be overriden.) 941 942 :Returns: 943 The function returns a list of dicts (``groups``) with fields 944 945 name 946 name of the groups 947 nr 948 number of the group (starts at 0) 949 natoms 950 number of atoms in the group 951 952 """ 953 954 m = NDXLIST.search(output) # make sure we pick up a proper full list 955 grouplist = m.group('LIST') 956 return parse_groups(grouplist)
957
958 -def parse_groups(output):
959 """Parse ``make_ndx`` output and return groups as a list of dicts.""" 960 groups = [] 961 for line in output.split('\n'): 962 m = NDXGROUP.match(line) 963 if m: 964 d = m.groupdict() 965 groups.append({'name': d['GROUPNAME'], 966 'nr': int(d['GROUPNUMBER']), 967 'natoms': int(d['NATOMS'])}) 968 return groups
969
970 -class IndexBuilder(object):
971 """Build an index file with specified groups and the combined group. 972 973 This is *not* a full blown selection parser a la Charmm, VMD or 974 MDAnalysis but a very quick hack. 975 976 **Example** 977 978 How to use the :class:`IndexBuilder`:: 979 980 G = gromacs.cbook.IndexBuilder('md_posres.pdb', 981 ['S312:OG','T313:OG1','A38:O','A309:O','@a62549 & r NA'], 982 offset=-9, out_ndx='selection.ndx') 983 groupname, ndx = G.combine() 984 del G 985 986 The residue numbers are given with their canonical resids from the 987 sequence or pdb. *offset=-9* says that one calculates Gromacs topology 988 resids by subtracting 9 from the canonical resid. 989 990 The combined selection is ``OR`` ed by default and written to 991 *selection.ndx*. One can also add all the groups in the initial *ndx* 992 file (or the :program:`make_ndx` default groups) to the output (see the 993 *defaultgroups* keyword for :meth:`IndexBuilder.combine`). 994 995 Generating an index file always requires calling 996 :meth:`~IndexBuilder.combine` even if there is only a single group. 997 998 Deleting the class removes all temporary files associated with it (see 999 :attr:`IndexBuilder.indexfiles`). 1000 1001 :Raises: 1002 If an empty group is detected (which does not always work) then a 1003 :exc:`gromacs.BadParameterWarning` is issued. 1004 1005 :Bugs: 1006 If ``make_ndx`` crashes with an unexpected error then this is fairly hard to 1007 diagnose. For instance, in certain cases it segmentation faults when a tpr 1008 is provided as a *struct* file and the resulting error messages becomes :: 1009 1010 GromacsError: [Errno -11] Gromacs tool failed 1011 Command invocation: make_ndx -o /tmp/tmp_Na1__NK7cT3.ndx -f md_posres.tpr 1012 1013 In this case run the command invocation manually to see what the problem 1014 could be. 1015 1016 1017 .. SeeAlso:: In some cases it might be more straightforward to use 1018 :class:`gromacs.formats.NDX`. 1019 1020 """ 1021
1022 - def __init__(self, struct=None, selections=None, names=None, name_all=None, 1023 ndx=None, out_ndx="selection.ndx", offset=0):
1024 """Build a index group from the selection arguments. 1025 1026 If selections and a structure file are supplied then the individual 1027 selections are constructed with separate calls to 1028 :func:`gromacs.make_ndx`. Use :meth:`IndexBuilder.combine` to combine 1029 them into a joint selection. 1030 1031 :Arguments: 1032 1033 struct : filename 1034 Structure file (tpr, pdb, ...) 1035 1036 selections : list 1037 The list must contain strings or tuples, which must be be one of 1038 the following constructs: 1039 1040 "<1-letter aa code><resid>[:<atom name]" 1041 1042 Selects the CA of the residue or the specified atom 1043 name. 1044 1045 example: ``"S312:OA"`` or ``"A22"`` (equivalent to ``"A22:CA"``) 1046 1047 ("<1-letter aa code><resid>", "<1-letter aa code><resid>, ["<atom name>"]) 1048 1049 Selects a *range* of residues. If only two residue 1050 identifiers are provided then all atoms are 1051 selected. With an optional third atom identifier, 1052 only this atom anme is selected for each residue 1053 in the range. [EXPERIMENTAL] 1054 1055 "@<make_ndx selection>" 1056 1057 The ``@`` letter introduces a verbatim ``make_ndx`` 1058 command. It will apply the given selection without any 1059 further processing or checks. 1060 1061 example: ``"@a 6234 - 6238"`` or ``'@"SOL"'`` (note the quoting) 1062 or ``"@r SER & r 312 & t OA"``. 1063 1064 names : list 1065 Strings to name the selections; if not supplied or if individuals 1066 are ``None`` then a default name is created. 1067 1068 offset : int, dict 1069 This number is added to the resids in the first selection scheme; this 1070 allows names to be the same as in a crystal structure. If offset is a 1071 dict then it is used to directly look up the resids. 1072 1073 ndx : filename or list of filenames 1074 Optional input index file(s). 1075 1076 out_ndx : filename 1077 Output index file. 1078 1079 """ 1080 self.structure = struct 1081 self.ndx = ndx 1082 self.output = out_ndx 1083 self.name_all = name_all 1084 #: This number is added to the resids in the first selection scheme; this 1085 #: allows names to be the same as in a crystal structure. If offset is a 1086 #: dict then it is used to directly look up the resids. Use :meth:`gmx_resid` 1087 #: to transform a crystal resid to a gromacs resid. 1088 #: 1089 #: The attribute may be changed directly after init. 1090 self.offset = offset 1091 1092 #: Auto-labelled groups use this counter. 1093 self.command_counter = 0 1094 1095 if selections is None: 1096 selections = [] 1097 if not utilities.iterable(selections): 1098 selections = [selections] 1099 self.selections = selections 1100 if names is None: 1101 names = [None] * len(selections) 1102 1103 #: Specialized ``make_ndx`` that always uses same structure 1104 #: and redirection (can be overridden) 1105 self.make_ndx = tools.Make_ndx(f=self.structure, n=self.ndx, 1106 stdout=False, stderr=False) 1107 1108 #: dict, keyed by group name and pointing to index file for group 1109 #: (Groups are built in separate files because that is more robust 1110 #: as I can clear groups easily.) 1111 self.indexfiles = dict([self.parse_selection(selection, name) 1112 for selection, name in zip(selections, names)]) 1113 self.names = self.indexfiles.keys()
1114
1115 - def gmx_resid(self, resid):
1116 """Returns resid in the Gromacs index by transforming with offset.""" 1117 try: 1118 gmx_resid = int(self.offset[resid]) 1119 except (TypeError, IndexError): 1120 gmx_resid = resid + self.offset 1121 except KeyError: 1122 raise KeyError("offset must be a dict that contains the gmx resid for %d" % resid) 1123 return gmx_resid
1124
1125 - def combine(self, name_all=None, out_ndx=None, operation='|', defaultgroups=False):
1126 """Combine individual groups into a single one and write output. 1127 1128 :Keywords: 1129 name_all : string 1130 Name of the combined group, ``None`` generates a name. [``None``] 1131 out_ndx : filename 1132 Name of the output file that will contain the individual groups 1133 and the combined group. If ``None`` then default from the class 1134 constructor is used. [``None``] 1135 operation : character 1136 Logical operation that is used to generate the combined group from 1137 the individual groups: "|" (OR) or "&" (AND) ["|"] 1138 defaultgroups : bool 1139 ``True``: append everything to the default groups produced by 1140 :program:`make_ndx` (or rather, the groups provided in the ndx file on 1141 initialization --- if this was ``None`` then these are truly default groups); 1142 ``False``: only use the generated groups 1143 1144 :Returns: 1145 ``(combinedgroup_name, output_ndx)``, a tuple showing the 1146 actual group name and the name of the file; useful when all names are autogenerated. 1147 1148 .. Warning:: The order of the atom numbers in the combined group is 1149 *not* guaranteed to be the same as the selections on input because 1150 ``make_ndx`` sorts them ascending. Thus you should be careful when 1151 using these index files for calculations of angles and dihedrals. 1152 Use :class:`gromacs.formats.NDX` in these cases. 1153 """ 1154 if name_all is None: 1155 name_all = self.name_all 1156 if name_all is None: 1157 name_all = operation.join(self.indexfiles) 1158 if not operation in ('|', '&'): 1159 raise ValueError("Illegal operation %r, only '|' (OR) and '&' (AND) allowed." % 1160 operation) 1161 if out_ndx is None: 1162 out_ndx = self.output 1163 1164 if defaultgroups: 1165 # make a default file (using the original ndx where provided!!) 1166 fd, default_ndx = tempfile.mkstemp(suffix='.ndx', prefix='default__') 1167 try: 1168 self.make_ndx(o=default_ndx, input=['q']) 1169 except: 1170 utilities.unlink_gmx(default_ndx) 1171 raise 1172 ndxfiles = [default_ndx] 1173 else: 1174 ndxfiles = [] 1175 1176 ndxfiles.extend(self.indexfiles.values()) 1177 1178 if len(self.selections) == 1: 1179 # no need to combine selections 1180 try: 1181 cmd = ['', 'q'] 1182 rc,out,err = self.make_ndx(n=ndxfiles, o=out_ndx, input=cmd) 1183 if self._is_empty_group(out): 1184 warnings.warn("No atoms found for %(cmd)r" % vars(), 1185 category=BadParameterWarning) 1186 finally: 1187 if defaultgroups: 1188 utilities.unlink_gmx(default_ndx) 1189 else: 1190 # multiple selections: combine them and name them 1191 try: 1192 fd, tmp_ndx = tempfile.mkstemp(suffix='.ndx', prefix='combined__') 1193 # combine all selections by loading ALL temporary index files 1194 operation = ' '+operation.strip()+' ' 1195 cmd = [operation.join(['"%s"' % gname for gname in self.indexfiles]), 1196 '', 'q'] 1197 rc,out,err = self.make_ndx(n=ndxfiles, o=tmp_ndx, input=cmd) 1198 if self._is_empty_group(out): 1199 warnings.warn("No atoms found for %(cmd)r" % vars(), 1200 category=BadParameterWarning) 1201 1202 # second pass for naming, sigh (or: use NDX ?) 1203 groups = parse_ndxlist(out) 1204 last = groups[-1] 1205 # name this group 1206 name_cmd = ["name %d %s" % (last['nr'], name_all), 1207 'q'] 1208 rc,out,err = self.make_ndx(n=tmp_ndx, o=out_ndx, input=name_cmd) 1209 # For debugging, look at out and err or set stdout=True, stderr=True 1210 # TODO: check out if at least 1 atom selected 1211 ##print "DEBUG: combine()" 1212 ##print out 1213 finally: 1214 utilities.unlink_gmx(tmp_ndx) 1215 if defaultgroups: 1216 utilities.unlink_gmx(default_ndx) 1217 1218 return name_all, out_ndx
1219
1220 - def cat(self, out_ndx=None):
1221 """Concatenate input index files. 1222 1223 Generate a new index file that contains the default Gromacs index 1224 groups (if a structure file was defined) and all index groups from the 1225 input index files. 1226 1227 :Arguments: 1228 out_ndx : filename 1229 Name of the output index file; if ``None`` then use the default 1230 provided to the constructore. [``None``]. 1231 """ 1232 if out_ndx is None: 1233 out_ndx = self.output 1234 self.make_ndx(o=out_ndx, input=['q']) 1235 return out_ndx
1236
1237 - def parse_selection(self, selection, name=None):
1238 """Retuns (groupname, filename) with index group.""" 1239 1240 if type(selection) is tuple: 1241 # range 1242 process = self._process_range 1243 elif selection.startswith('@'): 1244 # verbatim make_ndx command 1245 process = self._process_command 1246 selection = selection[1:] 1247 else: 1248 process = self._process_residue 1249 return process(selection, name)
1250
1251 - def _process_command(self, command, name=None):
1252 """Process ``make_ndx`` command and return name and temp index file.""" 1253 1254 self.command_counter += 1 1255 if name is None: 1256 name = "CMD%03d" % self.command_counter 1257 1258 # Need to build it with two make_ndx calls because I cannot reliably 1259 # name the new group without knowing its number. 1260 try: 1261 fd, tmp_ndx = tempfile.mkstemp(suffix='.ndx', prefix='tmp_'+name+'__') 1262 cmd = [command, '', 'q'] # empty command '' necessary to get list 1263 # This sometimes fails with 'OSError: Broken Pipe' --- hard to debug 1264 rc,out,err = self.make_ndx(o=tmp_ndx, input=cmd) 1265 self.check_output(out, "No atoms found for selection %(command)r." % vars(), err=err) 1266 # For debugging, look at out and err or set stdout=True, stderr=True 1267 # TODO: check ' 0 r_300_&_ALA_&_O : 1 atoms' has at least 1 atom 1268 ##print "DEBUG: _process_command()" 1269 ##print out 1270 groups = parse_ndxlist(out) 1271 last = groups[-1] 1272 # reduce and name this group 1273 fd, ndx = tempfile.mkstemp(suffix='.ndx', prefix=name+'__') 1274 name_cmd = ["keep %d" % last['nr'], 1275 "name 0 %s" % name, 'q'] 1276 rc,out,err = self.make_ndx(n=tmp_ndx, o=ndx, input=name_cmd) 1277 finally: 1278 utilities.unlink_gmx(tmp_ndx) 1279 1280 return name, ndx
1281 1282 #: regular expression to match and parse a residue-atom selection 1283 RESIDUE = re.compile(""" 1284 (?P<aa>([ACDEFGHIKLMNPQRSTVWY]) # 1-letter amino acid 1285 | # or 1286 (\S\S\S) # 3-letter residue name 1287 ) 1288 (?P<resid>\d+) # resid 1289 (: # separator ':' 1290 (?P<atom>\w+) # atom name 1291 )? # possibly one 1292 """, re.VERBOSE) 1293
1294 - def _process_residue(self, selection, name=None):
1295 """Process residue/atom selection and return name and temp index file.""" 1296 1297 if name is None: 1298 name = selection.replace(':', '_') 1299 1300 # XXX: use _translate_residue() .... 1301 m = self.RESIDUE.match(selection) 1302 if not m: 1303 raise ValueError("Selection %(selection)r is not valid." % vars()) 1304 1305 gmx_resid = self.gmx_resid(int(m.group('resid'))) 1306 residue = m.group('aa') 1307 if len(residue) == 1: 1308 gmx_resname = utilities.convert_aa_code(residue) # only works for AA 1309 else: 1310 gmx_resname = residue # use 3-letter for any resname 1311 gmx_atomname = m.group('atom') 1312 if gmx_atomname is None: 1313 gmx_atomname = 'CA' 1314 1315 #: select residue <gmx_resname><gmx_resid> atom <gmx_atomname> 1316 _selection = 'r %(gmx_resid)d & r %(gmx_resname)s & a %(gmx_atomname)s' % vars() 1317 cmd = ['keep 0', 'del 0', 1318 _selection, 1319 'name 0 %(name)s' % vars(), 1320 'q'] 1321 fd, ndx = tempfile.mkstemp(suffix='.ndx', prefix=name+'__') 1322 rc,out,err = self.make_ndx(n=self.ndx, o=ndx, input=cmd) 1323 self.check_output(out, "No atoms found for " 1324 "%(selection)r --> %(_selection)r" % vars()) 1325 # For debugging, look at out and err or set stdout=True, stderr=True 1326 ##print "DEBUG: _process_residue()" 1327 ##print out 1328 1329 return name, ndx
1330
1331 - def _process_range(self, selection, name=None):
1332 """Process a range selection. 1333 1334 ("S234", "A300", "CA") --> selected all CA in this range 1335 ("S234", "A300") --> selected all atoms in this range 1336 1337 .. Note:: Ignores residue type, only cares about the resid (but still required) 1338 """ 1339 1340 try: 1341 first, last, gmx_atomname = selection 1342 except ValueError: 1343 try: 1344 first, last = selection 1345 gmx_atomname = '*' 1346 except: 1347 logger.error("%r is not a valid range selection", selection) 1348 raise 1349 if name is None: 1350 name = "%(first)s-%(last)s_%(gmx_atomname)s" % vars() 1351 1352 _first = self._translate_residue(first, default_atomname=gmx_atomname) 1353 _last = self._translate_residue(last, default_atomname=gmx_atomname) 1354 1355 _selection = 'r %d - %d & & a %s' % (_first['resid'], _last['resid'], gmx_atomname) 1356 cmd = ['keep 0', 'del 0', 1357 _selection, 1358 'name 0 %(name)s' % vars(), 1359 'q'] 1360 fd, ndx = tempfile.mkstemp(suffix='.ndx', prefix=name+'__') 1361 rc,out,err = self.make_ndx(n=self.ndx, o=ndx, input=cmd) 1362 self.check_output(out, "No atoms found for " 1363 "%(selection)r --> %(_selection)r" % vars()) 1364 # For debugging, look at out and err or set stdout=True, stderr=True 1365 ##print "DEBUG: _process_residue()" 1366 ##print out 1367 1368 return name, ndx
1369 1370
1371 - def _translate_residue(self, selection, default_atomname='CA'):
1372 """Translate selection for a single res to make_ndx syntax.""" 1373 m = self.RESIDUE.match(selection) 1374 if not m: 1375 errmsg = "Selection %(selection)r is not valid." % vars() 1376 logger.error(errmsg) 1377 raise ValueError(errmsg) 1378 1379 gmx_resid = self.gmx_resid(int(m.group('resid'))) # magic offset correction 1380 residue = m.group('aa') 1381 if len(residue) == 1: 1382 gmx_resname = utilities.convert_aa_code(residue) # only works for AA 1383 else: 1384 gmx_resname = residue # use 3-letter for any resname 1385 1386 gmx_atomname = m.group('atom') 1387 if gmx_atomname is None: 1388 gmx_atomname = default_atomname 1389 1390 return {'resname':gmx_resname, 'resid':gmx_resid, 'atomname':gmx_atomname}
1391 1392 1393
1394 - def check_output(self, make_ndx_output, message=None, err=None):
1395 """Simple tests to flag problems with a ``make_ndx`` run.""" 1396 if message is None: 1397 message = "" 1398 else: 1399 message = '\n' + message 1400 def format(output, w=60): 1401 hrule = "====[ GromacsError (diagnostic output) ]".ljust(w,"=") 1402 return hrule + '\n' + str(output) + hrule
1403 1404 rc = True 1405 if self._is_empty_group(make_ndx_output): 1406 warnings.warn("Selection produced empty group.%(message)s" 1407 % vars(), category=GromacsValueWarning) 1408 rc = False 1409 if self._has_syntax_error(make_ndx_output): 1410 rc = False 1411 out_formatted = format(make_ndx_output) 1412 raise GromacsError("make_ndx encountered a Syntax Error, " 1413 "%(message)s\noutput:\n%(out_formatted)s" % vars()) 1414 if make_ndx_output.strip() == "": 1415 rc = False 1416 out_formatted = format(err) 1417 raise GromacsError("make_ndx produced no output, " 1418 "%(message)s\nerror output:\n%(out_formatted)s" % vars()) 1419 return rc
1420
1421 - def _is_empty_group(self, make_ndx_output):
1422 m = re.search('Group is empty', make_ndx_output) 1423 return not (m is None)
1424
1425 - def _has_syntax_error(self, make_ndx_output):
1426 m = re.search('Syntax error:', make_ndx_output) 1427 return not (m is None)
1428 1429
1430 - def __del__(self):
1431 try: 1432 for path in self.indexfiles.values(): 1433 utilities.unlink_gmx(path) 1434 # Removes auto-backup files, too (which we have because mkstemp creates 1435 # an empty file and make_ndx backs that up). 1436 except (AttributeError, OSError): 1437 # all exceptions are ignored inside __del__ anyway but these 1438 # two we do not even want to be noticed off: 1439 # AttributeError: when reloading the module, OSError: when file disappeared 1440 pass
1441
1442 1443 -class Transformer(utilities.FileUtils):
1444 """Class to handle transformations of trajectories. 1445 1446 1. Center, compact, and fit to reference structure in tpr 1447 (optionally, only center in the xy plane): :meth:`~Transformer.center_fit` 1448 2. Write compact xtc and tpr with water removed: :meth:`~Transformer.strip_water` 1449 1450 """ 1451
1452 - def __init__(self, s="topol.tpr", f="traj.xtc", n=None, force=None, dirname=os.path.curdir):
1453 """Set up Transformer with structure and trajectory. 1454 1455 Supply *n* = tpr, *f* = xtc (and *n* = ndx) relative to dirname. 1456 1457 :Keywords: 1458 *s* 1459 tpr file (or similar); note that this should not contain 1460 position restraints if it is to be used with a reduced 1461 system (see :meth:`~Transformer.strip_water`) 1462 *f* 1463 trajectory (xtc, trr, ...) 1464 *n* 1465 index file (it is typically safe to leave this as ``None``; in 1466 cases where a trajectory needs to be centered on non-standard 1467 groups this should contain those groups) 1468 *force* 1469 Set the default behaviour for handling existing files: 1470 - ``True``: overwrite existing trajectories 1471 - ``False``: throw a IOError exception 1472 - ``None``: skip existing and log a warning [default] 1473 """ 1474 1475 self.tpr = self.filename(s, ext="tpr", use_my_ext=True) 1476 self.xtc = self.filename(f, ext="xtc", use_my_ext=True) 1477 self.ndx = n 1478 self.dirname = dirname 1479 self.force = force 1480 self.nowater = {} # data for trajectory stripped from water 1481 1482 with utilities.in_dir(self.dirname, create=False): 1483 for f in (self.tpr, self.xtc, self.ndx): 1484 if f is None: 1485 continue 1486 if not os.path.exists(f): 1487 msg = "Possible problem: File %(f)r not found in %(dirname)r." % vars() 1488 warnings.warn(msg, category=MissingDataWarning) 1489 logger.warn(msg)
1490
1491 - def rp(self, *args):
1492 """Return canonical path to file under *dirname* with components *args*""" 1493 return utilities.realpath(self.dirname, *args)
1494
1495 - def center_fit(self, **kwargs):
1496 """Write compact xtc that is fitted to the tpr reference structure. 1497 1498 See :func:gromacs.cbook.trj_fitandcenter` for details and 1499 description of *kwargs*. The most important ones are listed 1500 here but in most cases the defaults should work. 1501 1502 :Keywords: 1503 *s* 1504 Input structure (typically the default tpr file but can be set to 1505 some other file with a different conformation for fitting) 1506 *n* 1507 Alternative index file. 1508 *o* 1509 Name of the output trajectory. 1510 *xy* : Boolean 1511 If ``True`` then only fit in xy-plane (useful for a membrane normal 1512 to z). The default is ``False``. 1513 *force* 1514 - ``True``: overwrite existing trajectories 1515 - ``False``: throw a IOError exception 1516 - ``None``: skip existing and log a warning [default] 1517 1518 :Returns: 1519 dictionary with keys *tpr*, *xtc*, which are the names of the 1520 the new files 1521 """ 1522 kwargs.setdefault('s', self.tpr) 1523 kwargs.setdefault('n', self.ndx) 1524 kwargs['f'] = self.xtc 1525 kwargs.setdefault('o', self.infix_filename(None, self.xtc, '_centfit', 'xtc')) 1526 force = kwargs.pop('force', self.force) 1527 1528 logger.info("Centering and fitting trajectory %(f)r..." % kwargs) 1529 with utilities.in_dir(self.dirname): 1530 if not self.check_file_exists(kwargs['o'], resolve="indicate", force=force): 1531 trj_fitandcenter(**kwargs) 1532 logger.info("Centered and fit trajectory: %(o)r." % kwargs) 1533 return {'tpr': self.rp(kwargs['s']), 'xtc': self.rp(kwargs['o'])}
1534
1535 - def fit(self, xy=False, **kwargs):
1536 """Write xtc that is fitted to the tpr reference structure. 1537 1538 See :func:`gromacs.cbook.trj_xyfitted` for details and 1539 description of *kwargs*. The most important ones are listed 1540 here but in most cases the defaults should work. 1541 1542 :Keywords: 1543 *s* 1544 Input structure (typically the default tpr file but can be set to 1545 some other file with a different conformation for fitting) 1546 *n* 1547 Alternative index file. 1548 *o* 1549 Name of the output trajectory. A default name is created. 1550 If e.g. *dt* = 100 is one of the *kwargs* then the default name includes 1551 "_dt100ps". 1552 *xy* : boolean 1553 If ``True`` then only do a rot+trans fit in the xy plane 1554 (good for membrane simulations); default is ``False``. 1555 *force* 1556 ``True``: overwrite existing trajectories 1557 ``False``: throw a IOError exception 1558 ``None``: skip existing and log a warning [default] 1559 *kwargs* 1560 kwargs are passed to :func:`~gromacs.cbook.trj_xyfitted` 1561 1562 :Returns: 1563 dictionary with keys *tpr*, *xtc*, which are the names of the 1564 the new files 1565 """ 1566 kwargs.setdefault('s', self.tpr) 1567 kwargs.setdefault('n', self.ndx) 1568 kwargs['f'] = self.xtc 1569 force = kwargs.pop('force', self.force) 1570 if xy: 1571 fitmode = 'rotxy+transxy' 1572 kwargs.pop('fit', None) 1573 infix_default = '_fitxy' 1574 else: 1575 fitmode = kwargs.pop('fit', 'rot+trans') # user can use 'progressive', too 1576 infix_default = '_fit' 1577 1578 dt = kwargs.get('dt', None) 1579 if dt: 1580 infix_default += '_dt%dps' % int(dt) # dt in ps 1581 1582 kwargs.setdefault('o', self.infix_filename(None, self.xtc, infix_default, 'xtc')) 1583 1584 logger.info("Fitting trajectory %r to with xy=%r...", kwargs['f'], xy) 1585 with utilities.in_dir(self.dirname): 1586 if self.check_file_exists(kwargs['o'], resolve="indicate", force=force): 1587 logger.warn("File %r exists; force regenerating it with force=True.", kwargs['o']) 1588 else: 1589 trj_xyfitted(fit=fitmode, **kwargs) 1590 logger.info("Fitted trajectory (fitmode=%s): %r.", fitmode, kwargs['o']) 1591 return {'tpr': self.rp(kwargs['s']), 'xtc': self.rp(kwargs['o'])}
1592
1593 - def strip_water(self, os=None, o=None, on=None, compact=False, 1594 resn="SOL", groupname="notwater", **kwargs):
1595 """Write xtc and tpr with water (by resname) removed. 1596 1597 :Keywords: 1598 *os* 1599 Name of the output tpr file; by default use the original but 1600 insert "nowater" before suffix. 1601 *o* 1602 Name of the output trajectory; by default use the original name but 1603 insert "nowater" before suffix. 1604 *on* 1605 Name of a new index file (without water). 1606 *compact* 1607 ``True``: write a compact and centered trajectory 1608 ``False``: use trajectory as it is [``False``] 1609 *resn* 1610 Residue name of the water molecules; all these residues are excluded. 1611 *groupname* 1612 Name of the group that is generated by subtracting all waters 1613 from the system. 1614 *force* : Boolean 1615 - ``True``: overwrite existing trajectories 1616 - ``False``: throw a IOError exception 1617 - ``None``: skip existing and log a warning [default] 1618 *kwargs* 1619 are passed on to :func:`gromacs.cbook.trj_compact` (unless the 1620 values have to be set to certain values such as s, f, n, o 1621 keywords). The *input* keyword is always mangled: Only the first 1622 entry (the group to centre the trajectory on) is kept, and as a 1623 second group (the output group) *groupname* is used. 1624 1625 :Returns: 1626 dictionary with keys *tpr*, *xtc*, *ndx* which are the names of the 1627 the new files 1628 1629 .. warning:: The input tpr file should *not* have *any position restraints*; 1630 otherwise Gromacs will throw a hissy-fit and say 1631 1632 *Software inconsistency error: Position restraint coordinates are 1633 missing* 1634 1635 (This appears to be a bug in Gromacs 4.x.) 1636 """ 1637 force = kwargs.pop('force', self.force) 1638 1639 newtpr = self.infix_filename(os, self.tpr, '_nowater') 1640 newxtc = self.infix_filename(o, self.xtc, '_nowater') 1641 newndx = self.infix_filename(on, self.tpr, '_nowater', 'ndx') 1642 1643 nowater_ndx = "nowater.ndx" # refers to original tpr 1644 1645 if compact: 1646 TRJCONV = trj_compact 1647 _input = kwargs.get('input', ['Protein']) 1648 kwargs['input'] = [_input[0], groupname] # [center group, write-out selection] 1649 del _input 1650 else: 1651 TRJCONV = gromacs.trjconv 1652 kwargs['input'] = [groupname] 1653 1654 NOTwater = "! r %(resn)s" % vars() # make_ndx selection ("not water residues") 1655 with utilities.in_dir(self.dirname): 1656 # ugly because I cannot break from the block 1657 if not self.check_file_exists(newxtc, resolve="indicate", force=force): 1658 # make no-water index 1659 B = IndexBuilder(struct=self.tpr, selections=['@'+NOTwater], names=[groupname], 1660 ndx=self.ndx, out_ndx=nowater_ndx) 1661 B.combine(defaultgroups=True) 1662 1663 logger.info("TPR file without water %(newtpr)r" % vars()) 1664 gromacs.tpbconv(s=self.tpr, o=newtpr, n=nowater_ndx, input=[groupname]) 1665 1666 logger.info("NDX of the new system %(newndx)r" % vars()) 1667 gromacs.make_ndx(f=newtpr, o=newndx, input=['q'], stderr=False, stdout=False) 1668 1669 logger.info("Trajectory without water %(newxtc)r" % vars()) 1670 kwargs['s'] = self.tpr 1671 kwargs['f'] = self.xtc 1672 kwargs['n'] = nowater_ndx 1673 kwargs['o'] = newxtc 1674 TRJCONV(**kwargs) 1675 1676 logger.info("pdb and gro for visualization") 1677 for ext in 'pdb', 'gro': 1678 try: 1679 # see warning in doc ... so we don't use the new xtc but the old one 1680 kwargs['o'] = self.filename(newtpr, ext=ext) 1681 TRJCONV(dump=0, stdout=False, stderr=False, **kwargs) # silent 1682 except: 1683 logger.exception("Failed building the water-less %(ext)s. " 1684 "Position restraints in tpr file (see docs)?" % vars()) 1685 logger.info("strip_water() complete") 1686 1687 self.nowater[newxtc] = Transformer(dirname=self.dirname, s=newtpr, 1688 f=newxtc, n=newndx) 1689 return {'tpr':self.rp(newtpr), 'xtc':self.rp(newxtc), 'ndx':self.rp(newndx)}
1690