Package gromacs :: Module setup
[hide private]
[frames] | no frames]

Source Code for Module gromacs.setup

  1  # GromacsWrapper: cbook.py 
  2  """ 
  3  :mod:`gromacs.setup` -- Setting up a Gromacs MD run 
  4  =================================================== 
  5   
  6  Individual steps such as solvating a structure or energy minimization 
  7  are set up in individual directories. For energy minimization one 
  8  should supply appropriate mdp run input files; otherwise example 
  9  templates are used. 
 10   
 11  .. warning:: 
 12   
 13     You **must** check all simulation parameters for yourself. Do not rely on 
 14     any defaults provided here. The scripts provided here are provided under the 
 15     assumption that you know what you are doing and you just want to automate 
 16     the boring parts of the process. 
 17   
 18   
 19  User functions 
 20  -------------- 
 21   
 22  The individual steps of setting up a simple MD simulation are broken down in a 
 23  sequence of functions that depend on the previous step(s): 
 24   
 25    :func:`topology` 
 26          generate initial topology file (limited functionality, might require 
 27          manual setup) 
 28    :func:`solvate` 
 29          solvate globular protein and add ions to neutralize 
 30    :func:`energy_minimize` 
 31          set up energy minimization and run it (using ``mdrun_d``) 
 32    :func:`MD_restrained` 
 33          set up restrained MD 
 34    :func:`MD` 
 35          set up equilibrium MD 
 36   
 37  Each function uses its own working directory (set with the ``dirname`` keyword 
 38  argument, but it should be safe and convenient to use the defaults). Other 
 39  arguments assume the default locations so typically not much should have to be 
 40  set manually. 
 41   
 42  One can supply non-standard itp files in the topology directory. In 
 43  some cases one does not use the :func:`topology` function at all but 
 44  sets up the topology manually. In this case it is safest to call the 
 45  topology directory ``top`` and make sure that it contains all relevant 
 46  top, itp, and pdb files. 
 47   
 48   
 49  Example 
 50  ------- 
 51   
 52  Run a single protein in a dodecahedral box of SPC water molecules and 
 53  use the GROMOS96 G43a1 force field. We start with the structure in 
 54  ``protein.pdb``:: 
 55   
 56    from gromacs.setup import * 
 57    f1 = topology(protein='MyProtein', struct='protein.pdb', ff='G43a1', water='spc', force=True, ignh=True) 
 58   
 59  Each function returns "interesting" new files in a dictionary in such 
 60  a away that it can often be used as input for the next function in the 
 61  chain (although in most cases one can get away with the defaults of 
 62  the keyword arguments):: 
 63   
 64    f2 = solvate(**f1) 
 65    f3 = energy_minimize(**f2) 
 66   
 67  Now prepare input for a MD run with restraints on the protein:: 
 68   
 69    MD_restrained(**f3) 
 70   
 71  Use the files in the directory to run the simulation locally or on a 
 72  cluster. You can provide your own template for a queuing system 
 73  submission script; see the source code for details. 
 74   
 75  Once the restraint run has completed, use the last frame as input for 
 76  the equilibrium MD:: 
 77   
 78    MD(struct='MD_POSRES/md.gro', runtime=1e5) 
 79   
 80  Run the resulting tpr file on a cluster. 
 81   
 82   
 83  Functions 
 84  --------- 
 85   
 86  The following functions are provided for the user: 
 87   
 88  .. autofunction:: topology 
 89  .. autofunction:: solvate 
 90  .. autofunction:: energy_minimize 
 91  .. autofunction:: MD_restrained 
 92  .. autofunction:: MD 
 93   
 94  Helper functions (mainly of use to developers): 
 95   
 96  .. autofunction:: make_main_index 
 97  .. autofunction:: _setup_MD 
 98   
 99  """ 
100   
101  from __future__ import with_statement 
102   
103  __docformat__ = "restructuredtext en" 
104   
105  import os 
106  import errno 
107  import re 
108  import shutil 
109  import warnings 
110   
111  import logging 
112  logger = logging.getLogger('gromacs.setup') 
113   
114   
115  import gromacs 
116  import gromacs.config as config 
117  from gromacs import GromacsError, GromacsFailureWarning, GromacsValueWarning, \ 
118       AutoCorrectionWarning, BadParameterWarning, UsageWarning, MissingDataError 
119  import gromacs.cbook 
120  from gromacs.cbook import add_mdp_includes 
121  import gromacs.qsub 
122  import gromacs.utilities 
123  from gromacs.utilities import in_dir, realpath, Timedelta, asiterable 
124   
125   
126  #: Concentration of water at standard conditions in mol/L. 
127  #: density at 25 degrees C and 1 atmosphere pressure: rho = 997.0480 g/L. 
128  #: molecular weight: M = 18.015 g/mol 
129  #: c = n/V = m/(V*M) = rho/M  = 55.345 mol/L 
130  CONC_WATER = 55.345 
131   
132  # XXX: This is not used anywhere at the moment: 
133  # parse this table for a usable data structure (and then put it directly in the docs) 
134  recommended_mdp_table = """\ 
135  Table: recommended mdp parameters for different FF 
136  ==========   =========  ================ 
137  mdp          GROMOS     OPLS-AA 
138  ==========   =========  ================ 
139  rvdw         1.4        1.0 
140  rlist        1.4 ?      1.0 
141  ==========   =========  ================ 
142  """ 
143   
144   
145       
146   
147   
148  # TODO: 
149  # - should be part of a class so that we can store the topology etc !!! 
150  #   and also store mainselection 
151   
152 -def topology(struct=None, protein='protein', 153 top='system.top', dirname='top', **pdb2gmx_args):
154 """Build Gromacs topology files from pdb. 155 156 :Keywords: 157 *struct* 158 input structure (**required**) 159 *protein* 160 name of the output files 161 *top* 162 name of the topology file 163 *dirname* 164 directory in which the new topology will be stored 165 *pdb2gmxargs* 166 arguments for ``pdb2gmx`` such as ``ff``, ``water``, ... 167 168 .. note:: 169 At the moment this function simply runs ``pdb2gmx`` and uses 170 the resulting topology file directly. If you want to create 171 more complicated topologies and maybe also use additional itp 172 files or make a protein itp file then you will have to do this 173 manually. 174 """ 175 176 structure = realpath(struct) 177 178 new_struct = protein + '.pdb' 179 posres = protein + '_posres.itp' 180 tmp_top = "tmp.top" 181 182 pdb2gmx_args.update({'f': structure, 'o': new_struct, 'p': tmp_top, 'i': posres}) 183 184 with in_dir(dirname): 185 logger.info("[%(dirname)s] Building topology %(top)r from struct = %(struct)r" % vars()) 186 gromacs.pdb2gmx(**pdb2gmx_args) 187 # need some editing protein_tmp --> system.top and protein.itp 188 # here... for the time being we just copy 189 shutil.copy(tmp_top, top) 190 return {'top': realpath(dirname, top), 'struct': realpath(dirname, new_struct)}
191 192 trj_compact_main = gromacs.tools.Trjconv(ur='compact', center=True, boxcenter='tric', pbc='mol', 193 input=('__main__','system'), 194 doc="Returns a compact representation of the system centered on the __main__ group") 195
196 -def make_main_index(struct, selection='"Protein"', ndx='main.ndx', oldndx=None):
197 """Make index file with the special groups. 198 199 This routine adds the group __main__ and the group __environment__ 200 to the end of the index file. __main__ contains what the user 201 defines as the *central* and *most important* parts of the 202 system. __environment__ is everything else. 203 204 The template mdp file, for instance, uses these two groups for T-coupling. 205 206 These groups are mainly useful if the default groups "Protein" and "Non-Protein" 207 are not appropriate. By using symbolic names such as __main__ one 208 can keep scripts more general. 209 210 :Returns: 211 *groups* is a list of dictionaries that describe the index groups. See 212 :func:`gromacs.cbook.parse_ndxlist` for details. 213 214 :Arguments: 215 *struct* : filename 216 structure (tpr, pdb, gro) 217 *selection* : string 218 is a ``make_ndx`` command such as ``"Protein"`` or ``r DRG`` which 219 determines what is considered the main group for centering etc. It is 220 passed directly to ``make_ndx``. 221 *ndx* : string 222 name of the final index file 223 *oldndx* : string 224 name of index file that should be used as a basis; if None 225 then the ``make_ndx`` default groups are used. 226 227 This routine is very dumb at the moment; maybe some heuristics will be 228 added later as could be other symbolic groups such as __membrane__. 229 """ 230 231 logger.info("Building the main index file %(ndx)r..." % vars()) 232 233 # pass 1: select 234 # empty command '' important to get final list of groups 235 rc,out,nothing = gromacs.make_ndx(f=struct, n=oldndx, o=ndx, stdout=False, stderr=True, 236 input=(selection, '', 'q')) 237 groups = gromacs.cbook.parse_ndxlist(out) 238 last = len(groups) - 1 239 assert last == groups[-1]['nr'] 240 241 # pass 2: 242 # 1) last group is __main__ 243 # 2) __environment__ is everything else (eg SOL, ions, ...) 244 rc,out,nothing = gromacs.make_ndx(f=struct, n=ndx, o=ndx, 245 stdout=False, stderr=True, 246 input=('name %d __main__' % last, 247 '! "__main__"', # is now group last+1 248 'name %d __environment__' % (last+1), 249 '', 'q')) 250 return gromacs.cbook.parse_ndxlist(out)
251 252 253 #: Hard-coded lipid residue names for a ``vdwradii.dat`` file. Use together with 254 #: :data:`~gromacs.setup.vdw_lipid_atoms` in :func:`~gromacs.setup.get_lipid_vdwradii`. 255 vdw_lipid_resnames = ["POPC", "POPE", "POPG", "DOPC", "DPPC", "DLPC", "DMPC", "DPPG"] 256 #: Increased atom radii for lipid atoms; these are simply the standard values from 257 #: ``GMXLIB/vdwradii.dat`` increased by 0.1 nm (C) or 0.05 (N, O, H). 258 vdw_lipid_atoms = {'C': 0.25, 'N': 0.16, 'O': 0.155, 'H': 0.09} 259
260 -def get_lipid_vdwradii(outdir=os.path.curdir, libdir=None):
261 """Find vdwradii.dat and add special entries for lipids. 262 263 See :data:`gromacs.setup.vdw_lipid_resnames` for lipid 264 resnames. Add more if necessary. 265 """ 266 vdwradii_dat = os.path.join(outdir, "vdwradii.dat") 267 268 if not libdir is None: 269 filename = os.path.join(libdir, 'vdwradii.dat') # canonical name 270 if not os.path.exists(filename): 271 msg = 'No VDW database file found in %(filename)r.' % vars() 272 logger.exception(msg) 273 raise OSError(msg, errno.ENOENT) 274 else: 275 try: 276 filename = os.path.join(os.environ['GMXLIB'], 'vdwradii.dat') 277 except KeyError: 278 try: 279 filename = os.path.join(os.environ['GMXDATA'], 'gromacs', 'top', 'vdwradii.dat') 280 except KeyError: 281 msg = "Cannot find vdwradii.dat. Set GMXLIB or GMXDATA." 282 logger.exception(msg) 283 raise OSError(msg, errno.ENOENT) 284 if not os.path.exists(filename): 285 msg = "Cannot find %r; something is wrong with the Gromacs installation." % vars() 286 logger.exception(msg, errno.ENOENT) 287 raise OSError(msg) 288 289 # make sure to catch 3 and 4 letter resnames 290 patterns = vdw_lipid_resnames + list(set([x[:3] for x in vdw_lipid_resnames])) 291 # TODO: should do a tempfile... 292 with open(vdwradii_dat, 'w') as outfile: 293 # write lipid stuff before general 294 outfile.write('; Special larger vdw radii for solvating lipid membranes\n') 295 for resname in patterns: 296 for atom,radius in vdw_lipid_atoms.items(): 297 outfile.write('%(resname)4s %(atom)-5s %(radius)5.3f\n' % vars()) 298 with open(filename, 'r') as infile: 299 for line in infile: 300 outfile.write(line) 301 logger.debug('Created lipid vdW radii file %(vdwradii_dat)r.' % vars()) 302 return realpath(vdwradii_dat)
303
304 -def solvate(struct='top/protein.pdb', top='top/system.top', 305 distance=0.9, boxtype='dodecahedron', 306 concentration=0, cation='NA+', anion='CL-', 307 water='spc', with_membrane=False, 308 ndx = 'main.ndx', mainselection = '"Protein"', 309 dirname='solvate', 310 **kwargs):
311 """Put protein into box, add water, add counter-ions. 312 313 Currently this really only supports solutes in water. If you need 314 to embedd a protein in a membrane then you will require more 315 sophisticated approaches. 316 317 However, you *can* supply a protein already inserted in a 318 bilayer. In this case you will probably want to set *distance* = 319 ``None`` and also enable *with_membrane* = ``True`` (using extra 320 big vdw radii for typical lipids). 321 322 323 :Arguments: 324 *struct* : filename 325 pdb or gro input structure 326 *top* : filename 327 Gromacs topology 328 *distance* : float 329 When solvating with water, make the box big enough so that 330 at least *distance* nm water are between the solute *struct* 331 and the box boundary. 332 Set this to ``None`` in order to use a box size in the input 333 file (gro or pdb). 334 *boxtype* : string 335 Any of the box types supported by :class:`~gromacs.tools.Genbox`. 336 If set to ``None`` it will also ignore *distance* and use the box 337 inside the *struct* file. 338 *concentration* : float 339 Concentration of the free ions in mol/l. Note that counter 340 ions are added in excess of this concentration. 341 *cation* and *anion* : string 342 Molecule names of the ions. This depends on the chosen force field. 343 *water* : string 344 Name of the water model; one of "spc", "spce", "tip3p", 345 "tip4p". This should be appropriate for the chosen force 346 field. If no water is requird, simply supply the path to a box with solvent 347 molecules (used by :func:`gromacs.genbox`'s *cs* argument). 348 *with_membrane* : bool 349 ``True``: use special ``vdwradii.dat`` with 0.1 nm-increased radii on 350 lipids. Default is ``False``. 351 *ndx* : filename 352 The name of the custom index file that is produced here. 353 *mainselection* : string 354 A string that is fed to :class:`~gromacs.tools.Make_ndx` and 355 which should select the solute. 356 *dirname* : directory name 357 Name of the directory in which all files for the solvation stage are stored. 358 *includes* : list of additional directories to add to the mdp include path 359 360 .. Note:: non-water solvents only work if the molecules are named SOL. 361 """ 362 363 structure = realpath(struct) 364 topology = realpath(top) 365 366 # handle additional include directories (kwargs are also modified!) 367 mdp_kwargs = add_mdp_includes(topology, kwargs) 368 369 if water.lower() in ('spc', 'spce'): 370 water = 'spc216' 371 372 # By default, grompp should not choke on a few warnings because at 373 # this stage the user cannot do much about it (can be set to any 374 # value but keep undocumented...) 375 grompp_maxwarn = kwargs.pop('maxwarn',10) 376 377 # should scrub topology but hard to know what to scrub; for 378 # instance, some ions or waters could be part of the crystal structure 379 380 with in_dir(dirname): 381 logger.info("[%(dirname)s] Solvating with water %(water)r..." % vars()) 382 if distance is None or boxtype is None: 383 hasBox = False 384 ext = os.path.splitext(structure)[1] 385 if ext == '.gro': 386 hasBox = True 387 elif ext == '.pdb': 388 with open(structure) as struct: 389 for line in struct: 390 if line.startswith('CRYST'): 391 hasBox = True 392 break 393 if not hasBox: 394 msg = "No box data in the input structure %(structure)r and distance or boxtype is set to None" % vars() 395 logger.exception(msg) 396 raise MissingDataError(msg) 397 distance = boxtype = None # ensures that editconf just converts 398 gromacs.editconf(f=structure, o='boxed.gro', bt=boxtype, d=distance) 399 400 if with_membrane: 401 vdwradii_dat = get_lipid_vdwradii() # need to clean up afterwards 402 logger.info("Using special vdW radii for lipids %r" % vdw_lipid_resnames) 403 404 try: 405 gromacs.genbox(p=topology, cp='boxed.gro', cs=water, o='solvated.gro') 406 except: 407 if with_membrane: 408 # remove so that it's not picked up accidentally 409 gromacs.utilities.unlink_f(vdwradii_dat) 410 raise 411 logger.info("Solvated system with %s", water) 412 413 with open('none.mdp','w') as mdp: 414 mdp.write('; empty mdp file\ninclude = %(include)s\n' % mdp_kwargs) 415 qtotgmx = gromacs.cbook.grompp_qtot(f='none.mdp', o='topol.tpr', c='solvated.gro', 416 p=topology, stdout=False, maxwarn=grompp_maxwarn) 417 qtot = round(qtotgmx) 418 logger.info("[%(dirname)s] After solvation: total charge qtot = %(qtotgmx)r = %(qtot)r" % vars()) 419 420 if concentration != 0: 421 logger.info("[%(dirname)s] Adding ions for c = %(concentration)f M..." % vars()) 422 # target concentration of free ions c ==> 423 # N = N_water * c/c_water 424 # add ions for concentration to the counter ions (counter ions are less free) 425 # 426 # get number of waters (count OW ... works for SPC*, TIP*P water models) 427 rc,output,junk = gromacs.make_ndx(f='topol.tpr', o='ow.ndx', 428 input=('keep 0', 'del 0', 'a OW*', 'name 0 OW', '', 'q'), 429 stdout=False, stderr=True) 430 groups = gromacs.cbook.parse_ndxlist(output) 431 gdict = dict([(g['name'], g) for g in groups]) # overkill... 432 N_water = gdict['OW']['natoms'] # ... but dict lookup is nice 433 N_ions = int(N_water * concentration/CONC_WATER) # number of monovalents 434 else: 435 N_ions = 0 436 437 # neutralize (or try -neutral switch of genion???) 438 n_cation = n_anion = 0 439 if qtot > 0: 440 n_anion = int(abs(qtot)) 441 elif qtot < 0: 442 n_cation = int(abs(qtot)) 443 444 n_cation += N_ions 445 n_anion += N_ions 446 447 if n_cation != 0 or n_anion != 0: 448 # sanity check: 449 assert qtot + n_cation - n_anion < 1e-6 450 logger.info("[%(dirname)s] Adding n_cation = %(n_cation)d and n_anion = %(n_anion)d ions..." % vars()) 451 gromacs.genion(s='topol.tpr', o='ionized.gro', p=topology, 452 pname=cation, nname=anion, np=n_cation, nn=n_anion, 453 input='SOL') 454 else: 455 # fake ionized file ... makes it easier to continue without too much fuzz 456 try: 457 os.unlink('ionized.gro') 458 except OSError, err: 459 if err.errno != errno.ENOENT: 460 raise 461 os.symlink('solvated.gro', 'ionized.gro') 462 463 qtot = gromacs.cbook.grompp_qtot(f='none.mdp', o='ionized.tpr', c='ionized.gro', 464 p=topology, stdout=False, maxwarn=grompp_maxwarn) 465 466 if abs(qtot) > 1e-4: 467 wmsg = "System has non-zero total charge qtot = %(qtot)g e." % vars() 468 warnings.warn(wmsg, category=BadParameterWarning) 469 logger.warn(wmsg) 470 471 # make main index 472 try: 473 make_main_index('ionized.tpr', selection=mainselection, ndx=ndx) 474 except GromacsError, err: 475 # or should I rather fail here? 476 wmsg = "Failed to make main index file %r ... maybe set mainselection='...'.\n"\ 477 "The error message was:\n%s\n" % (ndx, str(err)) 478 logger.warn(wmsg) 479 warnings.warn(wmsg, category=GromacsFailureWarning) 480 try: 481 trj_compact_main(f='ionized.gro', s='ionized.tpr', o='compact.pdb', n=ndx) 482 except GromacsError, err: 483 wmsg = "Failed to make compact pdb for visualization... pressing on regardless. "\ 484 "The error message was:\n%s\n" % str(err) 485 logger.warn(wmsg) 486 warnings.warn(wmsg, category=GromacsFailureWarning) 487 488 return {'qtot': qtot, 489 'struct': realpath(dirname, 'ionized.gro'), 490 'ndx': realpath(dirname, ndx), # not sure why this is propagated-is it used? 491 'mainselection': mainselection, 492 }
493 494
495 -def check_mdpargs(d):
496 """Check if any arguments remain in dict *d*.""" 497 if len(d) > 0: 498 wmsg = "Unprocessed mdp option are interpreted as options for grompp:\n"+str(d) 499 logger.warn(wmsg) 500 warnings.warn(wmsg, category=UsageWarning) 501 return len(d) == 0
502
503 -def energy_minimize(dirname='em', mdp=config.templates['em.mdp'], 504 struct='solvate/ionized.gro', top='top/system.top', 505 output='em.pdb', deffnm="em", 506 mdrunner=None, **kwargs):
507 """Energy minimize the system. 508 509 This sets up the system (creates run input files) and also runs 510 ``mdrun_d``. Thus it can take a while. 511 512 Additional itp files should be in the same directory as the top file. 513 514 Many of the keyword arguments below already have sensible values. 515 516 :Keywords: 517 *dirname* 518 set up under directory dirname [em] 519 *struct* 520 input structure (gro, pdb, ...) [solvate/ionized.gro] 521 *output* 522 output structure (will be put under dirname) [em.pdb] 523 *deffnm* 524 default name for mdrun-related files [em] 525 *top* 526 topology file [top/system.top] 527 *mdp* 528 mdp file (or use the template) [templates/em.mdp] 529 *includes* 530 additional directories to search for itp files 531 *mdrunner* 532 :class:`gromacs.run.MDrunner` class; by defauly we 533 just try :func:`gromacs.mdrun_d` and :func:`gromacs.mdrun` but a 534 MDrunner class gives the user the ability to run mpi jobs 535 etc. [None] 536 *kwargs* 537 remaining key/value pairs that should be changed in the 538 template mdp file, eg ``nstxtcout=250, nstfout=250``. 539 540 .. note:: If :func:`~gromacs.mdrun_d` is not found, the function 541 falls back to :func:`~gromacs.mdrun` instead. 542 """ 543 544 structure = realpath(struct) 545 topology = realpath(top) 546 mdp_template = config.get_template(mdp) 547 deffnm = deffnm.strip() 548 549 # write the processed topology to the default output 550 kwargs.setdefault('pp', 'processed.top') 551 552 # filter some kwargs that might come through when feeding output 553 # from previous stages such as solvate(); necessary because *all* 554 # **kwargs must be *either* substitutions in the mdp file *or* valid 555 # command line parameters for ``grompp``. 556 kwargs.pop('ndx', None) 557 # mainselection is not used but only passed through; right now we 558 # set it to the default that is being used in all argument lists 559 # but that is not pretty. TODO. 560 mainselection = kwargs.pop('mainselection', '"Protein"') 561 # only interesting when passed from solvate() 562 qtot = kwargs.pop('qtot', 0) 563 564 mdp = deffnm+'.mdp' 565 tpr = deffnm+'.tpr' 566 567 logger.info("[%(dirname)s] Energy minimization of struct=%(struct)r, top=%(top)r, mdp=%(mdp)r ..." % vars()) 568 569 add_mdp_includes(topology, kwargs) 570 571 if qtot != 0: 572 # At the moment this is purely user-reported and really only here because 573 # it might get fed into the function when using the keyword-expansion pipeline 574 # usage paradigm. 575 wmsg = "Total charge was reported as qtot = %(qtot)g <> 0; probably a problem." % vars() 576 logger.warn(wmsg) 577 warnings.warn(wmsg, category=BadParameterWarning) 578 579 with in_dir(dirname): 580 unprocessed = gromacs.cbook.edit_mdp(mdp_template, new_mdp=mdp, **kwargs) 581 check_mdpargs(unprocessed) 582 gromacs.grompp(f=mdp, o=tpr, c=structure, p=topology, **unprocessed) 583 mdrun_args = dict(v=True, stepout=10, deffnm=deffnm, c=output) 584 if mdrunner is None: 585 try: 586 gromacs.mdrun_d(**mdrun_args) 587 except (AttributeError, OSError): 588 # fall back to mdrun if no double precision binary 589 wmsg = "No 'mdrun_d' binary found so trying 'mdrun' instead.\n"\ 590 "(Note that energy minimization runs better with mdrun_d.)" 591 logger.warn(wmsg) 592 warnings.warn(wmsg, category=AutoCorrectionWarning) 593 gromacs.mdrun(**mdrun_args) 594 else: 595 # user wants full control and provides simulation.MDrunner **class** 596 # NO CHECKING --- in principle user can supply any callback they like 597 mdrun = mdrunner(**mdrun_args) 598 mdrun.run() 599 600 # em.gro --> gives 'Bad box in file em.gro' warning --- why?? 601 # --> use em.pdb instead. 602 if not os.path.exists(output): 603 errmsg = "Energy minimized system NOT produced." 604 logger.error(errmsg) 605 raise GromacsError(errmsg) 606 final_struct = realpath(output) 607 608 logger.info("[%(dirname)s] energy minimized structure %(final_struct)r" % vars()) 609 return {'struct': final_struct, 610 'top': topology, 611 'mainselection': mainselection, 612 }
613
614 -def em_schedule(**kwargs):
615 """Run multiple ebergy minimizations one after each other. 616 617 :Keywords: 618 *integrators* 619 list of integrators (from 'l-bfgs', 'cg', 'steep') 620 [['bfgs', 'steep']] 621 *nsteps* 622 list of maximum number of steps; one for each integrator in 623 in the *integrators* list [[100,1000]] 624 *kwargs* 625 mostly passed to :func:`gromacs.setup.energy_minimize` 626 627 :Returns: dictionary with paths to final structure ('struct') and 628 other files 629 630 :Example: 631 Conduct three minimizations: 632 1. low memory Broyden-Goldfar-Fletcher-Shannon for 30 steps 633 2. steepest descent for 200 steps 634 3. finish with BFGS for another 30 steps 635 We also do a multi-processor minimization when possible (i.e. for steep (and conjugate 636 gradient) by using a :class:`gromacs.run.MDrunner` class for a 637 :program:`mdrun` executable compiled for OpenMP in 64 bit (see 638 :mod:`gromacs.run` for details). 639 640 import gromacs.run 641 gromacs.setup.em_schedule(struct='solvate/ionized.gro', 642 mdrunner=gromacs.run.MDrunnerOpenMP64, 643 integrators=['l-bfgs', 'steep', 'l-bfgs'], 644 nsteps=[50,200, 50]) 645 646 .. Note:: You might have to prepare the mdp file carefully because at the 647 moment one can only modify the *nsteps* parameter on a 648 per-minimizer basis. 649 """ 650 651 mdrunner = kwargs.pop('mdrunner', None) 652 integrators = kwargs.pop('integrators', ['l-bfgs', 'steep']) 653 kwargs.pop('integrator', None) 654 nsteps = kwargs.pop('nsteps', [100, 1000]) 655 656 outputs = ['em%03d_%s.pdb' % (i,integrator) for i,integrator in enumerate(integrators)] 657 outputs[-1] = kwargs.pop('output', 'em.pdb') 658 659 files = {'struct': kwargs.pop('struct', None)} # fake output from energy_minimize() 660 661 for i, integrator in enumerate(integrators): 662 struct = files['struct'] 663 logger.info("[em %d] energy minimize with %s for maximum %d steps", i, integrator, nsteps[i]) 664 kwargs.update({'struct':struct, 'output':outputs[i], 665 'integrator':integrator, 'nsteps': nsteps[i]}) 666 if not integrator == 'l-bfgs': 667 kwargs['mdrunner'] = mdrunner 668 else: 669 kwargs['mdrunner'] = None 670 logger.warning("[em %d] Not using mdrunner for L-BFGS because it cannot " 671 "do parallel runs.", i) 672 673 files = energy_minimize(**kwargs) 674 675 return files
676 677
678 -def _setup_MD(dirname, 679 deffnm='md', mdp=config.templates['md_OPLSAA.mdp'], 680 struct=None, 681 top='top/system.top', ndx=None, 682 mainselection='"Protein"', 683 qscript=config.qscript_template, qname=None, mdrun_opts="", budget=None, walltime=1/3., 684 dt=0.002, runtime=1e3, **mdp_kwargs):
685 """Generic function to set up a ``mdrun`` MD simulation. 686 687 See the user functions for usage. 688 """ 689 690 if struct is None: 691 raise ValueError('struct must be set to a input structure') 692 structure = realpath(struct) 693 topology = realpath(top) 694 try: 695 index = realpath(ndx) 696 except AttributeError: # (that's what realpath(None) throws...) 697 index = None # None is handled fine below 698 699 qname = mdp_kwargs.pop('sgename', qname) # compatibility for old scripts 700 qscript = mdp_kwargs.pop('sge', qscript) # compatibility for old scripts 701 qscript_template = config.get_template(qscript) 702 mdp_template = config.get_template(mdp) 703 704 nsteps = int(float(runtime)/float(dt)) 705 706 mdp = deffnm + '.mdp' 707 tpr = deffnm + '.tpr' 708 mainindex = deffnm + '.ndx' 709 final_structure = deffnm + '.gro' # guess... really depends on templates,could also be DEFFNM.pdb 710 711 # write the processed topology to the default output 712 mdp_parameters = {'nsteps':nsteps, 'dt':dt, 'pp': 'processed.top'} 713 mdp_parameters.update(mdp_kwargs) 714 715 add_mdp_includes(topology, mdp_parameters) 716 717 with in_dir(dirname): 718 if not (mdp_parameters.get('Tcoupl','').lower() == 'no' or mainselection is None): 719 logger.info("[%(dirname)s] Automatic adjustment of T-coupling groups" % vars()) 720 721 # make index file in almost all cases; with mainselection == None the user 722 # takes FULL control and also has to provide the template or index 723 groups = make_main_index(structure, selection=mainselection, 724 oldndx=index, ndx=mainindex) 725 natoms = dict([(g['name'], float(g['natoms'])) for g in groups]) 726 try: 727 x = natoms['__main__']/natoms['__environment__'] 728 except KeyError: 729 x = 0 # force using SYSTEM in code below 730 wmsg = "Missing __main__ and/or __environment__ index group.\n" \ 731 "This probably means that you have an atypical system. You can " \ 732 "set mainselection=None and provide your own mdp and index files " \ 733 "in order to set up temperature coupling.\n" \ 734 "If no T-coupling is required then set Tcoupl='no'.\n" \ 735 "For now we will just couple everything to 'System'." 736 logger.warn(wmsg) 737 warnings.warn(wmsg, category=AutoCorrectionWarning) 738 if x < 0.1: 739 # TODO: does not handle properly multiple groups in arglist yet 740 tau_t = mdp_parameters.pop('tau_t', 0.1) 741 ref_t = mdp_parameters.pop('ref_t', 300) 742 # combine all in one T-coupling group 743 mdp_parameters['tc-grps'] = 'System' 744 mdp_parameters['tau_t'] = tau_t # this overrides the commandline! 745 mdp_parameters['ref_t'] = ref_t # this overrides the commandline! 746 mdp_parameters['gen-temp'] = ref_t 747 wmsg = "Size of __main__ is only %.1f%% of __environment__ so " \ 748 "we use 'System' for T-coupling and ref_t = %g and " \ 749 "tau_t = %g (can be changed in mdp_parameters).\n" \ 750 % (x * 100, ref_t, tau_t) 751 logger.warn(wmsg) 752 warnings.warn(wmsg, category=AutoCorrectionWarning) 753 index = realpath(mainindex) 754 if mdp_parameters.get('Tcoupl','').lower() == 'no': 755 logger.info("Tcoupl == no: disabling all temperature coupling mdp options") 756 mdp_parameters['tc-grps'] = "" 757 mdp_parameters['tau_t'] = "" 758 mdp_parameters['ref_t'] = "" 759 mdp_parameters['gen-temp'] = "" 760 if mdp_parameters.get('Pcoupl','').lower() == 'no': 761 logger.info("Pcoupl == no: disabling all pressure coupling mdp options") 762 mdp_parameters['tau_p'] = "" 763 mdp_parameters['ref_p'] = "" 764 mdp_parameters['compressibility'] = "" 765 766 unprocessed = gromacs.cbook.edit_mdp(mdp_template, new_mdp=mdp, **mdp_parameters) 767 check_mdpargs(unprocessed) 768 gromacs.grompp(f=mdp, p=topology, c=structure, n=index, o=tpr, **unprocessed) 769 770 runscripts = gromacs.qsub.generate_submit_scripts( 771 qscript_template, deffnm=deffnm, jobname=qname, budget=budget, 772 mdrun_opts=mdrun_opts, walltime=walltime) 773 774 logger.info("[%(dirname)s] All files set up for a run time of %(runtime)g ps " 775 "(dt=%(dt)g, nsteps=%(nsteps)g)" % vars()) 776 777 kwargs = {'struct': realpath(os.path.join(dirname, final_structure)), # guess 778 'top': topology, 779 'ndx': index, # possibly mainindex 780 'qscript': runscripts, 781 'mainselection': mainselection} 782 kwargs.update(mdp_kwargs) # return extra mdp args so that one can use them for prod run 783 kwargs.pop('define', None) # but make sure that -DPOSRES does not stay... 784 return kwargs
785 786
787 -def MD_restrained(dirname='MD_POSRES', **kwargs):
788 """Set up MD with position restraints. 789 790 Additional itp files should be in the same directory as the top file. 791 792 Many of the keyword arguments below already have sensible 793 values. Note that setting *mainselection* = ``None`` will disable 794 many of the automated choices and is often recommended when using 795 your own mdp file. 796 797 :Keywords: 798 *dirname* 799 set up under directory dirname [MD_POSRES] 800 *struct* 801 input structure (gro, pdb, ...) [em/em.pdb] 802 *top* 803 topology file [top/system.top] 804 *mdp* 805 mdp file (or use the template) [templates/md.mdp] 806 *ndx* 807 index file (supply when using a custom mdp) 808 *includes* 809 additional directories to search for itp files 810 *mainselection* 811 `` make_ndx`` selection to select main group ["Protein"] 812 (If ``None`` then no canonical index file is generated and 813 it is the user's responsibility to set *tc_grps*, 814 *tau_t*, and *ref_t* as keyword arguments, or provide the mdp template 815 with all parameter pre-set in *mdp* and probably also your own *ndx* 816 index file.) 817 *deffnm* 818 default filename for Gromacs run [md] 819 *runtime* 820 total length of the simulation in ps [1000] 821 *dt* 822 integration time step in ps [0.002] 823 *qscript* 824 script to submit to the queuing system; by default 825 uses the template :data:`gromacs.config.qscript_template`, which can 826 be manually set to another template from :data:`gromacs.config.templates`; 827 can also be a list of template names. 828 *qname* 829 name to be used for the job in the queuing system [PR_GMX] 830 *mdrun_opts* 831 option flags for the :program:`mdrun` command in the queuing system 832 scripts such as "-stepout 100". [""] 833 *kwargs* 834 remaining key/value pairs that should be changed in the template mdp 835 file, eg ``nstxtcout=250, nstfout=250`` or command line options for 836 ``grompp` such as ``maxwarn=1``. 837 838 In particular one can also set **define** and activate 839 whichever position restraints have been coded into the itp 840 and top file. For instance one could have 841 842 *define* = "-DPOSRES_MainChain -DPOSRES_LIGAND" 843 844 if these preprocessor constructs exist. Note that there 845 **must not be any space between "-D" and the value.** 846 847 By default *define* is set to "-DPOSRES". 848 849 :Returns: a dict that can be fed into :func:`gromacs.setup.MD` 850 (but check, just in case, especially if you want to 851 change the ``define`` parameter in the mdp file) 852 853 .. Note:: The output frequency is drastically reduce for position 854 restraints runs by default. Set the corresponding ``nst*`` 855 variables if you require more output. 856 """ 857 858 logger.info("[%(dirname)s] Setting up MD with position restraints..." % vars()) 859 kwargs.setdefault('struct', 'em/em.pdb') 860 kwargs.setdefault('qname', 'PR_GMX') 861 kwargs.setdefault('define', '-DPOSRES') 862 # reduce size of output files 863 kwargs.setdefault('nstxout', '50000') # trr pos 864 kwargs.setdefault('nstvout', '50000') # trr veloc 865 kwargs.setdefault('nstfout', '0') # trr forces 866 kwargs.setdefault('nstlog', '500') # log file 867 kwargs.setdefault('nstenergy', '2500') # edr energy 868 kwargs.setdefault('nstxtcout', '5000') # xtc pos 869 return _setup_MD(dirname, **kwargs)
870
871 -def MD(dirname='MD', **kwargs):
872 """Set up equilibrium MD. 873 874 Additional itp files should be in the same directory as the top file. 875 876 Many of the keyword arguments below already have sensible 877 values. Note that setting *mainselection*=``None`` will disable 878 many of the automated choices and is often recommended when using 879 your own mdp file. 880 881 :Keywords: 882 *dirname* 883 set up under directory dirname [MD] 884 *struct* 885 input structure (gro, pdb, ...) [MD_POSRES/md_posres.pdb] 886 *top* 887 topology file [top/system.top] 888 *mdp* 889 mdp file (or use the template) [templates/md.mdp] 890 *ndx* 891 index file (supply when using a custom mdp) 892 *includes* 893 additional directories to search for itp files 894 *mainselection* 895 ``make_ndx`` selection to select main group ["Protein"] 896 (If ``None`` then no canonical index file is generated and 897 it is the user's responsibility to set *tc_grps*, 898 *tau_t*, and *ref_t* as keyword arguments, or provide the mdp template 899 with all parameter pre-set in *mdp* and probably also your own *ndx* 900 index file.) 901 *deffnm* 902 default filename for Gromacs run [md] 903 *runtime* 904 total length of the simulation in ps [1000] 905 *dt* 906 integration time step in ps [0.002] 907 *qscript* 908 script to submit to the queuing system; by default 909 uses the template :data:`gromacs.config.qscript_template`, which can 910 be manually set to another template from :data:`gromacs.config.templates`; 911 can also be a list of template names. 912 *qname* 913 name to be used for the job in the queuing system [MD_GMX] 914 *mdrun_opts* 915 option flags for the :program:`mdrun` command in the queuing system 916 scripts such as "-stepout 100 -dgdl". [""] 917 *kwargs* 918 remaining key/value pairs that should be changed in the template mdp 919 file, eg ``nstxtcout=250, nstfout=250`` or command line options for 920 ``grompp` such as ``maxwarn=1``. 921 922 :Returns: a dict that can be fed into :func:`gromacs.setup.MD` 923 (but check, just in case, especially if you want to 924 change the ``define`` parameter in the mdp file) 925 """ 926 927 logger.info("[%(dirname)s] Setting up MD..." % vars()) 928 kwargs.setdefault('struct', 'MD_POSRES/md.gro') 929 kwargs.setdefault('qname', 'MD_GMX') 930 return _setup_MD(dirname, **kwargs)
931 932 933 934 # TODO: autorun (qv MI lipids/setup.pl) 935