Package gromacs :: Module setup
[hide private]
[frames] | no frames]

Source Code for Module gromacs.setup

  1  # GromacsWrapper: cbook.py 
  2  """ 
  3  :mod:`gromacs.setup` -- Setting up a Gromacs MD run 
  4  =================================================== 
  5   
  6  Individual steps such as solvating a structure or energy minimization 
  7  are set up in individual directories. For energy minimization one 
  8  should supply appropriate mdp run input files; otherwise example 
  9  templates are used. 
 10   
 11  .. warning:: 
 12   
 13     You **must** check all simulation parameters for yourself. Do not rely on 
 14     any defaults provided here. The scripts provided here are provided under the 
 15     assumption that you know what you are doing and you just want to automate 
 16     the boring parts of the process. 
 17   
 18   
 19  User functions 
 20  -------------- 
 21   
 22  The individual steps of setting up a simple MD simulation are broken down in a 
 23  sequence of functions that depend on the previous step(s): 
 24   
 25    :func:`topology` 
 26          generate initial topology file (limited functionality, might require 
 27          manual setup) 
 28    :func:`solvate` 
 29          solvate globular protein and add ions to neutralize 
 30    :func:`energy_minimize` 
 31          set up energy minimization and run it (using ``mdrun_d``) 
 32    :func:`em_schedule` 
 33          set up and run multiple energy minimizations one after another (as an 
 34          alternative to the simple single energy minimization provided by 
 35          :func:`energy_minimize`) 
 36    :func:`MD_restrained` 
 37          set up restrained MD 
 38    :func:`MD` 
 39          set up equilibrium MD 
 40   
 41  Each function uses its own working directory (set with the ``dirname`` keyword 
 42  argument, but it should be safe and convenient to use the defaults). Other 
 43  arguments assume the default locations so typically not much should have to be 
 44  set manually. 
 45   
 46  One can supply non-standard itp files in the topology directory. In 
 47  some cases one does not use the :func:`topology` function at all but 
 48  sets up the topology manually. In this case it is safest to call the 
 49  topology directory ``top`` and make sure that it contains all relevant 
 50  top, itp, and pdb files. 
 51   
 52   
 53  Example 
 54  ------- 
 55   
 56  Run a single protein in a dodecahedral box of SPC water molecules and 
 57  use the GROMOS96 G43a1 force field. We start with the structure in 
 58  ``protein.pdb``:: 
 59   
 60    from gromacs.setup import * 
 61    f1 = topology(protein='MyProtein', struct='protein.pdb', ff='G43a1', water='spc', force=True, ignh=True) 
 62   
 63  Each function returns "interesting" new files in a dictionary in such 
 64  a away that it can often be used as input for the next function in the 
 65  chain (although in most cases one can get away with the defaults of 
 66  the keyword arguments):: 
 67   
 68    f2 = solvate(**f1) 
 69    f3 = energy_minimize(**f2) 
 70   
 71  Now prepare input for a MD run with restraints on the protein:: 
 72   
 73    MD_restrained(**f3) 
 74   
 75  Use the files in the directory to run the simulation locally or on a 
 76  cluster. You can provide your own template for a queuing system 
 77  submission script; see the source code for details. 
 78   
 79  Once the restraint run has completed, use the last frame as input for 
 80  the equilibrium MD:: 
 81   
 82    MD(struct='MD_POSRES/md.gro', runtime=1e5) 
 83   
 84  Run the resulting tpr file on a cluster. 
 85   
 86   
 87  User functions 
 88  -------------- 
 89   
 90  The following functions are provided for the user: 
 91   
 92  .. autofunction:: topology 
 93  .. autofunction:: solvate 
 94  .. autofunction:: energy_minimize 
 95  .. autofunction:: em_schedule 
 96  .. autofunction:: MD_restrained 
 97  .. autofunction:: MD 
 98   
 99  Helper functions 
100  ---------------- 
101   
102  The following functions are used under the hood and are mainly useful when 
103  writing extensions to the module. 
104   
105  .. autofunction:: make_main_index 
106  .. autofunction:: check_mdpargs 
107  .. autofunction:: get_lipid_vdwradii 
108  .. autofunction:: _setup_MD 
109   
110  Defined constants: 
111   
112  .. autodata:: CONC_WATER 
113  .. autodata:: vdw_lipid_resnames 
114  .. autodata:: vdw_lipid_atom_radii 
115   
116  """ 
117   
118  from __future__ import with_statement 
119   
120  __docformat__ = "restructuredtext en" 
121   
122  import os 
123  import errno 
124  import re 
125  import shutil 
126  import warnings 
127   
128  import logging 
129  logger = logging.getLogger('gromacs.setup') 
130   
131   
132  import gromacs 
133  import gromacs.config as config 
134  from gromacs import GromacsError, GromacsFailureWarning, GromacsValueWarning, \ 
135       AutoCorrectionWarning, BadParameterWarning, UsageWarning, MissingDataError 
136  import gromacs.cbook 
137  from gromacs.cbook import add_mdp_includes 
138  import gromacs.qsub 
139  import gromacs.utilities 
140  from gromacs.utilities import in_dir, realpath, Timedelta, asiterable 
141   
142   
143  #: Concentration of water at standard conditions in mol/L. 
144  #: Density at 25 degrees C and 1 atmosphere pressure: rho = 997.0480 g/L. 
145  #: Molecular weight: M = 18.015 g/mol. 
146  #: c = n/V = m/(V*M) = rho/M  = 55.345 mol/L. 
147  CONC_WATER = 55.345 
148   
149  # XXX: This is not used anywhere at the moment: 
150  # parse this table for a usable data structure (and then put it directly in the docs) 
151  recommended_mdp_table = """\ 
152  Table: recommended mdp parameters for different FF 
153  ==========   =========  ================ 
154  mdp          GROMOS     OPLS-AA 
155  ==========   =========  ================ 
156  rvdw         1.4        1.0 
157  rlist        1.4 ?      1.0 
158  ==========   =========  ================ 
159  """ 
160   
161   
162       
163   
164   
165  # TODO: 
166  # - should be part of a class so that we can store the topology etc !!! 
167  #   and also store mainselection 
168   
169 -def topology(struct=None, protein='protein', 170 top='system.top', dirname='top', **pdb2gmx_args):
171 """Build Gromacs topology files from pdb. 172 173 :Keywords: 174 *struct* 175 input structure (**required**) 176 *protein* 177 name of the output files 178 *top* 179 name of the topology file 180 *dirname* 181 directory in which the new topology will be stored 182 *pdb2gmxargs* 183 arguments for ``pdb2gmx`` such as ``ff``, ``water``, ... 184 185 .. note:: 186 At the moment this function simply runs ``pdb2gmx`` and uses 187 the resulting topology file directly. If you want to create 188 more complicated topologies and maybe also use additional itp 189 files or make a protein itp file then you will have to do this 190 manually. 191 """ 192 193 structure = realpath(struct) 194 195 new_struct = protein + '.pdb' 196 posres = protein + '_posres.itp' 197 tmp_top = "tmp.top" 198 199 pdb2gmx_args.update({'f': structure, 'o': new_struct, 'p': tmp_top, 'i': posres}) 200 201 with in_dir(dirname): 202 logger.info("[%(dirname)s] Building topology %(top)r from struct = %(struct)r" % vars()) 203 gromacs.pdb2gmx(**pdb2gmx_args) 204 # need some editing protein_tmp --> system.top and protein.itp 205 # here... for the time being we just copy 206 shutil.copy(tmp_top, top) 207 return {'top': realpath(dirname, top), 'struct': realpath(dirname, new_struct)}
208 209 trj_compact_main = gromacs.tools.Trjconv(ur='compact', center=True, boxcenter='tric', pbc='mol', 210 input=('__main__','system'), 211 doc="Returns a compact representation of the system centered on the __main__ group") 212
213 -def make_main_index(struct, selection='"Protein"', ndx='main.ndx', oldndx=None):
214 """Make index file with the special groups. 215 216 This routine adds the group __main__ and the group __environment__ 217 to the end of the index file. __main__ contains what the user 218 defines as the *central* and *most important* parts of the 219 system. __environment__ is everything else. 220 221 The template mdp file, for instance, uses these two groups for T-coupling. 222 223 These groups are mainly useful if the default groups "Protein" and "Non-Protein" 224 are not appropriate. By using symbolic names such as __main__ one 225 can keep scripts more general. 226 227 :Returns: 228 *groups* is a list of dictionaries that describe the index groups. See 229 :func:`gromacs.cbook.parse_ndxlist` for details. 230 231 :Arguments: 232 *struct* : filename 233 structure (tpr, pdb, gro) 234 *selection* : string 235 is a ``make_ndx`` command such as ``"Protein"`` or ``r DRG`` which 236 determines what is considered the main group for centering etc. It is 237 passed directly to ``make_ndx``. 238 *ndx* : string 239 name of the final index file 240 *oldndx* : string 241 name of index file that should be used as a basis; if None 242 then the ``make_ndx`` default groups are used. 243 244 This routine is very dumb at the moment; maybe some heuristics will be 245 added later as could be other symbolic groups such as __membrane__. 246 """ 247 248 logger.info("Building the main index file %(ndx)r..." % vars()) 249 250 # pass 1: select 251 # empty command '' important to get final list of groups 252 rc,out,nothing = gromacs.make_ndx(f=struct, n=oldndx, o=ndx, stdout=False, stderr=True, 253 input=(selection, '', 'q')) 254 groups = gromacs.cbook.parse_ndxlist(out) 255 last = len(groups) - 1 256 assert last == groups[-1]['nr'] 257 258 # pass 2: 259 # 1) last group is __main__ 260 # 2) __environment__ is everything else (eg SOL, ions, ...) 261 rc,out,nothing = gromacs.make_ndx(f=struct, n=ndx, o=ndx, 262 stdout=False, stderr=True, 263 input=('name %d __main__' % last, 264 '! "__main__"', # is now group last+1 265 'name %d __environment__' % (last+1), 266 '', 'q')) 267 return gromacs.cbook.parse_ndxlist(out)
268 269 270 #: Hard-coded lipid residue names for a ``vdwradii.dat`` file. Use together with 271 #: :data:`~gromacs.setup.vdw_lipid_atom_radii` in :func:`~gromacs.setup.get_lipid_vdwradii`. 272 vdw_lipid_resnames = ["POPC", "POPE", "POPG", "DOPC", "DPPC", "DLPC", "DMPC", "DPPG"] 273 #: Increased atom radii for lipid atoms; these are simply the standard values from 274 #: ``GMXLIB/vdwradii.dat`` increased by 0.1 nm (C) or 0.05 nm (N, O, H). 275 vdw_lipid_atom_radii = {'C': 0.25, 'N': 0.16, 'O': 0.155, 'H': 0.09} 276
277 -def get_lipid_vdwradii(outdir=os.path.curdir, libdir=None):
278 """Find vdwradii.dat and add special entries for lipids. 279 280 See :data:`gromacs.setup.vdw_lipid_resnames` for lipid 281 resnames. Add more if necessary. 282 """ 283 vdwradii_dat = os.path.join(outdir, "vdwradii.dat") 284 285 if not libdir is None: 286 filename = os.path.join(libdir, 'vdwradii.dat') # canonical name 287 if not os.path.exists(filename): 288 msg = 'No VDW database file found in %(filename)r.' % vars() 289 logger.exception(msg) 290 raise OSError(msg, errno.ENOENT) 291 else: 292 try: 293 filename = os.path.join(os.environ['GMXLIB'], 'vdwradii.dat') 294 except KeyError: 295 try: 296 filename = os.path.join(os.environ['GMXDATA'], 'gromacs', 'top', 'vdwradii.dat') 297 except KeyError: 298 msg = "Cannot find vdwradii.dat. Set GMXLIB or GMXDATA." 299 logger.exception(msg) 300 raise OSError(msg, errno.ENOENT) 301 if not os.path.exists(filename): 302 msg = "Cannot find %r; something is wrong with the Gromacs installation." % vars() 303 logger.exception(msg, errno.ENOENT) 304 raise OSError(msg) 305 306 # make sure to catch 3 and 4 letter resnames 307 patterns = vdw_lipid_resnames + list(set([x[:3] for x in vdw_lipid_resnames])) 308 # TODO: should do a tempfile... 309 with open(vdwradii_dat, 'w') as outfile: 310 # write lipid stuff before general 311 outfile.write('; Special larger vdw radii for solvating lipid membranes\n') 312 for resname in patterns: 313 for atom,radius in vdw_lipid_atom_radii.items(): 314 outfile.write('%(resname)4s %(atom)-5s %(radius)5.3f\n' % vars()) 315 with open(filename, 'r') as infile: 316 for line in infile: 317 outfile.write(line) 318 logger.debug('Created lipid vdW radii file %(vdwradii_dat)r.' % vars()) 319 return realpath(vdwradii_dat)
320
321 -def solvate(struct='top/protein.pdb', top='top/system.top', 322 distance=0.9, boxtype='dodecahedron', 323 concentration=0, cation='NA+', anion='CL-', 324 water='spc', with_membrane=False, 325 ndx = 'main.ndx', mainselection = '"Protein"', 326 dirname='solvate', 327 **kwargs):
328 """Put protein into box, add water, add counter-ions. 329 330 Currently this really only supports solutes in water. If you need 331 to embedd a protein in a membrane then you will require more 332 sophisticated approaches. 333 334 However, you *can* supply a protein already inserted in a 335 bilayer. In this case you will probably want to set *distance* = 336 ``None`` and also enable *with_membrane* = ``True`` (using extra 337 big vdw radii for typical lipids). 338 339 340 :Arguments: 341 *struct* : filename 342 pdb or gro input structure 343 *top* : filename 344 Gromacs topology 345 *distance* : float 346 When solvating with water, make the box big enough so that 347 at least *distance* nm water are between the solute *struct* 348 and the box boundary. 349 Set this to ``None`` in order to use a box size in the input 350 file (gro or pdb). 351 *boxtype* : string 352 Any of the box types supported by :class:`~gromacs.tools.Genbox`. 353 If set to ``None`` it will also ignore *distance* and use the box 354 inside the *struct* file. 355 *concentration* : float 356 Concentration of the free ions in mol/l. Note that counter 357 ions are added in excess of this concentration. 358 *cation* and *anion* : string 359 Molecule names of the ions. This depends on the chosen force field. 360 *water* : string 361 Name of the water model; one of "spc", "spce", "tip3p", 362 "tip4p". This should be appropriate for the chosen force 363 field. If no water is requird, simply supply the path to a box with solvent 364 molecules (used by :func:`gromacs.genbox`'s *cs* argument). 365 *with_membrane* : bool 366 ``True``: use special ``vdwradii.dat`` with 0.1 nm-increased radii on 367 lipids. Default is ``False``. 368 *ndx* : filename 369 The name of the custom index file that is produced here. 370 *mainselection* : string 371 A string that is fed to :class:`~gromacs.tools.Make_ndx` and 372 which should select the solute. 373 *dirname* : directory name 374 Name of the directory in which all files for the solvation stage are stored. 375 *includes* : list of additional directories to add to the mdp include path 376 377 .. Note:: non-water solvents only work if the molecules are named SOL. 378 """ 379 380 structure = realpath(struct) 381 topology = realpath(top) 382 383 # handle additional include directories (kwargs are also modified!) 384 mdp_kwargs = add_mdp_includes(topology, kwargs) 385 386 if water.lower() in ('spc', 'spce'): 387 water = 'spc216' 388 389 # By default, grompp should not choke on a few warnings because at 390 # this stage the user cannot do much about it (can be set to any 391 # value but keep undocumented...) 392 grompp_maxwarn = kwargs.pop('maxwarn',10) 393 394 # should scrub topology but hard to know what to scrub; for 395 # instance, some ions or waters could be part of the crystal structure 396 397 with in_dir(dirname): 398 logger.info("[%(dirname)s] Solvating with water %(water)r..." % vars()) 399 if distance is None or boxtype is None: 400 hasBox = False 401 ext = os.path.splitext(structure)[1] 402 if ext == '.gro': 403 hasBox = True 404 elif ext == '.pdb': 405 with open(structure) as struct: 406 for line in struct: 407 if line.startswith('CRYST'): 408 hasBox = True 409 break 410 if not hasBox: 411 msg = "No box data in the input structure %(structure)r and distance or boxtype is set to None" % vars() 412 logger.exception(msg) 413 raise MissingDataError(msg) 414 distance = boxtype = None # ensures that editconf just converts 415 gromacs.editconf(f=structure, o='boxed.gro', bt=boxtype, d=distance) 416 417 if with_membrane: 418 vdwradii_dat = get_lipid_vdwradii() # need to clean up afterwards 419 logger.info("Using special vdW radii for lipids %r" % vdw_lipid_resnames) 420 421 try: 422 gromacs.genbox(p=topology, cp='boxed.gro', cs=water, o='solvated.gro') 423 except: 424 if with_membrane: 425 # remove so that it's not picked up accidentally 426 gromacs.utilities.unlink_f(vdwradii_dat) 427 raise 428 logger.info("Solvated system with %s", water) 429 430 with open('none.mdp','w') as mdp: 431 mdp.write('; empty mdp file\ninclude = %(include)s\n' % mdp_kwargs) 432 qtotgmx = gromacs.cbook.grompp_qtot(f='none.mdp', o='topol.tpr', c='solvated.gro', 433 p=topology, stdout=False, maxwarn=grompp_maxwarn) 434 qtot = round(qtotgmx) 435 logger.info("[%(dirname)s] After solvation: total charge qtot = %(qtotgmx)r = %(qtot)r" % vars()) 436 437 if concentration != 0: 438 logger.info("[%(dirname)s] Adding ions for c = %(concentration)f M..." % vars()) 439 # target concentration of free ions c ==> 440 # N = N_water * c/c_water 441 # add ions for concentration to the counter ions (counter ions are less free) 442 # 443 # get number of waters (count OW ... works for SPC*, TIP*P water models) 444 rc,output,junk = gromacs.make_ndx(f='topol.tpr', o='ow.ndx', 445 input=('keep 0', 'del 0', 'a OW*', 'name 0 OW', '', 'q'), 446 stdout=False, stderr=True) 447 groups = gromacs.cbook.parse_ndxlist(output) 448 gdict = dict([(g['name'], g) for g in groups]) # overkill... 449 N_water = gdict['OW']['natoms'] # ... but dict lookup is nice 450 N_ions = int(N_water * concentration/CONC_WATER) # number of monovalents 451 else: 452 N_ions = 0 453 454 # neutralize (or try -neutral switch of genion???) 455 n_cation = n_anion = 0 456 if qtot > 0: 457 n_anion = int(abs(qtot)) 458 elif qtot < 0: 459 n_cation = int(abs(qtot)) 460 461 n_cation += N_ions 462 n_anion += N_ions 463 464 if n_cation != 0 or n_anion != 0: 465 # sanity check: 466 assert qtot + n_cation - n_anion < 1e-6 467 logger.info("[%(dirname)s] Adding n_cation = %(n_cation)d and n_anion = %(n_anion)d ions..." % vars()) 468 gromacs.genion(s='topol.tpr', o='ionized.gro', p=topology, 469 pname=cation, nname=anion, np=n_cation, nn=n_anion, 470 input='SOL') 471 else: 472 # fake ionized file ... makes it easier to continue without too much fuzz 473 try: 474 os.unlink('ionized.gro') 475 except OSError, err: 476 if err.errno != errno.ENOENT: 477 raise 478 os.symlink('solvated.gro', 'ionized.gro') 479 480 qtot = gromacs.cbook.grompp_qtot(f='none.mdp', o='ionized.tpr', c='ionized.gro', 481 p=topology, stdout=False, maxwarn=grompp_maxwarn) 482 483 if abs(qtot) > 1e-4: 484 wmsg = "System has non-zero total charge qtot = %(qtot)g e." % vars() 485 warnings.warn(wmsg, category=BadParameterWarning) 486 logger.warn(wmsg) 487 488 # make main index 489 try: 490 make_main_index('ionized.tpr', selection=mainselection, ndx=ndx) 491 except GromacsError, err: 492 # or should I rather fail here? 493 wmsg = "Failed to make main index file %r ... maybe set mainselection='...'.\n"\ 494 "The error message was:\n%s\n" % (ndx, str(err)) 495 logger.warn(wmsg) 496 warnings.warn(wmsg, category=GromacsFailureWarning) 497 try: 498 trj_compact_main(f='ionized.gro', s='ionized.tpr', o='compact.pdb', n=ndx) 499 except GromacsError, err: 500 wmsg = "Failed to make compact pdb for visualization... pressing on regardless. "\ 501 "The error message was:\n%s\n" % str(err) 502 logger.warn(wmsg) 503 warnings.warn(wmsg, category=GromacsFailureWarning) 504 505 return {'qtot': qtot, 506 'struct': realpath(dirname, 'ionized.gro'), 507 'ndx': realpath(dirname, ndx), # not sure why this is propagated-is it used? 508 'mainselection': mainselection, 509 }
510 511
512 -def check_mdpargs(d):
513 """Check if any arguments remain in dict *d*.""" 514 if len(d) > 0: 515 wmsg = "Unprocessed mdp option are interpreted as options for grompp:\n"+str(d) 516 logger.warn(wmsg) 517 warnings.warn(wmsg, category=UsageWarning) 518 return len(d) == 0
519
520 -def energy_minimize(dirname='em', mdp=config.templates['em.mdp'], 521 struct='solvate/ionized.gro', top='top/system.top', 522 output='em.pdb', deffnm="em", 523 mdrunner=None, **kwargs):
524 """Energy minimize the system. 525 526 This sets up the system (creates run input files) and also runs 527 ``mdrun_d``. Thus it can take a while. 528 529 Additional itp files should be in the same directory as the top file. 530 531 Many of the keyword arguments below already have sensible values. 532 533 :Keywords: 534 *dirname* 535 set up under directory dirname [em] 536 *struct* 537 input structure (gro, pdb, ...) [solvate/ionized.gro] 538 *output* 539 output structure (will be put under dirname) [em.pdb] 540 *deffnm* 541 default name for mdrun-related files [em] 542 *top* 543 topology file [top/system.top] 544 *mdp* 545 mdp file (or use the template) [templates/em.mdp] 546 *includes* 547 additional directories to search for itp files 548 *mdrunner* 549 :class:`gromacs.run.MDrunner` class; by defauly we 550 just try :func:`gromacs.mdrun_d` and :func:`gromacs.mdrun` but a 551 MDrunner class gives the user the ability to run mpi jobs 552 etc. [None] 553 *kwargs* 554 remaining key/value pairs that should be changed in the 555 template mdp file, eg ``nstxtcout=250, nstfout=250``. 556 557 .. note:: If :func:`~gromacs.mdrun_d` is not found, the function 558 falls back to :func:`~gromacs.mdrun` instead. 559 """ 560 561 structure = realpath(struct) 562 topology = realpath(top) 563 mdp_template = config.get_template(mdp) 564 deffnm = deffnm.strip() 565 566 # write the processed topology to the default output 567 kwargs.setdefault('pp', 'processed.top') 568 569 # filter some kwargs that might come through when feeding output 570 # from previous stages such as solvate(); necessary because *all* 571 # **kwargs must be *either* substitutions in the mdp file *or* valid 572 # command line parameters for ``grompp``. 573 kwargs.pop('ndx', None) 574 # mainselection is not used but only passed through; right now we 575 # set it to the default that is being used in all argument lists 576 # but that is not pretty. TODO. 577 mainselection = kwargs.pop('mainselection', '"Protein"') 578 # only interesting when passed from solvate() 579 qtot = kwargs.pop('qtot', 0) 580 581 mdp = deffnm+'.mdp' 582 tpr = deffnm+'.tpr' 583 584 logger.info("[%(dirname)s] Energy minimization of struct=%(struct)r, top=%(top)r, mdp=%(mdp)r ..." % vars()) 585 586 add_mdp_includes(topology, kwargs) 587 588 if qtot != 0: 589 # At the moment this is purely user-reported and really only here because 590 # it might get fed into the function when using the keyword-expansion pipeline 591 # usage paradigm. 592 wmsg = "Total charge was reported as qtot = %(qtot)g <> 0; probably a problem." % vars() 593 logger.warn(wmsg) 594 warnings.warn(wmsg, category=BadParameterWarning) 595 596 with in_dir(dirname): 597 unprocessed = gromacs.cbook.edit_mdp(mdp_template, new_mdp=mdp, **kwargs) 598 check_mdpargs(unprocessed) 599 gromacs.grompp(f=mdp, o=tpr, c=structure, p=topology, **unprocessed) 600 mdrun_args = dict(v=True, stepout=10, deffnm=deffnm, c=output) 601 if mdrunner is None: 602 try: 603 gromacs.mdrun_d(**mdrun_args) 604 except (AttributeError, OSError): 605 # fall back to mdrun if no double precision binary 606 wmsg = "No 'mdrun_d' binary found so trying 'mdrun' instead.\n"\ 607 "(Note that energy minimization runs better with mdrun_d.)" 608 logger.warn(wmsg) 609 warnings.warn(wmsg, category=AutoCorrectionWarning) 610 gromacs.mdrun(**mdrun_args) 611 else: 612 # user wants full control and provides simulation.MDrunner **class** 613 # NO CHECKING --- in principle user can supply any callback they like 614 mdrun = mdrunner(**mdrun_args) 615 mdrun.run() 616 617 # em.gro --> gives 'Bad box in file em.gro' warning --- why?? 618 # --> use em.pdb instead. 619 if not os.path.exists(output): 620 errmsg = "Energy minimized system NOT produced." 621 logger.error(errmsg) 622 raise GromacsError(errmsg) 623 final_struct = realpath(output) 624 625 logger.info("[%(dirname)s] energy minimized structure %(final_struct)r" % vars()) 626 return {'struct': final_struct, 627 'top': topology, 628 'mainselection': mainselection, 629 }
630
631 -def em_schedule(**kwargs):
632 """Run multiple energy minimizations one after each other. 633 634 :Keywords: 635 *integrators* 636 list of integrators (from 'l-bfgs', 'cg', 'steep') 637 [['bfgs', 'steep']] 638 *nsteps* 639 list of maximum number of steps; one for each integrator in 640 in the *integrators* list [[100,1000]] 641 *kwargs* 642 mostly passed to :func:`gromacs.setup.energy_minimize` 643 644 :Returns: dictionary with paths to final structure ('struct') and 645 other files 646 647 :Example: 648 Conduct three minimizations: 649 1. low memory Broyden-Goldfarb-Fletcher-Shannon (BFGS) for 30 steps 650 2. steepest descent for 200 steps 651 3. finish with BFGS for another 30 steps 652 We also do a multi-processor minimization when possible (i.e. for steep 653 (and conjugate gradient) by using a :class:`gromacs.run.MDrunner` class 654 for a :program:`mdrun` executable compiled for OpenMP in 64 bit (see 655 :mod:`gromacs.run` for details):: 656 657 import gromacs.run 658 gromacs.setup.em_schedule(struct='solvate/ionized.gro', 659 mdrunner=gromacs.run.MDrunnerOpenMP64, 660 integrators=['l-bfgs', 'steep', 'l-bfgs'], 661 nsteps=[50,200, 50]) 662 663 .. Note:: You might have to prepare the mdp file carefully because at the 664 moment one can only modify the *nsteps* parameter on a 665 per-minimizer basis. 666 """ 667 668 mdrunner = kwargs.pop('mdrunner', None) 669 integrators = kwargs.pop('integrators', ['l-bfgs', 'steep']) 670 kwargs.pop('integrator', None) # clean input; we set intgerator from integrators 671 nsteps = kwargs.pop('nsteps', [100, 1000]) 672 673 outputs = ['em%03d_%s.pdb' % (i,integrator) for i,integrator in enumerate(integrators)] 674 outputs[-1] = kwargs.pop('output', 'em.pdb') 675 676 files = {'struct': kwargs.pop('struct', None)} # fake output from energy_minimize() 677 678 for i, integrator in enumerate(integrators): 679 struct = files['struct'] 680 logger.info("[em %d] energy minimize with %s for maximum %d steps", i, integrator, nsteps[i]) 681 kwargs.update({'struct':struct, 'output':outputs[i], 682 'integrator':integrator, 'nsteps': nsteps[i]}) 683 if not integrator == 'l-bfgs': 684 kwargs['mdrunner'] = mdrunner 685 else: 686 kwargs['mdrunner'] = None 687 logger.warning("[em %d] Not using mdrunner for L-BFGS because it cannot " 688 "do parallel runs.", i) 689 690 files = energy_minimize(**kwargs) 691 692 return files
693 694
695 -def _setup_MD(dirname, 696 deffnm='md', mdp=config.templates['md_OPLSAA.mdp'], 697 struct=None, 698 top='top/system.top', ndx=None, 699 mainselection='"Protein"', 700 qscript=config.qscript_template, qname=None, mdrun_opts="", budget=None, walltime=1/3., 701 dt=0.002, runtime=1e3, **mdp_kwargs):
702 """Generic function to set up a ``mdrun`` MD simulation. 703 704 See the user functions for usage. 705 """ 706 707 if struct is None: 708 raise ValueError('struct must be set to a input structure') 709 structure = realpath(struct) 710 topology = realpath(top) 711 try: 712 index = realpath(ndx) 713 except AttributeError: # (that's what realpath(None) throws...) 714 index = None # None is handled fine below 715 716 qname = mdp_kwargs.pop('sgename', qname) # compatibility for old scripts 717 qscript = mdp_kwargs.pop('sge', qscript) # compatibility for old scripts 718 qscript_template = config.get_template(qscript) 719 mdp_template = config.get_template(mdp) 720 721 nsteps = int(float(runtime)/float(dt)) 722 723 mdp = deffnm + '.mdp' 724 tpr = deffnm + '.tpr' 725 mainindex = deffnm + '.ndx' 726 final_structure = deffnm + '.gro' # guess... really depends on templates,could also be DEFFNM.pdb 727 728 # write the processed topology to the default output 729 mdp_parameters = {'nsteps':nsteps, 'dt':dt, 'pp': 'processed.top'} 730 mdp_parameters.update(mdp_kwargs) 731 732 add_mdp_includes(topology, mdp_parameters) 733 734 with in_dir(dirname): 735 if not (mdp_parameters.get('Tcoupl','').lower() == 'no' or mainselection is None): 736 logger.info("[%(dirname)s] Automatic adjustment of T-coupling groups" % vars()) 737 738 # make index file in almost all cases; with mainselection == None the user 739 # takes FULL control and also has to provide the template or index 740 groups = make_main_index(structure, selection=mainselection, 741 oldndx=index, ndx=mainindex) 742 natoms = dict([(g['name'], float(g['natoms'])) for g in groups]) 743 try: 744 x = natoms['__main__']/natoms['__environment__'] 745 except KeyError: 746 x = 0 # force using SYSTEM in code below 747 wmsg = "Missing __main__ and/or __environment__ index group.\n" \ 748 "This probably means that you have an atypical system. You can " \ 749 "set mainselection=None and provide your own mdp and index files " \ 750 "in order to set up temperature coupling.\n" \ 751 "If no T-coupling is required then set Tcoupl='no'.\n" \ 752 "For now we will just couple everything to 'System'." 753 logger.warn(wmsg) 754 warnings.warn(wmsg, category=AutoCorrectionWarning) 755 if x < 0.1: 756 # TODO: does not handle properly multiple groups in arglist yet 757 tau_t = mdp_parameters.pop('tau_t', 0.1) 758 ref_t = mdp_parameters.pop('ref_t', 300) 759 # combine all in one T-coupling group 760 mdp_parameters['tc-grps'] = 'System' 761 mdp_parameters['tau_t'] = tau_t # this overrides the commandline! 762 mdp_parameters['ref_t'] = ref_t # this overrides the commandline! 763 mdp_parameters['gen-temp'] = ref_t 764 wmsg = "Size of __main__ is only %.1f%% of __environment__ so " \ 765 "we use 'System' for T-coupling and ref_t = %g and " \ 766 "tau_t = %g (can be changed in mdp_parameters).\n" \ 767 % (x * 100, ref_t, tau_t) 768 logger.warn(wmsg) 769 warnings.warn(wmsg, category=AutoCorrectionWarning) 770 index = realpath(mainindex) 771 if mdp_parameters.get('Tcoupl','').lower() == 'no': 772 logger.info("Tcoupl == no: disabling all temperature coupling mdp options") 773 mdp_parameters['tc-grps'] = "" 774 mdp_parameters['tau_t'] = "" 775 mdp_parameters['ref_t'] = "" 776 mdp_parameters['gen-temp'] = "" 777 if mdp_parameters.get('Pcoupl','').lower() == 'no': 778 logger.info("Pcoupl == no: disabling all pressure coupling mdp options") 779 mdp_parameters['tau_p'] = "" 780 mdp_parameters['ref_p'] = "" 781 mdp_parameters['compressibility'] = "" 782 783 unprocessed = gromacs.cbook.edit_mdp(mdp_template, new_mdp=mdp, **mdp_parameters) 784 check_mdpargs(unprocessed) 785 gromacs.grompp(f=mdp, p=topology, c=structure, n=index, o=tpr, **unprocessed) 786 787 runscripts = gromacs.qsub.generate_submit_scripts( 788 qscript_template, deffnm=deffnm, jobname=qname, budget=budget, 789 mdrun_opts=mdrun_opts, walltime=walltime) 790 791 logger.info("[%(dirname)s] All files set up for a run time of %(runtime)g ps " 792 "(dt=%(dt)g, nsteps=%(nsteps)g)" % vars()) 793 794 kwargs = {'struct': realpath(os.path.join(dirname, final_structure)), # guess 795 'top': topology, 796 'ndx': index, # possibly mainindex 797 'qscript': runscripts, 798 'mainselection': mainselection} 799 kwargs.update(mdp_kwargs) # return extra mdp args so that one can use them for prod run 800 kwargs.pop('define', None) # but make sure that -DPOSRES does not stay... 801 return kwargs
802 803
804 -def MD_restrained(dirname='MD_POSRES', **kwargs):
805 """Set up MD with position restraints. 806 807 Additional itp files should be in the same directory as the top file. 808 809 Many of the keyword arguments below already have sensible values. Note that 810 setting *mainselection* = ``None`` will disable many of the automated 811 choices and is often recommended when using your own mdp file. 812 813 :Keywords: 814 *dirname* 815 set up under directory dirname [MD_POSRES] 816 *struct* 817 input structure (gro, pdb, ...) [em/em.pdb] 818 *top* 819 topology file [top/system.top] 820 *mdp* 821 mdp file (or use the template) [templates/md.mdp] 822 *ndx* 823 index file (supply when using a custom mdp) 824 *includes* 825 additional directories to search for itp files 826 *mainselection* 827 :program:`make_ndx` selection to select main group ["Protein"] 828 (If ``None`` then no canonical index file is generated and 829 it is the user's responsibility to set *tc_grps*, 830 *tau_t*, and *ref_t* as keyword arguments, or provide the mdp template 831 with all parameter pre-set in *mdp* and probably also your own *ndx* 832 index file.) 833 *deffnm* 834 default filename for Gromacs run [md] 835 *runtime* 836 total length of the simulation in ps [1000] 837 *dt* 838 integration time step in ps [0.002] 839 *qscript* 840 script to submit to the queuing system; by default 841 uses the template :data:`gromacs.config.qscript_template`, which can 842 be manually set to another template from :data:`gromacs.config.templates`; 843 can also be a list of template names. 844 *qname* 845 name to be used for the job in the queuing system [PR_GMX] 846 *mdrun_opts* 847 option flags for the :program:`mdrun` command in the queuing system 848 scripts such as "-stepout 100". [""] 849 *kwargs* 850 remaining key/value pairs that should be changed in the template mdp 851 file, eg ``nstxtcout=250, nstfout=250`` or command line options for 852 ``grompp` such as ``maxwarn=1``. 853 854 In particular one can also set **define** and activate 855 whichever position restraints have been coded into the itp 856 and top file. For instance one could have 857 858 *define* = "-DPOSRES_MainChain -DPOSRES_LIGAND" 859 860 if these preprocessor constructs exist. Note that there 861 **must not be any space between "-D" and the value.** 862 863 By default *define* is set to "-DPOSRES". 864 865 :Returns: a dict that can be fed into :func:`gromacs.setup.MD` 866 (but check, just in case, especially if you want to 867 change the ``define`` parameter in the mdp file) 868 869 .. Note:: The output frequency is drastically reduced for position 870 restraint runs by default. Set the corresponding ``nst*`` 871 variables if you require more output. 872 """ 873 874 logger.info("[%(dirname)s] Setting up MD with position restraints..." % vars()) 875 kwargs.setdefault('struct', 'em/em.pdb') 876 kwargs.setdefault('qname', 'PR_GMX') 877 kwargs.setdefault('define', '-DPOSRES') 878 # reduce size of output files 879 kwargs.setdefault('nstxout', '50000') # trr pos 880 kwargs.setdefault('nstvout', '50000') # trr veloc 881 kwargs.setdefault('nstfout', '0') # trr forces 882 kwargs.setdefault('nstlog', '500') # log file 883 kwargs.setdefault('nstenergy', '2500') # edr energy 884 kwargs.setdefault('nstxtcout', '5000') # xtc pos 885 return _setup_MD(dirname, **kwargs)
886
887 -def MD(dirname='MD', **kwargs):
888 """Set up equilibrium MD. 889 890 Additional itp files should be in the same directory as the top file. 891 892 Many of the keyword arguments below already have sensible values. Note that 893 setting *mainselection* = ``None`` will disable many of the automated 894 choices and is often recommended when using your own mdp file. 895 896 :Keywords: 897 *dirname* 898 set up under directory dirname [MD] 899 *struct* 900 input structure (gro, pdb, ...) [MD_POSRES/md_posres.pdb] 901 *top* 902 topology file [top/system.top] 903 *mdp* 904 mdp file (or use the template) [templates/md.mdp] 905 *ndx* 906 index file (supply when using a custom mdp) 907 *includes* 908 additional directories to search for itp files 909 *mainselection* 910 ``make_ndx`` selection to select main group ["Protein"] 911 (If ``None`` then no canonical index file is generated and 912 it is the user's responsibility to set *tc_grps*, 913 *tau_t*, and *ref_t* as keyword arguments, or provide the mdp template 914 with all parameter pre-set in *mdp* and probably also your own *ndx* 915 index file.) 916 *deffnm* 917 default filename for Gromacs run [md] 918 *runtime* 919 total length of the simulation in ps [1000] 920 *dt* 921 integration time step in ps [0.002] 922 *qscript* 923 script to submit to the queuing system; by default 924 uses the template :data:`gromacs.config.qscript_template`, which can 925 be manually set to another template from :data:`gromacs.config.templates`; 926 can also be a list of template names. 927 *qname* 928 name to be used for the job in the queuing system [MD_GMX] 929 *mdrun_opts* 930 option flags for the :program:`mdrun` command in the queuing system 931 scripts such as "-stepout 100 -dgdl". [""] 932 *kwargs* 933 remaining key/value pairs that should be changed in the template mdp 934 file, e.g. ``nstxtcout=250, nstfout=250`` or command line options for 935 :program`grompp` such as ``maxwarn=1``. 936 937 :Returns: a dict that can be fed into :func:`gromacs.setup.MD` 938 (but check, just in case, especially if you want to 939 change the *define* parameter in the mdp file) 940 """ 941 942 logger.info("[%(dirname)s] Setting up MD..." % vars()) 943 kwargs.setdefault('struct', 'MD_POSRES/md.gro') 944 kwargs.setdefault('qname', 'MD_GMX') 945 return _setup_MD(dirname, **kwargs)
946 947 948 949 # TODO: autorun (qv MI lipids/setup.pl) 950