md.log.v5

Log file opened on Fri Jul 25 15:02:23 2014
Host: c4437  pid: 29821  rank ID: 0  number of ranks:  1
GROMACS:    gmx mdrun, VERSION 5.0

GROMACS is written by:
Emile Apol         Rossen Apostolov   Herman J.C. Berendsen Par Bjelkmar
Aldert van Buuren  Rudi van Drunen    Anton Feenstra     Sebastian Fritsch
Gerrit Groenhof    Christoph Junghans Peter Kasson       Carsten Kutzner
Per Larsson        Justin A. Lemkul   Magnus Lundborg    Pieter Meulenhoff
Erik Marklund      Teemu Murtola      Szilard Pall       Sander Pronk
Roland Schulz      Alexey Shvetsov    Michael Shirts     Alfons Sijbers
Peter Tieleman     Christian Wennberg Maarten Wolf
and the project leaders:
Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel

Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2014, The GROMACS development team at
Uppsala University, Stockholm University and
the Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.

GROMACS is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.

GROMACS:      gmx mdrun, VERSION 5.0
Executable:   /work/gromacs5gpu/bin/gmx_5gpu
Library dir:  /work/gromacs5gpu/share/gromacs/top
Command line:
  mdrun_5gpu -v

Gromacs version:    VERSION 5.0
Precision:          single
Memory model:       64 bit
MPI library:        thread_mpi
OpenMP support:     enabled
GPU support:        enabled
invsqrt routine:    gmx_software_invsqrt(x)
SIMD instructions:  AVX_256
FFT library:        fftw-3.3.4-sse2
RDTSCP usage:       enabled
C++11 compilation:  disabled
TNG support:        enabled
Tracing support:    disabled
Built on:           Fri Jul 25 14:35:10 CDT 2014
Built by:           mohtadin@c4437 [CMAKE]
Build OS/arch:      Linux 2.6.32-431.11.2.el6.x86_64 x86_64
Build CPU vendor:   GenuineIntel
Build CPU brand:    Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
Build CPU family:   6   Model: 45   Stepping: 7
Build CPU features: aes apic avx clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
C compiler:         /util/comp/gcc/4.7/bin/gcc GNU 4.7.1
C compiler flags:    -mavx   -Wno-maybe-uninitialized -Wextra -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -Wall -Wno-unused -Wunused-value -Wunused-parameter  -I/util/opt/cuda/6.0/include  -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast  -Wno-array-bounds  -O3 -DNDEBUG
C++ compiler:       /util/comp/gcc/4.7/bin/g++ GNU 4.7.1
C++ compiler flags:  -mavx   -Wextra -Wno-missing-field-initializers -Wpointer-arith -Wall -Wno-unused-function  -m64 -march=bdver1  -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast  -Wno-array-bounds  -O3 -DNDEBUG
Boost version:      1.55.0 (internal)
CUDA compiler:      /util/opt/cuda/6.0/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2013 NVIDIA Corporation;Built on Thu_Mar_13_11:58:58_PDT_2014;Cuda compilation tools, release 6.0, V6.0.1
CUDA compiler flags:-gencode;arch=compute_20,code=sm_20;-gencode;arch=compute_20,code=sm_21;-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_35,code=compute_35;-use_fast_math;; ;-mavx;-Wextra;-Wno-missing-field-initializers;-Wpointer-arith;-Wall;-Wno-unused-function;-m64;-march=bdver1;-fomit-frame-pointer;-funroll-all-loops;-fexcess-precision=fast;-Wno-array-bounds;-O3;-DNDEBUG
CUDA driver:        6.0
CUDA runtime:       6.0


++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
B. Hess and C. Kutzner and D. van der Spoel and E. Lindahl
GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable
molecular simulation
J. Chem. Theory Comput. 4 (2008) pp. 435-447
-------- -------- --- Thank You --- -------- --------


++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C.
Berendsen
GROMACS: Fast, Flexible and Free
J. Comp. Chem. 26 (2005) pp. 1701-1719
-------- -------- --- Thank You --- -------- --------


++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
E. Lindahl and B. Hess and D. van der Spoel
GROMACS 3.0: A package for molecular simulation and trajectory analysis
J. Mol. Mod. 7 (2001) pp. 306-317
-------- -------- --- Thank You --- -------- --------


++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
H. J. C. Berendsen, D. van der Spoel and R. van Drunen
GROMACS: A message-passing parallel molecular dynamics implementation
Comp. Phys. Comm. 91 (1995) pp. 43-56
-------- -------- --- Thank You --- -------- --------


Number of hardware threads detected (24) does not match the number reported by OpenMP (12).
Consider setting the launch configuration manually!

For optimal performance with a GPU nstlist (now 10) should be larger.
The optimum depends on your CPU and GPU resources.
You might want to try several nstlist values.
Changing nstlist from 10 to 20, rlist from 1.094 to 1.208


Non-default thread affinity set, disabling internal thread affinity
Input Parameters:
   integrator                     = md
   tinit                          = 0
   dt                             = 0.002
   nsteps                         = 5000
   init-step                      = 0
   simulation-part                = 1
   comm-mode                      = Linear
   nstcomm                        = 100
   bd-fric                        = 0
   ld-seed                        = 1993
   emtol                          = 10
   emstep                         = 0.01
   niter                          = 20
   fcstep                         = 0
   nstcgsteep                     = 1000
   nbfgscorr                      = 10
   rtpi                           = 0.05
   nstxout                        = 0
   nstvout                        = 0
   nstfout                        = 0
   nstlog                         = 0
   nstcalcenergy                  = 100
   nstenergy                      = 0
   nstxout-compressed             = 0
   compressed-x-precision         = 1000
   cutoff-scheme                  = Verlet
   nstlist                        = 20
   ns-type                        = Grid
   pbc                            = xyz
   periodic-molecules             = FALSE
   verlet-buffer-tolerance        = 0.005
   rlist                          = 1.208
   rlistlong                      = 1.208
   nstcalclr                      = 10
   coulombtype                    = Cut-off
   coulomb-modifier               = Potential-shift
   rcoulomb-switch                = 0
   rcoulomb                       = 1
   epsilon-r                      = 1
   epsilon-rf                     = inf
   vdw-type                       = Cut-off
   vdw-modifier                   = Potential-shift
   rvdw-switch                    = 0
   rvdw                           = 1
   DispCorr                       = No
   table-extension                = 1
   fourierspacing                 = 0.12
   fourier-nx                     = 0
   fourier-ny                     = 0
   fourier-nz                     = 0
   pme-order                      = 4
   ewald-rtol                     = 1e-05
   ewald-rtol-lj                  = 1e-05
   lj-pme-comb-rule               = Geometric
   ewald-geometry                 = 0
   epsilon-surface                = 0
   implicit-solvent               = No
   gb-algorithm                   = Still
   nstgbradii                     = 1
   rgbradii                       = 1
   gb-epsilon-solvent             = 80
   gb-saltconc                    = 0
   gb-obc-alpha                   = 1
   gb-obc-beta                    = 0.8
   gb-obc-gamma                   = 4.85
   gb-dielectric-offset           = 0.009
   sa-algorithm                   = Ace-approximation
   sa-surface-tension             = 2.05016
   tcoupl                         = Berendsen
   nsttcouple                     = 10
   nh-chain-length                = 0
   print-nose-hoover-chain-variables = FALSE
   pcoupl                         = No
   pcoupltype                     = Isotropic
   nstpcouple                     = -1
   tau-p                          = 1
   compressibility (3x3):
      compressibility[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      compressibility[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      compressibility[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
   ref-p (3x3):
      ref-p[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      ref-p[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      ref-p[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
   refcoord-scaling               = No
   posres-com (3):
      posres-com[0]= 0.00000e+00
      posres-com[1]= 0.00000e+00
      posres-com[2]= 0.00000e+00
   posres-comB (3):
      posres-comB[0]= 0.00000e+00
      posres-comB[1]= 0.00000e+00
      posres-comB[2]= 0.00000e+00
   QMMM                           = FALSE
   QMconstraints                  = 0
   QMMMscheme                     = 0
   MMChargeScaleFactor            = 1
qm-opts:
   ngQM                           = 0
   constraint-algorithm           = Lincs
   continuation                   = FALSE
   Shake-SOR                      = FALSE
   shake-tol                      = 0.0001
   lincs-order                    = 4
   lincs-iter                     = 1
   lincs-warnangle                = 30
   nwall                          = 0
   wall-type                      = 9-3
   wall-r-linpot                  = -1
   wall-atomtype[0]               = -1
   wall-atomtype[1]               = -1
   wall-density[0]                = 0
   wall-density[1]                = 0
   wall-ewald-zfac                = 3
   pull                           = no
   rotation                       = FALSE
   interactiveMD                  = FALSE
   disre                          = No
   disre-weighting                = Conservative
   disre-mixed                    = FALSE
   dr-fc                          = 1000
   dr-tau                         = 0
   nstdisreout                    = 100
   orire-fc                       = 0
   orire-tau                      = 0
   nstorireout                    = 100
   free-energy                    = no
   cos-acceleration               = 0
   deform (3x3):
      deform[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      deform[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      deform[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
   simulated-tempering            = FALSE
   E-x:
      n = 0
   E-xt:
      n = 0
   E-y:
      n = 0
   E-yt:
      n = 0
   E-z:
      n = 0
   E-zt:
      n = 0
   swapcoords                     = no
   adress                         = FALSE
   userint1                       = 0
   userint2                       = 0
   userint3                       = 0
   userint4                       = 0
   userreal1                      = 0
   userreal2                      = 0
   userreal3                      = 0
   userreal4                      = 0
grpopts:
   nrdf:      103423      141310
   ref-t:         323         323
   tau-t:         0.1         0.1
annealing:          No          No
annealing-npoints:           0           0
   acc:	           0           0           0
   nfreeze:           N           N           N
   energygrp-flags[  0]: 0

Initializing Domain Decomposition on 3 ranks
Dynamic load balancing: auto
Will sort the charge groups at every domain (re)decomposition
Initial maximum inter charge-group distances:
    two-body bonded interactions: 0.410 nm, LJ-14, atoms 19458 19463
  multi-body bonded interactions: 0.410 nm, Proper Dih., atoms 19458 19463
Minimum cell size due to bonded interactions: 0.452 nm
Maximum distance for 5 constraints, at 120 deg. angles, all-trans: 0.771 nm
Estimated maximum distance required for P-LINCS: 0.771 nm
This distance will limit the DD cell size, you can override this with -rcon
Scaling the initial minimum size with 1/0.8 (option -dds) = 1.25
Optimizing the DD grid for 3 cells with a minimum initial size of 0.964 nm
The maximum allowed number of cells is: X 18 Y 19 Z 6
Domain decomposition grid 1 x 3 x 1, separate PME ranks 0
Domain decomposition rank 0, coordinates 0 0 0

Using 3 MPI threads
Using 8 OpenMP threads per tMPI thread

Detecting CPU SIMD instructions.
Present hardware specification:
Vendor: GenuineIntel
Brand:  Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
Family:  6  Model: 45  Stepping:  7
Features: aes apic avx clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
SIMD instructions most likely to fit this hardware: AVX_256
SIMD instructions selected at GROMACS compile time: AVX_256


3 GPUs detected:
  #0: NVIDIA Tesla K20m, compute cap.: 3.5, ECC: yes, stat: compatible
  #1: NVIDIA Tesla K20m, compute cap.: 3.5, ECC: yes, stat: compatible
  #2: NVIDIA Tesla K20m, compute cap.: 3.5, ECC: yes, stat: compatible

3 GPUs auto-selected for this run.
Mapping of GPUs to the 3 PP ranks in this node: #0, #1, #2

Cut-off's:   NS: 1.208   Coulomb: 1   LJ: 1
System total charge: 0.000
Generated table with 1104 data points for 1-4 COUL.
Tabscale = 500 points/nm
Generated table with 1104 data points for 1-4 LJ6.
Tabscale = 500 points/nm
Generated table with 1104 data points for 1-4 LJ12.
Tabscale = 500 points/nm

Using CUDA 8x8 non-bonded kernels

Potential shift: LJ r^-12: -1.000e+00 r^-6: -1.000e+00, Coulomb -1e+00
Removing pbc first time

Initializing Parallel LINear Constraint Solver

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
B. Hess
P-LINCS: A Parallel Linear Constraint Solver for molecular simulation
J. Chem. Theory Comput. 4 (2008) pp. 116-122
-------- -------- --- Thank You --- -------- --------

The number of constraints is 50176
There are inter charge-group constraints,
will communicate selected coordinates each lincs iteration

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Miyamoto and P. A. Kollman
SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
Water Models
J. Comp. Chem. 13 (1992) pp. 952-962
-------- -------- --- Thank You --- -------- --------


Linking all bonded interactions to atoms

The initial number of communication pulses is: Y 1
The initial domain decomposition cell size is: Y 6.11 nm

The maximum allowed distance for charge groups involved in interactions is:
                 non-bonded interactions           1.208 nm
            two-body bonded interactions  (-rdd)   1.208 nm
          multi-body bonded interactions  (-rdd)   1.208 nm
  atoms separated by up to 5 constraints  (-rcon)  6.113 nm

When dynamic load balancing gets turned on, these settings will change to:
The maximum number of communication pulses is: Y 1
The minimum size for domain decomposition cells is 1.208 nm
The requested allowed shrink of DD cells (option -dds) is: 0.80
The allowed shrink of domain decomposition cells is: Y 0.20
The maximum allowed distance for charge groups involved in interactions is:
                 non-bonded interactions           1.208 nm
            two-body bonded interactions  (-rdd)   1.208 nm
          multi-body bonded interactions  (-rdd)   1.208 nm
  atoms separated by up to 5 constraints  (-rcon)  1.208 nm


Making 1D domain decomposition grid 1 x 3 x 1, home cell index 0 0 0

Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
  0:  rest

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
H. J. C. Berendsen, J. P. M. Postma, A. DiNola and J. R. Haak
Molecular dynamics with coupling to an external bath
J. Chem. Phys. 81 (1984) pp. 3684-3690
-------- -------- --- Thank You --- -------- --------