Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Log file opened on Thu Apr 30 12:24:09 2015
- Host: nid01946 pid: 24118 rank ID: 0 number of ranks: 256
- GROMACS: gmx mdrun, VERSION 5.0.2
- GROMACS is written by:
- Emile Apol Rossen Apostolov Herman J.C. Berendsen Par Bjelkmar
- Aldert van Buuren Rudi van Drunen Anton Feenstra Sebastian Fritsch
- Gerrit Groenhof Christoph Junghans Peter Kasson Carsten Kutzner
- Per Larsson Justin A. Lemkul Magnus Lundborg Pieter Meulenhoff
- Erik Marklund Teemu Murtola Szilard Pall Sander Pronk
- Roland Schulz Alexey Shvetsov Michael Shirts Alfons Sijbers
- Peter Tieleman Christian Wennberg Maarten Wolf
- and the project leaders:
- Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel
- Copyright (c) 1991-2000, University of Groningen, The Netherlands.
- Copyright (c) 2001-2014, The GROMACS development team at
- Uppsala University, Stockholm University and
- the Royal Institute of Technology, Sweden.
- check out http://www.gromacs.org for more information.
- GROMACS is free software; you can redistribute it and/or modify it
- under the terms of the GNU Lesser General Public License
- as published by the Free Software Foundation; either version 2.1
- of the License, or (at your option) any later version.
- GROMACS: gmx mdrun, VERSION 5.0.2
- Executable: mdrun_mpi
- Library dir: /sw/xk6/gromacs/5.0.2/cle5.2_gnu4.8.2/share/gromacs/top
- Command line:
- mdrun_mpi -npme 128 -dlb yes -pin on -resethway -noconfout -v -s opt.tpr -deffnm test
- Gromacs version: VERSION 5.0.2
- Precision: single
- Memory model: 64 bit
- MPI library: MPI
- OpenMP support: enabled
- GPU support: enabled
- invsqrt routine: gmx_software_invsqrt(x)
- SIMD instructions: AVX_128_FMA
- FFT library: commercial-fftw-3.3.4-fma-sse2-avx
- RDTSCP usage: disabled
- C++11 compilation: disabled
- TNG support: enabled
- Tracing support: disabled
- Built on: Thu Mar 12 18:27:12 EDT 2015
- Built by: ff1@titan-ext8 [CMAKE]
- Build OS/arch: Linux 3.0.101-0.46-default x86_64
- Build CPU vendor: AuthenticAMD
- Build CPU brand: AMD Opteron(tm) Processor 6140
- Build CPU family: 16 Model: 9 Stepping: 1
- Build CPU features: apic clfsh cmov cx8 cx16 htt lahf_lm misalignsse mmx msr nonstop_tsc pdpe1gb popcnt pse rdtscp sse2 sse3 sse4a
- C compiler: /opt/cray/craype/2.2.1/bin/cc GNU 4.8.2
- C compiler flags: -mavx -mfma4 -mxop -Wno-maybe-uninitialized -Wextra -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -Wall -Wno-unused -Wunused-value -Wunused-parameter -O3 -DNDEBUG -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast -Wno-array-bounds
- C++ compiler: /opt/cray/craype/2.2.1/bin/CC GNU 4.8.2
- C++ compiler flags: -mavx -mfma4 -mxop -Wextra -Wno-missing-field-initializers -Wpointer-arith -Wall -Wno-unused-function -O3 -DNDEBUG -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast -Wno-array-bounds
- Boost version: 1.55.0 (internal)
- CUDA compiler: /opt/nvidia/cudatoolkit/5.5.51-1.0502.9594.3.1/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2013 NVIDIA Corporation;Built on Thu_Mar__6_02:21:19_PST_2014;Cuda compilation tools, release 5.5, V5.5.0
- CUDA compiler flags:-gencode;arch=compute_20,code=sm_20;-gencode;arch=compute_20,code=sm_21;-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_35,code=compute_35;-use_fast_math;; ;-mavx;-mfma4;-mxop;-Wextra;-Wno-missing-field-initializers;-Wpointer-arith;-Wall;-Wno-unused-function;-O3;-DNDEBUG;-fomit-frame-pointer;-funroll-all-loops;-fexcess-precision=fast;-Wno-array-bounds;
- CUDA driver: 5.50
- CUDA runtime: 5.50
- ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
- B. Hess and C. Kutzner and D. van der Spoel and E. Lindahl
- GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable
- molecular simulation
- J. Chem. Theory Comput. 4 (2008) pp. 435-447
- -------- -------- --- Thank You --- -------- --------
- ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
- D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C.
- Berendsen
- GROMACS: Fast, Flexible and Free
- J. Comp. Chem. 26 (2005) pp. 1701-1719
- -------- -------- --- Thank You --- -------- --------
- ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
- E. Lindahl and B. Hess and D. van der Spoel
- GROMACS 3.0: A package for molecular simulation and trajectory analysis
- J. Mol. Mod. 7 (2001) pp. 306-317
- -------- -------- --- Thank You --- -------- --------
- ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
- H. J. C. Berendsen, D. van der Spoel and R. van Drunen
- GROMACS: A message-passing parallel molecular dynamics implementation
- Comp. Phys. Comm. 91 (1995) pp. 43-56
- -------- -------- --- Thank You --- -------- --------
- Number of hardware threads detected (16) does not match the number reported by OpenMP (1).
- Consider setting the launch configuration manually!
- Changing nstlist from 20 to 40, rlist from 1.2 to 1.239
- Input Parameters:
- integrator = md
- tinit = 0
- dt = 0.002
- nsteps = 10000
- init-step = 0
- simulation-part = 1
- comm-mode = Linear
- nstcomm = 100
- bd-fric = 0
- ld-seed = 60975668
- emtol = 10
- emstep = 0.01
- niter = 20
- fcstep = 0
- nstcgsteep = 1000
- nbfgscorr = 10
- rtpi = 0.05
- nstxout = 5000
- nstvout = 5000
- nstfout = 5000
- nstlog = 1000
- nstcalcenergy = 100
- nstenergy = 1000
- nstxout-compressed = 0
- compressed-x-precision = 1000
- cutoff-scheme = Verlet
- nstlist = 40
- ns-type = Grid
- pbc = xyz
- periodic-molecules = FALSE
- verlet-buffer-tolerance = 0.005
- rlist = 1.239
- rlistlong = 1.239
- nstcalclr = 20
- coulombtype = PME
- coulomb-modifier = Potential-shift
- rcoulomb-switch = 0
- rcoulomb = 1.2
- epsilon-r = 1
- epsilon-rf = inf
- vdw-type = Cut-off
- vdw-modifier = Force-switch
- rvdw-switch = 1
- rvdw = 1.2
- DispCorr = No
- table-extension = 1
- fourierspacing = 0.12
- fourier-nx = 144
- fourier-ny = 144
- fourier-nz = 64
- pme-order = 4
- ewald-rtol = 1e-05
- ewald-rtol-lj = 0.001
- lj-pme-comb-rule = Geometric
- ewald-geometry = 0
- epsilon-surface = 0
- implicit-solvent = No
- gb-algorithm = Still
- nstgbradii = 1
- rgbradii = 1
- gb-epsilon-solvent = 80
- gb-saltconc = 0
- gb-obc-alpha = 1
- gb-obc-beta = 0.8
- gb-obc-gamma = 4.85
- gb-dielectric-offset = 0.009
- sa-algorithm = Ace-approximation
- sa-surface-tension = 2.05016
- tcoupl = Nose-Hoover
- nsttcouple = 20
- nh-chain-length = 1
- print-nose-hoover-chain-variables = FALSE
- pcoupl = Parrinello-Rahman
- pcoupltype = Semiisotropic
- nstpcouple = 20
- tau-p = 5
- compressibility (3x3):
- compressibility[ 0]={ 4.50000e-05, 0.00000e+00, 0.00000e+00}
- compressibility[ 1]={ 0.00000e+00, 4.50000e-05, 0.00000e+00}
- compressibility[ 2]={ 0.00000e+00, 0.00000e+00, 4.50000e-05}
- ref-p (3x3):
- ref-p[ 0]={ 1.00000e+00, 0.00000e+00, 0.00000e+00}
- ref-p[ 1]={ 0.00000e+00, 1.00000e+00, 0.00000e+00}
- ref-p[ 2]={ 0.00000e+00, 0.00000e+00, 1.00000e+00}
- refcoord-scaling = COM
- posres-com (3):
- posres-com[0]= 0.00000e+00
- posres-com[1]= 0.00000e+00
- posres-com[2]= 0.00000e+00
- posres-comB (3):
- posres-comB[0]= 0.00000e+00
- posres-comB[1]= 0.00000e+00
- posres-comB[2]= 0.00000e+00
- QMMM = FALSE
- QMconstraints = 0
- QMMMscheme = 0
- MMChargeScaleFactor = 1
- qm-opts:
- ngQM = 0
- constraint-algorithm = Lincs
- continuation = TRUE
- Shake-SOR = FALSE
- shake-tol = 0.0001
- lincs-order = 4
- lincs-iter = 1
- lincs-warnangle = 30
- nwall = 0
- wall-type = 9-3
- wall-r-linpot = -1
- wall-atomtype[0] = -1
- wall-atomtype[1] = -1
- wall-density[0] = 0
- wall-density[1] = 0
- wall-ewald-zfac = 3
- pull = no
- rotation = FALSE
- interactiveMD = FALSE
- disre = No
- disre-weighting = Conservative
- disre-mixed = FALSE
- dr-fc = 1000
- dr-tau = 0
- nstdisreout = 100
- orire-fc = 0
- orire-tau = 0
- nstorireout = 100
- free-energy = no
- cos-acceleration = 0
- deform (3x3):
- deform[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
- deform[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
- deform[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
- simulated-tempering = FALSE
- E-x:
- n = 0
- E-xt:
- n = 0
- E-y:
- n = 0
- E-yt:
- n = 0
- E-z:
- n = 0
- E-zt:
- n = 0
- swapcoords = no
- adress = FALSE
- userint1 = 0
- userint2 = 0
- userint3 = 0
- userint4 = 0
- userreal1 = 0
- userreal2 = 0
- userreal3 = 0
- userreal4 = 0
- grpopts:
- nrdf: 261777 192987
- ref-t: 303.15 303.15
- tau-t: 1 1
- annealing: No No
- annealing-npoints: 0 0
- acc: 0 0 0
- nfreeze: N N N
- energygrp-flags[ 0]: 0
- Initializing Domain Decomposition on 256 ranks
- Dynamic load balancing: yes
- Will sort the charge groups at every domain (re)decomposition
- Initial maximum inter charge-group distances:
- two-body bonded interactions: 0.420 nm, LJ-14, atoms 42821 42830
- multi-body bonded interactions: 0.420 nm, Proper Dih., atoms 42821 42830
- Minimum cell size due to bonded interactions: 0.462 nm
- Maximum distance for 5 constraints, at 120 deg. angles, all-trans: 0.222 nm
- Estimated maximum distance required for P-LINCS: 0.222 nm
- Using 128 separate PME ranks, per user request
- Scaling the initial minimum size with 1/0.8 (option -dds) = 1.25
- Optimizing the DD grid for 128 cells with a minimum initial size of 0.578 nm
- The maximum allowed number of cells is: X 27 Y 27 Z 13
- Domain decomposition grid 8 x 8 x 2, separate PME ranks 128
- PME domain decomposition: 8 x 16 x 1
- Interleaving PP and PME ranks
- This rank does only particle-particle work.
- Domain decomposition rank 0, coordinates 0 0 0
- Using 256 MPI processes
- Using 4 OpenMP threads per MPI process
- Detecting CPU SIMD instructions.
- Present hardware specification:
- Vendor: AuthenticAMD
- Brand: AMD Opteron(TM) Processor 6274
- Family: 21 Model: 1 Stepping: 2
- Features: aes apic avx clfsh cmov cx8 cx16 fma4 htt lahf_lm misalignsse mmx msr nonstop_tsc pclmuldq pdpe1gb popcnt pse rdtscp sse2 sse3 sse4a sse4.1 sse4.2 ssse3 xop
- SIMD instructions most likely to fit this hardware: AVX_128_FMA
- SIMD instructions selected at GROMACS compile time: AVX_128_FMA
- The current CPU can measure timings more accurately than the code in
- mdrun_mpi was configured to use. This might affect your simulation
- speed as accurate timings are needed for load-balancing.
- Please consider rebuilding mdrun_mpi with the GMX_USE_RDTSCP=OFF CMake option.
- 1 GPU detected on host nid01946:
- #0: NVIDIA Tesla K20X, compute cap.: 3.5, ECC: yes, stat: compatible
- 1 GPU auto-selected for this run.
- Mapping of GPU to the 1 PP rank in this node: #0
- Will do PME sum in reciprocal space for electrostatic interactions.
- ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
- U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen
- A smooth particle mesh Ewald method
- J. Chem. Phys. 103 (1995) pp. 8577-8592
- -------- -------- --- Thank You --- -------- --------
- Will do ordinary reciprocal space Ewald sum.
- Using a Gaussian width (1/beta) of 0.384195 nm for Ewald
- Cut-off's: NS: 1.239 Coulomb: 1.2 LJ: 1.2
- System total charge: 0.000
- Generated table with 1119 data points for Ewald.
- Tabscale = 500 points/nm
- Generated table with 1119 data points for LJ6Shift.
- Tabscale = 500 points/nm
- Generated table with 1119 data points for LJ12Shift.
- Tabscale = 500 points/nm
- Generated table with 1119 data points for 1-4 COUL.
- Tabscale = 500 points/nm
- Generated table with 1119 data points for 1-4 LJ6.
- Tabscale = 500 points/nm
- Generated table with 1119 data points for 1-4 LJ12.
- Tabscale = 500 points/nm
- Using CUDA 8x8 non-bonded kernels
- Potential shift: LJ r^-12: -2.648e-01 r^-6: -5.349e-01, Ewald -1.000e-05
- Initialized non-bonded Ewald correction tables, spacing: 7.82e-04 size: 1536
- Overriding thread affinity set outside mdrun_mpi
- Pinning threads with an auto-selected logical core stride of 1
- Initializing Parallel LINear Constraint Solver
- ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
- B. Hess
- P-LINCS: A Parallel Linear Constraint Solver for molecular simulation
- J. Chem. Theory Comput. 4 (2008) pp. 116-122
- -------- -------- --- Thank You --- -------- --------
- The number of constraints is 67980
- There are inter charge-group constraints,
- will communicate selected coordinates each lincs iteration
- ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
- S. Miyamoto and P. A. Kollman
- SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
- Water Models
- J. Comp. Chem. 13 (1992) pp. 952-962
- -------- -------- --- Thank You --- -------- --------
- Linking all bonded interactions to atoms
- There are 739685 inter charge-group exclusions,
- will use an extra communication step for exclusion forces for PME
- The maximum number of communication pulses is: X 1 Y 1 Z 1
- The minimum size for domain decomposition cells is 1.239 nm
- The requested allowed shrink of DD cells (option -dds) is: 0.80
- The allowed shrink of domain decomposition cells is: X 0.62 Y 0.62 Z 0.33
- The maximum allowed distance for charge groups involved in interactions is:
- non-bonded interactions 1.239 nm
- two-body bonded interactions (-rdd) 1.239 nm
- multi-body bonded interactions (-rdd) 1.212 nm
- atoms separated by up to 5 constraints (-rcon) 1.239 nm
- Making 3D domain decomposition grid 8 x 8 x 2, home cell index 0 0 0
- Center of mass motion removal mode is Linear
- We have the following groups for center of mass motion removal:
- 0: NPROT
- 1: SOL_ION
- There are: 206415 Atoms
- Charge group distribution at step 0: 1641 1598 1624 1561 1574 1617 1609 1647 1618 1559 1620 1669 1602 1608 1633 1588 1641 1626 1585 1594 1584 1651 1608 1583 1655 1652 1650 1593 1573 1638 1635 1652 1600 1620 1645 1598 1616 1595 1621 1618 1610 1561 1601 1588 1589 1621 1644 1640 1674 1591 1655 1594 1644 1593 1608 1641 1596 1616 1588 1599 1595 1639 1642 1627 1606 1596 1569 1649 1601 1607 1606 1600 1583 1645 1587 1604 1622 1584 1585 1668 1625 1655 1622 1617 1559 1648 1616 1610 1646 1636 1514 1657 1575 1626 1598 1591 1594 1600 1585 1624 1603 1655 1565 1621 1622 1618 1597 1660 1625 1543 1585 1627 1640 1598 1603 1633 1626 1598 1618 1648 1582 1628 1584 1620 1590 1604 1645 1610
- Initial temperature: 303.008 K
- Started mdrun on rank 0 Thu Apr 30 12:24:10 2015
- Step Time Lambda
- 0 0.00000 0.00000
- Energies (kJ/mol)
- Bond U-B Proper Dih. Improper Dih. LJ-14
- 5.27858e+04 2.89239e+05 1.51310e+05 1.50125e+03 3.17014e+04
- Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
- -4.55645e+05 -3.43455e+04 -1.43186e+06 8.83147e+03 -1.38648e+06
- Kinetic En. Total Energy Temperature Pressure (bar) Constr. rmsd
- 5.74416e+05 -8.12065e+05 3.03832e+02 -5.33660e+01 4.19742e-06
- DD step 39 vol min/aver 1.000 load imb.: force 17.0% pme mesh/force 1.128
- step 120: timed with pme grid 144 144 64, coulomb cutoff 1.200: 106.2 M-cycles
- step 200: timed with pme grid 128 128 60, coulomb cutoff 1.269: 109.0 M-cycles
- step 280: timed with pme grid 112 112 56, coulomb cutoff 1.431: 112.3 M-cycles
- step 360: timed with pme grid 104 104 48, coulomb cutoff 1.586: 114.9 M-cycles
- step 360: the domain decompostion limits the PME load balancing to a coulomb cut-off of 1.603
- step 440: timed with pme grid 144 144 64, coulomb cutoff 1.200: 92.3 M-cycles
- step 520: timed with pme grid 128 128 64, coulomb cutoff 1.252: 83.3 M-cycles
- step 600: timed with pme grid 128 128 60, coulomb cutoff 1.269: 94.0 M-cycles
- step 680: timed with pme grid 120 120 60, coulomb cutoff 1.336: 84.0 M-cycles
- step 760: timed with pme grid 120 120 56, coulomb cutoff 1.359: 91.1 M-cycles
- step 840: timed with pme grid 112 112 56, coulomb cutoff 1.431: 82.4 M-cycles
- step 920: timed with pme grid 112 112 52, coulomb cutoff 1.464: 87.2 M-cycles
- step 1000: timed with pme grid 108 108 52, coulomb cutoff 1.484: 86.3 M-cycles
- step 1000: the domain decompostion limits the PME load balancing to a coulomb cut-off of 1.484
- DD step 999 vol min/aver 0.813 load imb.: force 1.6% pme mesh/force 2.427
- Step Time Lambda
- 1000 2.00000 0.00000
- Energies (kJ/mol)
- Bond U-B Proper Dih. Improper Dih. LJ-14
- 5.22339e+04 2.88294e+05 1.51124e+05 1.46352e+03 3.16583e+04
- Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
- -4.55027e+05 -3.49318e+04 -1.43354e+06 8.96135e+03 -1.38977e+06
- Kinetic En. Total Energy Temperature Pressure (bar) Constr. rmsd
- 5.73413e+05 -8.16355e+05 3.03301e+02 -2.05265e+02 4.19518e-06
- step 1080: timed with pme grid 144 144 64, coulomb cutoff 1.200: 95.2 M-cycles
- step 1160: timed with pme grid 128 128 64, coulomb cutoff 1.252: 86.5 M-cycles
- step 1240: timed with pme grid 128 128 60, coulomb cutoff 1.269: 86.2 M-cycles
- step 1320: timed with pme grid 120 120 60, coulomb cutoff 1.336: 89.6 M-cycles
- step 1400: timed with pme grid 120 120 56, coulomb cutoff 1.359: 81.6 M-cycles
- step 1480: timed with pme grid 112 112 56, coulomb cutoff 1.431: 84.1 M-cycles
- step 1560: timed with pme grid 112 112 52, coulomb cutoff 1.464: 83.7 M-cycles
- step 1640: timed with pme grid 108 108 52, coulomb cutoff 1.484: 86.9 M-cycles
- step 1720: timed with pme grid 128 128 64, coulomb cutoff 1.252: 83.9 M-cycles
- step 1800: timed with pme grid 128 128 60, coulomb cutoff 1.269: 91.7 M-cycles
- step 1880: timed with pme grid 120 120 60, coulomb cutoff 1.336: 83.9 M-cycles
- step 1960: timed with pme grid 120 120 56, coulomb cutoff 1.359: 84.9 M-cycles
- DD step 1999 vol min/aver 0.835 load imb.: force 2.6% pme mesh/force 2.234
- Step Time Lambda
- 2000 4.00000 0.00000
- Energies (kJ/mol)
- Bond U-B Proper Dih. Improper Dih. LJ-14
- 5.16113e+04 2.87607e+05 1.51005e+05 1.36877e+03 3.17262e+04
- Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
- -4.54477e+05 -3.46807e+04 -1.42667e+06 4.69441e+03 -1.38781e+06
- Kinetic En. Total Energy Temperature Pressure (bar) Constr. rmsd
- 5.74593e+05 -8.13222e+05 3.03926e+02 -3.75786e+01 4.22066e-06
- step 2040: timed with pme grid 112 112 56, coulomb cutoff 1.431: 81.7 M-cycles
- step 2040: the domain decompostion limits the PME load balancing to a coulomb cut-off of 1.431
- step 2120: timed with pme grid 144 144 64, coulomb cutoff 1.200: 90.4 M-cycles
- step 2200: timed with pme grid 128 128 64, coulomb cutoff 1.252: 85.8 M-cycles
- step 2280: timed with pme grid 128 128 60, coulomb cutoff 1.269: 88.2 M-cycles
- step 2360: timed with pme grid 120 120 60, coulomb cutoff 1.336: 84.2 M-cycles
- step 2440: timed with pme grid 120 120 56, coulomb cutoff 1.359: 84.6 M-cycles
- step 2520: timed with pme grid 112 112 56, coulomb cutoff 1.431: 80.1 M-cycles
- step 2600: timed with pme grid 128 128 64, coulomb cutoff 1.252: 98.5 M-cycles
- step 2680: timed with pme grid 128 128 60, coulomb cutoff 1.269: 105.0 M-cycles
- step 2760: timed with pme grid 120 120 60, coulomb cutoff 1.336: 87.3 M-cycles
- step 2840: timed with pme grid 120 120 56, coulomb cutoff 1.359: 87.6 M-cycles
- step 2920: timed with pme grid 112 112 56, coulomb cutoff 1.431: 79.0 M-cycles
- optimal pme grid 112 112 56, coulomb cutoff 1.431
- DD step 2999 vol min/aver 0.837 load imb.: force 2.3% pme mesh/force 2.224
- Step Time Lambda
- 3000 6.00000 0.00000
- Energies (kJ/mol)
- Bond U-B Proper Dih. Improper Dih. LJ-14
- 5.16298e+04 2.87335e+05 1.50838e+05 1.40316e+03 3.18678e+04
- Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
- -4.55062e+05 -3.76964e+04 -1.42403e+06 4.58932e+03 -1.38913e+06
- Kinetic En. Total Energy Temperature Pressure (bar) Constr. rmsd
- 5.73322e+05 -8.15805e+05 3.03253e+02 -1.92705e+02 4.18539e-06
- DD step 3999 vol min/aver 0.850 load imb.: force 3.5% pme mesh/force 2.348
- Step Time Lambda
- 4000 8.00000 0.00000
- Energies (kJ/mol)
- Bond U-B Proper Dih. Improper Dih. LJ-14
- 5.14032e+04 2.87969e+05 1.51062e+05 1.43241e+03 3.18387e+04
- Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
- -4.55869e+05 -3.52871e+04 -1.42459e+06 4.64457e+03 -1.38740e+06
- Kinetic En. Total Energy Temperature Pressure (bar) Constr. rmsd
- 5.71147e+05 -8.16250e+05 3.02103e+02 9.11019e+01 4.24142e-06
- step 5000: resetting all time and cycle counters
- Restarted time on rank 0 Thu Apr 30 12:24:30 2015
- DD step 4999 vol min/aver 0.846 load imb.: force 3.1% pme mesh/force 2.283
- Step Time Lambda
- 5000 10.00000 0.00000
- Energies (kJ/mol)
- Bond U-B Proper Dih. Improper Dih. LJ-14
- 5.27171e+04 2.87086e+05 1.50683e+05 1.42820e+03 3.17165e+04
- Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
- -4.55266e+05 -3.59058e+04 -1.42348e+06 4.59201e+03 -1.38643e+06
- Kinetic En. Total Energy Temperature Pressure (bar) Constr. rmsd
- 5.70754e+05 -8.15674e+05 3.01895e+02 -6.53456e+01 4.25289e-06
- DD step 5999 vol min/aver 0.835 load imb.: force 1.9% pme mesh/force 2.376
- Step Time Lambda
- 6000 12.00000 0.00000
- Energies (kJ/mol)
- Bond U-B Proper Dih. Improper Dih. LJ-14
- 5.21181e+04 2.88148e+05 1.51161e+05 1.45073e+03 3.17385e+04
- Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
- -4.54968e+05 -3.55334e+04 -1.42569e+06 4.54206e+03 -1.38703e+06
- Kinetic En. Total Energy Temperature Pressure (bar) Constr. rmsd
- 5.73379e+05 -8.13655e+05 3.03284e+02 -1.19525e+02 4.20904e-06
- DD step 6999 vol min/aver 0.816 load imb.: force 2.3% pme mesh/force 2.392
- Step Time Lambda
- 7000 14.00000 0.00000
- Energies (kJ/mol)
- Bond U-B Proper Dih. Improper Dih. LJ-14
- 5.19565e+04 2.87655e+05 1.50761e+05 1.33519e+03 3.18436e+04
- Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
- -4.56036e+05 -3.51603e+04 -1.42398e+06 4.49858e+03 -1.38713e+06
- Kinetic En. Total Energy Temperature Pressure (bar) Constr. rmsd
- 5.74061e+05 -8.13071e+05 3.03644e+02 6.96304e+01 4.22401e-06
- DD step 7999 vol min/aver 0.832 load imb.: force 2.1% pme mesh/force 2.455
- Step Time Lambda
- 8000 16.00000 0.00000
- Energies (kJ/mol)
- Bond U-B Proper Dih. Improper Dih. LJ-14
- 5.18719e+04 2.87286e+05 1.51555e+05 1.37778e+03 3.17051e+04
- Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
- -4.56243e+05 -3.61634e+04 -1.42415e+06 4.69818e+03 -1.38806e+06
- Kinetic En. Total Energy Temperature Pressure (bar) Constr. rmsd
- 5.73987e+05 -8.14077e+05 3.03605e+02 2.64174e+01 4.20138e-06
- DD step 8999 vol min/aver 0.823 load imb.: force 2.5% pme mesh/force 2.470
- Step Time Lambda
- 9000 18.00000 0.00000
- Energies (kJ/mol)
- Bond U-B Proper Dih. Improper Dih. LJ-14
- 5.16139e+04 2.87526e+05 1.51142e+05 1.39203e+03 3.18259e+04
- Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
- -4.55993e+05 -3.64471e+04 -1.42463e+06 4.51628e+03 -1.38906e+06
- Kinetic En. Total Energy Temperature Pressure (bar) Constr. rmsd
- 5.71761e+05 -8.17295e+05 3.02428e+02 9.60366e+01 4.16279e-06
- DD step 9999 vol min/aver 0.806 load imb.: force 1.6% pme mesh/force 2.395
- Step Time Lambda
- 10000 20.00000 0.00000
- Energies (kJ/mol)
- Bond U-B Proper Dih. Improper Dih. LJ-14
- 5.15769e+04 2.85892e+05 1.50943e+05 1.39197e+03 3.17989e+04
- Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
- -4.56130e+05 -3.39380e+04 -1.43001e+06 4.60777e+03 -1.39386e+06
- Kinetic En. Total Energy Temperature Pressure (bar) Constr. rmsd
- 5.73135e+05 -8.20729e+05 3.03155e+02 1.75678e+02 4.22713e-06
- <====== ############### ==>
- <==== A V E R A G E S ====>
- <== ############### ======>
- Statistics over 10001 steps using 101 frames
- Energies (kJ/mol)
- Bond U-B Proper Dih. Improper Dih. LJ-14
- 5.19270e+04 2.87532e+05 1.50784e+05 1.41552e+03 3.17099e+04
- Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
- -4.55452e+05 -3.57340e+04 -1.42601e+06 5.05284e+03 -1.38878e+06
- Kinetic En. Total Energy Temperature Pressure (bar) Constr. rmsd
- 5.73150e+05 -8.15629e+05 3.03162e+02 -1.10950e+01 0.00000e+00
- Box-X Box-Y Box-Z
- 1.60015e+01 1.60015e+01 7.63570e+00
- Total Virial (kJ/mol)
- 1.86825e+05 4.87536e+02 -3.95055e+02
- 4.90755e+02 1.87159e+05 -1.05574e+02
- -4.00751e+02 -9.74001e+01 2.01130e+05
- Pressure (bar)
- -4.71925e+00 -9.89686e+00 8.51045e+00
- -9.95177e+00 -1.29647e+01 3.97039e+00
- 8.60705e+00 3.83140e+00 -1.56010e+01
- T-NPROT T-SOL_ION
- 3.03164e+02 3.03161e+02
- P P - P M E L O A D B A L A N C I N G
- NOTE: The PP/PME load balancing was limited by the domain decompostion,
- you might not have reached a good load balance.
- Try different mdrun -dd settings or lower the -dds value.
- PP/PME load balancing changed the cut-off and PME settings:
- particle-particle PME
- rcoulomb rlist grid spacing 1/beta
- initial 1.200 nm 1.239 nm 144 144 64 0.119 nm 0.384 nm
- final 1.431 nm 1.470 nm 112 112 56 0.143 nm 0.458 nm
- cost-ratio 1.67 0.53
- (note that these numbers concern only part of the total PP and PME load)
- M E G A - F L O P S A C C O U N T I N G
- NB=Group-cutoff nonbonded kernels NxN=N-by-N cluster Verlet kernels
- RF=Reaction-Field VdW=Van der Waals QSTab=quadratic-spline table
- W3=SPC/TIP3p W4=TIP4p (single or pairs)
- V&F=Potential and force V=Potential only F=Force only
- Computing: M-Number M-Flops % Flops
- -----------------------------------------------------------------------------
- NB VdW [V&F] 1088.367630 1088.368 0.0
- Pair Search distance check 5466.680512 49200.125 0.0
- NxN Ewald Elec. + LJ [F] 1826845.484608 142493947.799 96.6
- NxN Ewald Elec. + LJ [V&F] 18822.293824 2428075.903 1.6
- 1,4 nonbonded interactions 1575.315000 141778.350 0.1
- Calc Weights 3096.844245 111486.393 0.1
- Spread Q Bspline 66066.010560 132132.021 0.1
- Gather F Bspline 66066.010560 396396.063 0.3
- 3D-FFT 136460.296602 1091682.373 0.7
- Solve PME 1003.720704 64238.125 0.0
- Reset In Box 26.008290 78.025 0.0
- CG-CoM 26.008290 78.025 0.0
- Bonds 212.942580 12563.612 0.0
- Propers 1815.312990 415706.675 0.3
- Impropers 5.901180 1227.445 0.0
- Virial 53.255925 958.607 0.0
- Stop-CM 10.527165 105.272 0.0
- Calc-Ekin 103.413915 2792.176 0.0
- Lincs 387.745759 23264.746 0.0
- Lincs-Mat 2798.666532 11194.666 0.0
- Constraint-V 1309.868960 10478.952 0.0
- Constraint-Vir 46.281189 1110.749 0.0
- Settle 178.125814 57534.638 0.0
- -----------------------------------------------------------------------------
- Total 147447119.106 100.0
- -----------------------------------------------------------------------------
- D O M A I N D E C O M P O S I T I O N S T A T I S T I C S
- av. #atoms communicated per step for force: 2 x 597133.5
- av. #atoms communicated per step for LINCS: 2 x 25401.9
- Average load imbalance: 2.6 %
- Part of the total run time spent waiting due to load imbalance: 0.7 %
- Steps where the load balancing was limited by -rdd, -rcon and/or -dds: X 0 % Y 0 % Z 0 %
- Average PME mesh/force load: 2.371
- Part of the total run time spent waiting due to PP/PME imbalance: 22.3 %
- NOTE: 22.3 % performance was lost because the PME ranks
- had more work to do than the PP ranks.
- You might want to increase the number of PME ranks
- or increase the cut-off and the grid spacing.
- R E A L C Y C L E A N D T I M E A C C O U N T I N G
- On 128 MPI ranks doing PP, each using 4 OpenMP threads, and
- on 128 MPI ranks doing PME, each using 4 OpenMP threads
- Computing: Num Num Call Wall time Giga-Cycles
- Ranks Threads Count (s) total sum %
- -----------------------------------------------------------------------------
- Domain decomp. 128 4 126 0.469 528.272 2.4
- DD comm. load 128 4 126 0.004 4.945 0.0
- DD comm. bounds 128 4 126 0.021 23.422 0.1
- Send X to PME 128 4 5001 0.051 57.293 0.3
- Neighbor search 128 4 126 0.227 256.118 1.2
- Launch GPU ops. 128 4 10002 0.396 445.524 2.1
- Comm. coord. 128 4 4875 1.011 1138.874 5.2
- Force 128 4 5001 1.564 1761.513 8.1
- Wait + Comm. F 128 4 5001 0.934 1052.349 4.9
- PME mesh * 128 4 5001 7.112 8011.553 36.9
- PME wait for PP * 2.519 2836.985 13.1
- Wait + Recv. PME F 128 4 5001 2.731 3076.330 14.2
- Wait GPU nonlocal 128 4 5001 0.027 30.401 0.1
- Wait GPU local 128 4 5001 0.021 23.784 0.1
- NB X/F buffer ops. 128 4 19752 0.448 504.248 2.3
- Write traj. 128 4 2 0.037 41.230 0.2
- Update 128 4 5001 0.273 307.857 1.4
- Constraints 128 4 5001 0.961 1082.961 5.0
- Comm. energies 128 4 251 0.290 326.762 1.5
- Rest 0.166 186.669 0.9
- -----------------------------------------------------------------------------
- Total 9.631 21697.106 100.0
- -----------------------------------------------------------------------------
- (*) Note that with separate PME ranks, the walltime column actually sums to
- twice the total reported, but the cycle count total and % are correct.
- -----------------------------------------------------------------------------
- Breakdown of PME mesh computation
- -----------------------------------------------------------------------------
- PME redist. X/F 128 4 10002 1.890 2128.461 9.8
- PME spread/gather 128 4 10002 1.706 1921.464 8.9
- PME 3D-FFT 128 4 10002 0.686 772.746 3.6
- PME 3D-FFT Comm. 128 4 20004 2.752 3099.967 14.3
- PME solve Elec 128 4 5001 0.038 42.752 0.2
- -----------------------------------------------------------------------------
- Core t (s) Wall t (s) (%)
- Time: 9593.355 9.631 99608.0
- (ns/day) (hour/ns)
- Performance: 89.727 0.267
- Finished mdrun on rank 0 Thu Apr 30 12:24:40 2015
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement