Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- lbraun@ceg01:/local/lbraun/CUDA_5.0_SKD/5_Simulations/nbody$ cuda-gdb --args nbody_ocelot --benchmark
- NVIDIA (R) CUDA Debugger
- 5.0 release
- Portions Copyright (C) 2007-2012 NVIDIA Corporation
- GNU gdb (GDB) 7.2
- Copyright (C) 2010 Free Software Foundation, Inc.
- License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
- This is free software: you are free to change and redistribute it.
- There is NO WARRANTY, to the extent permitted by law. Type "show copying"
- and "show warranty" for details.
- This GDB was configured as "x86_64-unknown-linux-gnu".
- For bug reporting instructions, please see:
- <http://www.gnu.org/software/gdb/bugs/>...
- Reading symbols from /local/lbraun/CUDA_5.0_SKD/5_Simulations/nbody/nbody_ocelot...(no debugging symbols found)...done.
- (cuda-gdb) run
- Starting program: /local/lbraun/CUDA_5.0_SKD/5_Simulations/nbody/nbody_ocelot --benchmark
- [Thread debugging using libthread_db enabled]
- [New Thread 0x7fffec28e700 (LWP 28122)]
- Run "nbody -benchmark [-numbodies=<numBodies>]" to measure perfomance.
- -fullscreen (run n-body simulation in fullscreen mode)
- -fp64 (use double precision floating point values for simulation)
- -hostmem (stores simulation data in host memory)
- -benchmark (run benchmark to measure performance)
- -numbodies=<N> (number of bodies (>= 1) to run in simulation)
- -device=<d> (where d=0,1,2.... for the CUDA device to use)
- -numdevices=<i> (where i=(number of CUDA devices > 0) to use for simulation)
- -compare (compares simulation results running once on the default GPU and once on the CPU)
- -cpu (run n-body simulation on the CPU)
- -tipsy=<file.bin> (load a tipsy model file for simulation)
- > Windowed mode
- > Simulation data stored in video memory
- > Single precision floating point simulation
- > 1 Devices used for simulation
- [New Thread 0x7fffeba8d700 (LWP 28123)]
- [Thread 0x7fffeba8d700 (LWP 28123) exited]
- [New Thread 0x7fffeba8d700 (LWP 28124)]
- [Thread 0x7fffeba8d700 (LWP 28124) exited]
- [New Thread 0x7fffeba8d700 (LWP 28125)]
- [Thread 0x7fffeba8d700 (LWP 28125) exited]
- GPU Device 0: "Ocelot PTX Emulator" with compute capability 2.1
- [New Thread 0x7fffeba8d700 (LWP 28126)]
- [Thread 0x7fffeba8d700 (LWP 28126) exited]
- [New Thread 0x7fffeba8d700 (LWP 28127)]
- [Thread 0x7fffeba8d700 (LWP 28127) exited]
- > Compute 2.1 CUDA device: [Ocelot PTX Emulator]
- [New Thread 0x7fffeba8d700 (LWP 28128)]
- [Thread 0x7fffeba8d700 (LWP 28128) exited]
- [New Thread 0x7fffeba8d700 (LWP 28129)]
- all enabled
- (0.699931) X86TraceGenerator.cpp:771: New kernel launched
- (0.699972) X86TraceGenerator.cpp:772: compute version:2.0
- (0.699993) X86TraceGenerator.cpp:773: grid 4 x 1 x 1
- (0.700021) X86TraceGenerator.cpp:774: block 256 x 1 x 1
- (0.700041) X86TraceGenerator.cpp:775: number of warps per block:8
- (0.700067) X86TraceGenerator.cpp:776: number of total warps:32
- (0.700079) X86TraceGenerator.cpp:777: # threads per block : 256
- (0.700090) X86TraceGenerator.cpp:778: number of register per thread:0
- (0.700106) X86TraceGenerator.cpp:779: number of shared memory per thread:0
- (0.700119) X86TraceGenerator.cpp:817: max blocks per core : 6
- (0.704504) X86TraceGenerator.cpp:1078: mkdir -p /local/lbraun/CUDA_5.0_SKD/5_Simulations/nbody/macsim_Trace/_Z15integrateBodiesIfLb0EEvPN4vec4IT_E4TypeES4_S4_jjffi_0/ (status 0)
- (0.704586) X86TraceGenerator.cpp:1079: errno is 2 message is No such file or directory
- (16.163317) X86TraceGenerator.cpp:771: New kernel launched
- (16.163355) X86TraceGenerator.cpp:772: compute version:2.0
- (16.163368) X86TraceGenerator.cpp:773: grid 4 x 1 x 1
- (16.163381) X86TraceGenerator.cpp:774: block 256 x 1 x 1
- (16.163393) X86TraceGenerator.cpp:775: number of warps per block:8
- (16.163404) X86TraceGenerator.cpp:776: number of total warps:32
- (16.163416) X86TraceGenerator.cpp:777: # threads per block : 256
- (16.163427) X86TraceGenerator.cpp:778: number of register per thread:0
- (16.163438) X86TraceGenerator.cpp:779: number of shared memory per thread:0
- (16.163458) X86TraceGenerator.cpp:817: max blocks per core : 6
- (16.167410) X86TraceGenerator.cpp:1078: mkdir -p /local/lbraun/CUDA_5.0_SKD/5_Simulations/nbody/macsim_Trace/_Z15integrateBodiesIfLb0EEvPN4vec4IT_E4TypeES4_S4_jjffi_1/ (status 0)
- (16.167461) X86TraceGenerator.cpp:1079: errno is 0 message is Success
- (31.551816) X86TraceGenerator.cpp:771: New kernel launched
- (31.551859) X86TraceGenerator.cpp:772: compute version:2.0
- (31.551879) X86TraceGenerator.cpp:773: grid 4 x 1 x 1
- (31.551900) X86TraceGenerator.cpp:774: block 256 x 1 x 1
- (31.551928) X86TraceGenerator.cpp:775: number of warps per block:8
- (31.551953) X86TraceGenerator.cpp:776: number of total warps:32
- (31.551976) X86TraceGenerator.cpp:777: # threads per block : 256
- (31.551999) X86TraceGenerator.cpp:778: number of register per thread:0
- (31.552022) X86TraceGenerator.cpp:779: number of shared memory per thread:0
- (31.552047) X86TraceGenerator.cpp:817: max blocks per core : 6
- (31.556119) X86TraceGenerator.cpp:1078: mkdir -p /local/lbraun/CUDA_5.0_SKD/5_Simulations/nbody/macsim_Trace/_Z15integrateBodiesIfLb0EEvPN4vec4IT_E4TypeES4_S4_jjffi_2/ (status 0)
- (31.556209) X86TraceGenerator.cpp:1079: errno is 0 message is Success
- (46.961537) X86TraceGenerator.cpp:771: New kernel launched
- (46.961581) X86TraceGenerator.cpp:772: compute version:2.0
- (46.961601) X86TraceGenerator.cpp:773: grid 4 x 1 x 1
- (46.961623) X86TraceGenerator.cpp:774: block 256 x 1 x 1
- (46.961649) X86TraceGenerator.cpp:775: number of warps per block:8
- (46.961675) X86TraceGenerator.cpp:776: number of total warps:32
- (46.961698) X86TraceGenerator.cpp:777: # threads per block : 256
- (46.961722) X86TraceGenerator.cpp:778: number of register per thread:0
- (46.961745) X86TraceGenerator.cpp:779: number of shared memory per thread:0
- (46.961770) X86TraceGenerator.cpp:817: max blocks per core : 6
- (46.965670) X86TraceGenerator.cpp:1078: mkdir -p /local/lbraun/CUDA_5.0_SKD/5_Simulations/nbody/macsim_Trace/_Z15integrateBodiesIfLb0EEvPN4vec4IT_E4TypeES4_S4_jjffi_3/ (status 0)
- (46.965721) X86TraceGenerator.cpp:1079: errno is 0 message is Success
- (62.341825) X86TraceGenerator.cpp:771: New kernel launched
- (62.341866) X86TraceGenerator.cpp:772: compute version:2.0
- (62.341887) X86TraceGenerator.cpp:773: grid 4 x 1 x 1
- (62.341911) X86TraceGenerator.cpp:774: block 256 x 1 x 1
- (62.341935) X86TraceGenerator.cpp:775: number of warps per block:8
- (62.341960) X86TraceGenerator.cpp:776: number of total warps:32
- (62.341984) X86TraceGenerator.cpp:777: # threads per block : 256
- (62.342006) X86TraceGenerator.cpp:778: number of register per thread:0
- (62.342030) X86TraceGenerator.cpp:779: number of shared memory per thread:0
- (62.342054) X86TraceGenerator.cpp:817: max blocks per core : 6
- (62.345955) X86TraceGenerator.cpp:1078: mkdir -p /local/lbraun/CUDA_5.0_SKD/5_Simulations/nbody/macsim_Trace/_Z15integrateBodiesIfLb0EEvPN4vec4IT_E4TypeES4_S4_jjffi_4/ (status 0)
- (62.346010) X86TraceGenerator.cpp:1079: errno is 0 message is Success
- (77.762645) X86TraceGenerator.cpp:771: New kernel launched
- (77.762684) X86TraceGenerator.cpp:772: compute version:2.0
- (77.762704) X86TraceGenerator.cpp:773: grid 4 x 1 x 1
- (77.762727) X86TraceGenerator.cpp:774: block 256 x 1 x 1
- (77.762752) X86TraceGenerator.cpp:775: number of warps per block:8
- (77.762774) X86TraceGenerator.cpp:776: number of total warps:32
- (77.762798) X86TraceGenerator.cpp:777: # threads per block : 256
- (77.762824) X86TraceGenerator.cpp:778: number of register per thread:0
- (77.762847) X86TraceGenerator.cpp:779: number of shared memory per thread:0
- (77.762874) X86TraceGenerator.cpp:817: max blocks per core : 6
- (77.765837) X86TraceGenerator.cpp:1078: mkdir -p /local/lbraun/CUDA_5.0_SKD/5_Simulations/nbody/macsim_Trace/_Z15integrateBodiesIfLb0EEvPN4vec4IT_E4TypeES4_S4_jjffi_5/ (status 0)
- (77.765885) X86TraceGenerator.cpp:1079: errno is 0 message is Success
- (93.150533) X86TraceGenerator.cpp:771: New kernel launched
- (93.150571) X86TraceGenerator.cpp:772: compute version:2.0
- (93.150591) X86TraceGenerator.cpp:773: grid 4 x 1 x 1
- (93.150612) X86TraceGenerator.cpp:774: block 256 x 1 x 1
- (93.150638) X86TraceGenerator.cpp:775: number of warps per block:8
- (93.150662) X86TraceGenerator.cpp:776: number of total warps:32
- (93.150686) X86TraceGenerator.cpp:777: # threads per block : 256
- (93.150709) X86TraceGenerator.cpp:778: number of register per thread:0
- (93.150732) X86TraceGenerator.cpp:779: number of shared memory per thread:0
- (93.150757) X86TraceGenerator.cpp:817: max blocks per core : 6
- (93.153666) X86TraceGenerator.cpp:1078: mkdir -p /local/lbraun/CUDA_5.0_SKD/5_Simulations/nbody/macsim_Trace/_Z15integrateBodiesIfLb0EEvPN4vec4IT_E4TypeES4_S4_jjffi_6/ (status 0)
- (93.153715) X86TraceGenerator.cpp:1079: errno is 0 message is Success
- (108.538777) X86TraceGenerator.cpp:771: New kernel launched
- (108.538831) X86TraceGenerator.cpp:772: compute version:2.0
- (108.538846) X86TraceGenerator.cpp:773: grid 4 x 1 x 1
- (108.538859) X86TraceGenerator.cpp:774: block 256 x 1 x 1
- (108.538871) X86TraceGenerator.cpp:775: number of warps per block:8
- (108.538883) X86TraceGenerator.cpp:776: number of total warps:32
- (108.538895) X86TraceGenerator.cpp:777: # threads per block : 256
- (108.538906) X86TraceGenerator.cpp:778: number of register per thread:0
- (108.538918) X86TraceGenerator.cpp:779: number of shared memory per thread:0
- (108.538931) X86TraceGenerator.cpp:817: max blocks per core : 6
- (108.541826) X86TraceGenerator.cpp:1078: mkdir -p /local/lbraun/CUDA_5.0_SKD/5_Simulations/nbody/macsim_Trace/_Z15integrateBodiesIfLb0EEvPN4vec4IT_E4TypeES4_S4_jjffi_7/ (status 0)
- (108.541878) X86TraceGenerator.cpp:1079: errno is 0 message is Success
- (123.948039) X86TraceGenerator.cpp:771: New kernel launched
- (123.948076) X86TraceGenerator.cpp:772: compute version:2.0
- (123.948089) X86TraceGenerator.cpp:773: grid 4 x 1 x 1
- (123.948102) X86TraceGenerator.cpp:774: block 256 x 1 x 1
- (123.948113) X86TraceGenerator.cpp:775: number of warps per block:8
- (123.948125) X86TraceGenerator.cpp:776: number of total warps:32
- (123.948137) X86TraceGenerator.cpp:777: # threads per block : 256
- (123.948149) X86TraceGenerator.cpp:778: number of register per thread:0
- (123.948160) X86TraceGenerator.cpp:779: number of shared memory per thread:0
- (123.948179) X86TraceGenerator.cpp:817: max blocks per core : 6
- (123.951005) X86TraceGenerator.cpp:1078: mkdir -p /local/lbraun/CUDA_5.0_SKD/5_Simulations/nbody/macsim_Trace/_Z15integrateBodiesIfLb0EEvPN4vec4IT_E4TypeES4_S4_jjffi_8/ (status 0)
- (123.951056) X86TraceGenerator.cpp:1079: errno is 0 message is Success
- (139.448141) X86TraceGenerator.cpp:771: New kernel launched
- (139.448179) X86TraceGenerator.cpp:772: compute version:2.0
- (139.448193) X86TraceGenerator.cpp:773: grid 4 x 1 x 1
- (139.448206) X86TraceGenerator.cpp:774: block 256 x 1 x 1
- (139.448217) X86TraceGenerator.cpp:775: number of warps per block:8
- (139.448229) X86TraceGenerator.cpp:776: number of total warps:32
- (139.448241) X86TraceGenerator.cpp:777: # threads per block : 256
- (139.448252) X86TraceGenerator.cpp:778: number of register per thread:0
- (139.448264) X86TraceGenerator.cpp:779: number of shared memory per thread:0
- (139.448277) X86TraceGenerator.cpp:817: max blocks per core : 6
- (139.452406) X86TraceGenerator.cpp:1078: mkdir -p /local/lbraun/CUDA_5.0_SKD/5_Simulations/nbody/macsim_Trace/_Z15integrateBodiesIfLb0EEvPN4vec4IT_E4TypeES4_S4_jjffi_9/ (status 0)
- (139.452491) X86TraceGenerator.cpp:1079: errno is 0 message is Success
- (154.893865) X86TraceGenerator.cpp:771: New kernel launched
- (154.893908) X86TraceGenerator.cpp:772: compute version:2.0
- (154.893922) X86TraceGenerator.cpp:773: grid 4 x 1 x 1
- (154.893935) X86TraceGenerator.cpp:774: block 256 x 1 x 1
- (154.893947) X86TraceGenerator.cpp:775: number of warps per block:8
- (154.893959) X86TraceGenerator.cpp:776: number of total warps:32
- (154.893971) X86TraceGenerator.cpp:777: # threads per block : 256
- (154.893983) X86TraceGenerator.cpp:778: number of register per thread:0
- (154.893994) X86TraceGenerator.cpp:779: number of shared memory per thread:0
- (154.894014) X86TraceGenerator.cpp:817: max blocks per core : 6
- (154.898086) X86TraceGenerator.cpp:1078: mkdir -p /local/lbraun/CUDA_5.0_SKD/5_Simulations/nbody/macsim_Trace/_Z15integrateBodiesIfLb0EEvPN4vec4IT_E4TypeES4_S4_jjffi_10/ (status 0)
- (154.898160) X86TraceGenerator.cpp:1079: errno is 0 message is Success
- 1024 bodies, total time for 10 iterations: 154364.375 ms
- = 0.000 billion interactions per second
- = 0.001 single-precision GFLOP/s at 20 flops per interaction
- [Thread 0x7fffeba8d700 (LWP 28129) exited]
- [Thread 0x7fffec28e700 (LWP 28122) exited]
- Program received signal SIGSEGV, Segmentation fault.
- 0x00007ffff3a37213 in llvm::StringRef::operator[](unsigned long) const () at StringRef.h:192
- 192 return Data[Index];
Advertisement
Add Comment
Please, Sign In to add comment