Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- ==7440== NVPROF is profiling process 7440, command: ../../../quick.cuda taxol.in
- ==7440== Profiling application: ../../../quick.cuda taxol.in
- ==7440== Profiling result:
- Type Time(%) Time Calls Avg Min Max Name
- GPU activities: 83.30% 353.084s 1 353.084s 353.084s 353.084s getGrad_kernel(void)
- 16.58% 70.2794s 21 3.34664s 1.39236s 5.67828s get2e_kernel(void)
- 0.09% 361.64ms 728 496.76us 1.2800us 4.1991ms [CUDA memcpy HtoD]
- 0.02% 84.772ms 211 401.76us 2.9120us 1.0472ms [CUDA memcpy DtoH]
- 0.01% 39.862ms 189 210.91us 209.92us 220.48us volta_dgemm_64x64_nn
- 0.00% 263.84us 189 1.3950us 1.2800us 2.6240us [CUDA memset]
- API calls: 99.47% 423.364s 22 19.2438s 1.39236s 353.084s cudaEventSynchronize
- 0.16% 669.77ms 728 920.01us 4.0230us 421.07ms cudaMalloc
- 0.16% 660.87ms 915 722.26us 5.0660us 4.3660ms cudaMemcpy
- 0.09% 402.42ms 1 402.42ms 402.42ms 402.42ms cudaThreadSynchronize
- 0.06% 255.40ms 1 255.40ms 255.40ms 255.40ms cudaThreadExit
- 0.06% 234.53ms 1067 219.80us 342ns 2.4427ms cudaFree
- 0.01% 24.617ms 211 116.67us 15.637us 13.966ms cudaLaunch
- 0.00% 2.7636ms 2 1.3818ms 1.3393ms 1.4244ms cudaGetDeviceProperties
- 0.00% 2.7396ms 1 2.7396ms 2.7396ms 2.7396ms cudaDeviceSetLimit
- 0.00% 2.7212ms 185 14.709us 113ns 564.12us cuDeviceGetAttribute
- 0.00% 1.9483ms 189 10.308us 7.4570us 18.646us cudaMemsetAsync
- 0.00% 1.4785ms 2 739.26us 737.73us 740.78us cuDeviceTotalMem
- 0.00% 1.3237ms 24 55.152us 12.818us 108.12us cudaMemcpyToSymbol
- 0.00% 708.14us 189 3.7460us 2.5480us 8.3350us cudaEventQuery
- 0.00% 687.07us 4536 151ns 93ns 9.8640us cudaSetupArgument
- 0.00% 519.70us 233 2.2300us 1.2740us 17.039us cudaEventRecord
- 0.00% 244.84us 2 122.42us 112.59us 132.24us cuDeviceGetName
- 0.00% 156.37us 211 741ns 334ns 5.1790us cudaConfigureCall
- 0.00% 128.36us 22 5.8340us 3.3470us 11.274us cudaEventElapsedTime
- 0.00% 122.01us 44 2.7720us 610ns 11.444us cudaEventCreate
- 0.00% 72.492us 44 1.6470us 511ns 7.6340us cudaEventDestroy
- 0.00% 49.681us 211 235ns 140ns 3.6230us cudaGetLastError
- 0.00% 38.931us 32 1.2160us 633ns 10.160us cudaFuncSetAttribute
- 0.00% 11.442us 16 715ns 408ns 3.0570us cudaEventCreateWithFlags
- 0.00% 10.751us 1 10.751us 10.751us 10.751us cudaSetDevice
- 0.00% 7.2770us 11 661ns 236ns 3.8930us cudaDeviceGetAttribute
- 0.00% 6.2170us 1 6.2170us 6.2170us 6.2170us cudaDeviceSetCacheConfig
- 0.00% 5.4800us 4 1.3700us 397ns 3.1600us cudaDeviceGetLimit
- 0.00% 2.8610us 1 2.8610us 2.8610us 2.8610us cudaGetDevice
- 0.00% 2.4780us 2 1.2390us 551ns 1.9270us cudaGetDeviceCount
- 0.00% 2.1280us 3 709ns 222ns 1.6090us cuDeviceGet
- 0.00% 1.8730us 4 468ns 140ns 1.2090us cuDeviceGetCount
- 0.00% 1.5600us 1 1.5600us 1.5600us 1.5600us cuInit
- 0.00% 928ns 1 928ns 928ns 928ns cuDriverGetVersion
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement