Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- $ cat cublasinit.cu
- #include <cublas_v2.h>
- int main()
- {
- cublasHandle_t handle;
- cublasCreate(&handle);
- cudaDeviceReset();
- return 0;
- }
- $ nvcc -arch=sm_52 cublasinit.cu -g -o cublasinit -lcublas
- $ nvprof --profile-api-trace runtime ./cublasinit
- ==20967== NVPROF is profiling process 20967, command: ./cublasinit
- ==20967== Profiling application: ./cublasinit
- ==20967== Profiling result:
- Type Time(%) Time Calls Avg Min Max Name
- GPU activities: 100.00% 1.1520us 1 1.1520us 1.1520us 1.1520us [CUDA memcpy HtoD]
- API calls: 73.39% 333.76ms 1 333.76ms 333.76ms 333.76ms cudaFree
- 26.53% 120.65ms 1 120.65ms 120.65ms 120.65ms cudaDeviceReset
- 0.07% 308.56us 3 102.85us 10.797us 196.58us cudaMalloc
- 0.00% 15.254us 1 15.254us 15.254us 15.254us cudaMemcpy
- 0.00% 9.9180us 16 619ns 427ns 2.0100us cudaEventCreateWithFlags
- 0.00% 4.4720us 11 406ns 247ns 1.4270us cudaDeviceGetAttribute
- 0.00% 1.2830us 1 1.2830us 1.2830us 1.2830us cudaGetDevice
- $ nvprof --profile-api-trace driver ./cublasinit
- ==22436== NVPROF is profiling process 22436, command: ./cublasinit
- ==22436== Profiling application: ./cublasinit
- ==22436== Profiling result:
- Type Time(%) Time Calls Avg Min Max Name
- GPU activities: 100.00% 1.2480us 1 1.2480us 1.2480us 1.2480us [CUDA memcpy HtoD]
- API calls: 62.06% 620.86us 185 3.3550us 124ns 161.93us cuDeviceGetAttribute
- 29.29% 293.05us 2 146.53us 69.859us 223.19us cuDeviceTotalMem
- 8.00% 80.030us 2 40.015us 31.885us 48.145us cuDeviceGetName
- 0.34% 3.4310us 4 857ns 259ns 2.2730us cuDeviceGetCount
- 0.19% 1.8850us 3 628ns 295ns 1.2760us cuDeviceGet
- 0.06% 625ns 1 625ns 625ns 625ns cuInit
- 0.05% 480ns 1 480ns 480ns 480ns cuDriverGetVersion
Advertisement
Add Comment
Please, Sign In to add comment