mahmoodn

nsys_out_rnnt

Nov 3rd, 2022
197
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 5.13 KB | None | 0 0
  1. &&&& RUNNING RNN-T_Harness # /work/./build/bin/harness_rnnt
  2. I1103 19:58:43.096913 866 main_rnnt.cc:2903] Found 1 GPUs
  3. [I] Starting creating QSL.
  4. [I] Finished creating QSL.
  5. [I] Starting creating SUT.
  6. [I] Set to device 0
  7. Dali pipeline creating..
  8. Dali pipeline created
  9. [I] Creating stream 0/1
  10. [I] [TRT] [MemUsageChange] Init CUDA: CPU +531, GPU +0, now: CPU 966, GPU 2541 (MiB)
  11. [I] [TRT] Loaded engine size: 81 MiB
  12. [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +1239, GPU +348, now: CPU 2388, GPU 2891 (MiB)
  13. [I] [TRT] [MemUsageChange] Init cuDNN: CPU +178, GPU +56, now: CPU 2566, GPU 2947 (MiB)
  14. [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +0, now: CPU 0, GPU 0 (MiB)
  15. [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +1, GPU +8, now: CPU 2593, GPU 3007 (MiB)
  16. [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 2593, GPU 3015 (MiB)
  17. [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +232, now: CPU 0, GPU 232 (MiB)
  18. [I] Created RnntEncoder runner: encoder
  19. [I] [TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 2593, GPU 3249 (MiB)
  20. [I] [TRT] Loaded engine size: 3 MiB
  21. [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2599, GPU 3257 (MiB)
  22. [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +10, now: CPU 2599, GPU 3267 (MiB)
  23. [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +0, now: CPU 0, GPU 232 (MiB)
  24. [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2599, GPU 3271 (MiB)
  25. [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 2599, GPU 3279 (MiB)
  26. [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +1, now: CPU 0, GPU 233 (MiB)
  27. [I] Created RnntDecoder runner: decoder
  28. [I] [TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 2600, GPU 3279 (MiB)
  29. [I] [TRT] Loaded engine size: 1 MiB
  30. [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +1, now: CPU 0, GPU 234 (MiB)
  31. [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +0, now: CPU 0, GPU 234 (MiB)
  32. [I] Created RnntJointFc1 runner: fc1_a
  33. [I] [TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 2600, GPU 3279 (MiB)
  34. [I] [TRT] Loaded engine size: 0 MiB
  35. [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +1, GPU +8, now: CPU 2601, GPU 3287 (MiB)
  36. [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +0, now: CPU 0, GPU 234 (MiB)
  37. [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +1, GPU +8, now: CPU 2601, GPU 3287 (MiB)
  38. [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +0, now: CPU 0, GPU 234 (MiB)
  39. [I] Created RnntJointFc1 runner: fc1_b
  40. [I] [TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 2600, GPU 3287 (MiB)
  41. [I] [TRT] Loaded engine size: 0 MiB
  42. [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +1, GPU +8, now: CPU 2601, GPU 3295 (MiB)
  43. [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +0, now: CPU 0, GPU 234 (MiB)
  44. [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +1, GPU +8, now: CPU 2601, GPU 3295 (MiB)
  45. [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +0, now: CPU 0, GPU 234 (MiB)
  46. [I] Created RnntJointBackend runner: joint_backend
  47. [I] [TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 2600, GPU 3295 (MiB)
  48. [I] [TRT] Loaded engine size: 0 MiB
  49. [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2601, GPU 3303 (MiB)
  50. [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +10, now: CPU 2601, GPU 3313 (MiB)
  51. [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +0, now: CPU 0, GPU 234 (MiB)
  52. [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 2601, GPU 3305 (MiB)
  53. [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 2601, GPU 3313 (MiB)
  54. [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +0, now: CPU 0, GPU 234 (MiB)
  55. [I] Created RnntIsel runner: isel
  56. [I] [TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 2601, GPU 3313 (MiB)
  57. [I] [TRT] Loaded engine size: 0 MiB
  58. [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +0, now: CPU 0, GPU 234 (MiB)
  59. [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +2, now: CPU 0, GPU 236 (MiB)
  60. [I] Created RnntIgather runner: igather
  61. [I] Instantiated RnntEngineContainer runner
  62. cudaMemcpy blocking
  63. cudaMemcpy blocking
  64. [I] Instantiated RnntTensorContainer host memory
  65. Stream::Stream sampleSize: 61440
  66. Stream::Stream singleSampleSize: 480
  67. Stream::Stream fullseqSampleSize: 61440
  68. Stream::Stream mBatchSize: 16
  69. [I] Finished creating SUT.
  70. [I] Starting warming up SUT.
  71. [I] Finished warming up SUT.
  72. [I] Starting running actual test.
  73. Generating '/tmp/nsys-report-ecc6.qdstrm'
  74. [1/1] [========================100%] nsys_rnnt.nsys-rep
  75. Generated:
  76. /work/nsys_rnnt.nsys-rep
  77.  
Add Comment
Please, Sign In to add comment