Deep_love

GPU_RAM_Error

Jan 27th, 2018
154
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 8.60 KB | None | 0 0
  1. 2018-01-27 14:53:01.837268: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\bfc_allocator.cc:679] 3 Chunks of size 981504 totalling 2.81MiB
  2. 2018-01-27 14:53:01.837285: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\bfc_allocator.cc:679] 6 Chunks of size 1048576 totalling 6.00MiB
  3. 2018-01-27 14:53:01.837302: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\bfc_allocator.cc:679] 1 Chunks of size 1179648 totalling 1.13MiB
  4. 2018-01-27 14:53:01.837318: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\bfc_allocator.cc:679] 1 Chunks of size 2097152 totalling 2.00MiB
  5. 2018-01-27 14:53:01.837335: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\bfc_allocator.cc:679] 1 Chunks of size 4194304 totalling 4.00MiB
  6. 2018-01-27 14:53:01.837351: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\bfc_allocator.cc:679] 1 Chunks of size 6539264 totalling 6.24MiB
  7. 2018-01-27 14:53:01.837368: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\bfc_allocator.cc:679] 1 Chunks of size 6755527168 totalling 6.29GiB
  8. 2018-01-27 14:53:01.837384: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\bfc_allocator.cc:683] Sum Total of in-use chunks: 6.32GiB
  9. 2018-01-27 14:53:01.837403: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\bfc_allocator.cc:685] Stats:
  10. Limit:                  6787871540
  11. InUse:                  6787868928
  12. MaxInUse:               6787871488
  13. NumAllocs:                   10818
  14. MaxAllocSize:           6755527168
  15.  
  16. 2018-01-27 14:53:01.837496: W C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\bfc_allocator.cc:277] *******************************************************************xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  17. 2018-01-27 14:53:01.837520: W C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\framework\op_kernel.cc:1192] Resource exhausted: OOM when allocating tensor with shape[3548,3275,3]
  18. Traceback (most recent call last):
  19.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1323, in _do_call
  20.     return fn(*args)
  21.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1302, in _run_fn
  22.     status, run_metadata)
  23.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 473, in __exit__
  24.     c_api.TF_GetCode(self.status.status))
  25. tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
  26.          [[Node: FeatureExtractor/MobilenetV1/Conv2d_13_pointwise_2_Conv2d_2_3x3_s2_512/BatchNorm/gamma/read/_6427 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_15470_FeatureExtractor/MobilenetV1/Conv2d_13_pointwise_2_Conv2d_2_3x3_s2_512/BatchNorm/gamma/read", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
  27.          [[Node: Loss/strided_slice_23/_5711 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_9593_Loss/strided_slice_23", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
  28.  
  29. During handling of the above exception, another exception occurred:
  30.  
  31. Traceback (most recent call last):
  32.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\site-packages\tensorflow\python\training\supervisor.py", line 954, in managed_session
  33.     yield sess
  34.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\site-packages\tensorflow\contrib\slim\python\slim\learning.py", line 763, in train
  35.     sess, train_op, global_step, train_step_kwargs)
  36.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\site-packages\tensorflow\contrib\slim\python\slim\learning.py", line 487, in train_step
  37.     run_metadata=run_metadata)
  38.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\site-packages\tensorflow\python\client\session.py", line 889, in run
  39.     run_metadata_ptr)
  40.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1120, in _run
  41.     feed_dict_tensor, options, run_metadata)
  42.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1317, in _do_run
  43.     options, run_metadata)
  44.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1336, in _do_call
  45.     raise type(e)(node_def, op, message)
  46. tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
  47.          [[Node: FeatureExtractor/MobilenetV1/Conv2d_13_pointwise_2_Conv2d_2_3x3_s2_512/BatchNorm/gamma/read/_6427 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_15470_FeatureExtractor/MobilenetV1/Conv2d_13_pointwise_2_Conv2d_2_3x3_s2_512/BatchNorm/gamma/read", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
  48.          [[Node: Loss/strided_slice_23/_5711 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_9593_Loss/strided_slice_23", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
  49.  
  50. During handling of the above exception, another exception occurred:
  51.  
  52. Traceback (most recent call last):
  53.   File "train.py", line 198, in <module>
  54.     tf.app.run()
  55.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
  56.     _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  57.   File "train.py", line 194, in main
  58.     worker_job_name, is_chief, FLAGS.train_dir)
  59.   File "C:\Users\ITML.LAB\models\object_detection\trainer.py", line 296, in train
  60.     saver=saver)
  61.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\site-packages\tensorflow\contrib\slim\python\slim\learning.py", line 775, in train
  62.     sv.stop(threads, close_summary_writer=True)
  63.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\contextlib.py", line 77, in __exit__
  64.     self.gen.throw(type, value, traceback)
  65.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\site-packages\tensorflow\python\training\supervisor.py", line 964, in managed_session
  66.     self.stop(close_summary_writer=close_summary_writer)
  67.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\site-packages\tensorflow\python\training\supervisor.py", line 792, in stop
  68.     stop_grace_period_secs=self._stop_grace_secs)
  69.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\site-packages\tensorflow\python\training\coordinator.py", line 389, in join
  70.     six.reraise(*self._exc_info_to_raise)
  71.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\site-packages\six.py", line 693, in reraise
  72.     raise value
  73.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\site-packages\tensorflow\python\training\queue_runner_impl.py", line 238, in _run
  74.     enqueue_callable()
  75.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1231, in _single_operation_run
  76.     target_list_as_strings, status, None)
  77.   File "C:\Users\ITML.LAB\AppData\Local\conda\conda\envs\Python35\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 473, in __exit__
  78.     c_api.TF_GetCode(self.status.status))
  79. tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[32,1,4802,4802,3]
  80.          [[Node: batch = QueueDequeueManyV2[component_types=[DT_STRING, DT_INT32, DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32, DT_INT64, DT_INT32, DT_INT64, DT_INT32, DT_INT64, DT_INT32, DT_BOOL, DT_INT32, DT_BOOL, DT_INT32, DT_FLOAT, DT_INT32, DT_STRING, DT_INT32, DT_STRING, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch/padding_fifo_queue, batch/n)]]
Advertisement
Add Comment
Please, Sign In to add comment