Guest User

Untitled

a guest
Nov 20th, 2017
125
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 5.36 KB | None | 0 0
  1. os: Ubuntu 16.04
  2. gpu: nvidia geforce 1080Ti & 1060
  3. tensorflow version: 1.3.0
  4. training model: faster_rcnn_resnet101_coco (although I have tried others)
  5. Classes: 1
  6.  
  7. INFO:tensorflow:global step 363: loss = 1.4006 (0.294 sec/step)
  8. INFO:tensorflow:Finished training! Saving model to disk.
  9. Traceback (most recent call last):
  10. File "object_detection/train.py", line 163, in <module>
  11. tf.app.run()
  12. File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
  13. _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  14. File "object_detection/train.py", line 159, in main
  15. worker_job_name, is_chief, FLAGS.train_dir)
  16. File "/home/ucfadng/tensorflow/models/research/object_detection/trainer.py", line 332, in train
  17. saver=saver)
  18. File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/slim/python/slim/learning.py", line 767, in train
  19. sv.stop(threads, close_summary_writer=True)
  20. File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 792, in stop
  21. stop_grace_period_secs=self._stop_grace_secs)
  22. File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/coordinator.py", line 389, in join
  23. six.reraise(*self._exc_info_to_raise)
  24. File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/coordinator.py", line 296, in stop_on_exception
  25. yield
  26. File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/coordinator.py", line 494, in run
  27. self.run_loop()
  28. File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 994, in run_loop
  29. self._sv.global_step])
  30. File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 895, in run
  31. run_metadata_ptr)
  32. File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1124, in _run
  33. feed_dict_tensor, options, run_metadata)
  34. File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1321, in _do_run
  35. options, run_metadata)
  36. File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1340, in _do_call
  37. raise type(e)(node_def, op, message)
  38. tensorflow.python.framework.errors_impl.InvalidArgumentError: Nan in summary histogram for: SecondStageFeatureExtractor/resnet_v1_101/block4/unit_3/bottleneck_v1/conv3/BatchNorm/gamma_1
  39. [[Node: SecondStageFeatureExtractor/resnet_v1_101/block4/unit_3/bottleneck_v1/conv3/BatchNorm/gamma_1 = HistogramSummary[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](SecondStageFeatureExtractor/resnet_v1_101/block4/unit_3/bottleneck_v1/conv3/BatchNorm/gamma_1/tag, SecondStageFeatureExtractor/resnet_v1_101/block4/unit_3/bottleneck_v1/conv3/BatchNorm/gamma/read)]]
  40. [[Node: Loss/RPNLoss/map/TensorArray_2/_1353 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_5239_Loss/RPNLoss/map/TensorArray_2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
  41.  
  42. Caused by op u'SecondStageFeatureExtractor/resnet_v1_101/block4/unit_3/bottleneck_v1/conv3/BatchNorm/gamma_1', defined at:
  43. File "object_detection/train.py", line 163, in <module>
  44. tf.app.run()
  45. File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
  46. _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  47. File "object_detection/train.py", line 159, in main
  48. worker_job_name, is_chief, FLAGS.train_dir)
  49. File "/home/ucfadng/tensorflow/models/research/object_detection/trainer.py", line 295, in train
  50. global_summaries.add(tf.summary.histogram(model_var.op.name, model_var))
  51. File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/summary/summary.py", line 192, in histogram
  52. tag=tag, values=values, name=scope)
  53. File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_logging_ops.py", line 129, in _histogram_summary
  54. name=name)
  55. File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
  56. op_def=op_def)
  57. File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
  58. original_op=self._default_original_op, op_def=op_def)
  59. File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
  60. self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
  61.  
  62. InvalidArgumentError (see above for traceback): Nan in summary histogram for: SecondStageFeatureExtractor/resnet_v1_101/block4/unit_3/bottleneck_v1/conv3/BatchNorm/gamma_1
  63. [[Node: SecondStageFeatureExtractor/resnet_v1_101/block4/unit_3/bottleneck_v1/conv3/BatchNorm/gamma_1 = HistogramSummary[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](SecondStageFeatureExtractor/resnet_v1_101/block4/unit_3/bottleneck_v1/conv3/BatchNorm/gamma_1/tag, SecondStageFeatureExtractor/resnet_v1_101/block4/unit_3/bottleneck_v1/conv3/BatchNorm/gamma/read)]]
  64. [[Node: Loss/RPNLoss/map/TensorArray_2/_1353 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_5239_Loss/RPNLoss/map/TensorArray_2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
  65.  
  66. item {
  67. id: 1
  68. name: 'rail'
  69. }
Add Comment
Please, Sign In to add comment