Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- (pytorch-nightly) jysohn@n1-standard-64-check-no-sharding-regression:/usr/share/torch-xla-nightly/pytorch/xla$ XLA_USE_BF16=1 python test/test_train_imagenet.py --d
- atadir=/var/datasets/imagenet --num_workers=64 --batch_size=256 |& sudo tee sharding_logs.txt
- 2019-09-05 00:32:04.321948: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) CPU:0 -> /job:tpu_worker/replica:0/task:0/device:
- XLA_CPU:0
- 2019-09-05 00:32:04.321991: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:0 -> /job:tpu_worker/replica:0/task:0/device:
- TPU:0
- 2019-09-05 00:32:04.321997: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:1 -> /job:tpu_worker/replica:0/task:0/device:
- TPU:1
- 2019-09-05 00:32:04.322002: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:2 -> /job:tpu_worker/replica:0/task:0/device:
- TPU:2
- 2019-09-05 00:32:04.322006: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:3 -> /job:tpu_worker/replica:0/task:0/device:
- TPU:3
- 2019-09-05 00:32:04.322011: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:4 -> /job:tpu_worker/replica:0/task:0/device:
- TPU:4
- 2019-09-05 00:32:04.322016: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:5 -> /job:tpu_worker/replica:0/task:0/device:
- TPU:5
- 2019-09-05 00:32:04.322020: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:6 -> /job:tpu_worker/replica:0/task:0/device:
- TPU:6
- 2019-09-05 00:32:04.322024: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:7 -> /job:tpu_worker/replica:0/task:0/device:
- TPU:7
- 2019-09-05 00:32:04.322043: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:200] Worker grpc://10.7.7.2:8470 for /job:tpu_worker/replica:0/task:0
- 2019-09-05 00:32:04.322048: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:204] XRT default device: TPU:0
- 2019-09-05 00:32:04.322065: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:1082] Configuring TPU for worker tpu_worker:0 at grpc://10.7.7.2:8470
- 2019-09-05 00:32:07.748478: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:1093] TPU topology: mesh_shape: 2
- mesh_shape: 2
- mesh_shape: 2
- num_tasks: 1
- num_tpu_devices_per_task: 8
- device_coordinates: 0
- device_coordinates: 0
- device_coordinates: 0
- device_coordinates: 0
- device_coordinates: 0
- device_coordinates: 1
- device_coordinates: 0
- device_coordinates: 1
- device_coordinates: 0
- device_coordinates: 0
- device_coordinates: 1
- device_coordinates: 1
- device_coordinates: 1
- device_coordinates: 0
- device_coordinates: 0
- device_coordinates: 1
- device_coordinates: 0
- device_coordinates: 1
- device_coordinates: 1
- device_coordinates: 1
- device_coordinates: 0
- device_coordinates: 1
- device_coordinates: 1
- device_coordinates: 1
- 2019-09-05 00:32:13.139790: I torch_xla/csrc/tensor_util.cpp:27] Using BF16 data type for floating point values
- ==> Preparing data..
- [xla:1](0) Loss=7.12500 Rate=11.66
- [xla:7](0) Loss=7.09375 Rate=11.62
- [xla:8](0) Loss=7.03125 Rate=11.62
- [xla:2](0) Loss=7.09375 Rate=11.49
- [xla:5](0) Loss=7.15625 Rate=11.42
- [xla:6](0) Loss=7.15625 Rate=11.33
- [xla:3](0) Loss=7.12500 Rate=11.30
- [xla:4](0) Loss=7.09375 Rate=11.29
- [xla:3](20) Loss=7.34375 Rate=102.42
- [xla:4](20) Loss=7.37500 Rate=102.60
- [xla:8](20) Loss=7.18750 Rate=100.71
- [xla:2](20) Loss=7.09375 Rate=101.11
- [xla:1](20) Loss=7.40625 Rate=99.81
- [xla:6](20) Loss=7.50000 Rate=102.03
- [xla:7](20) Loss=7.34375 Rate=100.46
- [xla:5](20) Loss=7.21875 Rate=101.50
- [xla:3](40) Loss=7.15625 Rate=237.17
- [xla:2](40) Loss=7.50000 Rate=237.71
- [xla:5](40) Loss=7.00000 Rate=237.44
- [xla:6](40) Loss=7.09375 Rate=237.64
- [xla:8](40) Loss=6.93750 Rate=237.05
- [xla:7](40) Loss=7.00000 Rate=237.01
- [xla:4](40) Loss=7.00000 Rate=236.74
- [xla:1](40) Loss=6.96875 Rate=236.29
- [xla:6](60) Loss=6.93750 Rate=284.90
- [xla:3](60) Loss=7.03125 Rate=283.82
- [xla:2](60) Loss=6.87500 Rate=283.94
- [xla:1](60) Loss=6.87500 Rate=284.16
- [xla:8](60) Loss=6.93750 Rate=283.52
- [xla:4](60) Loss=7.00000 Rate=283.78
- [xla:5](60) Loss=6.90625 Rate=283.67
- [xla:7](60) Loss=6.90625 Rate=283.50
- [xla:1](80) Loss=6.90625 Rate=309.62
- [xla:2](80) Loss=6.90625 Rate=309.09
- [xla:5](80) Loss=6.87500 Rate=309.49
- [xla:6](80) Loss=6.90625 Rate=308.72
- [xla:3](80) Loss=6.90625 Rate=308.31
- [xla:4](80) Loss=6.90625 Rate=308.94
- [xla:7](80) Loss=6.90625 Rate=308.76
- [xla:8](80) Loss=6.90625 Rate=308.72
- [xla:7](100) Loss=6.84375 Rate=310.63
- [xla:5](100) Loss=6.87500 Rate=310.10
- [xla:2](100) Loss=6.90625 Rate=309.82
- [xla:6](100) Loss=6.87500 Rate=309.32
- [xla:1](100) Loss=6.87500 Rate=309.13
- [xla:3](100) Loss=6.90625 Rate=309.48
- [xla:4](100) Loss=6.87500 Rate=309.83
- [xla:8](100) Loss=6.87500 Rate=309.77
- [xla:1](120) Loss=6.81250 Rate=309.53
- [xla:8](120) Loss=6.87500 Rate=309.70
- [xla:4](120) Loss=6.81250 Rate=309.59
- [xla:2](120) Loss=6.87500 Rate=308.64
- [xla:5](120) Loss=6.78125 Rate=308.65
- [xla:3](120) Loss=6.81250 Rate=308.97
- [xla:7](120) Loss=6.78125 Rate=308.63
- [xla:6](120) Loss=6.87500 Rate=308.65
- [xla:5](140) Loss=6.75000 Rate=311.59
- [xla:1](140) Loss=6.78125 Rate=311.10
- [xla:8](140) Loss=6.75000 Rate=311.21
- [xla:4](140) Loss=6.78125 Rate=311.22
- [xla:3](140) Loss=6.75000 Rate=311.43
- [xla:6](140) Loss=6.81250 Rate=311.47
- [xla:7](140) Loss=6.71875 Rate=310.95
- [xla:2](140) Loss=6.71875 Rate=310.81
- [xla:2](160) Loss=6.71875 Rate=309.23
- [xla:5](160) Loss=6.71875 Rate=308.86
- [xla:1](160) Loss=6.68750 Rate=308.14
- [xla:4](160) Loss=6.78125 Rate=308.26
- [xla:3](160) Loss=6.71875 Rate=308.36
- [xla:8](160) Loss=6.75000 Rate=308.24
- [xla:7](160) Loss=6.75000 Rate=308.45
- [xla:6](160) Loss=6.68750 Rate=308.32
- [xla:5](180) Loss=6.65625 Rate=307.43
- [xla:8](180) Loss=6.65625 Rate=307.89
- [xla:4](180) Loss=6.78125 Rate=307.66
- [xla:3](180) Loss=6.71875 Rate=307.64
- [xla:2](180) Loss=6.65625 Rate=307.27
- [xla:1](180) Loss=6.75000 Rate=307.22
- [xla:6](180) Loss=6.65625 Rate=307.41
- [xla:7](180) Loss=6.65625 Rate=307.45
- [xla:7](200) Loss=6.65625 Rate=308.01
- [xla:5](200) Loss=6.56250 Rate=306.18
- [xla:1](200) Loss=6.59375 Rate=306.58
- [xla:2](200) Loss=6.65625 Rate=306.16
- [xla:6](200) Loss=6.62500 Rate=306.47
- [xla:3](200) Loss=6.62500 Rate=306.22
- [xla:4](200) Loss=6.53125 Rate=305.86
- [xla:8](200) Loss=6.65625 Rate=305.71
- [xla:7](220) Loss=6.56250 Rate=312.68
- [xla:3](220) Loss=6.56250 Rate=313.61
- [xla:6](220) Loss=6.53125 Rate=313.53
- [xla:4](220) Loss=6.56250 Rate=313.63
- [xla:1](220) Loss=6.53125 Rate=313.27
- [xla:2](220) Loss=6.43750 Rate=313.10
- [xla:8](220) Loss=6.56250 Rate=313.02
- [xla:5](220) Loss=6.59375 Rate=312.48
- [xla:4](240) Loss=6.50000 Rate=306.02
- [xla:1](240) Loss=6.46875 Rate=305.81
- [xla:3](240) Loss=6.50000 Rate=305.75
- [xla:8](240) Loss=6.53125 Rate=305.89
- [xla:7](240) Loss=6.46875 Rate=305.04
- [xla:2](240) Loss=6.50000 Rate=305.39
- [xla:5](240) Loss=6.50000 Rate=305.34
- [xla:6](240) Loss=6.53125 Rate=305.21
- [xla:2](260) Loss=6.46875 Rate=309.30
- [xla:3](260) Loss=6.53125 Rate=308.64
- [xla:1](260) Loss=6.46875 Rate=308.65
- [xla:8](260) Loss=6.46875 Rate=308.96
- [xla:5](260) Loss=6.46875 Rate=309.10
- [xla:7](260) Loss=6.50000 Rate=308.58
- [xla:4](260) Loss=6.53125 Rate=308.10
- [xla:6](260) Loss=6.37500 Rate=308.55
- [xla:2](280) Loss=6.37500 Rate=314.96
- [xla:6](280) Loss=6.28125 Rate=315.49
- [xla:7](280) Loss=6.40625 Rate=303.91
- [xla:5](280) Loss=6.34375 Rate=303.57
- [xla:1](280) Loss=6.37500 Rate=302.94
- [xla:8](280) Loss=6.37500 Rate=303.07
- [xla:4](280) Loss=6.34375 Rate=303.17
- [xla:3](280) Loss=6.43750 Rate=302.72
- [xla:6](300) Loss=6.37500 Rate=301.97
- [xla:8](300) Loss=6.28125 Rate=308.08
- [xla:5](300) Loss=6.31250 Rate=307.81
- [xla:1](300) Loss=6.28125 Rate=307.77
- [xla:7](300) Loss=6.28125 Rate=307.03
- [xla:2](300) Loss=6.28125 Rate=300.87
- [xla:4](300) Loss=6.28125 Rate=307.54
- [xla:3](300) Loss=6.37500 Rate=307.25
- [xla:1](320) Loss=6.34375 Rate=302.78
- [xla:3](320) Loss=6.28125 Rate=303.11
- [xla:6](320) Loss=6.28125 Rate=299.73
- [xla:8](320) Loss=6.12500 Rate=302.46
- [xla:4](320) Loss=6.28125 Rate=302.79
- [xla:7](320) Loss=6.09375 Rate=301.87
- [xla:2](320) Loss=6.25000 Rate=299.52
- [xla:5](320) Loss=6.21875 Rate=301.76
- [xla:3](340) Loss=6.25000 Rate=306.19
- [xla:1](340) Loss=6.25000 Rate=305.95
- [xla:4](340) Loss=6.21875 Rate=306.20
- [xla:7](340) Loss=6.15625 Rate=306.40
- [xla:8](340) Loss=6.21875 Rate=305.60
- [xla:5](340) Loss=6.21875 Rate=305.82
- [xla:6](340) Loss=6.03125 Rate=304.21
- [xla:2](340) Loss=6.25000 Rate=304.85
- [xla:5](360) Loss=6.06250 Rate=313.22
- [xla:1](360) Loss=6.12500 Rate=312.44
- [xla:4](360) Loss=6.21875 Rate=312.25
- [xla:2](360) Loss=6.00000 Rate=312.01
- [xla:8](360) Loss=6.25000 Rate=312.14
- [xla:7](360) Loss=6.03125 Rate=311.98
- [xla:3](360) Loss=6.12500 Rate=311.83
- [xla:6](360) Loss=6.12500 Rate=311.68
- [xla:4](380) Loss=6.12500 Rate=309.61
- [xla:3](380) Loss=6.09375 Rate=309.78
- [xla:7](380) Loss=6.31250 Rate=309.78
- [xla:2](380) Loss=6.03125 Rate=309.71
- [xla:1](380) Loss=6.09375 Rate=309.27
- [xla:5](380) Loss=6.03125 Rate=309.02
- [xla:6](380) Loss=6.00000 Rate=309.25
- [xla:8](380) Loss=6.18750 Rate=309.20
- [xla:5](400) Loss=6.06250 Rate=309.15
- [xla:4](400) Loss=6.06250 Rate=308.55
- [xla:3](400) Loss=6.09375 Rate=308.64
- [xla:8](400) Loss=5.93750 Rate=309.08
- [xla:7](400) Loss=5.96875 Rate=308.66
- [xla:1](400) Loss=6.00000 Rate=308.51
- [xla:2](400) Loss=6.00000 Rate=308.68
- [xla:6](400) Loss=6.03125 Rate=308.89
- [xla:8](420) Loss=6.00000 Rate=301.12
- [xla:5](420) Loss=5.93750 Rate=300.56
- [xla:2](420) Loss=6.09375 Rate=300.70
- [xla:6](420) Loss=5.96875 Rate=300.61
- [xla:3](420) Loss=6.00000 Rate=300.48
- [xla:1](420) Loss=5.93750 Rate=299.95
- [xla:7](420) Loss=5.96875 Rate=299.99
- [xla:4](420) Loss=5.90625 Rate=299.88
- [xla:4](440) Loss=5.93750 Rate=308.50
- [xla:2](440) Loss=5.90625 Rate=307.96
- [xla:7](440) Loss=5.90625 Rate=308.38
- [xla:8](440) Loss=5.93750 Rate=307.73
- [xla:5](440) Loss=5.93750 Rate=307.79
- [xla:6](440) Loss=5.78125 Rate=308.01
- [xla:1](440) Loss=5.84375 Rate=308.30
- [xla:3](440) Loss=5.78125 Rate=307.94
- [xla:1](460) Loss=5.81250 Rate=309.51
- [xla:2](460) Loss=6.03125 Rate=309.09
- [xla:7](460) Loss=5.84375 Rate=309.21
- [xla:8](460) Loss=5.87500 Rate=308.95
- [xla:5](460) Loss=5.81250 Rate=308.71
- [xla:6](460) Loss=6.03125 Rate=308.52
- [xla:4](460) Loss=5.81250 Rate=308.55
- [xla:3](460) Loss=5.78125 Rate=308.43
- [xla:2](480) Loss=5.87500 Rate=312.67
- [xla:8](480) Loss=5.96875 Rate=312.72
- [xla:6](480) Loss=5.84375 Rate=312.71
- [xla:7](480) Loss=5.78125 Rate=312.23
- [xla:1](480) Loss=5.71875 Rate=312.05
- [xla:4](480) Loss=5.93750 Rate=312.55
- [xla:3](480) Loss=5.81250 Rate=312.02
- [xla:5](480) Loss=5.87500 Rate=311.76
- [xla:5](500) Loss=5.71875 Rate=314.64
- [xla:4](500) Loss=5.78125 Rate=314.22
- [xla:6](500) Loss=5.75000 Rate=314.06
- [xla:8](500) Loss=5.84375 Rate=313.42
- [xla:1](500) Loss=5.71875 Rate=313.64
- [xla:7](500) Loss=5.68750 Rate=313.69
- [xla:2](500) Loss=5.71875 Rate=313.26
- [xla:3](500) Loss=5.62500 Rate=314.19
- [xla:1](520) Loss=5.71875 Rate=312.99
- [xla:8](520) Loss=5.87500 Rate=312.55
- [xla:3](520) Loss=5.62500 Rate=312.94
- [xla:5](520) Loss=5.62500 Rate=312.39
- [xla:7](520) Loss=5.71875 Rate=312.52
- [xla:4](520) Loss=5.59375 Rate=312.38
- [xla:2](520) Loss=5.65625 Rate=312.17
- [xla:6](520) Loss=5.75000 Rate=310.40
- [xla:6](540) Loss=5.59375 Rate=308.39
- [xla:3](540) Loss=5.75000 Rate=307.31/anaconda3/envs/pytorch-nightly/lib/python3.6/site-packages/PIL/TiffImagePlugin.py:804: UserWarning: Corrupt EXIF data. Expect
- ing to read 4 bytes but only got 0.
- warnings.warn(str(msg))
- [xla:4](540) Loss=5.56250 Rate=307.27
- [xla:2](540) Loss=5.68750 Rate=307.36
- [xla:5](540) Loss=5.56250 Rate=307.15
- [xla:7](540) Loss=5.65625 Rate=307.11
- [xla:8](540) Loss=5.75000 Rate=306.51
- [xla:1](540) Loss=5.50000 Rate=306.38
- [xla:2](560) Loss=5.65625 Rate=311.17
- [xla:4](560) Loss=5.65625 Rate=310.55
- [xla:8](560) Loss=5.53125 Rate=310.86
- [xla:1](560) Loss=5.71875 Rate=310.83
- [xla:6](560) Loss=5.59375 Rate=310.87
- [xla:7](560) Loss=5.65625 Rate=310.60
- [xla:5](560) Loss=5.50000 Rate=310.02
- [xla:3](560) Loss=5.46875 Rate=309.95
- [xla:2](580) Loss=5.53125 Rate=317.80
- [xla:1](580) Loss=5.43750 Rate=317.97
- [xla:8](580) Loss=5.59375 Rate=317.67
- [xla:7](580) Loss=5.40625 Rate=317.64
- [xla:3](580) Loss=5.37500 Rate=317.72
- [xla:5](580) Loss=5.53125 Rate=317.74
- [xla:6](580) Loss=5.65625 Rate=317.53
- [xla:4](580) Loss=5.43750 Rate=317.27
- [xla:2](600) Loss=5.37500 Rate=320.07
- [xla:5](600) Loss=5.31250 Rate=319.24
- [xla:4](600) Loss=5.59375 Rate=317.05
- [xla:8](600) Loss=5.31250 Rate=316.79
- [xla:6](600) Loss=5.50000 Rate=316.96
- [xla:7](600) Loss=5.56250 Rate=316.76
- [xla:3](600) Loss=5.59375 Rate=316.94
- [xla:1](600) Loss=5.59375 Rate=316.61
- [xla:7](620) Loss=5.50000 Rate=356.96
- [xla:6](620) Loss=5.25000 Rate=356.98
- [xla:3](620) Loss=5.34375 Rate=356.99
- [xla:1](620) Loss=5.37500 Rate=356.74
- [xla:5](620) Loss=5.40625 Rate=354.51
- [xla:4](620) Loss=5.31250 Rate=356.54
- [xla:2](620) Loss=5.34375 Rate=351.73
- [xla:8](620) Loss=5.50000 Rate=355.80
- [xla:7] Accuracy=6.48%
- [xla:8] Accuracy=5.45%
- [xla:1] Accuracy=5.39%
- [xla:5] Accuracy=6.33%
- [xla:6] Accuracy=4.48%
- [xla:2] Accuracy=4.28%
- [xla:4] Accuracy=5.40%
- [xla:3] Accuracy=5.03%
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement