Advertisement
Guest User

Untitled

a guest
Sep 16th, 2019
128
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 12.84 KB | None | 0 0
  1. (pytorch-nightly) jysohn@n1-standard-64-check-no-sharding-regression:/usr/share/torch-xla-nightly/pytorch/xla$ XLA_USE_BF16=1 python test/test_train_imagenet.py --d
  2. atadir=/var/datasets/imagenet --num_workers=64 --batch_size=256 |& sudo tee sharding_logs.txt
  3. 2019-09-05 00:32:04.321948: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) CPU:0 -> /job:tpu_worker/replica:0/task:0/device:
  4. XLA_CPU:0
  5. 2019-09-05 00:32:04.321991: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:0 -> /job:tpu_worker/replica:0/task:0/device:
  6. TPU:0
  7. 2019-09-05 00:32:04.321997: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:1 -> /job:tpu_worker/replica:0/task:0/device:
  8. TPU:1
  9. 2019-09-05 00:32:04.322002: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:2 -> /job:tpu_worker/replica:0/task:0/device:
  10. TPU:2
  11. 2019-09-05 00:32:04.322006: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:3 -> /job:tpu_worker/replica:0/task:0/device:
  12. TPU:3
  13. 2019-09-05 00:32:04.322011: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:4 -> /job:tpu_worker/replica:0/task:0/device:
  14. TPU:4
  15. 2019-09-05 00:32:04.322016: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:5 -> /job:tpu_worker/replica:0/task:0/device:
  16. TPU:5
  17. 2019-09-05 00:32:04.322020: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:6 -> /job:tpu_worker/replica:0/task:0/device:
  18. TPU:6
  19. 2019-09-05 00:32:04.322024: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:7 -> /job:tpu_worker/replica:0/task:0/device:
  20. TPU:7
  21. 2019-09-05 00:32:04.322043: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:200] Worker grpc://10.7.7.2:8470 for /job:tpu_worker/replica:0/task:0
  22. 2019-09-05 00:32:04.322048: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:204] XRT default device: TPU:0
  23. 2019-09-05 00:32:04.322065: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:1082] Configuring TPU for worker tpu_worker:0 at grpc://10.7.7.2:8470
  24. 2019-09-05 00:32:07.748478: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:1093] TPU topology: mesh_shape: 2
  25. mesh_shape: 2
  26. mesh_shape: 2
  27. num_tasks: 1
  28. num_tpu_devices_per_task: 8
  29. device_coordinates: 0
  30. device_coordinates: 0
  31. device_coordinates: 0
  32. device_coordinates: 0
  33. device_coordinates: 0
  34. device_coordinates: 1
  35. device_coordinates: 0
  36. device_coordinates: 1
  37. device_coordinates: 0
  38. device_coordinates: 0
  39. device_coordinates: 1
  40. device_coordinates: 1
  41. device_coordinates: 1
  42. device_coordinates: 0
  43. device_coordinates: 0
  44. device_coordinates: 1
  45. device_coordinates: 0
  46. device_coordinates: 1
  47. device_coordinates: 1
  48. device_coordinates: 1
  49. device_coordinates: 0
  50. device_coordinates: 1
  51. device_coordinates: 1
  52. device_coordinates: 1
  53. 2019-09-05 00:32:13.139790: I torch_xla/csrc/tensor_util.cpp:27] Using BF16 data type for floating point values
  54. ==> Preparing data..
  55. [xla:1](0) Loss=7.12500 Rate=11.66
  56. [xla:7](0) Loss=7.09375 Rate=11.62
  57. [xla:8](0) Loss=7.03125 Rate=11.62
  58. [xla:2](0) Loss=7.09375 Rate=11.49
  59. [xla:5](0) Loss=7.15625 Rate=11.42
  60. [xla:6](0) Loss=7.15625 Rate=11.33
  61. [xla:3](0) Loss=7.12500 Rate=11.30
  62. [xla:4](0) Loss=7.09375 Rate=11.29
  63. [xla:3](20) Loss=7.34375 Rate=102.42
  64. [xla:4](20) Loss=7.37500 Rate=102.60
  65. [xla:8](20) Loss=7.18750 Rate=100.71
  66. [xla:2](20) Loss=7.09375 Rate=101.11
  67. [xla:1](20) Loss=7.40625 Rate=99.81
  68. [xla:6](20) Loss=7.50000 Rate=102.03
  69. [xla:7](20) Loss=7.34375 Rate=100.46
  70. [xla:5](20) Loss=7.21875 Rate=101.50
  71. [xla:3](40) Loss=7.15625 Rate=237.17
  72. [xla:2](40) Loss=7.50000 Rate=237.71
  73. [xla:5](40) Loss=7.00000 Rate=237.44
  74. [xla:6](40) Loss=7.09375 Rate=237.64
  75. [xla:8](40) Loss=6.93750 Rate=237.05
  76. [xla:7](40) Loss=7.00000 Rate=237.01
  77. [xla:4](40) Loss=7.00000 Rate=236.74
  78. [xla:1](40) Loss=6.96875 Rate=236.29
  79. [xla:6](60) Loss=6.93750 Rate=284.90
  80. [xla:3](60) Loss=7.03125 Rate=283.82
  81. [xla:2](60) Loss=6.87500 Rate=283.94
  82. [xla:1](60) Loss=6.87500 Rate=284.16
  83. [xla:8](60) Loss=6.93750 Rate=283.52
  84. [xla:4](60) Loss=7.00000 Rate=283.78
  85. [xla:5](60) Loss=6.90625 Rate=283.67
  86. [xla:7](60) Loss=6.90625 Rate=283.50
  87. [xla:1](80) Loss=6.90625 Rate=309.62
  88. [xla:2](80) Loss=6.90625 Rate=309.09
  89. [xla:5](80) Loss=6.87500 Rate=309.49
  90. [xla:6](80) Loss=6.90625 Rate=308.72
  91. [xla:3](80) Loss=6.90625 Rate=308.31
  92. [xla:4](80) Loss=6.90625 Rate=308.94
  93. [xla:7](80) Loss=6.90625 Rate=308.76
  94. [xla:8](80) Loss=6.90625 Rate=308.72
  95. [xla:7](100) Loss=6.84375 Rate=310.63
  96. [xla:5](100) Loss=6.87500 Rate=310.10
  97. [xla:2](100) Loss=6.90625 Rate=309.82
  98. [xla:6](100) Loss=6.87500 Rate=309.32
  99. [xla:1](100) Loss=6.87500 Rate=309.13
  100. [xla:3](100) Loss=6.90625 Rate=309.48
  101. [xla:4](100) Loss=6.87500 Rate=309.83
  102. [xla:8](100) Loss=6.87500 Rate=309.77
  103. [xla:1](120) Loss=6.81250 Rate=309.53
  104. [xla:8](120) Loss=6.87500 Rate=309.70
  105. [xla:4](120) Loss=6.81250 Rate=309.59
  106. [xla:2](120) Loss=6.87500 Rate=308.64
  107. [xla:5](120) Loss=6.78125 Rate=308.65
  108. [xla:3](120) Loss=6.81250 Rate=308.97
  109. [xla:7](120) Loss=6.78125 Rate=308.63
  110. [xla:6](120) Loss=6.87500 Rate=308.65
  111. [xla:5](140) Loss=6.75000 Rate=311.59
  112. [xla:1](140) Loss=6.78125 Rate=311.10
  113. [xla:8](140) Loss=6.75000 Rate=311.21
  114. [xla:4](140) Loss=6.78125 Rate=311.22
  115. [xla:3](140) Loss=6.75000 Rate=311.43
  116. [xla:6](140) Loss=6.81250 Rate=311.47
  117. [xla:7](140) Loss=6.71875 Rate=310.95
  118. [xla:2](140) Loss=6.71875 Rate=310.81
  119. [xla:2](160) Loss=6.71875 Rate=309.23
  120. [xla:5](160) Loss=6.71875 Rate=308.86
  121. [xla:1](160) Loss=6.68750 Rate=308.14
  122. [xla:4](160) Loss=6.78125 Rate=308.26
  123. [xla:3](160) Loss=6.71875 Rate=308.36
  124. [xla:8](160) Loss=6.75000 Rate=308.24
  125. [xla:7](160) Loss=6.75000 Rate=308.45
  126. [xla:6](160) Loss=6.68750 Rate=308.32
  127. [xla:5](180) Loss=6.65625 Rate=307.43
  128. [xla:8](180) Loss=6.65625 Rate=307.89
  129. [xla:4](180) Loss=6.78125 Rate=307.66
  130. [xla:3](180) Loss=6.71875 Rate=307.64
  131. [xla:2](180) Loss=6.65625 Rate=307.27
  132. [xla:1](180) Loss=6.75000 Rate=307.22
  133. [xla:6](180) Loss=6.65625 Rate=307.41
  134. [xla:7](180) Loss=6.65625 Rate=307.45
  135. [xla:7](200) Loss=6.65625 Rate=308.01
  136. [xla:5](200) Loss=6.56250 Rate=306.18
  137. [xla:1](200) Loss=6.59375 Rate=306.58
  138. [xla:2](200) Loss=6.65625 Rate=306.16
  139. [xla:6](200) Loss=6.62500 Rate=306.47
  140. [xla:3](200) Loss=6.62500 Rate=306.22
  141. [xla:4](200) Loss=6.53125 Rate=305.86
  142. [xla:8](200) Loss=6.65625 Rate=305.71
  143. [xla:7](220) Loss=6.56250 Rate=312.68
  144. [xla:3](220) Loss=6.56250 Rate=313.61
  145. [xla:6](220) Loss=6.53125 Rate=313.53
  146. [xla:4](220) Loss=6.56250 Rate=313.63
  147. [xla:1](220) Loss=6.53125 Rate=313.27
  148. [xla:2](220) Loss=6.43750 Rate=313.10
  149. [xla:8](220) Loss=6.56250 Rate=313.02
  150. [xla:5](220) Loss=6.59375 Rate=312.48
  151. [xla:4](240) Loss=6.50000 Rate=306.02
  152. [xla:1](240) Loss=6.46875 Rate=305.81
  153. [xla:3](240) Loss=6.50000 Rate=305.75
  154. [xla:8](240) Loss=6.53125 Rate=305.89
  155. [xla:7](240) Loss=6.46875 Rate=305.04
  156. [xla:2](240) Loss=6.50000 Rate=305.39
  157. [xla:5](240) Loss=6.50000 Rate=305.34
  158. [xla:6](240) Loss=6.53125 Rate=305.21
  159. [xla:2](260) Loss=6.46875 Rate=309.30
  160. [xla:3](260) Loss=6.53125 Rate=308.64
  161. [xla:1](260) Loss=6.46875 Rate=308.65
  162. [xla:8](260) Loss=6.46875 Rate=308.96
  163. [xla:5](260) Loss=6.46875 Rate=309.10
  164. [xla:7](260) Loss=6.50000 Rate=308.58
  165. [xla:4](260) Loss=6.53125 Rate=308.10
  166. [xla:6](260) Loss=6.37500 Rate=308.55
  167. [xla:2](280) Loss=6.37500 Rate=314.96
  168. [xla:6](280) Loss=6.28125 Rate=315.49
  169. [xla:7](280) Loss=6.40625 Rate=303.91
  170. [xla:5](280) Loss=6.34375 Rate=303.57
  171. [xla:1](280) Loss=6.37500 Rate=302.94
  172. [xla:8](280) Loss=6.37500 Rate=303.07
  173. [xla:4](280) Loss=6.34375 Rate=303.17
  174. [xla:3](280) Loss=6.43750 Rate=302.72
  175. [xla:6](300) Loss=6.37500 Rate=301.97
  176. [xla:8](300) Loss=6.28125 Rate=308.08
  177. [xla:5](300) Loss=6.31250 Rate=307.81
  178. [xla:1](300) Loss=6.28125 Rate=307.77
  179. [xla:7](300) Loss=6.28125 Rate=307.03
  180. [xla:2](300) Loss=6.28125 Rate=300.87
  181. [xla:4](300) Loss=6.28125 Rate=307.54
  182. [xla:3](300) Loss=6.37500 Rate=307.25
  183. [xla:1](320) Loss=6.34375 Rate=302.78
  184. [xla:3](320) Loss=6.28125 Rate=303.11
  185. [xla:6](320) Loss=6.28125 Rate=299.73
  186. [xla:8](320) Loss=6.12500 Rate=302.46
  187. [xla:4](320) Loss=6.28125 Rate=302.79
  188. [xla:7](320) Loss=6.09375 Rate=301.87
  189. [xla:2](320) Loss=6.25000 Rate=299.52
  190. [xla:5](320) Loss=6.21875 Rate=301.76
  191. [xla:3](340) Loss=6.25000 Rate=306.19
  192. [xla:1](340) Loss=6.25000 Rate=305.95
  193. [xla:4](340) Loss=6.21875 Rate=306.20
  194. [xla:7](340) Loss=6.15625 Rate=306.40
  195. [xla:8](340) Loss=6.21875 Rate=305.60
  196. [xla:5](340) Loss=6.21875 Rate=305.82
  197. [xla:6](340) Loss=6.03125 Rate=304.21
  198. [xla:2](340) Loss=6.25000 Rate=304.85
  199. [xla:5](360) Loss=6.06250 Rate=313.22
  200. [xla:1](360) Loss=6.12500 Rate=312.44
  201. [xla:4](360) Loss=6.21875 Rate=312.25
  202. [xla:2](360) Loss=6.00000 Rate=312.01
  203. [xla:8](360) Loss=6.25000 Rate=312.14
  204. [xla:7](360) Loss=6.03125 Rate=311.98
  205. [xla:3](360) Loss=6.12500 Rate=311.83
  206. [xla:6](360) Loss=6.12500 Rate=311.68
  207. [xla:4](380) Loss=6.12500 Rate=309.61
  208. [xla:3](380) Loss=6.09375 Rate=309.78
  209. [xla:7](380) Loss=6.31250 Rate=309.78
  210. [xla:2](380) Loss=6.03125 Rate=309.71
  211. [xla:1](380) Loss=6.09375 Rate=309.27
  212. [xla:5](380) Loss=6.03125 Rate=309.02
  213. [xla:6](380) Loss=6.00000 Rate=309.25
  214. [xla:8](380) Loss=6.18750 Rate=309.20
  215. [xla:5](400) Loss=6.06250 Rate=309.15
  216. [xla:4](400) Loss=6.06250 Rate=308.55
  217. [xla:3](400) Loss=6.09375 Rate=308.64
  218. [xla:8](400) Loss=5.93750 Rate=309.08
  219. [xla:7](400) Loss=5.96875 Rate=308.66
  220. [xla:1](400) Loss=6.00000 Rate=308.51
  221. [xla:2](400) Loss=6.00000 Rate=308.68
  222. [xla:6](400) Loss=6.03125 Rate=308.89
  223. [xla:8](420) Loss=6.00000 Rate=301.12
  224. [xla:5](420) Loss=5.93750 Rate=300.56
  225. [xla:2](420) Loss=6.09375 Rate=300.70
  226. [xla:6](420) Loss=5.96875 Rate=300.61
  227. [xla:3](420) Loss=6.00000 Rate=300.48
  228. [xla:1](420) Loss=5.93750 Rate=299.95
  229. [xla:7](420) Loss=5.96875 Rate=299.99
  230. [xla:4](420) Loss=5.90625 Rate=299.88
  231. [xla:4](440) Loss=5.93750 Rate=308.50
  232. [xla:2](440) Loss=5.90625 Rate=307.96
  233. [xla:7](440) Loss=5.90625 Rate=308.38
  234. [xla:8](440) Loss=5.93750 Rate=307.73
  235. [xla:5](440) Loss=5.93750 Rate=307.79
  236. [xla:6](440) Loss=5.78125 Rate=308.01
  237. [xla:1](440) Loss=5.84375 Rate=308.30
  238. [xla:3](440) Loss=5.78125 Rate=307.94
  239. [xla:1](460) Loss=5.81250 Rate=309.51
  240. [xla:2](460) Loss=6.03125 Rate=309.09
  241. [xla:7](460) Loss=5.84375 Rate=309.21
  242. [xla:8](460) Loss=5.87500 Rate=308.95
  243. [xla:5](460) Loss=5.81250 Rate=308.71
  244. [xla:6](460) Loss=6.03125 Rate=308.52
  245. [xla:4](460) Loss=5.81250 Rate=308.55
  246. [xla:3](460) Loss=5.78125 Rate=308.43
  247. [xla:2](480) Loss=5.87500 Rate=312.67
  248. [xla:8](480) Loss=5.96875 Rate=312.72
  249. [xla:6](480) Loss=5.84375 Rate=312.71
  250. [xla:7](480) Loss=5.78125 Rate=312.23
  251. [xla:1](480) Loss=5.71875 Rate=312.05
  252. [xla:4](480) Loss=5.93750 Rate=312.55
  253. [xla:3](480) Loss=5.81250 Rate=312.02
  254. [xla:5](480) Loss=5.87500 Rate=311.76
  255. [xla:5](500) Loss=5.71875 Rate=314.64
  256. [xla:4](500) Loss=5.78125 Rate=314.22
  257. [xla:6](500) Loss=5.75000 Rate=314.06
  258. [xla:8](500) Loss=5.84375 Rate=313.42
  259. [xla:1](500) Loss=5.71875 Rate=313.64
  260. [xla:7](500) Loss=5.68750 Rate=313.69
  261. [xla:2](500) Loss=5.71875 Rate=313.26
  262. [xla:3](500) Loss=5.62500 Rate=314.19
  263. [xla:1](520) Loss=5.71875 Rate=312.99
  264. [xla:8](520) Loss=5.87500 Rate=312.55
  265. [xla:3](520) Loss=5.62500 Rate=312.94
  266. [xla:5](520) Loss=5.62500 Rate=312.39
  267. [xla:7](520) Loss=5.71875 Rate=312.52
  268. [xla:4](520) Loss=5.59375 Rate=312.38
  269. [xla:2](520) Loss=5.65625 Rate=312.17
  270. [xla:6](520) Loss=5.75000 Rate=310.40
  271. [xla:6](540) Loss=5.59375 Rate=308.39
  272. [xla:3](540) Loss=5.75000 Rate=307.31/anaconda3/envs/pytorch-nightly/lib/python3.6/site-packages/PIL/TiffImagePlugin.py:804: UserWarning: Corrupt EXIF data. Expect
  273. ing to read 4 bytes but only got 0.
  274. warnings.warn(str(msg))
  275. [xla:4](540) Loss=5.56250 Rate=307.27
  276. [xla:2](540) Loss=5.68750 Rate=307.36
  277. [xla:5](540) Loss=5.56250 Rate=307.15
  278. [xla:7](540) Loss=5.65625 Rate=307.11
  279. [xla:8](540) Loss=5.75000 Rate=306.51
  280. [xla:1](540) Loss=5.50000 Rate=306.38
  281. [xla:2](560) Loss=5.65625 Rate=311.17
  282. [xla:4](560) Loss=5.65625 Rate=310.55
  283. [xla:8](560) Loss=5.53125 Rate=310.86
  284. [xla:1](560) Loss=5.71875 Rate=310.83
  285. [xla:6](560) Loss=5.59375 Rate=310.87
  286. [xla:7](560) Loss=5.65625 Rate=310.60
  287. [xla:5](560) Loss=5.50000 Rate=310.02
  288. [xla:3](560) Loss=5.46875 Rate=309.95
  289. [xla:2](580) Loss=5.53125 Rate=317.80
  290. [xla:1](580) Loss=5.43750 Rate=317.97
  291. [xla:8](580) Loss=5.59375 Rate=317.67
  292. [xla:7](580) Loss=5.40625 Rate=317.64
  293. [xla:3](580) Loss=5.37500 Rate=317.72
  294. [xla:5](580) Loss=5.53125 Rate=317.74
  295. [xla:6](580) Loss=5.65625 Rate=317.53
  296. [xla:4](580) Loss=5.43750 Rate=317.27
  297. [xla:2](600) Loss=5.37500 Rate=320.07
  298. [xla:5](600) Loss=5.31250 Rate=319.24
  299. [xla:4](600) Loss=5.59375 Rate=317.05
  300. [xla:8](600) Loss=5.31250 Rate=316.79
  301. [xla:6](600) Loss=5.50000 Rate=316.96
  302. [xla:7](600) Loss=5.56250 Rate=316.76
  303. [xla:3](600) Loss=5.59375 Rate=316.94
  304. [xla:1](600) Loss=5.59375 Rate=316.61
  305. [xla:7](620) Loss=5.50000 Rate=356.96
  306. [xla:6](620) Loss=5.25000 Rate=356.98
  307. [xla:3](620) Loss=5.34375 Rate=356.99
  308. [xla:1](620) Loss=5.37500 Rate=356.74
  309. [xla:5](620) Loss=5.40625 Rate=354.51
  310. [xla:4](620) Loss=5.31250 Rate=356.54
  311. [xla:2](620) Loss=5.34375 Rate=351.73
  312. [xla:8](620) Loss=5.50000 Rate=355.80
  313. [xla:7] Accuracy=6.48%
  314. [xla:8] Accuracy=5.45%
  315. [xla:1] Accuracy=5.39%
  316. [xla:5] Accuracy=6.33%
  317. [xla:6] Accuracy=4.48%
  318. [xla:2] Accuracy=4.28%
  319. [xla:4] Accuracy=5.40%
  320. [xla:3] Accuracy=5.03%
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement