Advertisement
Guest User

server1: slurmctld.log

a guest
Jul 26th, 2024
191
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 149.78 KB | None | 0 0
  1. [2024-07-26T13:13:49.562] debug: slurmctld log levels: stderr=quiet logfile=debug2 syslog=fatal
  2. [2024-07-26T13:13:49.562] debug: Log file re-opened
  3. [2024-07-26T13:13:49.564] debug: sched: slurmctld starting
  4. [2024-07-26T13:13:49.565] slurmscriptd: debug: slurmscriptd: Got ack from slurmctld, initialization successful
  5. [2024-07-26T13:13:49.565] slurmscriptd: debug: _slurmscriptd_mainloop: started
  6. [2024-07-26T13:13:49.565] debug: slurmctld: slurmscriptd fork()'d and initialized.
  7. [2024-07-26T13:13:49.565] debug: _slurmctld_listener_thread: started listening to slurmscriptd
  8. [2024-07-26T13:13:49.565] slurmctld version 22.05.8 started on cluster dlabcluster
  9. [2024-07-26T13:13:49.566] cred/munge: init: Munge credential signature plugin loaded
  10. [2024-07-26T13:13:49.567] debug: auth/munge: init: Munge authentication plugin loaded
  11. [2024-07-26T13:13:49.567] select/cons_res: common_init: select/cons_res loaded
  12. [2024-07-26T13:13:49.568] select/cons_tres: common_init: select/cons_tres loaded
  13. [2024-07-26T13:13:49.568] select/cray_aries: init: Cray/Aries node selection plugin loaded
  14. [2024-07-26T13:13:49.568] preempt/none: init: preempt/none loaded
  15. [2024-07-26T13:13:49.568] debug: acct_gather_energy/none: init: AcctGatherEnergy NONE plugin loaded
  16. [2024-07-26T13:13:49.568] debug: acct_gather_profile/none: init: AcctGatherProfile NONE plugin loaded
  17. [2024-07-26T13:13:49.568] debug: acct_gather_interconnect/none: init: AcctGatherInterconnect NONE plugin loaded
  18. [2024-07-26T13:13:49.568] debug: acct_gather_filesystem/none: init: AcctGatherFilesystem NONE plugin loaded
  19. [2024-07-26T13:13:49.568] debug2: No acct_gather.conf file (/etc/slurm/acct_gather.conf)
  20. [2024-07-26T13:13:49.568] debug: jobacct_gather/none: init: Job accounting gather NOT_INVOKED plugin loaded
  21. [2024-07-26T13:13:49.569] ext_sensors/none: init: ExtSensors NONE plugin loaded
  22. [2024-07-26T13:13:49.569] debug: MPI: Loading all types
  23. [2024-07-26T13:13:49.570] debug: mpi/pmix_v4: init: PMIx plugin loaded
  24. [2024-07-26T13:13:49.570] debug2: No mpi.conf file (/etc/slurm/mpi.conf)
  25. [2024-07-26T13:13:49.573] accounting_storage/none: init: Accounting storage NOT INVOKED plugin loaded
  26. [2024-07-26T13:13:49.573] debug: switch Cray/Aries plugin loaded.
  27. [2024-07-26T13:13:49.573] debug: switch/none: init: switch NONE plugin loaded
  28. [2024-07-26T13:13:49.573] debug: Reading slurm.conf file: /etc/slurm/slurm.conf
  29. [2024-07-26T13:13:49.574] No memory enforcing mechanism configured.
  30. [2024-07-26T13:13:49.574] topology/none: init: topology NONE plugin loaded
  31. [2024-07-26T13:13:49.574] debug: No DownNodes
  32. [2024-07-26T13:13:49.577] debug: slurmctld log levels: stderr=quiet logfile=debug2 syslog=fatal
  33. [2024-07-26T13:13:49.577] debug: Log file re-opened
  34. [2024-07-26T13:13:49.578] sched: Backfill scheduler plugin loaded
  35. [2024-07-26T13:13:49.578] route/default: init: route default plugin loaded
  36. [2024-07-26T13:13:49.578] Recovered state of 3 nodes
  37. [2024-07-26T13:13:49.578] Down nodes: server[2-3]
  38. [2024-07-26T13:13:49.579] Recovered JobId=314 Assoc=0
  39. [2024-07-26T13:13:49.579] debug: starting JobId=314 in accounting
  40. [2024-07-26T13:13:49.579] Recovered information about 1 jobs
  41. [2024-07-26T13:13:49.579] select/cons_tres: select_p_node_init: select/cons_tres SelectTypeParameters not specified, using default value: CR_Core_Memory
  42. [2024-07-26T13:13:49.579] select/cons_tres: part_data_create_array: select/cons_tres: preparing for 1 partitions
  43. [2024-07-26T13:13:49.579] debug: Updating partition uid access list
  44. [2024-07-26T13:13:49.579] Recovered state of 0 reservations
  45. [2024-07-26T13:13:49.579] State of 0 triggers recovered
  46. [2024-07-26T13:13:49.579] read_slurm_conf: backup_controller not specified
  47. [2024-07-26T13:13:49.579] select/cons_tres: select_p_reconfigure: select/cons_tres: reconfigure
  48. [2024-07-26T13:13:49.579] select/cons_tres: part_data_create_array: select/cons_tres: preparing for 1 partitions
  49. [2024-07-26T13:13:49.580] debug: power_save module disabled, SuspendTime < 0
  50. [2024-07-26T13:13:49.580] Running as primary controller
  51. [2024-07-26T13:13:49.580] debug: No backup controllers, not launching heartbeat.
  52. [2024-07-26T13:13:49.580] debug: priority/basic: init: Priority BASIC plugin loaded
  53. [2024-07-26T13:13:49.580] No parameter for mcs plugin, default values set
  54. [2024-07-26T13:13:49.580] mcs: MCSParameters = (null). ondemand set.
  55. [2024-07-26T13:13:49.580] debug: mcs/none: init: mcs none plugin loaded
  56. [2024-07-26T13:13:49.580] debug2: slurmctld listening on 0.0.0.0:6817
  57. [2024-07-26T13:13:52.662] debug: hash/k12: init: init: KangarooTwelve hash plugin loaded
  58. [2024-07-26T13:13:52.662] debug2: Processing RPC: MESSAGE_NODE_REGISTRATION_STATUS from UID=0
  59. [2024-07-26T13:13:52.662] debug: gres/gpu: init: loaded
  60. [2024-07-26T13:13:52.662] debug: validate_node_specs: node server1 registered with 0 jobs
  61. [2024-07-26T13:13:52.662] debug2: _slurm_rpc_node_registration complete for server1 usec=229
  62. [2024-07-26T13:13:53.586] debug: Spawning registration agent for server[2-3] 2 hosts
  63. [2024-07-26T13:13:53.586] SchedulerParameters=default_queue_depth=100,max_rpc_cnt=0,max_sched_time=2,partition_job_depth=0,sched_max_job_start=0,sched_min_interval=2
  64. [2024-07-26T13:13:53.586] debug: sched: Running job scheduler for default depth.
  65. [2024-07-26T13:13:53.586] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  66. [2024-07-26T13:13:53.587] debug2: Tree head got back 0 looking for 2
  67. [2024-07-26T13:13:53.588] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  68. [2024-07-26T13:13:53.588] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  69. [2024-07-26T13:13:53.588] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  70. [2024-07-26T13:13:53.588] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  71. [2024-07-26T13:13:54.588] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  72. [2024-07-26T13:13:54.588] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  73. [2024-07-26T13:13:54.589] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  74. [2024-07-26T13:13:54.589] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  75. [2024-07-26T13:13:55.589] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  76. [2024-07-26T13:13:55.589] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  77. [2024-07-26T13:13:55.590] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  78. [2024-07-26T13:13:55.590] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  79. [2024-07-26T13:13:56.590] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  80. [2024-07-26T13:13:56.590] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  81. [2024-07-26T13:13:56.591] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  82. [2024-07-26T13:13:56.591] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  83. [2024-07-26T13:13:57.591] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  84. [2024-07-26T13:13:57.591] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  85. [2024-07-26T13:13:57.592] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  86. [2024-07-26T13:13:57.592] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  87. [2024-07-26T13:13:58.592] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  88. [2024-07-26T13:13:58.592] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  89. [2024-07-26T13:13:58.593] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  90. [2024-07-26T13:13:58.593] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  91. [2024-07-26T13:13:59.593] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  92. [2024-07-26T13:13:59.593] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  93. [2024-07-26T13:13:59.594] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  94. [2024-07-26T13:13:59.594] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  95. [2024-07-26T13:14:00.594] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  96. [2024-07-26T13:14:00.594] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  97. [2024-07-26T13:14:00.595] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  98. [2024-07-26T13:14:00.595] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  99. [2024-07-26T13:14:01.595] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  100. [2024-07-26T13:14:01.595] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  101. [2024-07-26T13:14:01.596] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  102. [2024-07-26T13:14:01.596] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  103. [2024-07-26T13:14:02.596] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  104. [2024-07-26T13:14:02.596] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  105. [2024-07-26T13:14:02.597] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  106. [2024-07-26T13:14:02.597] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  107. [2024-07-26T13:14:03.597] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  108. [2024-07-26T13:14:03.597] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  109. [2024-07-26T13:14:03.598] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  110. [2024-07-26T13:14:03.598] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  111. [2024-07-26T13:14:04.597] debug2: Tree head got back 1
  112. [2024-07-26T13:14:04.598] debug2: Tree head got back 2
  113. [2024-07-26T13:14:04.598] agent/is_node_resp: node:server2 RPC:REQUEST_NODE_REGISTRATION_STATUS : Communication connection failure
  114. [2024-07-26T13:14:04.598] agent/is_node_resp: node:server3 RPC:REQUEST_NODE_REGISTRATION_STATUS : Communication connection failure
  115. [2024-07-26T13:14:19.578] debug: sched/backfill: _attempt_backfill: beginning
  116. [2024-07-26T13:14:19.578] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  117. [2024-07-26T13:14:19.626] debug2: Testing job time limits and checkpoints
  118. [2024-07-26T13:14:49.669] debug2: Testing job time limits and checkpoints
  119. [2024-07-26T13:14:49.669] debug2: Performing purge of old job records
  120. [2024-07-26T13:14:49.670] debug: sched: Running job scheduler for full queue.
  121. [2024-07-26T13:14:50.579] debug: sched/backfill: _attempt_backfill: beginning
  122. [2024-07-26T13:14:50.579] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  123. [2024-07-26T13:15:19.715] debug2: Testing job time limits and checkpoints
  124. [2024-07-26T13:15:33.737] debug: Spawning ping agent for server1
  125. [2024-07-26T13:15:33.737] debug: Spawning registration agent for server[2-3] 2 hosts
  126. [2024-07-26T13:15:33.737] debug2: Spawning RPC agent for msg_type REQUEST_PING
  127. [2024-07-26T13:15:33.737] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  128. [2024-07-26T13:15:33.737] debug2: Tree head got back 0 looking for 1
  129. [2024-07-26T13:15:33.738] debug2: Tree head got back 0 looking for 2
  130. [2024-07-26T13:15:33.738] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  131. [2024-07-26T13:15:33.739] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  132. [2024-07-26T13:15:33.739] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  133. [2024-07-26T13:15:33.739] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  134. [2024-07-26T13:15:33.740] debug2: Tree head got back 1
  135. [2024-07-26T13:15:33.742] debug2: node_did_resp server1
  136. [2024-07-26T13:15:34.581] debug: sched/backfill: _attempt_backfill: beginning
  137. [2024-07-26T13:15:34.582] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  138. [2024-07-26T13:15:34.740] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  139. [2024-07-26T13:15:34.740] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  140. [2024-07-26T13:15:34.740] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  141. [2024-07-26T13:15:34.740] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  142. [2024-07-26T13:15:35.741] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  143. [2024-07-26T13:15:35.741] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  144. [2024-07-26T13:15:35.741] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  145. [2024-07-26T13:15:35.741] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  146. [2024-07-26T13:15:36.741] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  147. [2024-07-26T13:15:36.741] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  148. [2024-07-26T13:15:36.742] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  149. [2024-07-26T13:15:36.742] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  150. [2024-07-26T13:15:37.742] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  151. [2024-07-26T13:15:37.742] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  152. [2024-07-26T13:15:37.743] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  153. [2024-07-26T13:15:37.743] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  154. [2024-07-26T13:15:38.743] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  155. [2024-07-26T13:15:38.743] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  156. [2024-07-26T13:15:38.744] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  157. [2024-07-26T13:15:38.744] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  158. [2024-07-26T13:15:39.744] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  159. [2024-07-26T13:15:39.744] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  160. [2024-07-26T13:15:39.745] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  161. [2024-07-26T13:15:39.745] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  162. [2024-07-26T13:15:40.745] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  163. [2024-07-26T13:15:40.745] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  164. [2024-07-26T13:15:40.746] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  165. [2024-07-26T13:15:40.746] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  166. [2024-07-26T13:15:41.746] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  167. [2024-07-26T13:15:41.746] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  168. [2024-07-26T13:15:41.746] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  169. [2024-07-26T13:15:41.747] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  170. [2024-07-26T13:15:42.747] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  171. [2024-07-26T13:15:42.747] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  172. [2024-07-26T13:15:42.748] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  173. [2024-07-26T13:15:42.748] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  174. [2024-07-26T13:15:43.748] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  175. [2024-07-26T13:15:43.748] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  176. [2024-07-26T13:15:43.749] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  177. [2024-07-26T13:15:43.749] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  178. [2024-07-26T13:15:44.748] debug2: Tree head got back 1
  179. [2024-07-26T13:15:44.749] debug2: Tree head got back 2
  180. [2024-07-26T13:15:49.759] debug2: Testing job time limits and checkpoints
  181. [2024-07-26T13:15:49.759] debug2: Performing purge of old job records
  182. [2024-07-26T13:15:49.759] debug: sched: Running job scheduler for full queue.
  183. [2024-07-26T13:16:04.582] debug: sched/backfill: _attempt_backfill: beginning
  184. [2024-07-26T13:16:04.582] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  185. [2024-07-26T13:16:19.802] debug2: Testing job time limits and checkpoints
  186. [2024-07-26T13:16:49.845] debug2: Testing job time limits and checkpoints
  187. [2024-07-26T13:16:49.845] debug2: Performing purge of old job records
  188. [2024-07-26T13:16:49.845] debug: sched: Running job scheduler for full queue.
  189. [2024-07-26T13:16:50.585] debug: sched/backfill: _attempt_backfill: beginning
  190. [2024-07-26T13:16:50.585] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  191. [2024-07-26T13:17:13.881] debug: Spawning registration agent for server[2-3] 2 hosts
  192. [2024-07-26T13:17:13.881] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  193. [2024-07-26T13:17:13.882] debug2: Tree head got back 0 looking for 2
  194. [2024-07-26T13:17:13.883] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  195. [2024-07-26T13:17:13.883] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  196. [2024-07-26T13:17:13.883] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  197. [2024-07-26T13:17:13.883] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  198. [2024-07-26T13:17:14.884] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  199. [2024-07-26T13:17:14.884] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  200. [2024-07-26T13:17:14.884] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  201. [2024-07-26T13:17:14.884] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  202. [2024-07-26T13:17:15.885] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  203. [2024-07-26T13:17:15.885] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  204. [2024-07-26T13:17:15.885] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  205. [2024-07-26T13:17:15.885] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  206. [2024-07-26T13:17:16.886] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  207. [2024-07-26T13:17:16.886] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  208. [2024-07-26T13:17:16.886] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  209. [2024-07-26T13:17:16.886] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  210. [2024-07-26T13:17:17.887] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  211. [2024-07-26T13:17:17.887] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  212. [2024-07-26T13:17:17.887] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  213. [2024-07-26T13:17:17.887] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  214. [2024-07-26T13:17:18.888] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  215. [2024-07-26T13:17:18.888] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  216. [2024-07-26T13:17:18.888] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  217. [2024-07-26T13:17:18.888] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  218. [2024-07-26T13:17:19.889] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  219. [2024-07-26T13:17:19.889] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  220. [2024-07-26T13:17:19.889] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  221. [2024-07-26T13:17:19.889] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  222. [2024-07-26T13:17:19.890] debug2: Testing job time limits and checkpoints
  223. [2024-07-26T13:17:20.890] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  224. [2024-07-26T13:17:20.890] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  225. [2024-07-26T13:17:20.890] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  226. [2024-07-26T13:17:20.890] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  227. [2024-07-26T13:17:21.891] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  228. [2024-07-26T13:17:21.891] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  229. [2024-07-26T13:17:21.891] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  230. [2024-07-26T13:17:21.891] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  231. [2024-07-26T13:17:22.891] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  232. [2024-07-26T13:17:22.892] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  233. [2024-07-26T13:17:22.892] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  234. [2024-07-26T13:17:22.892] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  235. [2024-07-26T13:17:23.892] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  236. [2024-07-26T13:17:23.892] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  237. [2024-07-26T13:17:23.893] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  238. [2024-07-26T13:17:23.893] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  239. [2024-07-26T13:17:24.893] debug2: Tree head got back 1
  240. [2024-07-26T13:17:24.893] debug2: Tree head got back 2
  241. [2024-07-26T13:17:49.937] debug2: Testing job time limits and checkpoints
  242. [2024-07-26T13:17:49.937] debug2: Performing purge of old job records
  243. [2024-07-26T13:17:49.937] debug: sched: Running job scheduler for full queue.
  244. [2024-07-26T13:17:50.590] debug: sched/backfill: _attempt_backfill: beginning
  245. [2024-07-26T13:17:50.590] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  246. [2024-07-26T13:18:19.982] debug2: Testing job time limits and checkpoints
  247. [2024-07-26T13:18:49.026] debug2: Testing job time limits and checkpoints
  248. [2024-07-26T13:18:49.026] debug2: Performing purge of old job records
  249. [2024-07-26T13:18:49.026] debug2: Performing full system state save
  250. [2024-07-26T13:18:49.026] debug: sched: Running job scheduler for full queue.
  251. [2024-07-26T13:18:49.595] debug: sched/backfill: _attempt_backfill: beginning
  252. [2024-07-26T13:18:49.595] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  253. [2024-07-26T13:18:53.049] debug: Spawning ping agent for server1
  254. [2024-07-26T13:18:53.049] debug: Spawning registration agent for server[2-3] 2 hosts
  255. [2024-07-26T13:18:53.049] debug2: Spawning RPC agent for msg_type REQUEST_PING
  256. [2024-07-26T13:18:53.049] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  257. [2024-07-26T13:18:53.050] debug2: Tree head got back 0 looking for 1
  258. [2024-07-26T13:18:53.050] debug2: Tree head got back 0 looking for 2
  259. [2024-07-26T13:18:53.050] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  260. [2024-07-26T13:18:53.050] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  261. [2024-07-26T13:18:53.050] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  262. [2024-07-26T13:18:53.051] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  263. [2024-07-26T13:18:53.051] debug2: Tree head got back 1
  264. [2024-07-26T13:18:53.055] debug2: node_did_resp server1
  265. [2024-07-26T13:18:54.051] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  266. [2024-07-26T13:18:54.051] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  267. [2024-07-26T13:18:54.051] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  268. [2024-07-26T13:18:54.051] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  269. [2024-07-26T13:18:55.052] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  270. [2024-07-26T13:18:55.052] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  271. [2024-07-26T13:18:55.052] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  272. [2024-07-26T13:18:55.052] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  273. [2024-07-26T13:18:56.053] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  274. [2024-07-26T13:18:56.053] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  275. [2024-07-26T13:18:56.053] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  276. [2024-07-26T13:18:56.053] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  277. [2024-07-26T13:18:57.054] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  278. [2024-07-26T13:18:57.054] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  279. [2024-07-26T13:18:57.054] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  280. [2024-07-26T13:18:57.054] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  281. [2024-07-26T13:18:58.055] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  282. [2024-07-26T13:18:58.055] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  283. [2024-07-26T13:18:58.055] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  284. [2024-07-26T13:18:58.055] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  285. [2024-07-26T13:18:59.056] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  286. [2024-07-26T13:18:59.056] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  287. [2024-07-26T13:18:59.056] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  288. [2024-07-26T13:18:59.056] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  289. [2024-07-26T13:19:00.057] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  290. [2024-07-26T13:19:00.057] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  291. [2024-07-26T13:19:00.057] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  292. [2024-07-26T13:19:00.057] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  293. [2024-07-26T13:19:01.057] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  294. [2024-07-26T13:19:01.057] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  295. [2024-07-26T13:19:01.057] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  296. [2024-07-26T13:19:01.057] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  297. [2024-07-26T13:19:02.059] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  298. [2024-07-26T13:19:02.059] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  299. [2024-07-26T13:19:02.059] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  300. [2024-07-26T13:19:02.059] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  301. [2024-07-26T13:19:03.059] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  302. [2024-07-26T13:19:03.060] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  303. [2024-07-26T13:19:03.060] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  304. [2024-07-26T13:19:03.060] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  305. [2024-07-26T13:19:04.060] debug2: Tree head got back 1
  306. [2024-07-26T13:19:04.060] debug2: Tree head got back 2
  307. [2024-07-26T13:19:19.089] debug2: Testing job time limits and checkpoints
  308. [2024-07-26T13:19:19.595] debug: sched/backfill: _attempt_backfill: beginning
  309. [2024-07-26T13:19:19.595] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  310. [2024-07-26T13:19:49.135] debug2: Testing job time limits and checkpoints
  311. [2024-07-26T13:19:49.136] debug2: Performing purge of old job records
  312. [2024-07-26T13:19:49.136] debug: sched: Running job scheduler for full queue.
  313. [2024-07-26T13:19:49.596] debug: sched/backfill: _attempt_backfill: beginning
  314. [2024-07-26T13:19:49.596] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  315. [2024-07-26T13:20:19.181] debug2: Testing job time limits and checkpoints
  316. [2024-07-26T13:20:19.596] debug: sched/backfill: _attempt_backfill: beginning
  317. [2024-07-26T13:20:19.596] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  318. [2024-07-26T13:20:33.204] debug: Spawning registration agent for server[2-3] 2 hosts
  319. [2024-07-26T13:20:33.204] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  320. [2024-07-26T13:20:33.204] debug2: Tree head got back 0 looking for 2
  321. [2024-07-26T13:20:33.205] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  322. [2024-07-26T13:20:33.205] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  323. [2024-07-26T13:20:33.205] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  324. [2024-07-26T13:20:33.205] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  325. [2024-07-26T13:20:34.206] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  326. [2024-07-26T13:20:34.206] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  327. [2024-07-26T13:20:34.206] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  328. [2024-07-26T13:20:34.206] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  329. [2024-07-26T13:20:35.207] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  330. [2024-07-26T13:20:35.207] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  331. [2024-07-26T13:20:35.207] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  332. [2024-07-26T13:20:35.207] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  333. [2024-07-26T13:20:36.208] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  334. [2024-07-26T13:20:36.208] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  335. [2024-07-26T13:20:36.208] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  336. [2024-07-26T13:20:36.208] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  337. [2024-07-26T13:20:37.209] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  338. [2024-07-26T13:20:37.209] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  339. [2024-07-26T13:20:37.209] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  340. [2024-07-26T13:20:37.209] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  341. [2024-07-26T13:20:38.210] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  342. [2024-07-26T13:20:38.210] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  343. [2024-07-26T13:20:38.210] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  344. [2024-07-26T13:20:38.210] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  345. [2024-07-26T13:20:39.211] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  346. [2024-07-26T13:20:39.211] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  347. [2024-07-26T13:20:39.211] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  348. [2024-07-26T13:20:39.211] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  349. [2024-07-26T13:20:40.212] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  350. [2024-07-26T13:20:40.212] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  351. [2024-07-26T13:20:40.212] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  352. [2024-07-26T13:20:40.212] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  353. [2024-07-26T13:20:41.213] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  354. [2024-07-26T13:20:41.213] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  355. [2024-07-26T13:20:41.213] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  356. [2024-07-26T13:20:41.213] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  357. [2024-07-26T13:20:42.213] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  358. [2024-07-26T13:20:42.213] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  359. [2024-07-26T13:20:42.214] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  360. [2024-07-26T13:20:42.214] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  361. [2024-07-26T13:20:43.215] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  362. [2024-07-26T13:20:43.215] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  363. [2024-07-26T13:20:43.215] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  364. [2024-07-26T13:20:43.215] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  365. [2024-07-26T13:20:44.215] debug2: Tree head got back 1
  366. [2024-07-26T13:20:44.215] debug2: Tree head got back 2
  367. [2024-07-26T13:20:49.229] debug2: Testing job time limits and checkpoints
  368. [2024-07-26T13:20:49.229] debug2: Performing purge of old job records
  369. [2024-07-26T13:20:49.230] debug: sched: Running job scheduler for full queue.
  370. [2024-07-26T13:20:49.596] debug: sched/backfill: _attempt_backfill: beginning
  371. [2024-07-26T13:20:49.597] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  372. [2024-07-26T13:21:19.274] debug2: Testing job time limits and checkpoints
  373. [2024-07-26T13:21:19.597] debug: sched/backfill: _attempt_backfill: beginning
  374. [2024-07-26T13:21:19.597] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  375. [2024-07-26T13:21:49.321] debug2: Testing job time limits and checkpoints
  376. [2024-07-26T13:21:49.322] debug2: Performing purge of old job records
  377. [2024-07-26T13:21:49.322] debug: sched: Running job scheduler for full queue.
  378. [2024-07-26T13:21:49.597] debug: sched/backfill: _attempt_backfill: beginning
  379. [2024-07-26T13:21:49.597] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  380. [2024-07-26T13:22:13.358] debug: Spawning ping agent for server1
  381. [2024-07-26T13:22:13.359] debug: Spawning registration agent for server[2-3] 2 hosts
  382. [2024-07-26T13:22:13.359] debug2: Spawning RPC agent for msg_type REQUEST_PING
  383. [2024-07-26T13:22:13.359] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  384. [2024-07-26T13:22:13.359] debug2: Tree head got back 0 looking for 1
  385. [2024-07-26T13:22:13.359] debug2: Tree head got back 0 looking for 2
  386. [2024-07-26T13:22:13.360] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  387. [2024-07-26T13:22:13.360] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  388. [2024-07-26T13:22:13.360] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  389. [2024-07-26T13:22:13.360] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  390. [2024-07-26T13:22:13.361] debug2: Tree head got back 1
  391. [2024-07-26T13:22:13.364] debug2: node_did_resp server1
  392. [2024-07-26T13:22:14.361] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  393. [2024-07-26T13:22:14.361] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  394. [2024-07-26T13:22:14.361] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  395. [2024-07-26T13:22:14.361] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  396. [2024-07-26T13:22:15.362] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  397. [2024-07-26T13:22:15.362] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  398. [2024-07-26T13:22:15.362] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  399. [2024-07-26T13:22:15.362] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  400. [2024-07-26T13:22:16.363] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  401. [2024-07-26T13:22:16.363] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  402. [2024-07-26T13:22:16.363] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  403. [2024-07-26T13:22:16.363] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  404. [2024-07-26T13:22:17.364] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  405. [2024-07-26T13:22:17.364] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  406. [2024-07-26T13:22:17.364] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  407. [2024-07-26T13:22:17.364] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  408. [2024-07-26T13:22:18.365] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  409. [2024-07-26T13:22:18.365] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  410. [2024-07-26T13:22:18.365] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  411. [2024-07-26T13:22:18.365] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  412. [2024-07-26T13:22:19.366] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  413. [2024-07-26T13:22:19.366] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  414. [2024-07-26T13:22:19.366] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  415. [2024-07-26T13:22:19.366] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  416. [2024-07-26T13:22:19.368] debug2: Testing job time limits and checkpoints
  417. [2024-07-26T13:22:19.597] debug: sched/backfill: _attempt_backfill: beginning
  418. [2024-07-26T13:22:19.598] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  419. [2024-07-26T13:22:20.367] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  420. [2024-07-26T13:22:20.367] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  421. [2024-07-26T13:22:20.367] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  422. [2024-07-26T13:22:20.367] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  423. [2024-07-26T13:22:21.367] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  424. [2024-07-26T13:22:21.367] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  425. [2024-07-26T13:22:21.368] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  426. [2024-07-26T13:22:21.368] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  427. [2024-07-26T13:22:22.368] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  428. [2024-07-26T13:22:22.368] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  429. [2024-07-26T13:22:22.369] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  430. [2024-07-26T13:22:22.369] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  431. [2024-07-26T13:22:23.369] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  432. [2024-07-26T13:22:23.369] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  433. [2024-07-26T13:22:23.369] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  434. [2024-07-26T13:22:23.369] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  435. [2024-07-26T13:22:24.370] debug2: Tree head got back 2
  436. [2024-07-26T13:22:49.417] debug2: Testing job time limits and checkpoints
  437. [2024-07-26T13:22:49.417] debug2: Performing purge of old job records
  438. [2024-07-26T13:22:49.417] debug: sched: Running job scheduler for full queue.
  439. [2024-07-26T13:22:49.598] debug: sched/backfill: _attempt_backfill: beginning
  440. [2024-07-26T13:22:49.598] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  441. [2024-07-26T13:23:19.462] debug2: Testing job time limits and checkpoints
  442. [2024-07-26T13:23:19.598] debug: sched/backfill: _attempt_backfill: beginning
  443. [2024-07-26T13:23:19.598] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  444. [2024-07-26T13:23:49.510] debug2: Testing job time limits and checkpoints
  445. [2024-07-26T13:23:49.510] debug: Updating partition uid access list
  446. [2024-07-26T13:23:49.510] debug2: Updating reservations group's uid access lists
  447. [2024-07-26T13:23:49.510] debug2: Performing purge of old job records
  448. [2024-07-26T13:23:49.510] debug2: Performing full system state save
  449. [2024-07-26T13:23:49.510] debug: sched: Running job scheduler for full queue.
  450. [2024-07-26T13:23:49.598] debug: sched/backfill: _attempt_backfill: beginning
  451. [2024-07-26T13:23:49.599] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  452. [2024-07-26T13:23:53.533] debug: Spawning registration agent for server[2-3] 2 hosts
  453. [2024-07-26T13:23:53.533] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  454. [2024-07-26T13:23:53.534] debug2: Tree head got back 0 looking for 2
  455. [2024-07-26T13:23:53.534] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  456. [2024-07-26T13:23:53.534] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  457. [2024-07-26T13:23:53.534] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  458. [2024-07-26T13:23:53.534] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  459. [2024-07-26T13:23:54.535] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  460. [2024-07-26T13:23:54.535] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  461. [2024-07-26T13:23:54.535] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  462. [2024-07-26T13:23:54.535] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  463. [2024-07-26T13:23:55.536] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  464. [2024-07-26T13:23:55.536] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  465. [2024-07-26T13:23:55.536] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  466. [2024-07-26T13:23:55.536] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  467. [2024-07-26T13:23:56.537] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  468. [2024-07-26T13:23:56.537] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  469. [2024-07-26T13:23:56.537] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  470. [2024-07-26T13:23:56.537] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  471. [2024-07-26T13:23:57.538] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  472. [2024-07-26T13:23:57.538] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  473. [2024-07-26T13:23:57.538] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  474. [2024-07-26T13:23:57.538] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  475. [2024-07-26T13:23:58.538] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  476. [2024-07-26T13:23:58.538] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  477. [2024-07-26T13:23:58.539] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  478. [2024-07-26T13:23:58.539] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  479. [2024-07-26T13:23:59.539] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  480. [2024-07-26T13:23:59.540] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  481. [2024-07-26T13:23:59.540] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  482. [2024-07-26T13:23:59.540] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  483. [2024-07-26T13:24:00.540] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  484. [2024-07-26T13:24:00.540] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  485. [2024-07-26T13:24:00.541] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  486. [2024-07-26T13:24:00.541] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  487. [2024-07-26T13:24:01.541] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  488. [2024-07-26T13:24:01.541] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  489. [2024-07-26T13:24:01.542] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  490. [2024-07-26T13:24:01.542] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  491. [2024-07-26T13:24:02.542] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  492. [2024-07-26T13:24:02.542] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  493. [2024-07-26T13:24:02.543] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  494. [2024-07-26T13:24:02.543] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  495. [2024-07-26T13:24:03.543] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  496. [2024-07-26T13:24:03.543] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  497. [2024-07-26T13:24:03.544] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  498. [2024-07-26T13:24:03.544] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  499. [2024-07-26T13:24:04.544] debug2: Tree head got back 1
  500. [2024-07-26T13:24:04.545] debug2: Tree head got back 2
  501. [2024-07-26T13:24:19.576] debug2: Testing job time limits and checkpoints
  502. [2024-07-26T13:24:19.599] debug: sched/backfill: _attempt_backfill: beginning
  503. [2024-07-26T13:24:19.599] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  504. [2024-07-26T13:24:49.625] debug2: Testing job time limits and checkpoints
  505. [2024-07-26T13:24:49.625] debug2: Performing purge of old job records
  506. [2024-07-26T13:24:49.625] debug: sched: Running job scheduler for full queue.
  507. [2024-07-26T13:24:50.599] debug: sched/backfill: _attempt_backfill: beginning
  508. [2024-07-26T13:24:50.600] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  509. [2024-07-26T13:25:19.674] debug2: Testing job time limits and checkpoints
  510. [2024-07-26T13:25:33.697] debug: Spawning ping agent for server1
  511. [2024-07-26T13:25:33.697] debug: Spawning registration agent for server[2-3] 2 hosts
  512. [2024-07-26T13:25:33.697] debug2: Spawning RPC agent for msg_type REQUEST_PING
  513. [2024-07-26T13:25:33.698] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  514. [2024-07-26T13:25:33.698] debug2: Tree head got back 0 looking for 1
  515. [2024-07-26T13:25:33.698] debug2: Tree head got back 0 looking for 2
  516. [2024-07-26T13:25:33.699] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  517. [2024-07-26T13:25:33.699] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  518. [2024-07-26T13:25:33.699] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  519. [2024-07-26T13:25:33.699] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  520. [2024-07-26T13:25:33.700] debug2: Tree head got back 1
  521. [2024-07-26T13:25:33.703] debug2: node_did_resp server1
  522. [2024-07-26T13:25:34.602] debug: sched/backfill: _attempt_backfill: beginning
  523. [2024-07-26T13:25:34.602] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  524. [2024-07-26T13:25:34.700] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  525. [2024-07-26T13:25:34.700] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  526. [2024-07-26T13:25:34.700] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  527. [2024-07-26T13:25:34.700] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  528. [2024-07-26T13:25:35.701] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  529. [2024-07-26T13:25:35.701] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  530. [2024-07-26T13:25:35.701] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  531. [2024-07-26T13:25:35.701] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  532. [2024-07-26T13:25:36.701] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  533. [2024-07-26T13:25:36.702] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  534. [2024-07-26T13:25:36.702] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  535. [2024-07-26T13:25:36.702] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  536. [2024-07-26T13:25:37.702] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  537. [2024-07-26T13:25:37.702] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  538. [2024-07-26T13:25:37.703] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  539. [2024-07-26T13:25:37.703] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  540. [2024-07-26T13:25:38.703] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  541. [2024-07-26T13:25:38.704] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  542. [2024-07-26T13:25:38.704] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  543. [2024-07-26T13:25:38.704] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  544. [2024-07-26T13:25:39.704] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  545. [2024-07-26T13:25:39.704] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  546. [2024-07-26T13:25:39.705] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  547. [2024-07-26T13:25:39.705] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  548. [2024-07-26T13:25:40.705] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  549. [2024-07-26T13:25:40.705] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  550. [2024-07-26T13:25:40.705] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  551. [2024-07-26T13:25:40.706] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  552. [2024-07-26T13:25:41.706] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  553. [2024-07-26T13:25:41.706] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  554. [2024-07-26T13:25:41.706] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  555. [2024-07-26T13:25:41.706] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  556. [2024-07-26T13:25:42.707] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  557. [2024-07-26T13:25:42.707] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  558. [2024-07-26T13:25:42.707] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  559. [2024-07-26T13:25:42.707] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  560. [2024-07-26T13:25:43.708] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  561. [2024-07-26T13:25:43.708] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  562. [2024-07-26T13:25:43.708] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  563. [2024-07-26T13:25:43.708] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  564. [2024-07-26T13:25:44.708] debug2: Tree head got back 1
  565. [2024-07-26T13:25:44.708] debug2: Tree head got back 2
  566. [2024-07-26T13:25:49.722] debug2: Testing job time limits and checkpoints
  567. [2024-07-26T13:25:49.722] debug2: Performing purge of old job records
  568. [2024-07-26T13:25:49.722] debug: sched: Running job scheduler for full queue.
  569. [2024-07-26T13:26:04.603] debug: sched/backfill: _attempt_backfill: beginning
  570. [2024-07-26T13:26:04.603] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  571. [2024-07-26T13:26:19.768] debug2: Testing job time limits and checkpoints
  572. [2024-07-26T13:26:49.814] debug2: Testing job time limits and checkpoints
  573. [2024-07-26T13:26:49.814] debug2: Performing purge of old job records
  574. [2024-07-26T13:26:49.815] debug: sched: Running job scheduler for full queue.
  575. [2024-07-26T13:26:50.606] debug: sched/backfill: _attempt_backfill: beginning
  576. [2024-07-26T13:26:50.606] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  577. [2024-07-26T13:27:13.850] debug: Spawning registration agent for server[2-3] 2 hosts
  578. [2024-07-26T13:27:13.850] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  579. [2024-07-26T13:27:13.851] debug2: Tree head got back 0 looking for 2
  580. [2024-07-26T13:27:13.852] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  581. [2024-07-26T13:27:13.852] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  582. [2024-07-26T13:27:13.852] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  583. [2024-07-26T13:27:13.852] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  584. [2024-07-26T13:27:14.853] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  585. [2024-07-26T13:27:14.853] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  586. [2024-07-26T13:27:14.853] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  587. [2024-07-26T13:27:14.853] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  588. [2024-07-26T13:27:15.854] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  589. [2024-07-26T13:27:15.854] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  590. [2024-07-26T13:27:15.854] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  591. [2024-07-26T13:27:15.854] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  592. [2024-07-26T13:27:16.854] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  593. [2024-07-26T13:27:16.854] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  594. [2024-07-26T13:27:16.854] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  595. [2024-07-26T13:27:16.855] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  596. [2024-07-26T13:27:17.855] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  597. [2024-07-26T13:27:17.855] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  598. [2024-07-26T13:27:17.855] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  599. [2024-07-26T13:27:17.855] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  600. [2024-07-26T13:27:18.856] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  601. [2024-07-26T13:27:18.856] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  602. [2024-07-26T13:27:18.856] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  603. [2024-07-26T13:27:18.856] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  604. [2024-07-26T13:27:19.857] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  605. [2024-07-26T13:27:19.857] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  606. [2024-07-26T13:27:19.857] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  607. [2024-07-26T13:27:19.857] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  608. [2024-07-26T13:27:19.860] debug2: Testing job time limits and checkpoints
  609. [2024-07-26T13:27:20.858] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  610. [2024-07-26T13:27:20.858] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  611. [2024-07-26T13:27:20.858] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  612. [2024-07-26T13:27:20.858] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  613. [2024-07-26T13:27:21.859] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  614. [2024-07-26T13:27:21.859] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  615. [2024-07-26T13:27:21.859] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  616. [2024-07-26T13:27:21.859] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  617. [2024-07-26T13:27:22.860] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  618. [2024-07-26T13:27:22.860] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  619. [2024-07-26T13:27:22.860] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  620. [2024-07-26T13:27:22.860] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  621. [2024-07-26T13:27:23.861] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  622. [2024-07-26T13:27:23.861] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  623. [2024-07-26T13:27:23.861] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  624. [2024-07-26T13:27:23.861] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  625. [2024-07-26T13:27:24.861] debug2: Tree head got back 1
  626. [2024-07-26T13:27:24.861] debug2: Tree head got back 2
  627. [2024-07-26T13:27:49.907] debug2: Testing job time limits and checkpoints
  628. [2024-07-26T13:27:49.907] debug2: Performing purge of old job records
  629. [2024-07-26T13:27:49.907] debug: sched: Running job scheduler for full queue.
  630. [2024-07-26T13:27:50.611] debug: sched/backfill: _attempt_backfill: beginning
  631. [2024-07-26T13:27:50.611] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  632. [2024-07-26T13:28:19.954] debug2: Testing job time limits and checkpoints
  633. [2024-07-26T13:28:50.001] debug2: Testing job time limits and checkpoints
  634. [2024-07-26T13:28:50.001] debug2: Performing purge of old job records
  635. [2024-07-26T13:28:50.001] debug2: Performing full system state save
  636. [2024-07-26T13:28:50.001] debug: sched: Running job scheduler for full queue.
  637. [2024-07-26T13:28:50.616] debug: sched/backfill: _attempt_backfill: beginning
  638. [2024-07-26T13:28:50.616] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  639. [2024-07-26T13:28:53.023] debug: Spawning ping agent for server1
  640. [2024-07-26T13:28:53.023] debug: Spawning registration agent for server[2-3] 2 hosts
  641. [2024-07-26T13:28:53.023] debug2: Spawning RPC agent for msg_type REQUEST_PING
  642. [2024-07-26T13:28:53.023] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  643. [2024-07-26T13:28:53.024] debug2: Tree head got back 0 looking for 1
  644. [2024-07-26T13:28:53.024] debug2: Tree head got back 0 looking for 2
  645. [2024-07-26T13:28:53.024] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  646. [2024-07-26T13:28:53.025] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  647. [2024-07-26T13:28:53.025] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  648. [2024-07-26T13:28:53.025] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  649. [2024-07-26T13:28:53.025] debug2: Tree head got back 1
  650. [2024-07-26T13:28:53.029] debug2: node_did_resp server1
  651. [2024-07-26T13:28:54.025] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  652. [2024-07-26T13:28:54.025] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  653. [2024-07-26T13:28:54.026] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  654. [2024-07-26T13:28:54.026] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  655. [2024-07-26T13:28:55.026] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  656. [2024-07-26T13:28:55.026] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  657. [2024-07-26T13:28:55.027] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  658. [2024-07-26T13:28:55.027] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  659. [2024-07-26T13:28:56.027] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  660. [2024-07-26T13:28:56.027] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  661. [2024-07-26T13:28:56.027] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  662. [2024-07-26T13:28:56.027] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  663. [2024-07-26T13:28:57.028] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  664. [2024-07-26T13:28:57.028] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  665. [2024-07-26T13:28:57.028] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  666. [2024-07-26T13:28:57.028] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  667. [2024-07-26T13:28:58.029] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  668. [2024-07-26T13:28:58.029] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  669. [2024-07-26T13:28:58.029] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  670. [2024-07-26T13:28:58.029] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  671. [2024-07-26T13:28:59.030] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  672. [2024-07-26T13:28:59.030] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  673. [2024-07-26T13:28:59.030] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  674. [2024-07-26T13:28:59.030] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  675. [2024-07-26T13:29:00.031] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  676. [2024-07-26T13:29:00.031] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  677. [2024-07-26T13:29:00.031] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  678. [2024-07-26T13:29:00.031] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  679. [2024-07-26T13:29:01.032] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  680. [2024-07-26T13:29:01.032] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  681. [2024-07-26T13:29:01.032] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  682. [2024-07-26T13:29:01.032] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  683. [2024-07-26T13:29:02.033] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  684. [2024-07-26T13:29:02.033] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  685. [2024-07-26T13:29:02.033] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  686. [2024-07-26T13:29:02.033] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  687. [2024-07-26T13:29:03.034] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  688. [2024-07-26T13:29:03.034] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  689. [2024-07-26T13:29:03.034] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  690. [2024-07-26T13:29:03.034] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  691. [2024-07-26T13:29:04.034] debug2: Tree head got back 1
  692. [2024-07-26T13:29:04.034] debug2: Tree head got back 2
  693. [2024-07-26T13:29:19.064] debug2: Testing job time limits and checkpoints
  694. [2024-07-26T13:29:20.616] debug: sched/backfill: _attempt_backfill: beginning
  695. [2024-07-26T13:29:20.616] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  696. [2024-07-26T13:29:49.109] debug2: Testing job time limits and checkpoints
  697. [2024-07-26T13:29:49.109] debug2: Performing purge of old job records
  698. [2024-07-26T13:29:49.109] debug: sched: Running job scheduler for full queue.
  699. [2024-07-26T13:29:50.617] debug: sched/backfill: _attempt_backfill: beginning
  700. [2024-07-26T13:29:50.617] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  701. [2024-07-26T13:30:19.151] debug2: Testing job time limits and checkpoints
  702. [2024-07-26T13:30:33.170] debug: Spawning registration agent for server[2-3] 2 hosts
  703. [2024-07-26T13:30:33.170] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  704. [2024-07-26T13:30:33.171] debug2: Tree head got back 0 looking for 2
  705. [2024-07-26T13:30:33.171] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  706. [2024-07-26T13:30:33.172] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  707. [2024-07-26T13:30:33.172] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  708. [2024-07-26T13:30:33.172] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  709. [2024-07-26T13:30:34.173] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  710. [2024-07-26T13:30:34.173] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  711. [2024-07-26T13:30:34.173] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  712. [2024-07-26T13:30:34.173] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  713. [2024-07-26T13:30:35.173] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  714. [2024-07-26T13:30:35.173] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  715. [2024-07-26T13:30:35.174] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  716. [2024-07-26T13:30:35.174] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  717. [2024-07-26T13:30:36.175] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  718. [2024-07-26T13:30:36.175] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  719. [2024-07-26T13:30:36.175] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  720. [2024-07-26T13:30:36.175] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  721. [2024-07-26T13:30:37.176] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  722. [2024-07-26T13:30:37.176] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  723. [2024-07-26T13:30:37.176] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  724. [2024-07-26T13:30:37.176] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  725. [2024-07-26T13:30:38.177] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  726. [2024-07-26T13:30:38.177] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  727. [2024-07-26T13:30:38.177] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  728. [2024-07-26T13:30:38.177] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  729. [2024-07-26T13:30:39.177] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  730. [2024-07-26T13:30:39.178] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  731. [2024-07-26T13:30:39.178] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  732. [2024-07-26T13:30:39.178] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  733. [2024-07-26T13:30:40.179] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  734. [2024-07-26T13:30:40.179] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  735. [2024-07-26T13:30:40.179] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  736. [2024-07-26T13:30:40.179] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  737. [2024-07-26T13:30:41.180] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  738. [2024-07-26T13:30:41.180] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  739. [2024-07-26T13:30:41.180] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  740. [2024-07-26T13:30:41.180] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  741. [2024-07-26T13:30:42.181] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  742. [2024-07-26T13:30:42.181] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  743. [2024-07-26T13:30:42.181] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  744. [2024-07-26T13:30:42.181] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  745. [2024-07-26T13:30:43.181] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  746. [2024-07-26T13:30:43.181] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  747. [2024-07-26T13:30:43.182] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  748. [2024-07-26T13:30:43.182] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  749. [2024-07-26T13:30:44.182] debug2: Tree head got back 1
  750. [2024-07-26T13:30:44.182] debug2: Tree head got back 2
  751. [2024-07-26T13:30:49.195] debug2: Testing job time limits and checkpoints
  752. [2024-07-26T13:30:49.195] debug2: Performing purge of old job records
  753. [2024-07-26T13:30:49.195] debug: sched: Running job scheduler for full queue.
  754. [2024-07-26T13:30:49.622] debug: sched/backfill: _attempt_backfill: beginning
  755. [2024-07-26T13:30:49.622] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  756. [2024-07-26T13:31:19.241] debug2: Testing job time limits and checkpoints
  757. [2024-07-26T13:31:19.622] debug: sched/backfill: _attempt_backfill: beginning
  758. [2024-07-26T13:31:19.622] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  759. [2024-07-26T13:31:49.286] debug2: Testing job time limits and checkpoints
  760. [2024-07-26T13:31:49.286] debug2: Performing purge of old job records
  761. [2024-07-26T13:31:49.286] debug: sched: Running job scheduler for full queue.
  762. [2024-07-26T13:31:49.622] debug: sched/backfill: _attempt_backfill: beginning
  763. [2024-07-26T13:31:49.622] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  764. [2024-07-26T13:32:13.323] debug: Spawning ping agent for server1
  765. [2024-07-26T13:32:13.323] debug: Spawning registration agent for server[2-3] 2 hosts
  766. [2024-07-26T13:32:13.323] debug2: Spawning RPC agent for msg_type REQUEST_PING
  767. [2024-07-26T13:32:13.323] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  768. [2024-07-26T13:32:13.323] debug2: Tree head got back 0 looking for 1
  769. [2024-07-26T13:32:13.324] debug2: Tree head got back 0 looking for 2
  770. [2024-07-26T13:32:13.324] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  771. [2024-07-26T13:32:13.324] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  772. [2024-07-26T13:32:13.324] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  773. [2024-07-26T13:32:13.325] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  774. [2024-07-26T13:32:13.325] debug2: Tree head got back 1
  775. [2024-07-26T13:32:13.328] debug2: node_did_resp server1
  776. [2024-07-26T13:32:14.325] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  777. [2024-07-26T13:32:14.325] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  778. [2024-07-26T13:32:14.325] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  779. [2024-07-26T13:32:14.325] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  780. [2024-07-26T13:32:15.326] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  781. [2024-07-26T13:32:15.326] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  782. [2024-07-26T13:32:15.326] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  783. [2024-07-26T13:32:15.326] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  784. [2024-07-26T13:32:16.327] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  785. [2024-07-26T13:32:16.327] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  786. [2024-07-26T13:32:16.327] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  787. [2024-07-26T13:32:16.327] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  788. [2024-07-26T13:32:17.328] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  789. [2024-07-26T13:32:17.328] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  790. [2024-07-26T13:32:17.328] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  791. [2024-07-26T13:32:17.328] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  792. [2024-07-26T13:32:18.329] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  793. [2024-07-26T13:32:18.329] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  794. [2024-07-26T13:32:18.329] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  795. [2024-07-26T13:32:18.329] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  796. [2024-07-26T13:32:19.330] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  797. [2024-07-26T13:32:19.330] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  798. [2024-07-26T13:32:19.330] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  799. [2024-07-26T13:32:19.330] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  800. [2024-07-26T13:32:19.332] debug2: Testing job time limits and checkpoints
  801. [2024-07-26T13:32:19.623] debug: sched/backfill: _attempt_backfill: beginning
  802. [2024-07-26T13:32:19.623] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  803. [2024-07-26T13:32:20.331] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  804. [2024-07-26T13:32:20.331] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  805. [2024-07-26T13:32:20.331] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  806. [2024-07-26T13:32:20.331] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  807. [2024-07-26T13:32:21.332] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  808. [2024-07-26T13:32:21.332] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  809. [2024-07-26T13:32:21.332] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  810. [2024-07-26T13:32:21.332] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  811. [2024-07-26T13:32:22.332] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  812. [2024-07-26T13:32:22.333] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  813. [2024-07-26T13:32:22.333] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  814. [2024-07-26T13:32:22.333] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  815. [2024-07-26T13:32:23.334] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  816. [2024-07-26T13:32:23.334] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  817. [2024-07-26T13:32:23.334] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  818. [2024-07-26T13:32:23.334] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  819. [2024-07-26T13:32:24.334] debug2: Tree head got back 1
  820. [2024-07-26T13:32:24.334] debug2: Tree head got back 2
  821. [2024-07-26T13:32:49.378] debug2: Testing job time limits and checkpoints
  822. [2024-07-26T13:32:49.378] debug2: Performing purge of old job records
  823. [2024-07-26T13:32:49.378] debug: sched: Running job scheduler for full queue.
  824. [2024-07-26T13:32:49.623] debug: sched/backfill: _attempt_backfill: beginning
  825. [2024-07-26T13:32:49.623] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  826. [2024-07-26T13:33:19.425] debug2: Testing job time limits and checkpoints
  827. [2024-07-26T13:33:19.623] debug: sched/backfill: _attempt_backfill: beginning
  828. [2024-07-26T13:33:19.624] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  829. [2024-07-26T13:33:49.473] debug2: Testing job time limits and checkpoints
  830. [2024-07-26T13:33:49.474] debug: Updating partition uid access list
  831. [2024-07-26T13:33:49.474] debug2: Updating reservations group's uid access lists
  832. [2024-07-26T13:33:49.474] debug2: Performing purge of old job records
  833. [2024-07-26T13:33:49.474] debug2: Performing full system state save
  834. [2024-07-26T13:33:49.474] debug: sched: Running job scheduler for full queue.
  835. [2024-07-26T13:33:49.624] debug: sched/backfill: _attempt_backfill: beginning
  836. [2024-07-26T13:33:49.624] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  837. [2024-07-26T13:33:53.496] debug: Spawning registration agent for server[2-3] 2 hosts
  838. [2024-07-26T13:33:53.496] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  839. [2024-07-26T13:33:53.497] debug2: Tree head got back 0 looking for 2
  840. [2024-07-26T13:33:53.498] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  841. [2024-07-26T13:33:53.498] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  842. [2024-07-26T13:33:53.498] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  843. [2024-07-26T13:33:53.498] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  844. [2024-07-26T13:33:54.498] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  845. [2024-07-26T13:33:54.498] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  846. [2024-07-26T13:33:54.499] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  847. [2024-07-26T13:33:54.499] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  848. [2024-07-26T13:33:55.499] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  849. [2024-07-26T13:33:55.499] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  850. [2024-07-26T13:33:55.500] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  851. [2024-07-26T13:33:55.500] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  852. [2024-07-26T13:33:56.500] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  853. [2024-07-26T13:33:56.500] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  854. [2024-07-26T13:33:56.501] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  855. [2024-07-26T13:33:56.501] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  856. [2024-07-26T13:33:57.501] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  857. [2024-07-26T13:33:57.501] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  858. [2024-07-26T13:33:57.502] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  859. [2024-07-26T13:33:57.502] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  860. [2024-07-26T13:33:58.502] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  861. [2024-07-26T13:33:58.502] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  862. [2024-07-26T13:33:58.502] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  863. [2024-07-26T13:33:58.502] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  864. [2024-07-26T13:33:59.503] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  865. [2024-07-26T13:33:59.503] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  866. [2024-07-26T13:33:59.504] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  867. [2024-07-26T13:33:59.504] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  868. [2024-07-26T13:34:00.504] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  869. [2024-07-26T13:34:00.504] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  870. [2024-07-26T13:34:00.504] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  871. [2024-07-26T13:34:00.504] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  872. [2024-07-26T13:34:01.505] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  873. [2024-07-26T13:34:01.505] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  874. [2024-07-26T13:34:01.505] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  875. [2024-07-26T13:34:01.505] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  876. [2024-07-26T13:34:02.506] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  877. [2024-07-26T13:34:02.506] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  878. [2024-07-26T13:34:02.506] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  879. [2024-07-26T13:34:02.506] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  880. [2024-07-26T13:34:03.507] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  881. [2024-07-26T13:34:03.507] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  882. [2024-07-26T13:34:03.507] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  883. [2024-07-26T13:34:03.507] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  884. [2024-07-26T13:34:04.507] debug2: Tree head got back 2
  885. [2024-07-26T13:34:04.507] debug2: Tree head got back 2
  886. [2024-07-26T13:34:19.537] debug2: Testing job time limits and checkpoints
  887. [2024-07-26T13:34:19.624] debug: sched/backfill: _attempt_backfill: beginning
  888. [2024-07-26T13:34:19.624] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  889. [2024-07-26T13:34:49.582] debug2: Testing job time limits and checkpoints
  890. [2024-07-26T13:34:49.582] debug2: Performing purge of old job records
  891. [2024-07-26T13:34:49.582] debug: sched: Running job scheduler for full queue.
  892. [2024-07-26T13:34:49.625] debug: sched/backfill: _attempt_backfill: beginning
  893. [2024-07-26T13:34:49.625] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  894. [2024-07-26T13:35:19.624] debug2: Testing job time limits and checkpoints
  895. [2024-07-26T13:35:19.625] debug: sched/backfill: _attempt_backfill: beginning
  896. [2024-07-26T13:35:19.625] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  897. [2024-07-26T13:35:33.645] debug: Spawning ping agent for server1
  898. [2024-07-26T13:35:33.645] debug: Spawning registration agent for server[2-3] 2 hosts
  899. [2024-07-26T13:35:33.645] debug2: Spawning RPC agent for msg_type REQUEST_PING
  900. [2024-07-26T13:35:33.645] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  901. [2024-07-26T13:35:33.645] debug2: Tree head got back 0 looking for 1
  902. [2024-07-26T13:35:33.645] debug2: Tree head got back 0 looking for 2
  903. [2024-07-26T13:35:33.646] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  904. [2024-07-26T13:35:33.646] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  905. [2024-07-26T13:35:33.646] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  906. [2024-07-26T13:35:33.646] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  907. [2024-07-26T13:35:33.647] debug2: Tree head got back 1
  908. [2024-07-26T13:35:33.650] debug2: node_did_resp server1
  909. [2024-07-26T13:35:34.647] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  910. [2024-07-26T13:35:34.647] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  911. [2024-07-26T13:35:34.647] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  912. [2024-07-26T13:35:34.647] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  913. [2024-07-26T13:35:35.648] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  914. [2024-07-26T13:35:35.648] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  915. [2024-07-26T13:35:35.648] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  916. [2024-07-26T13:35:35.648] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  917. [2024-07-26T13:35:36.649] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  918. [2024-07-26T13:35:36.649] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  919. [2024-07-26T13:35:36.649] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  920. [2024-07-26T13:35:36.649] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  921. [2024-07-26T13:35:37.650] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  922. [2024-07-26T13:35:37.650] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  923. [2024-07-26T13:35:37.650] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  924. [2024-07-26T13:35:37.650] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  925. [2024-07-26T13:35:38.651] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  926. [2024-07-26T13:35:38.651] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  927. [2024-07-26T13:35:38.651] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  928. [2024-07-26T13:35:38.651] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  929. [2024-07-26T13:35:39.652] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  930. [2024-07-26T13:35:39.652] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  931. [2024-07-26T13:35:39.652] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  932. [2024-07-26T13:35:39.652] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  933. [2024-07-26T13:35:40.653] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  934. [2024-07-26T13:35:40.653] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  935. [2024-07-26T13:35:40.653] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  936. [2024-07-26T13:35:40.653] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  937. [2024-07-26T13:35:41.654] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  938. [2024-07-26T13:35:41.654] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  939. [2024-07-26T13:35:41.654] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  940. [2024-07-26T13:35:41.654] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  941. [2024-07-26T13:35:42.654] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  942. [2024-07-26T13:35:42.654] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  943. [2024-07-26T13:35:42.655] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  944. [2024-07-26T13:35:42.655] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  945. [2024-07-26T13:35:43.655] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  946. [2024-07-26T13:35:43.655] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  947. [2024-07-26T13:35:43.655] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  948. [2024-07-26T13:35:43.655] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  949. [2024-07-26T13:35:44.655] debug2: Tree head got back 1
  950. [2024-07-26T13:35:44.656] debug2: Tree head got back 2
  951. [2024-07-26T13:35:49.625] debug: sched/backfill: _attempt_backfill: beginning
  952. [2024-07-26T13:35:49.625] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  953. [2024-07-26T13:35:49.671] debug2: Testing job time limits and checkpoints
  954. [2024-07-26T13:35:49.671] debug2: Performing purge of old job records
  955. [2024-07-26T13:35:49.671] debug: sched: Running job scheduler for full queue.
  956. [2024-07-26T13:36:19.626] debug: sched/backfill: _attempt_backfill: beginning
  957. [2024-07-26T13:36:19.626] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  958. [2024-07-26T13:36:19.719] debug2: Testing job time limits and checkpoints
  959. [2024-07-26T13:36:49.766] debug2: Testing job time limits and checkpoints
  960. [2024-07-26T13:36:49.766] debug2: Performing purge of old job records
  961. [2024-07-26T13:36:49.766] debug: sched: Running job scheduler for full queue.
  962. [2024-07-26T13:36:50.626] debug: sched/backfill: _attempt_backfill: beginning
  963. [2024-07-26T13:36:50.626] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  964. [2024-07-26T13:37:13.803] debug: Spawning registration agent for server[2-3] 2 hosts
  965. [2024-07-26T13:37:13.803] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  966. [2024-07-26T13:37:13.804] debug2: Tree head got back 0 looking for 2
  967. [2024-07-26T13:37:13.804] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  968. [2024-07-26T13:37:13.804] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  969. [2024-07-26T13:37:13.804] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  970. [2024-07-26T13:37:13.804] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  971. [2024-07-26T13:37:14.805] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  972. [2024-07-26T13:37:14.805] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  973. [2024-07-26T13:37:14.805] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  974. [2024-07-26T13:37:14.805] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  975. [2024-07-26T13:37:15.806] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  976. [2024-07-26T13:37:15.806] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  977. [2024-07-26T13:37:15.806] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  978. [2024-07-26T13:37:15.806] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  979. [2024-07-26T13:37:16.807] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  980. [2024-07-26T13:37:16.807] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  981. [2024-07-26T13:37:16.807] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  982. [2024-07-26T13:37:16.807] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  983. [2024-07-26T13:37:17.808] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  984. [2024-07-26T13:37:17.808] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  985. [2024-07-26T13:37:17.808] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  986. [2024-07-26T13:37:17.808] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  987. [2024-07-26T13:37:18.809] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  988. [2024-07-26T13:37:18.809] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  989. [2024-07-26T13:37:18.809] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  990. [2024-07-26T13:37:18.809] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  991. [2024-07-26T13:37:19.810] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  992. [2024-07-26T13:37:19.810] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  993. [2024-07-26T13:37:19.810] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  994. [2024-07-26T13:37:19.811] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  995. [2024-07-26T13:37:19.813] debug2: Testing job time limits and checkpoints
  996. [2024-07-26T13:37:20.811] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  997. [2024-07-26T13:37:20.812] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  998. [2024-07-26T13:37:20.812] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  999. [2024-07-26T13:37:20.812] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1000. [2024-07-26T13:37:21.812] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1001. [2024-07-26T13:37:21.813] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1002. [2024-07-26T13:37:21.813] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1003. [2024-07-26T13:37:21.813] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1004. [2024-07-26T13:37:22.814] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1005. [2024-07-26T13:37:22.814] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1006. [2024-07-26T13:37:22.814] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1007. [2024-07-26T13:37:22.814] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1008. [2024-07-26T13:37:23.814] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1009. [2024-07-26T13:37:23.814] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1010. [2024-07-26T13:37:23.815] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1011. [2024-07-26T13:37:23.815] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1012. [2024-07-26T13:37:24.815] debug2: Tree head got back 1
  1013. [2024-07-26T13:37:24.815] debug2: Tree head got back 2
  1014. [2024-07-26T13:37:49.861] debug2: Testing job time limits and checkpoints
  1015. [2024-07-26T13:37:49.861] debug2: Performing purge of old job records
  1016. [2024-07-26T13:37:49.861] debug: sched: Running job scheduler for full queue.
  1017. [2024-07-26T13:37:50.631] debug: sched/backfill: _attempt_backfill: beginning
  1018. [2024-07-26T13:37:50.631] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1019. [2024-07-26T13:38:19.907] debug2: Testing job time limits and checkpoints
  1020. [2024-07-26T13:38:49.955] debug2: Testing job time limits and checkpoints
  1021. [2024-07-26T13:38:49.955] debug2: Performing purge of old job records
  1022. [2024-07-26T13:38:49.955] debug2: Performing full system state save
  1023. [2024-07-26T13:38:49.955] debug: sched: Running job scheduler for full queue.
  1024. [2024-07-26T13:38:50.636] debug: sched/backfill: _attempt_backfill: beginning
  1025. [2024-07-26T13:38:50.636] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1026. [2024-07-26T13:38:53.978] debug: Spawning ping agent for server1
  1027. [2024-07-26T13:38:53.978] debug: Spawning registration agent for server[2-3] 2 hosts
  1028. [2024-07-26T13:38:53.978] debug2: Spawning RPC agent for msg_type REQUEST_PING
  1029. [2024-07-26T13:38:53.978] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  1030. [2024-07-26T13:38:53.979] debug2: Tree head got back 0 looking for 1
  1031. [2024-07-26T13:38:53.979] debug2: Tree head got back 0 looking for 2
  1032. [2024-07-26T13:38:53.979] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1033. [2024-07-26T13:38:53.980] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1034. [2024-07-26T13:38:53.980] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1035. [2024-07-26T13:38:53.980] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1036. [2024-07-26T13:38:53.980] debug2: Tree head got back 1
  1037. [2024-07-26T13:38:53.984] debug2: node_did_resp server1
  1038. [2024-07-26T13:38:54.981] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1039. [2024-07-26T13:38:54.981] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1040. [2024-07-26T13:38:54.981] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1041. [2024-07-26T13:38:54.981] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1042. [2024-07-26T13:38:55.982] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1043. [2024-07-26T13:38:55.982] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1044. [2024-07-26T13:38:55.982] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1045. [2024-07-26T13:38:55.982] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1046. [2024-07-26T13:38:56.983] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1047. [2024-07-26T13:38:56.983] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1048. [2024-07-26T13:38:56.983] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1049. [2024-07-26T13:38:56.983] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1050. [2024-07-26T13:38:57.984] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1051. [2024-07-26T13:38:57.984] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1052. [2024-07-26T13:38:57.984] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1053. [2024-07-26T13:38:57.984] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1054. [2024-07-26T13:38:58.985] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1055. [2024-07-26T13:38:58.985] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1056. [2024-07-26T13:38:58.985] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1057. [2024-07-26T13:38:58.985] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1058. [2024-07-26T13:38:59.986] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1059. [2024-07-26T13:38:59.986] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1060. [2024-07-26T13:38:59.986] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1061. [2024-07-26T13:38:59.986] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1062. [2024-07-26T13:39:00.986] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1063. [2024-07-26T13:39:00.986] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1064. [2024-07-26T13:39:00.987] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1065. [2024-07-26T13:39:00.987] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1066. [2024-07-26T13:39:01.987] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1067. [2024-07-26T13:39:01.987] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1068. [2024-07-26T13:39:01.987] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1069. [2024-07-26T13:39:01.988] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1070. [2024-07-26T13:39:02.988] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1071. [2024-07-26T13:39:02.988] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1072. [2024-07-26T13:39:02.989] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1073. [2024-07-26T13:39:02.989] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1074. [2024-07-26T13:39:03.989] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1075. [2024-07-26T13:39:03.989] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1076. [2024-07-26T13:39:03.989] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1077. [2024-07-26T13:39:03.989] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1078. [2024-07-26T13:39:04.990] debug2: Tree head got back 1
  1079. [2024-07-26T13:39:04.990] debug2: Tree head got back 2
  1080. [2024-07-26T13:39:19.017] debug2: Testing job time limits and checkpoints
  1081. [2024-07-26T13:39:20.636] debug: sched/backfill: _attempt_backfill: beginning
  1082. [2024-07-26T13:39:20.636] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1083. [2024-07-26T13:39:49.064] debug2: Testing job time limits and checkpoints
  1084. [2024-07-26T13:39:49.064] debug2: Performing purge of old job records
  1085. [2024-07-26T13:39:49.064] debug: sched: Running job scheduler for full queue.
  1086. [2024-07-26T13:39:50.637] debug: sched/backfill: _attempt_backfill: beginning
  1087. [2024-07-26T13:39:50.637] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1088. [2024-07-26T13:40:19.109] debug2: Testing job time limits and checkpoints
  1089. [2024-07-26T13:40:33.130] debug: Spawning registration agent for server[2-3] 2 hosts
  1090. [2024-07-26T13:40:33.130] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  1091. [2024-07-26T13:40:33.131] debug2: Tree head got back 0 looking for 2
  1092. [2024-07-26T13:40:33.132] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1093. [2024-07-26T13:40:33.132] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1094. [2024-07-26T13:40:33.132] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1095. [2024-07-26T13:40:33.132] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1096. [2024-07-26T13:40:34.133] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1097. [2024-07-26T13:40:34.133] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1098. [2024-07-26T13:40:34.133] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1099. [2024-07-26T13:40:34.133] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1100. [2024-07-26T13:40:35.133] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1101. [2024-07-26T13:40:35.134] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1102. [2024-07-26T13:40:35.134] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1103. [2024-07-26T13:40:35.134] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1104. [2024-07-26T13:40:36.134] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1105. [2024-07-26T13:40:36.134] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1106. [2024-07-26T13:40:36.135] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1107. [2024-07-26T13:40:36.135] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1108. [2024-07-26T13:40:37.135] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1109. [2024-07-26T13:40:37.135] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1110. [2024-07-26T13:40:37.135] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1111. [2024-07-26T13:40:37.135] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1112. [2024-07-26T13:40:38.136] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1113. [2024-07-26T13:40:38.136] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1114. [2024-07-26T13:40:38.136] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1115. [2024-07-26T13:40:38.136] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1116. [2024-07-26T13:40:39.137] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1117. [2024-07-26T13:40:39.137] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1118. [2024-07-26T13:40:39.137] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1119. [2024-07-26T13:40:39.137] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1120. [2024-07-26T13:40:40.138] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1121. [2024-07-26T13:40:40.138] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1122. [2024-07-26T13:40:40.138] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1123. [2024-07-26T13:40:40.138] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1124. [2024-07-26T13:40:41.139] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1125. [2024-07-26T13:40:41.139] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1126. [2024-07-26T13:40:41.139] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1127. [2024-07-26T13:40:41.139] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1128. [2024-07-26T13:40:42.140] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1129. [2024-07-26T13:40:42.140] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1130. [2024-07-26T13:40:42.140] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1131. [2024-07-26T13:40:42.140] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1132. [2024-07-26T13:40:43.141] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1133. [2024-07-26T13:40:43.141] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1134. [2024-07-26T13:40:43.141] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1135. [2024-07-26T13:40:43.141] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1136. [2024-07-26T13:40:44.141] debug2: Tree head got back 1
  1137. [2024-07-26T13:40:44.141] debug2: Tree head got back 2
  1138. [2024-07-26T13:40:49.153] debug2: Testing job time limits and checkpoints
  1139. [2024-07-26T13:40:49.153] debug2: Performing purge of old job records
  1140. [2024-07-26T13:40:49.153] debug: sched: Running job scheduler for full queue.
  1141. [2024-07-26T13:40:49.642] debug: sched/backfill: _attempt_backfill: beginning
  1142. [2024-07-26T13:40:49.642] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1143. [2024-07-26T13:41:19.196] debug2: Testing job time limits and checkpoints
  1144. [2024-07-26T13:41:19.642] debug: sched/backfill: _attempt_backfill: beginning
  1145. [2024-07-26T13:41:19.642] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1146. [2024-07-26T13:41:49.244] debug2: Testing job time limits and checkpoints
  1147. [2024-07-26T13:41:49.244] debug2: Performing purge of old job records
  1148. [2024-07-26T13:41:49.244] debug: sched: Running job scheduler for full queue.
  1149. [2024-07-26T13:41:49.642] debug: sched/backfill: _attempt_backfill: beginning
  1150. [2024-07-26T13:41:49.642] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1151. [2024-07-26T13:42:13.279] debug: Spawning ping agent for server1
  1152. [2024-07-26T13:42:13.279] debug: Spawning registration agent for server[2-3] 2 hosts
  1153. [2024-07-26T13:42:13.279] debug2: Spawning RPC agent for msg_type REQUEST_PING
  1154. [2024-07-26T13:42:13.279] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  1155. [2024-07-26T13:42:13.279] debug2: Tree head got back 0 looking for 1
  1156. [2024-07-26T13:42:13.280] debug2: Tree head got back 0 looking for 2
  1157. [2024-07-26T13:42:13.280] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1158. [2024-07-26T13:42:13.280] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1159. [2024-07-26T13:42:13.280] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1160. [2024-07-26T13:42:13.280] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1161. [2024-07-26T13:42:13.281] debug2: Tree head got back 1
  1162. [2024-07-26T13:42:13.284] debug2: node_did_resp server1
  1163. [2024-07-26T13:42:14.281] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1164. [2024-07-26T13:42:14.281] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1165. [2024-07-26T13:42:14.281] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1166. [2024-07-26T13:42:14.281] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1167. [2024-07-26T13:42:15.282] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1168. [2024-07-26T13:42:15.282] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1169. [2024-07-26T13:42:15.282] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1170. [2024-07-26T13:42:15.282] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1171. [2024-07-26T13:42:16.283] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1172. [2024-07-26T13:42:16.283] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1173. [2024-07-26T13:42:16.283] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1174. [2024-07-26T13:42:16.283] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1175. [2024-07-26T13:42:17.284] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1176. [2024-07-26T13:42:17.284] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1177. [2024-07-26T13:42:17.284] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1178. [2024-07-26T13:42:17.284] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1179. [2024-07-26T13:42:18.285] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1180. [2024-07-26T13:42:18.285] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1181. [2024-07-26T13:42:18.285] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1182. [2024-07-26T13:42:18.285] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1183. [2024-07-26T13:42:19.286] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1184. [2024-07-26T13:42:19.286] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1185. [2024-07-26T13:42:19.286] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1186. [2024-07-26T13:42:19.286] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1187. [2024-07-26T13:42:19.289] debug2: Testing job time limits and checkpoints
  1188. [2024-07-26T13:42:19.643] debug: sched/backfill: _attempt_backfill: beginning
  1189. [2024-07-26T13:42:19.643] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1190. [2024-07-26T13:42:20.287] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1191. [2024-07-26T13:42:20.287] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1192. [2024-07-26T13:42:20.287] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1193. [2024-07-26T13:42:20.287] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1194. [2024-07-26T13:42:21.288] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1195. [2024-07-26T13:42:21.288] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1196. [2024-07-26T13:42:21.288] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1197. [2024-07-26T13:42:21.288] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1198. [2024-07-26T13:42:22.288] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1199. [2024-07-26T13:42:22.288] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1200. [2024-07-26T13:42:22.289] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1201. [2024-07-26T13:42:22.289] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1202. [2024-07-26T13:42:23.289] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1203. [2024-07-26T13:42:23.290] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1204. [2024-07-26T13:42:23.290] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1205. [2024-07-26T13:42:23.290] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1206. [2024-07-26T13:42:24.290] debug2: Tree head got back 1
  1207. [2024-07-26T13:42:24.290] debug2: Tree head got back 2
  1208. [2024-07-26T13:42:49.336] debug2: Testing job time limits and checkpoints
  1209. [2024-07-26T13:42:49.336] debug2: Performing purge of old job records
  1210. [2024-07-26T13:42:49.336] debug: sched: Running job scheduler for full queue.
  1211. [2024-07-26T13:42:49.643] debug: sched/backfill: _attempt_backfill: beginning
  1212. [2024-07-26T13:42:49.643] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1213. [2024-07-26T13:43:19.382] debug2: Testing job time limits and checkpoints
  1214. [2024-07-26T13:43:19.643] debug: sched/backfill: _attempt_backfill: beginning
  1215. [2024-07-26T13:43:19.644] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1216. [2024-07-26T13:43:49.430] debug2: Testing job time limits and checkpoints
  1217. [2024-07-26T13:43:49.430] debug: Updating partition uid access list
  1218. [2024-07-26T13:43:49.430] debug2: Updating reservations group's uid access lists
  1219. [2024-07-26T13:43:49.430] debug2: Performing purge of old job records
  1220. [2024-07-26T13:43:49.430] debug2: Performing full system state save
  1221. [2024-07-26T13:43:49.430] debug: sched: Running job scheduler for full queue.
  1222. [2024-07-26T13:43:49.644] debug: sched/backfill: _attempt_backfill: beginning
  1223. [2024-07-26T13:43:49.644] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1224. [2024-07-26T13:43:53.452] debug: Spawning registration agent for server[2-3] 2 hosts
  1225. [2024-07-26T13:43:53.452] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  1226. [2024-07-26T13:43:53.453] debug2: Tree head got back 0 looking for 2
  1227. [2024-07-26T13:43:53.453] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1228. [2024-07-26T13:43:53.453] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1229. [2024-07-26T13:43:53.453] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1230. [2024-07-26T13:43:53.453] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1231. [2024-07-26T13:43:54.454] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1232. [2024-07-26T13:43:54.454] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1233. [2024-07-26T13:43:54.454] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1234. [2024-07-26T13:43:54.454] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1235. [2024-07-26T13:43:55.455] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1236. [2024-07-26T13:43:55.455] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1237. [2024-07-26T13:43:55.455] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1238. [2024-07-26T13:43:55.455] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1239. [2024-07-26T13:43:56.456] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1240. [2024-07-26T13:43:56.456] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1241. [2024-07-26T13:43:56.456] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1242. [2024-07-26T13:43:56.456] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1243. [2024-07-26T13:43:57.457] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1244. [2024-07-26T13:43:57.457] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1245. [2024-07-26T13:43:57.457] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1246. [2024-07-26T13:43:57.457] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1247. [2024-07-26T13:43:58.458] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1248. [2024-07-26T13:43:58.458] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1249. [2024-07-26T13:43:58.458] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1250. [2024-07-26T13:43:58.458] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1251. [2024-07-26T13:43:59.459] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1252. [2024-07-26T13:43:59.459] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1253. [2024-07-26T13:43:59.459] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1254. [2024-07-26T13:43:59.459] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1255. [2024-07-26T13:44:00.460] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1256. [2024-07-26T13:44:00.460] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1257. [2024-07-26T13:44:00.460] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1258. [2024-07-26T13:44:00.460] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1259. [2024-07-26T13:44:01.461] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1260. [2024-07-26T13:44:01.461] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1261. [2024-07-26T13:44:01.461] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1262. [2024-07-26T13:44:01.461] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1263. [2024-07-26T13:44:02.461] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1264. [2024-07-26T13:44:02.461] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1265. [2024-07-26T13:44:02.462] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1266. [2024-07-26T13:44:02.462] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1267. [2024-07-26T13:44:03.462] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1268. [2024-07-26T13:44:03.463] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1269. [2024-07-26T13:44:03.463] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1270. [2024-07-26T13:44:03.463] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1271. [2024-07-26T13:44:04.463] debug2: Tree head got back 1
  1272. [2024-07-26T13:44:04.464] debug2: Tree head got back 2
  1273. [2024-07-26T13:44:19.493] debug2: Testing job time limits and checkpoints
  1274. [2024-07-26T13:44:19.644] debug: sched/backfill: _attempt_backfill: beginning
  1275. [2024-07-26T13:44:19.644] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1276. [2024-07-26T13:44:49.535] debug2: Testing job time limits and checkpoints
  1277. [2024-07-26T13:44:49.536] debug2: Performing purge of old job records
  1278. [2024-07-26T13:44:49.536] debug: sched: Running job scheduler for full queue.
  1279. [2024-07-26T13:44:49.645] debug: sched/backfill: _attempt_backfill: beginning
  1280. [2024-07-26T13:44:49.645] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1281. [2024-07-26T13:45:19.579] debug2: Testing job time limits and checkpoints
  1282. [2024-07-26T13:45:19.645] debug: sched/backfill: _attempt_backfill: beginning
  1283. [2024-07-26T13:45:19.645] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1284. [2024-07-26T13:45:33.601] debug: Spawning registration agent for server[1-3] 3 hosts
  1285. [2024-07-26T13:45:33.601] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  1286. [2024-07-26T13:45:33.602] debug2: Tree head got back 0 looking for 3
  1287. [2024-07-26T13:45:33.602] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1288. [2024-07-26T13:45:33.603] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1289. [2024-07-26T13:45:33.603] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1290. [2024-07-26T13:45:33.603] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1291. [2024-07-26T13:45:33.604] debug2: Tree head got back 1
  1292. [2024-07-26T13:45:33.604] debug2: Processing RPC: MESSAGE_NODE_REGISTRATION_STATUS from UID=0
  1293. [2024-07-26T13:45:33.604] debug2: _slurm_rpc_node_registration complete for server1 usec=14
  1294. [2024-07-26T13:45:34.604] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1295. [2024-07-26T13:45:34.604] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1296. [2024-07-26T13:45:34.604] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1297. [2024-07-26T13:45:34.604] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1298. [2024-07-26T13:45:35.605] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1299. [2024-07-26T13:45:35.605] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1300. [2024-07-26T13:45:35.605] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1301. [2024-07-26T13:45:35.605] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1302. [2024-07-26T13:45:36.606] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1303. [2024-07-26T13:45:36.606] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1304. [2024-07-26T13:45:36.606] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1305. [2024-07-26T13:45:36.606] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1306. [2024-07-26T13:45:37.607] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1307. [2024-07-26T13:45:37.607] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1308. [2024-07-26T13:45:37.607] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1309. [2024-07-26T13:45:37.607] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1310. [2024-07-26T13:45:38.608] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1311. [2024-07-26T13:45:38.608] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1312. [2024-07-26T13:45:38.608] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1313. [2024-07-26T13:45:38.608] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1314. [2024-07-26T13:45:39.609] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1315. [2024-07-26T13:45:39.609] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1316. [2024-07-26T13:45:39.609] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1317. [2024-07-26T13:45:39.609] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1318. [2024-07-26T13:45:40.610] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1319. [2024-07-26T13:45:40.610] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1320. [2024-07-26T13:45:40.610] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1321. [2024-07-26T13:45:40.610] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1322. [2024-07-26T13:45:41.610] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1323. [2024-07-26T13:45:41.611] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1324. [2024-07-26T13:45:41.611] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1325. [2024-07-26T13:45:41.611] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1326. [2024-07-26T13:45:42.612] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1327. [2024-07-26T13:45:42.612] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1328. [2024-07-26T13:45:42.612] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1329. [2024-07-26T13:45:42.612] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1330. [2024-07-26T13:45:43.613] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1331. [2024-07-26T13:45:43.613] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1332. [2024-07-26T13:45:43.613] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1333. [2024-07-26T13:45:43.613] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1334. [2024-07-26T13:45:44.613] debug2: Tree head got back 2
  1335. [2024-07-26T13:45:44.613] debug2: Tree head got back 3
  1336. [2024-07-26T13:45:44.879] debug2: node_did_resp server1
  1337. [2024-07-26T13:45:49.625] debug2: Testing job time limits and checkpoints
  1338. [2024-07-26T13:45:49.625] debug2: Performing purge of old job records
  1339. [2024-07-26T13:45:49.626] debug: sched: Running job scheduler for full queue.
  1340. [2024-07-26T13:45:49.645] debug: sched/backfill: _attempt_backfill: beginning
  1341. [2024-07-26T13:45:49.645] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1342. [2024-07-26T13:46:19.646] debug: sched/backfill: _attempt_backfill: beginning
  1343. [2024-07-26T13:46:19.646] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1344. [2024-07-26T13:46:19.673] debug2: Testing job time limits and checkpoints
  1345. [2024-07-26T13:46:49.716] debug2: Testing job time limits and checkpoints
  1346. [2024-07-26T13:46:49.716] debug2: Performing purge of old job records
  1347. [2024-07-26T13:46:49.716] debug: sched: Running job scheduler for full queue.
  1348. [2024-07-26T13:46:50.646] debug: sched/backfill: _attempt_backfill: beginning
  1349. [2024-07-26T13:46:50.646] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1350. [2024-07-26T13:47:13.752] debug: Spawning registration agent for server[2-3] 2 hosts
  1351. [2024-07-26T13:47:13.752] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  1352. [2024-07-26T13:47:13.753] debug2: Tree head got back 0 looking for 2
  1353. [2024-07-26T13:47:13.754] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1354. [2024-07-26T13:47:13.754] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1355. [2024-07-26T13:47:13.754] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1356. [2024-07-26T13:47:13.754] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1357. [2024-07-26T13:47:14.755] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1358. [2024-07-26T13:47:14.755] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1359. [2024-07-26T13:47:14.755] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1360. [2024-07-26T13:47:14.755] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1361. [2024-07-26T13:47:15.756] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1362. [2024-07-26T13:47:15.756] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1363. [2024-07-26T13:47:15.756] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1364. [2024-07-26T13:47:15.756] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1365. [2024-07-26T13:47:16.756] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1366. [2024-07-26T13:47:16.757] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1367. [2024-07-26T13:47:16.757] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1368. [2024-07-26T13:47:16.757] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1369. [2024-07-26T13:47:17.757] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1370. [2024-07-26T13:47:17.757] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1371. [2024-07-26T13:47:17.758] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1372. [2024-07-26T13:47:17.758] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1373. [2024-07-26T13:47:18.758] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1374. [2024-07-26T13:47:18.758] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1375. [2024-07-26T13:47:18.759] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1376. [2024-07-26T13:47:18.759] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1377. [2024-07-26T13:47:19.759] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1378. [2024-07-26T13:47:19.760] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1379. [2024-07-26T13:47:19.760] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1380. [2024-07-26T13:47:19.760] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1381. [2024-07-26T13:47:19.762] debug2: Testing job time limits and checkpoints
  1382. [2024-07-26T13:47:20.760] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1383. [2024-07-26T13:47:20.760] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1384. [2024-07-26T13:47:20.761] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1385. [2024-07-26T13:47:20.761] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1386. [2024-07-26T13:47:21.761] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1387. [2024-07-26T13:47:21.761] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1388. [2024-07-26T13:47:21.762] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1389. [2024-07-26T13:47:21.762] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1390. [2024-07-26T13:47:22.762] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1391. [2024-07-26T13:47:22.762] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1392. [2024-07-26T13:47:22.763] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1393. [2024-07-26T13:47:22.763] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1394. [2024-07-26T13:47:23.763] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1395. [2024-07-26T13:47:23.763] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1396. [2024-07-26T13:47:23.764] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1397. [2024-07-26T13:47:23.764] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1398. [2024-07-26T13:47:24.763] debug2: Tree head got back 1
  1399. [2024-07-26T13:47:24.764] debug2: Tree head got back 2
  1400. [2024-07-26T13:47:49.810] debug2: Testing job time limits and checkpoints
  1401. [2024-07-26T13:47:49.810] debug2: Performing purge of old job records
  1402. [2024-07-26T13:47:49.810] debug: sched: Running job scheduler for full queue.
  1403. [2024-07-26T13:47:50.651] debug: sched/backfill: _attempt_backfill: beginning
  1404. [2024-07-26T13:47:50.652] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1405. [2024-07-26T13:48:19.853] debug2: Testing job time limits and checkpoints
  1406. [2024-07-26T13:48:49.899] debug2: Testing job time limits and checkpoints
  1407. [2024-07-26T13:48:49.899] debug2: Performing purge of old job records
  1408. [2024-07-26T13:48:49.899] debug2: Performing full system state save
  1409. [2024-07-26T13:48:49.899] debug: sched: Running job scheduler for full queue.
  1410. [2024-07-26T13:48:50.656] debug: sched/backfill: _attempt_backfill: beginning
  1411. [2024-07-26T13:48:50.657] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1412. [2024-07-26T13:48:53.922] debug: Spawning ping agent for server1
  1413. [2024-07-26T13:48:53.922] debug: Spawning registration agent for server[2-3] 2 hosts
  1414. [2024-07-26T13:48:53.922] debug2: Spawning RPC agent for msg_type REQUEST_PING
  1415. [2024-07-26T13:48:53.922] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  1416. [2024-07-26T13:48:53.922] debug2: Tree head got back 0 looking for 1
  1417. [2024-07-26T13:48:53.922] debug2: Tree head got back 0 looking for 2
  1418. [2024-07-26T13:48:53.923] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1419. [2024-07-26T13:48:53.923] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1420. [2024-07-26T13:48:53.923] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1421. [2024-07-26T13:48:53.923] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1422. [2024-07-26T13:48:53.924] debug2: Tree head got back 1
  1423. [2024-07-26T13:48:53.927] debug2: node_did_resp server1
  1424. [2024-07-26T13:48:54.924] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1425. [2024-07-26T13:48:54.924] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1426. [2024-07-26T13:48:54.924] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1427. [2024-07-26T13:48:54.924] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1428. [2024-07-26T13:48:55.925] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1429. [2024-07-26T13:48:55.925] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1430. [2024-07-26T13:48:55.925] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1431. [2024-07-26T13:48:55.925] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1432. [2024-07-26T13:48:56.926] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1433. [2024-07-26T13:48:56.926] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1434. [2024-07-26T13:48:56.926] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1435. [2024-07-26T13:48:56.926] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1436. [2024-07-26T13:48:57.927] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1437. [2024-07-26T13:48:57.927] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1438. [2024-07-26T13:48:57.927] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1439. [2024-07-26T13:48:57.927] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1440. [2024-07-26T13:48:58.928] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1441. [2024-07-26T13:48:58.928] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1442. [2024-07-26T13:48:58.928] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1443. [2024-07-26T13:48:58.928] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1444. [2024-07-26T13:48:59.928] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1445. [2024-07-26T13:48:59.929] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1446. [2024-07-26T13:48:59.929] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1447. [2024-07-26T13:48:59.929] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1448. [2024-07-26T13:49:00.929] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1449. [2024-07-26T13:49:00.929] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1450. [2024-07-26T13:49:00.930] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1451. [2024-07-26T13:49:00.930] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1452. [2024-07-26T13:49:01.930] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1453. [2024-07-26T13:49:01.930] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1454. [2024-07-26T13:49:01.931] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1455. [2024-07-26T13:49:01.931] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1456. [2024-07-26T13:49:02.931] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1457. [2024-07-26T13:49:02.931] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1458. [2024-07-26T13:49:02.932] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1459. [2024-07-26T13:49:02.932] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1460. [2024-07-26T13:49:03.932] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1461. [2024-07-26T13:49:03.932] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1462. [2024-07-26T13:49:03.933] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1463. [2024-07-26T13:49:03.933] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1464. [2024-07-26T13:49:04.933] debug2: Tree head got back 1
  1465. [2024-07-26T13:49:04.933] debug2: Tree head got back 2
  1466. [2024-07-26T13:49:19.960] debug2: Testing job time limits and checkpoints
  1467. [2024-07-26T13:49:20.657] debug: sched/backfill: _attempt_backfill: beginning
  1468. [2024-07-26T13:49:20.657] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1469. [2024-07-26T13:49:49.005] debug2: Testing job time limits and checkpoints
  1470. [2024-07-26T13:49:49.005] debug2: Performing purge of old job records
  1471. [2024-07-26T13:49:49.005] debug: sched: Running job scheduler for full queue.
  1472. [2024-07-26T13:49:50.657] debug: sched/backfill: _attempt_backfill: beginning
  1473. [2024-07-26T13:49:50.657] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1474. [2024-07-26T13:50:19.050] debug2: Testing job time limits and checkpoints
  1475. [2024-07-26T13:50:33.073] debug: Spawning registration agent for server[2-3] 2 hosts
  1476. [2024-07-26T13:50:33.073] debug2: Spawning RPC agent for msg_type REQUEST_NODE_REGISTRATION_STATUS
  1477. [2024-07-26T13:50:33.073] debug2: Tree head got back 0 looking for 2
  1478. [2024-07-26T13:50:33.074] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1479. [2024-07-26T13:50:33.074] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1480. [2024-07-26T13:50:33.074] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1481. [2024-07-26T13:50:33.074] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1482. [2024-07-26T13:50:34.075] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1483. [2024-07-26T13:50:34.075] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1484. [2024-07-26T13:50:34.075] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1485. [2024-07-26T13:50:34.075] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1486. [2024-07-26T13:50:35.076] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1487. [2024-07-26T13:50:35.076] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1488. [2024-07-26T13:50:35.076] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1489. [2024-07-26T13:50:35.076] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1490. [2024-07-26T13:50:36.077] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1491. [2024-07-26T13:50:36.077] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1492. [2024-07-26T13:50:36.077] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1493. [2024-07-26T13:50:36.077] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1494. [2024-07-26T13:50:37.078] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1495. [2024-07-26T13:50:37.078] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1496. [2024-07-26T13:50:37.078] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1497. [2024-07-26T13:50:37.078] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1498. [2024-07-26T13:50:38.079] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1499. [2024-07-26T13:50:38.079] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1500. [2024-07-26T13:50:38.079] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1501. [2024-07-26T13:50:38.079] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1502. [2024-07-26T13:50:39.080] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1503. [2024-07-26T13:50:39.080] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1504. [2024-07-26T13:50:39.080] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1505. [2024-07-26T13:50:39.080] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1506. [2024-07-26T13:50:40.081] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1507. [2024-07-26T13:50:40.081] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1508. [2024-07-26T13:50:40.081] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1509. [2024-07-26T13:50:40.081] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1510. [2024-07-26T13:50:41.082] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1511. [2024-07-26T13:50:41.082] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1512. [2024-07-26T13:50:41.082] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1513. [2024-07-26T13:50:41.082] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1514. [2024-07-26T13:50:42.082] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1515. [2024-07-26T13:50:42.083] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1516. [2024-07-26T13:50:42.083] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1517. [2024-07-26T13:50:42.083] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1518. [2024-07-26T13:50:43.083] debug2: _slurm_connect: failed to connect to 10.36.17.132:6818: Connection refused
  1519. [2024-07-26T13:50:43.084] debug2: Error connecting slurm stream socket at 10.36.17.132:6818: Connection refused
  1520. [2024-07-26T13:50:43.084] debug2: _slurm_connect: failed to connect to 10.36.17.166:6818: Connection refused
  1521. [2024-07-26T13:50:43.084] debug2: Error connecting slurm stream socket at 10.36.17.166:6818: Connection refused
  1522. [2024-07-26T13:50:44.084] debug2: Tree head got back 1
  1523. [2024-07-26T13:50:44.084] debug2: Tree head got back 2
  1524. [2024-07-26T13:50:49.095] debug2: Testing job time limits and checkpoints
  1525. [2024-07-26T13:50:49.095] debug2: Performing purge of old job records
  1526. [2024-07-26T13:50:49.096] debug: sched: Running job scheduler for full queue.
  1527. [2024-07-26T13:50:49.662] debug: sched/backfill: _attempt_backfill: beginning
  1528. [2024-07-26T13:50:49.662] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1529. [2024-07-26T13:51:19.140] debug2: Testing job time limits and checkpoints
  1530. [2024-07-26T13:51:19.662] debug: sched/backfill: _attempt_backfill: beginning
  1531. [2024-07-26T13:51:19.662] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1532. [2024-07-26T13:51:49.187] debug2: Testing job time limits and checkpoints
  1533. [2024-07-26T13:51:49.188] debug2: Performing purge of old job records
  1534. [2024-07-26T13:51:49.188] debug: sched: Running job scheduler for full queue.
  1535. [2024-07-26T13:51:49.663] debug: sched/backfill: _attempt_backfill: beginning
  1536. [2024-07-26T13:51:49.663] debug: sched/backfill: _attempt_backfill: 1 jobs to backfill
  1537.  
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement