Guest User

mpirun

a guest
Apr 28th, 2023
103
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 8.82 KB | None | 0 0
  1. [node01:2220381] mca: base: components_register: registering framework oob components
  2. [node01:2220381] mca: base: components_register: found loaded component tcp
  3. [node01:2220381] mca: base: components_register: component tcp register function successful
  4. [node01:2220381] mca: base: components_open: opening oob components
  5. [node01:2220381] mca: base: components_open: found loaded component tcp
  6. [node01:2220381] mca: base: components_open: component tcp open function successful
  7. [node01:2220381] mca:oob:select: checking available component tcp
  8. [node01:2220381] mca:oob:select: Querying component [tcp]
  9. [node01:2220381] oob:tcp: component_available called
  10. [node01:2220381] WORKING INTERFACE 1 KERNEL INDEX 1 FAMILY: V4
  11. [node01:2220381] WORKING INTERFACE 2 KERNEL INDEX 2 FAMILY: V4
  12. [node01:2220381] [[63667,0],0] oob:tcp:init adding <ip_node_01> to our list of V4 connections
  13. [node01:2220381] [[63667,0],0] TCP STARTUP
  14. [node01:2220381] [[63667,0],0] attempting to bind to IPv4 port 0
  15. [node01:2220381] [[63667,0],0] assigned IPv4 port 51187
  16. [node01:2220381] mca:oob:select: Adding component to end
  17. [node01:2220381] mca:oob:select: Found 1 active transports
  18. [node01:2220381] mca: base: components_register: registering framework rml components
  19. [node01:2220381] mca: base: components_register: found loaded component oob
  20. [node01:2220381] mca: base: components_register: component oob has no register or open function
  21. [node01:2220381] mca: base: components_open: opening rml components
  22. [node01:2220381] mca: base: components_open: found loaded component oob
  23. [node01:2220381] [[63667,0],0]: get transports
  24. [node01:2220381] [[63667,0],0]:get transports for component tcp
  25. [node01:2220381] mca: base: components_open: component oob open function successful
  26. [node01:2220381] orte_rml_base_select: Initializing rml component oob
  27. [node01:2220381] [[63667,0],0]: Final rml priorities
  28. [node01:2220381] Component: oob Priority: 5
  29. [node01:2220381] [[63667,0],0] rml:base:open_conduit
  30. [node01:2220381] [[63667,0],0] rml:base:open_conduit Component oob provided a conduit
  31. [node01:2220381] [[63667,0],0] rml:base:open_conduit
  32. [node01:2220381] [[63667,0],0] rml:base:open_conduit Component oob provided a conduit
  33. [node01:2220381] [[63667,0],0] rml_recv_buffer_nb for peer [[WILDCARD],WILDCARD] tag 27
  34. [node01:2220381] [[63667,0],0] rml_recv_buffer_nb for peer [[WILDCARD],WILDCARD] tag 50
  35. [node01:2220381] [[63667,0],0] rml_recv_buffer_nb for peer [[WILDCARD],WILDCARD] tag 51
  36. [node01:2220381] [[63667,0],0] rml_recv_buffer_nb for peer [[WILDCARD],WILDCARD] tag 6
  37. [node01:2220381] [[63667,0],0] rml_recv_buffer_nb for peer [[WILDCARD],WILDCARD] tag 28
  38. [node01:2220381] [[63667,0],0] rml_recv_buffer_nb for peer [[WILDCARD],WILDCARD] tag 59
  39. [node01:2220381] [[63667,0],0] rml_recv_buffer_nb for peer [[WILDCARD],WILDCARD] tag 15
  40. [node01:2220381] [[63667,0],0] rml_recv_buffer_nb for peer [[WILDCARD],WILDCARD] tag 33
  41. [node01:2220381] [[63667,0],0] rml_recv_buffer_nb for peer [[WILDCARD],WILDCARD] tag 31
  42. [node01:2220381] [[63667,0],0] rml_recv_buffer_nb for peer [[WILDCARD],WILDCARD] tag 5
  43. [node01:2220381] [[63667,0],0] rml_recv_buffer_nb for peer [[WILDCARD],WILDCARD] tag 10
  44. [node01:2220381] [[63667,0],0] rml_recv_buffer_nb for peer [[WILDCARD],WILDCARD] tag 12
  45. [node01:2220381] [[63667,0],0] rml_recv_buffer_nb for peer [[WILDCARD],WILDCARD] tag 62
  46. [node01:2220381] [[63667,0],0] rml_recv_buffer_nb for peer [[WILDCARD],WILDCARD] tag 36
  47. [node01:2220381] [[63667,0],0] rml_recv_buffer_nb for peer [[WILDCARD],WILDCARD] tag 2
  48. [node01:2220381] [[63667,0],0] rml_recv_buffer_nb for peer [[WILDCARD],WILDCARD] tag 21
  49. [node01:2220381] [[63667,0],0] rml_recv_buffer_nb for peer [[WILDCARD],WILDCARD] tag 22
  50. [node01:2220381] [[63667,0],0] rml_recv_buffer_nb for peer [[WILDCARD],WILDCARD] tag 1
  51. [node01:2220381] [[63667,0],0] posting recv
  52. [node01:2220381] [[63667,0],0] posting persistent recv on tag 27 for peer [[WILDCARD],WILDCARD]
  53. [node01:2220381] [[63667,0],0] posting recv
  54. [node01:2220381] [[63667,0],0] posting persistent recv on tag 50 for peer [[WILDCARD],WILDCARD]
  55. [node01:2220381] [[63667,0],0] posting recv
  56. [node01:2220381] [[63667,0],0] posting persistent recv on tag 51 for peer [[WILDCARD],WILDCARD]
  57. [node01:2220381] [[63667,0],0] posting recv
  58. [node01:2220381] [[63667,0],0] posting persistent recv on tag 6 for peer [[WILDCARD],WILDCARD]
  59. [node01:2220381] [[63667,0],0] posting recv
  60. [node01:2220381] [[63667,0],0] posting persistent recv on tag 28 for peer [[WILDCARD],WILDCARD]
  61. [node01:2220381] [[63667,0],0] posting recv
  62. [node01:2220381] [[63667,0],0] posting persistent recv on tag 59 for peer [[WILDCARD],WILDCARD]
  63. [node01:2220381] [[63667,0],0] posting recv
  64. [node01:2220381] [[63667,0],0] posting persistent recv on tag 15 for peer [[WILDCARD],WILDCARD]
  65. [node01:2220381] [[63667,0],0] posting recv
  66. [node01:2220381] [[63667,0],0] posting persistent recv on tag 33 for peer [[WILDCARD],WILDCARD]
  67. [node01:2220381] [[63667,0],0] posting recv
  68. [node01:2220381] [[63667,0],0] posting persistent recv on tag 31 for peer [[WILDCARD],WILDCARD]
  69. [node01:2220381] [[63667,0],0] posting recv
  70. [node01:2220381] [[63667,0],0] posting persistent recv on tag 5 for peer [[WILDCARD],WILDCARD]
  71. [node01:2220381] [[63667,0],0] posting recv
  72. [node01:2220381] [[63667,0],0] posting persistent recv on tag 10 for peer [[WILDCARD],WILDCARD]
  73. [node01:2220381] [[63667,0],0] posting recv
  74. [node01:2220381] [[63667,0],0] posting persistent recv on tag 12 for peer [[WILDCARD],WILDCARD]
  75. [node01:2220381] [[63667,0],0] posting recv
  76. [node01:2220381] [[63667,0],0] posting persistent recv on tag 62 for peer [[WILDCARD],WILDCARD]
  77. [node01:2220381] [[63667,0],0] posting recv
  78. [node01:2220381] [[63667,0],0] posting persistent recv on tag 36 for peer [[WILDCARD],WILDCARD]
  79. [node01:2220381] [[63667,0],0] posting recv
  80. [node01:2220381] [[63667,0],0] posting persistent recv on tag 2 for peer [[WILDCARD],WILDCARD]
  81. [node01:2220381] [[63667,0],0] posting recv
  82. [node01:2220381] [[63667,0],0] posting persistent recv on tag 21 for peer [[WILDCARD],WILDCARD]
  83. [node01:2220381] [[63667,0],0] posting recv
  84. [node01:2220381] [[63667,0],0] posting persistent recv on tag 22 for peer [[WILDCARD],WILDCARD]
  85. [node01:2220381] [[63667,0],0] posting recv
  86. [node01:2220381] [[63667,0],0] posting persistent recv on tag 1 for peer [[WILDCARD],WILDCARD]
  87. [node01:2220381] [[63667,0],0] plm:slurm: final top-level argv:
  88. srun --ntasks-per-node=1 --kill-on-bad-exit --mpi=none --nodes=1 --nodelist=node02 --ntasks=1 orted -mca orte_debug_daemons "1" -mca ess "slurm" -mca ess_base_jobid "4172480512" -mca ess_base_vpid "1" -mca ess_base_num_procs "2" -mca orte_node_regex "node[2:01-02]@0(2)" -mca orte_hnp_uri "4172480512.0;tcp://<ip_node_01>:51187" --mca plm_base_verbose "5" -mca oob_base_verbose "10" -mca rml_base_verbose "10"
  89. srun: error: Unable to create step for job 18: Requested node configuration is not available
  90. --------------------------------------------------------------------------
  91. An ORTE daemon has unexpectedly failed after launch and before
  92. communicating back to mpirun. This could be caused by a number
  93. of factors, including an inability to create a connection back
  94. to mpirun due to a lack of common network interfaces and/or no
  95. route found between them. Please check network connectivity
  96. (including firewalls and network routing requirements).
  97. --------------------------------------------------------------------------
  98. [node01:2220381] [[63667,0],0] rml:base:send_buffer_nb() to peer [[63667,0],0] through conduit 1
  99. [node01:2220381] [[63667,0],0] Message posted at grpcomm_direct.c:617 for tag 1
  100. [node01:2220381] [[63667,0],0] orted_cmd: received halt_vm cmd
  101. [node01:2220381] [[63667,0],0] rml_recv_cancel for peer [[WILDCARD],WILDCARD] tag 36
  102. [node01:2220381] [[63667,0],0] rml_recv_cancel for peer [[WILDCARD],WILDCARD] tag 50
  103. [node01:2220381] [[63667,0],0] rml_recv_cancel for peer [[WILDCARD],WILDCARD] tag 51
  104. [node01:2220381] [[63667,0],0] rml_recv_cancel for peer [[WILDCARD],WILDCARD] tag 6
  105. [node01:2220381] [[63667,0],0] rml_recv_cancel for peer [[WILDCARD],WILDCARD] tag 28
  106. [node01:2220381] [[63667,0],0] rml_recv_cancel for peer [[WILDCARD],WILDCARD] tag 59
  107. [node01:2220381] [[63667,0],0] rml:base:close_conduit(0)
  108. [node01:2220381] [[63667,0],0] rml:base:close_conduit(1)
  109. [node01:2220381] [[63667,0],0] rml_recv_cancel for peer [[WILDCARD],WILDCARD] tag 5
  110. [node01:2220381] [[63667,0],0] rml_recv_cancel for peer [[WILDCARD],WILDCARD] tag 10
  111. [node01:2220381] [[63667,0],0] rml_recv_cancel for peer [[WILDCARD],WILDCARD] tag 12
  112. [node01:2220381] [[63667,0],0] rml_recv_cancel for peer [[WILDCARD],WILDCARD] tag 62
  113. [node01:2220381] mca: base: close: component oob closed
  114. [node01:2220381] mca: base: close: unloading component oob
  115. [node01:2220381] [[63667,0],0] TCP SHUTDOWN
  116. [node01:2220381] [[63667,0],0] TCP SHUTDOWN done
  117. [node01:2220381] mca: base: close: component tcp closed
  118. [node01:2220381] mca: base: close: unloading component tcp
  119.  
Advertisement
Add Comment
Please, Sign In to add comment