Pastebin launched a little side project called VERYVIRAL.com, check it out ;-) Want more features on Pastebin? Sign Up, it's FREE!
Guest

Drbd after fencing with rhcs_fence

By: a guest on Jan 25th, 2012  |  syntax: Bash  |  size: 12.92 KB  |  views: 73  |  expires: Never
download  |  raw  |  embed  |  report abuse  |  print
Text below is selected. Please press Ctrl+C to copy to your clipboard. (⌘+C on Mac)
  1. Log report
  2.  
  3. In node 1:
  4.  
  5. Jan 25 11:25:33 wsguardian1 kernel: igb: eth4 NIC Link is Down
  6. Jan 25 11:25:39 wsguardian1 kernel: block drbd0: PingAck did not arrive in time.
  7. Jan 25 11:25:39 wsguardian1 kernel: block drbd0: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) susp( 0 -> 1 )
  8. Jan 25 11:25:39 wsguardian1 kernel: block drbd0: asender terminated
  9. Jan 25 11:25:39 wsguardian1 kernel: block drbd0: Terminating asender thread
  10. Jan 25 11:25:39 wsguardian1 kernel: block drbd0: short read expecting header on sock: r=-512
  11. Jan 25 11:25:39 wsguardian1 kernel: block drbd0: Creating new current UUID
  12. Jan 25 11:25:39 wsguardian1 kernel: block drbd0: Connection closed
  13. Jan 25 11:25:39 wsguardian1 kernel: block drbd0: helper command: /sbin/drbdadm fence-peer minor-0
  14. Jan 25 11:25:39 wsguardian1 rhcs_fence: Attempting to fence peer using RHCS from DRBD...
  15. Jan 25 11:25:39 wsguardian1 kernel: block drbd0: helper command: /sbin/drbdadm fence-peer minor-0 exit code 255 (0xff00)
  16. Jan 25 11:25:39 wsguardian1 kernel: block drbd0: fence-peer helper broken, returned 255
  17. Jan 25 11:25:39 wsguardian1 kernel: block drbd0: Considering state change from bad state. Error would be: 'Refusing to be Primary while peer is not outdated'
  18. Jan 25 11:25:39 wsguardian1 kernel: block drbd0:  old = { cs:NetworkFailure ro:Primary/Unknown ds:UpToDate/DUnknown s--- }
  19. Jan 25 11:25:39 wsguardian1 kernel: block drbd0:  new = { cs:Unconnected ro:Primary/Unknown ds:UpToDate/DUnknown s--- }
  20. Jan 25 11:25:39 wsguardian1 kernel: block drbd0: conn( NetworkFailure -> Unconnected )
  21. Jan 25 11:25:39 wsguardian1 kernel: block drbd0: receiver terminated
  22. Jan 25 11:25:39 wsguardian1 kernel: block drbd0: Restarting receiver thread
  23. Jan 25 11:25:39 wsguardian1 kernel: block drbd0: receiver (re)started
  24. Jan 25 11:25:39 wsguardian1 kernel: block drbd0: Considering state change from bad state. Error would be: 'Refusing to be Primary while peer is not outdated'
  25. Jan 25 11:25:39 wsguardian1 kernel: block drbd0:  old = { cs:Unconnected ro:Primary/Unknown ds:UpToDate/DUnknown s--- }
  26. Jan 25 11:25:39 wsguardian1 kernel: block drbd0:  new = { cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown s--- }
  27. Jan 25 11:25:39 wsguardian1 kernel: block drbd0: conn( Unconnected -> WFConnection )
  28. Jan 25 11:25:39 wsguardian1 kernel: block drbd1: PingAck did not arrive in time.
  29. Jan 25 11:25:39 wsguardian1 kernel: block drbd1: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) susp( 0 -> 1 )
  30. Jan 25 11:25:39 wsguardian1 kernel: block drbd1: asender terminated
  31. Jan 25 11:25:39 wsguardian1 kernel: block drbd1: Terminating asender thread
  32. Jan 25 11:25:39 wsguardian1 kernel: block drbd1: short read expecting header on sock: r=-512
  33. Jan 25 11:25:39 wsguardian1 kernel: block drbd1: Creating new current UUID
  34. Jan 25 11:25:39 wsguardian1 kernel: block drbd1: Connection closed
  35. Jan 25 11:25:39 wsguardian1 kernel: block drbd1: helper command: /sbin/drbdadm fence-peer minor-1
  36. Jan 25 11:25:39 wsguardian1 rhcs_fence: Attempting to fence peer using RHCS from DRBD...
  37. Jan 25 11:25:39 wsguardian1 kernel: block drbd1: helper command: /sbin/drbdadm fence-peer minor-1 exit code 255 (0xff00)
  38. Jan 25 11:25:39 wsguardian1 kernel: block drbd1: fence-peer helper broken, returned 255
  39. Jan 25 11:25:39 wsguardian1 kernel: block drbd1: Considering state change from bad state. Error would be: 'Refusing to be Primary while peer is not outdated'
  40. Jan 25 11:25:39 wsguardian1 kernel: block drbd1:  old = { cs:NetworkFailure ro:Primary/Unknown ds:UpToDate/DUnknown s--- }
  41. Jan 25 11:25:39 wsguardian1 kernel: block drbd1:  new = { cs:Unconnected ro:Primary/Unknown ds:UpToDate/DUnknown s--- }
  42. Jan 25 11:25:39 wsguardian1 kernel: block drbd1: conn( NetworkFailure -> Unconnected )
  43. Jan 25 11:25:39 wsguardian1 kernel: block drbd1: receiver terminated
  44. Jan 25 11:25:39 wsguardian1 kernel: block drbd1: Restarting receiver thread
  45. Jan 25 11:25:39 wsguardian1 kernel: block drbd1: receiver (re)started
  46. Jan 25 11:25:39 wsguardian1 kernel: block drbd1: Considering state change from bad state. Error would be: 'Refusing to be Primary while peer is not outdated'
  47. Jan 25 11:25:39 wsguardian1 kernel: block drbd1:  old = { cs:Unconnected ro:Primary/Unknown ds:UpToDate/DUnknown s--- }
  48. Jan 25 11:25:39 wsguardian1 kernel: block drbd1:  new = { cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown s--- }
  49. Jan 25 11:25:39 wsguardian1 kernel: block drbd1: conn( Unconnected -> WFConnection )
  50. Jan 25 11:25:42 wsguardian1 openais[2880]: [TOTEM] The token was lost in the OPERATIONAL state.
  51. Jan 25 11:25:42 wsguardian1 openais[2880]: [TOTEM] Receive multicast socket recv buffer size (320000 bytes).
  52. Jan 25 11:25:42 wsguardian1 openais[2880]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes).
  53. Jan 25 11:25:42 wsguardian1 openais[2880]: [TOTEM] entering GATHER state from 2.
  54.  
  55.  
  56. Log report
  57. In node 2
  58. Jan 25 11:25:34 wsguardian2 kernel: igb: eth4 NIC Link is Down
  59. Jan 25 11:25:39 wsguardian2 kernel: block drbd0: PingAck did not arrive in time.
  60. Jan 25 11:25:39 wsguardian2 kernel: block drbd0: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) susp( 0 -> 1 )
  61. Jan 25 11:25:39 wsguardian2 kernel: block drbd0: asender terminated
  62. Jan 25 11:25:39 wsguardian2 kernel: block drbd0: Terminating asender thread
  63. Jan 25 11:25:39 wsguardian2 kernel: block drbd0: short read expecting header on sock: r=-512
  64. Jan 25 11:25:39 wsguardian2 kernel: block drbd0: Creating new current UUID
  65. Jan 25 11:25:39 wsguardian2 kernel: block drbd0: Connection closed
  66. Jan 25 11:25:39 wsguardian2 kernel: block drbd0: helper command: /sbin/drbdadm fence-peer minor-0
  67. Jan 25 11:25:39 wsguardian2 rhcs_fence: Attempting to fence peer using RHCS from DRBD...
  68. Jan 25 11:25:40 wsguardian2 kernel: block drbd1: PingAck did not arrive in time.
  69. Jan 25 11:25:40 wsguardian2 kernel: block drbd1: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) susp( 0 -> 1 )
  70. Jan 25 11:25:40 wsguardian2 kernel: block drbd1: asender terminated
  71. Jan 25 11:25:40 wsguardian2 kernel: block drbd1: Terminating asender thread
  72. Jan 25 11:25:40 wsguardian2 kernel: block drbd1: short read expecting header on sock: r=-512
  73. Jan 25 11:25:40 wsguardian2 kernel: block drbd1: Creating new current UUID
  74. Jan 25 11:25:40 wsguardian2 kernel: block drbd1: Connection closed
  75. Jan 25 11:25:40 wsguardian2 kernel: block drbd1: helper command: /sbin/drbdadm fence-peer minor-1
  76. Jan 25 11:25:40 wsguardian2 rhcs_fence: Attempting to fence peer using RHCS from DRBD...
  77. Jan 25 11:25:43 wsguardian2 openais[2854]: [TOTEM] The token was lost in the OPERATIONAL state.
  78. Jan 25 11:25:43 wsguardian2 openais[2854]: [TOTEM] Receive multicast socket recv buffer size (320000 bytes).
  79. Jan 25 11:25:43 wsguardian2 openais[2854]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes).
  80. Jan 25 11:25:43 wsguardian2 openais[2854]: [TOTEM] entering GATHER state from 2.
  81. Jan 25 11:25:45 wsguardian2 openais[2854]: [TOTEM] entering GATHER state from 0.
  82. Jan 25 11:25:45 wsguardian2 openais[2854]: [TOTEM] Creating commit token because I am the rep.
  83. Jan 25 11:25:45 wsguardian2 openais[2854]: [TOTEM] Storing new sequence id for ring 128
  84. Jan 25 11:25:45 wsguardian2 openais[2854]: [TOTEM] entering COMMIT state.
  85. Jan 25 11:25:45 wsguardian2 openais[2854]: [TOTEM] entering RECOVERY state.
  86. Jan 25 11:25:45 wsguardian2 openais[2854]: [TOTEM] position [0] member 192.168.253.2:
  87. Jan 25 11:25:45 wsguardian2 openais[2854]: [TOTEM] previous ring seq 292 rep 192.168.253.1
  88. Jan 25 11:25:45 wsguardian2 openais[2854]: [TOTEM] aru bb high delivered bb received flag 1
  89. Jan 25 11:25:45 wsguardian2 openais[2854]: [TOTEM] Did not need to originate any messages in recovery.
  90. Jan 25 11:25:45 wsguardian2 openais[2854]: [TOTEM] Sending initial ORF token
  91. Jan 25 11:25:45 wsguardian2 openais[2854]: [CLM  ] CLM CONFIGURATION CHANGE
  92. Jan 25 11:25:45 wsguardian2 openais[2854]: [CLM  ] New Configuration:
  93. Jan 25 11:25:45 wsguardian2 openais[2854]: [CLM  ]      r(0) ip(192.168.253.2)  
  94. Jan 25 11:25:45 wsguardian2 kernel: dlm: closing connection to node 1
  95. Jan 25 11:25:45 wsguardian2 openais[2854]: [CLM  ] Members Left:
  96. Jan 25 11:25:45 wsguardian2 fenced[2875]: wsguardian1 not a cluster member after 0 sec post_fail_delay
  97. Jan 25 11:25:45 wsguardian2 openais[2854]: [CLM  ]      r(0) ip(192.168.253.1)  
  98. Jan 25 11:25:45 wsguardian2 fenced[2875]: fencing node "wsguardian1"
  99. Jan 25 11:25:45 wsguardian2 openais[2854]: [CLM  ] Members Joined:
  100. Jan 25 11:25:45 wsguardian2 openais[2854]: [CLM  ] CLM CONFIGURATION CHANGE
  101. Jan 25 11:25:45 wsguardian2 openais[2854]: [CLM  ] New Configuration:
  102. Jan 25 11:25:45 wsguardian2 openais[2854]: [CLM  ]      r(0) ip(192.168.253.2)  
  103. Jan 25 11:25:45 wsguardian2 openais[2854]: [CLM  ] Members Left:
  104. Jan 25 11:25:45 wsguardian2 openais[2854]: [CLM  ] Members Joined:
  105. Jan 25 11:25:45 wsguardian2 openais[2854]: [SYNC ] This node is within the primary component and will provide service.
  106. Jan 25 11:25:45 wsguardian2 openais[2854]: [TOTEM] entering OPERATIONAL state.
  107. Jan 25 11:25:45 wsguardian2 openais[2854]: [CLM  ] got nodejoin message 192.168.253.2
  108. Jan 25 11:25:45 wsguardian2 openais[2854]: [CPG  ] got joinlist message from node 2
  109. Jan 25 11:25:48 wsguardian2 kernel: block drbd0: helper command: /sbin/drbdadm fence-peer minor-0 exit code 255 (0xff00)
  110. Jan 25 11:25:48 wsguardian2 kernel: block drbd0: fence-peer helper broken, returned 255
  111. Jan 25 11:25:48 wsguardian2 kernel: block drbd0: Considering state change from bad state. Error would be: 'Refusing to be Primary while peer is not outdated'
  112. Jan 25 11:25:48 wsguardian2 kernel: block drbd0:  old = { cs:NetworkFailure ro:Primary/Unknown ds:UpToDate/DUnknown s--- }
  113. Jan 25 11:25:48 wsguardian2 kernel: block drbd0:  new = { cs:Unconnected ro:Primary/Unknown ds:UpToDate/DUnknown s--- }
  114. Jan 25 11:25:48 wsguardian2 kernel: block drbd0: conn( NetworkFailure -> Unconnected )
  115. Jan 25 11:25:48 wsguardian2 kernel: block drbd0: receiver terminated
  116. Jan 25 11:25:48 wsguardian2 kernel: block drbd0: Restarting receiver thread
  117. Jan 25 11:25:48 wsguardian2 kernel: block drbd0: receiver (re)started
  118. Jan 25 11:25:48 wsguardian2 kernel: block drbd0: Considering state change from bad state. Error would be: 'Refusing to be Primary while peer is not outdated'
  119. Jan 25 11:25:48 wsguardian2 kernel: block drbd0:  old = { cs:Unconnected ro:Primary/Unknown ds:UpToDate/DUnknown s--- }
  120. Jan 25 11:25:48 wsguardian2 kernel: block drbd0:  new = { cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown s--- }
  121. Jan 25 11:25:48 wsguardian2 kernel: block drbd0: conn( Unconnected -> WFConnection )
  122. Jan 25 11:25:49 wsguardian2 kernel: block drbd1: helper command: /sbin/drbdadm fence-peer minor-1 exit code 255 (0xff00)
  123. Jan 25 11:25:49 wsguardian2 kernel: block drbd1: fence-peer helper broken, returned 255
  124. Jan 25 11:25:49 wsguardian2 kernel: block drbd1: Considering state change from bad state. Error would be: 'Refusing to be Primary while peer is not outdated'
  125. Jan 25 11:25:49 wsguardian2 kernel: block drbd1:  old = { cs:NetworkFailure ro:Primary/Unknown ds:UpToDate/DUnknown s--- }
  126. Jan 25 11:25:49 wsguardian2 kernel: block drbd1:  new = { cs:Unconnected ro:Primary/Unknown ds:UpToDate/DUnknown s--- }
  127. Jan 25 11:25:49 wsguardian2 kernel: block drbd1: conn( NetworkFailure -> Unconnected )
  128. Jan 25 11:25:49 wsguardian2 kernel: block drbd1: receiver terminated
  129. Jan 25 11:25:49 wsguardian2 kernel: block drbd1: Restarting receiver thread
  130. Jan 25 11:25:49 wsguardian2 kernel: block drbd1: receiver (re)started
  131. Jan 25 11:25:49 wsguardian2 kernel: block drbd1: Considering state change from bad state. Error would be: 'Refusing to be Primary while peer is not outdated'
  132. Jan 25 11:25:49 wsguardian2 kernel: block drbd1:  old = { cs:Unconnected ro:Primary/Unknown ds:UpToDate/DUnknown s--- }
  133. Jan 25 11:25:49 wsguardian2 kernel: block drbd1:  new = { cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown s--- }
  134. Jan 25 11:25:49 wsguardian2 kernel: block drbd1: conn( Unconnected -> WFConnection )
  135.  
  136.  
  137. After reboot (fencing) drbd report on node1
  138. cat /proc/drbd
  139. version: 8.3.8 (api:88/proto:86-94)
  140. GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:09
  141.  0: cs:StandAlone ro:Secondary/Unknown ds:Consistent/DUnknown   s----
  142.     ns:0 nr:0 dw:0 dr:0 al:0 bm:4 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:16384
  143.  1: cs:StandAlone ro:Secondary/Unknown ds:Consistent/DUnknown   s----
  144.     ns:0 nr:0 dw:0 dr:0 al:0 bm:3 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:167936
  145.  
  146. After manual startup drbd report on node2
  147. cat /proc/drbd
  148. version: 8.3.8 (api:88/proto:86-94)
  149. GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:09
  150.  0: cs:StandAlone ro:Secondary/Unknown ds:Consistent/DUnknown   s----
  151.     ns:0 nr:0 dw:0 dr:0 al:0 bm:3 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:20480
  152.  1: cs:StandAlone ro:Secondary/Unknown ds:Consistent/DUnknown   s----
  153.     ns:0 nr:0 dw:0 dr:0 al:0 bm:3 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:12288