Phylum

drbd replication issue

Jun 29th, 2011
889
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 56.07 KB | None | 0 0
  1. #### FROM SECONDARY NODE ####
  2. # tail -n 100 /var/log/messages
  3. Jun 30 01:52:58 secondarynode kernel: block drbd0: Starting worker thread (from kworker/u:3 [4103])
  4. Jun 30 01:52:58 secondarynode kernel: block drbd0: disk( Diskless -> Attaching )
  5. Jun 30 01:52:58 secondarynode kernel: block drbd0: No usable activity log found.
  6. Jun 30 01:52:58 secondarynode kernel: block drbd0: Method to ensure write ordering: flush
  7. Jun 30 01:52:58 secondarynode kernel: block drbd0: max_segment_size ( = BIO size ) = 65536
  8. Jun 30 01:52:58 secondarynode kernel: block drbd0: drbd_bm_resize called with capacity == 1023896
  9. Jun 30 01:52:58 secondarynode kernel: block drbd0: resync bitmap: bits=127987 words=2000
  10. Jun 30 01:52:58 secondarynode kernel: block drbd0: size = 500 MB (511948 KB)
  11. Jun 30 01:52:58 secondarynode kernel: block drbd0: recounting of set bits took additional 0 jiffies
  12. Jun 30 01:52:58 secondarynode kernel: block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
  13. Jun 30 01:52:58 secondarynode kernel: block drbd0: disk( Attaching -> UpToDate )
  14. Jun 30 01:52:58 secondarynode kernel: block drbd1: Starting worker thread (from kworker/u:3 [4103])
  15. Jun 30 01:52:58 secondarynode kernel: block drbd1: disk( Diskless -> Attaching )
  16. Jun 30 01:52:58 secondarynode kernel: block drbd1: No usable activity log found.
  17. Jun 30 01:52:58 secondarynode kernel: block drbd1: Method to ensure write ordering: flush
  18. Jun 30 01:52:58 secondarynode kernel: block drbd1: max_segment_size ( = BIO size ) = 65536
  19. Jun 30 01:52:58 secondarynode kernel: block drbd1: drbd_bm_resize called with capacity == 611309264
  20. Jun 30 01:52:58 secondarynode kernel: block drbd1: resync bitmap: bits=76413658 words=1193964
  21. Jun 30 01:52:58 secondarynode kernel: block drbd1: size = 291 GB (305654632 KB)
  22. Jun 30 01:52:58 secondarynode kernel: block drbd1: recounting of set bits took additional 0 jiffies
  23. Jun 30 01:52:58 secondarynode kernel: block drbd1: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
  24. Jun 30 01:52:58 secondarynode kernel: block drbd1: disk( Attaching -> UpToDate )
  25. Jun 30 01:52:58 secondarynode kernel: block drbd0: conn( StandAlone -> Unconnected )
  26. Jun 30 01:52:58 secondarynode kernel: block drbd0: Starting receiver thread (from drbd0_worker [3288])
  27. Jun 30 01:52:58 secondarynode kernel: block drbd0: receiver (re)started
  28. Jun 30 01:52:58 secondarynode kernel: block drbd0: conn( Unconnected -> WFConnection )
  29. Jun 30 01:52:58 secondarynode kernel: block drbd1: conn( StandAlone -> Unconnected )
  30. Jun 30 01:52:58 secondarynode kernel: block drbd1: Starting receiver thread (from drbd1_worker [3297])
  31. Jun 30 01:52:58 secondarynode kernel: block drbd1: receiver (re)started
  32. Jun 30 01:52:58 secondarynode kernel: block drbd1: conn( Unconnected -> WFConnection )
  33. Jun 30 01:52:58 secondarynode kernel: block drbd0: Handshake successful: Agreed network protocol version 95
  34. Jun 30 01:52:58 secondarynode kernel: block drbd0: conn( WFConnection -> WFReportParams )
  35. Jun 30 01:52:58 secondarynode kernel: block drbd0: Starting asender thread (from drbd0_receiver [3313])
  36. Jun 30 01:52:58 secondarynode kernel: block drbd0: data-integrity-alg: <not-used>
  37. Jun 30 01:52:58 secondarynode kernel: block drbd0: max_segment_size ( = BIO size ) = 65536
  38. Jun 30 01:52:58 secondarynode kernel: block drbd0: drbd_sync_handshake:
  39. Jun 30 01:52:58 secondarynode kernel: block drbd0: self C3C0A265E9277308:0000000000000000:173E3BD3C3CA5CD4:BBF4F1769486E305 bits:0 flags:0
  40. Jun 30 01:52:58 secondarynode kernel: block drbd0: peer 378FBD5A1F9BAB2D:C3C0A265E9277309:173E3BD3C3CA5CD4:BBF4F1769486E305 bits:0 flags:0
  41. Jun 30 01:52:58 secondarynode kernel: block drbd0: uuid_compare()=-1 by rule 50
  42. Jun 30 01:52:58 secondarynode kernel: block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
  43. Jun 30 01:52:58 secondarynode kernel: block drbd1: Handshake successful: Agreed network protocol version 95
  44. Jun 30 01:52:58 secondarynode kernel: block drbd1: conn( WFConnection -> WFReportParams )
  45. Jun 30 01:52:58 secondarynode kernel: block drbd1: Starting asender thread (from drbd1_receiver [3324])
  46. Jun 30 01:52:58 secondarynode kernel: block drbd1: data-integrity-alg: <not-used>
  47. Jun 30 01:52:58 secondarynode kernel: block drbd1: max_segment_size ( = BIO size ) = 65536
  48. Jun 30 01:52:58 secondarynode kernel: block drbd1: drbd_sync_handshake:
  49. Jun 30 01:52:58 secondarynode kernel: block drbd1: self FC433F9C35D4E19C:0000000000000000:EFD171D1BE6D85C4:305E2BE3E64F5FA9 bits:0 flags:0
  50. Jun 30 01:52:58 secondarynode kernel: block drbd1: peer BDB69235F03FB11B:FC433F9C35D4E19D:EFD171D1BE6D85C5:305E2BE3E64F5FA9 bits:28134237 flags:0
  51. Jun 30 01:52:58 secondarynode kernel: block drbd1: uuid_compare()=-1 by rule 50
  52. Jun 30 01:52:58 secondarynode kernel: block drbd1: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
  53. ## I noticed the time was wrong so I changed it here ##
  54. Jun 30 22:08:22 secondarynode kernel: block drbd1: meta connection shut down by peer.
  55. Jun 30 22:08:22 secondarynode kernel: block drbd1: peer( Primary -> Unknown ) conn( WFBitMapT -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
  56. Jun 30 22:08:22 secondarynode kernel: block drbd1: short read expecting header on sock: r=-512
  57. Jun 30 22:08:22 secondarynode kernel: block drbd0: meta connection shut down by peer.
  58. Jun 30 22:08:22 secondarynode kernel: block drbd0: peer( Primary -> Unknown ) conn( WFBitMapT -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
  59. Jun 30 22:08:22 secondarynode kernel: block drbd0: short read expecting header on sock: r=-512
  60. Jun 30 22:08:22 secondarynode kernel: block drbd1: asender terminated
  61. Jun 30 22:08:22 secondarynode kernel: block drbd1: Terminating drbd1_asender
  62. Jun 30 22:08:22 secondarynode kernel: block drbd1: Connection closed
  63. Jun 30 22:08:22 secondarynode kernel: block drbd1: conn( NetworkFailure -> Unconnected )
  64. Jun 30 22:08:22 secondarynode kernel: block drbd1: receiver terminated
  65. Jun 30 22:08:22 secondarynode kernel: block drbd1: Restarting drbd1_receiver
  66. Jun 30 22:08:22 secondarynode kernel: block drbd1: receiver (re)started
  67. Jun 30 22:08:22 secondarynode kernel: block drbd1: conn( Unconnected -> WFConnection )
  68. Jun 30 22:08:22 secondarynode kernel: block drbd0: asender terminated
  69. Jun 30 22:08:22 secondarynode kernel: block drbd0: Terminating drbd0_asender
  70. Jun 30 22:08:22 secondarynode kernel: block drbd0: Connection closed
  71. Jun 30 22:08:22 secondarynode kernel: block drbd0: conn( NetworkFailure -> Unconnected )
  72. Jun 30 22:08:22 secondarynode kernel: block drbd0: receiver terminated
  73. Jun 30 22:08:22 secondarynode kernel: block drbd0: Restarting drbd0_receiver
  74. Jun 30 22:08:22 secondarynode kernel: block drbd0: receiver (re)started
  75. Jun 30 22:08:22 secondarynode kernel: block drbd0: conn( Unconnected -> WFConnection )
  76. Jun 30 22:08:22 secondarynode kernel: block drbd0: Handshake successful: Agreed network protocol version 95
  77. Jun 30 22:08:22 secondarynode kernel: block drbd0: conn( WFConnection -> WFReportParams )
  78. Jun 30 22:08:22 secondarynode kernel: block drbd1: Handshake successful: Agreed network protocol version 95
  79. Jun 30 22:08:22 secondarynode kernel: block drbd1: conn( WFConnection -> WFReportParams )
  80. Jun 30 22:08:22 secondarynode kernel: block drbd1: Starting asender thread (from drbd1_receiver [3324])
  81. Jun 30 22:08:22 secondarynode kernel: block drbd0: Starting asender thread (from drbd0_receiver [3313])
  82. Jun 30 22:08:22 secondarynode kernel: block drbd0: data-integrity-alg: <not-used>
  83. Jun 30 22:08:22 secondarynode kernel: block drbd0: max_segment_size ( = BIO size ) = 65536
  84. Jun 30 22:08:22 secondarynode kernel: block drbd1: data-integrity-alg: <not-used>
  85. Jun 30 22:08:22 secondarynode kernel: block drbd1: max_segment_size ( = BIO size ) = 65536
  86. Jun 30 22:08:22 secondarynode kernel: block drbd0: drbd_sync_handshake:
  87. Jun 30 22:08:22 secondarynode kernel: block drbd0: self C3C0A265E9277308:0000000000000000:173E3BD3C3CA5CD4:BBF4F1769486E305 bits:0 flags:0
  88. Jun 30 22:08:22 secondarynode kernel: block drbd0: peer 378FBD5A1F9BAB2D:C3C0A265E9277309:173E3BD3C3CA5CD4:BBF4F1769486E305 bits:0 flags:0
  89. Jun 30 22:08:22 secondarynode kernel: block drbd0: uuid_compare()=-1 by rule 50
  90. Jun 30 22:08:22 secondarynode kernel: block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
  91.  
  92. # /etc/init.d/drbd status
  93. DRBD module version: 8.3.9
  94. userland version: 8.3.8
  95. you should upgrade your drbd tools!
  96. * drbd driver loaded OK; device status: ... [ ok ]
  97. version: 8.3.9 (api:88/proto:86-95)
  98. built-in
  99. 0: cs:WFBitMapT ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
  100. ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
  101. 1: cs:WFReportParams ro:Secondary/Unknown ds:UpToDate/DUnknown C r-----
  102. ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
  103.  
  104. # drbd-overview
  105. 0:meta WFBitMapT Secondary/Primary UpToDate/UpToDate C r-----
  106. 1:data WFReportParams Secondary/Unknown UpToDate/DUnknown C r-----
  107.  
  108. # ps aux | grep drbd | grep -v grep
  109. root 3288 0.0 0.0 0 0 ? S 21:52 0:00 [drbd0_worker]
  110. root 3297 0.0 0.0 0 0 ? S 21:52 0:00 [drbd1_worker]
  111. root 3313 0.0 0.0 0 0 ? S 21:52 0:00 [drbd0_receiver]
  112. root 3324 0.0 0.0 0 0 ? S 21:52 0:00 [drbd1_receiver]
  113. root 7098 0.0 0.0 0 0 ? S 22:08 0:00 [drbd1_asender]
  114. root 7099 0.0 0.0 0 0 ? S 22:08 0:00 [drbd0_asender]
  115. root 14498 0.0 0.0 8356 992 tty1 S+ Jun28 2:12 watch cat /proc/drbd
  116.  
  117. ##################### PRIMARY NODE #####################
  118. # tail -n 300 /var/log/messages
  119. Jun 30 17:52:33 primarynode kernel: block drbd0: Handshake successful: Agreed network protocol version 95
  120. Jun 30 17:52:33 primarynode kernel: block drbd0: conn( WFConnection -> WFReportParams )
  121. Jun 30 17:52:33 primarynode kernel: block drbd0: Starting asender thread (from drbd0_receiver [14311])
  122. Jun 30 17:52:33 primarynode kernel: block drbd0: data-integrity-alg: <not-used>
  123. Jun 30 17:52:33 primarynode kernel: block drbd0: max_segment_size ( = BIO size ) = 65536
  124. Jun 30 17:52:33 primarynode kernel: block drbd0: drbd_sync_handshake:
  125. Jun 30 17:52:33 primarynode kernel: block drbd0: self 378FBD5A1F9BAB2D:C3C0A265E9277309:173E3BD3C3CA5CD4:BBF4F1769486E305 bits:0 flags:0
  126. Jun 30 17:52:33 primarynode kernel: block drbd0: peer C3C0A265E9277308:0000000000000000:173E3BD3C3CA5CD4:BBF4F1769486E305 bits:0 flags:0
  127. Jun 30 17:52:33 primarynode kernel: block drbd0: uuid_compare()=1 by rule 70
  128. Jun 30 17:52:33 primarynode kernel: block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate )
  129. Jun 30 17:52:33 primarynode kernel: block drbd1: Handshake successful: Agreed network protocol version 95
  130. Jun 30 17:52:33 primarynode kernel: block drbd1: conn( WFConnection -> WFReportParams )
  131. Jun 30 17:52:33 primarynode kernel: block drbd1: Starting asender thread (from drbd1_receiver [14322])
  132. Jun 30 17:52:33 primarynode kernel: block drbd1: data-integrity-alg: <not-used>
  133. Jun 30 17:52:33 primarynode kernel: block drbd1: max_segment_size ( = BIO size ) = 65536
  134. Jun 30 17:52:33 primarynode kernel: block drbd1: drbd_sync_handshake:
  135. Jun 30 17:52:33 primarynode kernel: block drbd1: self BDB69235F03FB11B:FC433F9C35D4E19D:EFD171D1BE6D85C5:305E2BE3E64F5FA9 bits:28134237 flags:0
  136. Jun 30 17:52:33 primarynode kernel: block drbd1: peer FC433F9C35D4E19C:0000000000000000:EFD171D1BE6D85C4:305E2BE3E64F5FA9 bits:0 flags:0
  137. Jun 30 17:52:33 primarynode kernel: block drbd1: uuid_compare()=1 by rule 70
  138. Jun 30 17:52:33 primarynode kernel: block drbd1: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate )
  139. Jun 30 17:52:45 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967295
  140. Jun 30 17:52:51 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967294
  141. Jun 30 17:52:57 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967293
  142. Jun 30 17:53:03 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967292
  143. Jun 30 17:53:09 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967291
  144. Jun 30 17:53:15 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967290
  145. Jun 30 17:53:21 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967289
  146. Jun 30 17:53:27 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967288
  147. Jun 30 17:53:33 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967287
  148. Jun 30 17:53:39 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967286
  149. Jun 30 17:53:45 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967285
  150. Jun 30 17:53:51 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967284
  151. Jun 30 17:53:57 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967283
  152. Jun 30 17:54:03 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967282
  153. Jun 30 17:54:09 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967281
  154. Jun 30 17:54:15 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967280
  155. Jun 30 17:54:21 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967279
  156. Jun 30 17:54:27 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967278
  157. Jun 30 17:54:33 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967277
  158. Jun 30 17:54:39 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967276
  159. Jun 30 17:54:45 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967275
  160. Jun 30 17:54:51 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967274
  161. Jun 30 17:54:57 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967273
  162. Jun 30 17:55:03 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967272
  163. Jun 30 17:55:09 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967271
  164. Jun 30 17:55:15 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967270
  165. Jun 30 17:55:21 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967269
  166. Jun 30 17:55:27 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967268
  167. Jun 30 17:55:33 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967267
  168. Jun 30 17:55:39 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967266
  169. Jun 30 17:55:45 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967265
  170. Jun 30 17:55:51 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967264
  171. Jun 30 17:55:57 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967263
  172. Jun 30 17:56:03 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967262
  173. Jun 30 17:56:09 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967261
  174. Jun 30 17:56:15 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967260
  175. Jun 30 17:56:21 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967259
  176. Jun 30 17:56:27 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967258
  177. Jun 30 17:56:33 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967257
  178. Jun 30 17:56:39 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967256
  179. Jun 30 17:56:45 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967255
  180. Jun 30 17:56:51 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967254
  181. Jun 30 17:56:57 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967253
  182. Jun 30 17:57:03 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967252
  183. Jun 30 17:57:09 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967251
  184. Jun 30 17:57:15 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967250
  185. Jun 30 17:57:21 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967249
  186. Jun 30 17:57:27 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967248
  187. Jun 30 17:57:33 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967247
  188. Jun 30 17:57:39 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967246
  189. Jun 30 21:58:02 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967245
  190. Jun 30 21:58:08 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967244
  191. Jun 30 21:58:14 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967243
  192. Jun 30 21:58:20 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967242
  193. Jun 30 21:58:26 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967241
  194. Jun 30 21:58:32 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967240
  195. Jun 30 21:58:38 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967239
  196. Jun 30 21:58:44 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967238
  197. Jun 30 21:58:50 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967237
  198. Jun 30 21:58:56 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967236
  199. Jun 30 21:59:02 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967235
  200. Jun 30 21:59:08 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967234
  201. Jun 30 21:59:14 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967233
  202. Jun 30 21:59:20 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967232
  203. Jun 30 21:59:26 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967231
  204. Jun 30 21:59:32 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967230
  205. Jun 30 21:59:38 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967229
  206. Jun 30 21:59:44 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967228
  207. Jun 30 21:59:50 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967227
  208. Jun 30 21:59:56 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967226
  209. Jun 30 22:00:02 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967225
  210. Jun 30 22:00:08 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967224
  211. Jun 30 22:00:14 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967223
  212. Jun 30 22:00:20 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967222
  213. Jun 30 22:00:26 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967221
  214. Jun 30 22:00:32 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967220
  215. Jun 30 22:00:38 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967219
  216. Jun 30 22:00:44 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967218
  217. Jun 30 22:00:50 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967217
  218. Jun 30 22:00:56 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967216
  219. Jun 30 22:01:02 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967215
  220. Jun 30 22:01:08 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967214
  221. Jun 30 22:01:14 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967213
  222. Jun 30 22:01:20 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967212
  223. Jun 30 22:01:26 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967211
  224. Jun 30 22:01:32 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967210
  225. Jun 30 22:01:38 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967209
  226. Jun 30 22:01:44 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967208
  227. Jun 30 22:01:50 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967207
  228. Jun 30 22:01:56 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967206
  229. Jun 30 22:02:02 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967205
  230. Jun 30 22:02:08 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967204
  231. Jun 30 22:02:14 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967203
  232. Jun 30 22:02:20 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967202
  233. Jun 30 22:02:26 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967201
  234. Jun 30 22:02:32 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967200
  235. Jun 30 22:02:38 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967199
  236. Jun 30 22:02:44 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967198
  237. Jun 30 22:02:50 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967197
  238. Jun 30 22:02:56 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967196
  239. Jun 30 22:03:02 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967195
  240. Jun 30 22:03:08 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967194
  241. Jun 30 22:03:14 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967193
  242. Jun 30 22:03:20 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967192
  243. Jun 30 22:03:26 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967191
  244. Jun 30 22:03:32 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967190
  245. Jun 30 22:03:38 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967189
  246. Jun 30 22:03:44 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967188
  247. Jun 30 22:03:50 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967187
  248. Jun 30 22:03:56 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967186
  249. Jun 30 22:04:02 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967185
  250. Jun 30 22:04:08 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967184
  251. Jun 30 22:04:14 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967183
  252. Jun 30 22:04:20 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967182
  253. Jun 30 22:04:26 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967181
  254. Jun 30 22:04:32 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967180
  255. Jun 30 22:04:38 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967179
  256. Jun 30 22:04:44 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967178
  257. Jun 30 22:04:50 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967177
  258. Jun 30 22:04:56 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967176
  259. Jun 30 22:05:02 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967175
  260. Jun 30 22:05:08 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967174
  261. Jun 30 22:05:14 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967173
  262. Jun 30 22:05:20 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967172
  263. Jun 30 22:05:26 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967171
  264. Jun 30 22:05:32 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967170
  265. Jun 30 22:05:38 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967169
  266. Jun 30 22:05:44 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967168
  267. Jun 30 22:05:50 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967167
  268. Jun 30 22:05:56 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967166
  269. Jun 30 22:06:02 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967165
  270. Jun 30 22:06:08 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967164
  271. Jun 30 22:06:14 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967163
  272. Jun 30 22:06:20 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967162
  273. Jun 30 22:06:26 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967161
  274. Jun 30 22:06:32 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967160
  275. Jun 30 22:06:38 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967159
  276. Jun 30 22:06:44 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967158
  277. Jun 30 22:06:50 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967157
  278. Jun 30 22:06:56 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967156
  279. Jun 30 22:07:02 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967155
  280. Jun 30 22:07:08 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967154
  281. Jun 30 22:07:14 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967153
  282. Jun 30 22:07:20 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967152
  283. Jun 30 22:07:26 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967151
  284. Jun 30 22:07:32 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967150
  285. Jun 30 22:07:38 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967149
  286. Jun 30 22:07:44 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967148
  287. Jun 30 22:07:50 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967147
  288. Jun 30 22:07:56 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967146
  289. Jun 30 22:08:02 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967145
  290. Jun 30 22:08:08 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967144
  291. Jun 30 22:08:14 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967143
  292. Jun 30 22:08:20 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967142
  293. Jun 30 22:08:26 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967141
  294. Jun 30 22:08:27 primarynode kernel: block drbd1: sock_sendmsg returned -110
  295. Jun 30 22:08:27 primarynode kernel: block drbd0: sock_recvmsg returned -110
  296. Jun 30 22:08:27 primarynode kernel: block drbd0: peer( Secondary -> Unknown ) conn( WFBitMapS -> BrokenPipe ) pdsk( UpToDate -> DUnknown )
  297. Jun 30 22:08:27 primarynode kernel: block drbd1: peer( Secondary -> Unknown ) conn( WFBitMapS -> BrokenPipe ) pdsk( UpToDate -> DUnknown )
  298. Jun 30 22:08:27 primarynode kernel: block drbd0: short read expecting header on sock: r=-110
  299. Jun 30 22:08:27 primarynode kernel: block drbd1: short sent ReportBitMap size=4096 sent=2256
  300. Jun 30 22:08:27 primarynode kernel: block drbd1: sock was shut down by peer
  301. Jun 30 22:08:27 primarynode kernel: block drbd1: short read expecting header on sock: r=0
  302. Jun 30 22:08:27 primarynode kernel: block drbd1: asender terminated
  303. Jun 30 22:08:27 primarynode kernel: block drbd1: Terminating drbd1_asender
  304. Jun 30 22:08:27 primarynode kernel: block drbd1: Connection closed
  305. Jun 30 22:08:27 primarynode kernel: block drbd1: conn( BrokenPipe -> Unconnected )
  306. Jun 30 22:08:27 primarynode kernel: block drbd1: receiver terminated
  307. Jun 30 22:08:27 primarynode kernel: block drbd1: Restarting drbd1_receiver
  308. Jun 30 22:08:27 primarynode kernel: block drbd1: receiver (re)started
  309. Jun 30 22:08:27 primarynode kernel: block drbd1: conn( Unconnected -> WFConnection )
  310. Jun 30 22:08:27 primarynode kernel: block drbd0: asender terminated
  311. Jun 30 22:08:27 primarynode kernel: block drbd0: Terminating drbd0_asender
  312. Jun 30 22:08:27 primarynode kernel: block drbd0: Connection closed
  313. Jun 30 22:08:27 primarynode kernel: block drbd0: conn( BrokenPipe -> Unconnected )
  314. Jun 30 22:08:27 primarynode kernel: block drbd0: receiver terminated
  315. Jun 30 22:08:27 primarynode kernel: block drbd0: Restarting drbd0_receiver
  316. Jun 30 22:08:27 primarynode kernel: block drbd0: receiver (re)started
  317. Jun 30 22:08:27 primarynode kernel: block drbd0: conn( Unconnected -> WFConnection )
  318. Jun 30 22:08:27 primarynode kernel: block drbd0: Handshake successful: Agreed network protocol version 95
  319. Jun 30 22:08:27 primarynode kernel: block drbd0: conn( WFConnection -> WFReportParams )
  320. Jun 30 22:08:27 primarynode kernel: block drbd0: Starting asender thread (from drbd0_receiver [14311])
  321. Jun 30 22:08:27 primarynode kernel: block drbd1: Handshake successful: Agreed network protocol version 95
  322. Jun 30 22:08:27 primarynode kernel: block drbd1: conn( WFConnection -> WFReportParams )
  323. Jun 30 22:08:27 primarynode kernel: block drbd1: Starting asender thread (from drbd1_receiver [14322])
  324. Jun 30 22:08:27 primarynode kernel: block drbd1: data-integrity-alg: <not-used>
  325. Jun 30 22:08:27 primarynode kernel: block drbd0: data-integrity-alg: <not-used>
  326. Jun 30 22:08:27 primarynode kernel: block drbd1: max_segment_size ( = BIO size ) = 65536
  327. Jun 30 22:08:27 primarynode kernel: block drbd1: drbd_sync_handshake:
  328. Jun 30 22:08:27 primarynode kernel: block drbd1: self BDB69235F03FB11B:FC433F9C35D4E19D:EFD171D1BE6D85C5:305E2BE3E64F5FA9 bits:28134237 flags:0
  329. Jun 30 22:08:27 primarynode kernel: block drbd1: peer FC433F9C35D4E19C:0000000000000000:EFD171D1BE6D85C4:305E2BE3E64F5FA9 bits:0 flags:0
  330. Jun 30 22:08:27 primarynode kernel: block drbd1: uuid_compare()=1 by rule 70
  331. Jun 30 22:08:27 primarynode kernel: block drbd1: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate )
  332. Jun 30 22:08:27 primarynode kernel: block drbd0: max_segment_size ( = BIO size ) = 65536
  333. Jun 30 22:08:27 primarynode kernel: block drbd0: drbd_sync_handshake:
  334. Jun 30 22:08:27 primarynode kernel: block drbd0: self 378FBD5A1F9BAB2D:C3C0A265E9277309:173E3BD3C3CA5CD4:BBF4F1769486E305 bits:0 flags:0
  335. Jun 30 22:08:27 primarynode kernel: block drbd0: peer C3C0A265E9277308:0000000000000000:173E3BD3C3CA5CD4:BBF4F1769486E305 bits:0 flags:0
  336. Jun 30 22:08:27 primarynode kernel: block drbd0: uuid_compare()=1 by rule 70
  337. Jun 30 22:08:27 primarynode kernel: block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate )
  338. Jun 30 22:08:39 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967295
  339. Jun 30 22:08:45 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967294
  340. Jun 30 22:08:51 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967293
  341. Jun 30 22:08:57 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967292
  342. Jun 30 22:09:03 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967291
  343. Jun 30 22:09:09 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967290
  344. Jun 30 22:09:15 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967289
  345. Jun 30 22:09:21 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967288
  346. Jun 30 22:09:27 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967287
  347. Jun 30 22:09:33 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967286
  348. Jun 30 22:09:39 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967285
  349. Jun 30 22:09:45 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967284
  350. Jun 30 22:09:51 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967283
  351. Jun 30 22:09:57 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967282
  352. Jun 30 22:10:03 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967281
  353. Jun 30 22:10:09 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967280
  354. Jun 30 22:10:15 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967279
  355. Jun 30 22:10:21 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967278
  356. Jun 30 22:10:27 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967277
  357. Jun 30 22:10:33 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967276
  358. Jun 30 22:10:39 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967275
  359. Jun 30 22:10:45 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967274
  360. Jun 30 22:10:51 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967273
  361. Jun 30 22:10:57 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967272
  362. Jun 30 22:11:03 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967271
  363. Jun 30 22:11:09 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967270
  364. Jun 30 22:11:15 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967269
  365. Jun 30 22:11:21 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967268
  366. Jun 30 22:11:27 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967267
  367. Jun 30 22:11:33 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967266
  368. Jun 30 22:11:39 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967265
  369. Jun 30 22:11:45 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967264
  370. Jun 30 22:11:51 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967263
  371. Jun 30 22:11:57 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967262
  372. Jun 30 22:12:03 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967261
  373. Jun 30 22:12:09 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967260
  374. Jun 30 22:12:15 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967259
  375. Jun 30 22:12:21 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967258
  376. Jun 30 22:12:27 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967257
  377. Jun 30 22:12:33 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967256
  378. Jun 30 22:12:39 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967255
  379. Jun 30 22:12:45 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967254
  380. Jun 30 22:12:51 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967253
  381. Jun 30 22:12:57 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967252
  382. Jun 30 22:13:03 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967251
  383. Jun 30 22:13:09 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967250
  384. Jun 30 22:13:15 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967249
  385. Jun 30 22:13:21 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967248
  386. Jun 30 22:13:27 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967247
  387. Jun 30 22:13:33 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967246
  388. Jun 30 22:13:39 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967245
  389. Jun 30 22:13:45 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967244
  390. Jun 30 22:13:51 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967243
  391. Jun 30 22:13:57 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967242
  392. Jun 30 22:14:03 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967241
  393. Jun 30 22:14:09 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967240
  394. Jun 30 22:14:15 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967239
  395. Jun 30 22:14:21 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967238
  396. Jun 30 22:14:27 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967237
  397. Jun 30 22:14:33 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967236
  398. Jun 30 22:14:39 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967235
  399. Jun 30 22:14:45 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967234
  400. Jun 30 22:14:51 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967233
  401. Jun 30 22:14:57 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967232
  402. Jun 30 22:15:03 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967231
  403. Jun 30 22:15:09 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967230
  404. Jun 30 22:15:15 primarynode kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967229
  405.  
  406. # /etc/init.d/drbd status
  407. DRBD module version: 8.3.9
  408. userland version: 8.3.8
  409. you should upgrade your drbd tools!
  410. * drbd driver loaded OK; device status: ... [ ok ]
  411. version: 8.3.9 (api:88/proto:86-95)
  412. built-in
  413. 0: cs:WFBitMapS ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
  414. ns:0 nr:0 dw:0 dr:700 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
  415. 1: cs:WFBitMapS ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
  416. ns:279805224 nr:0 dw:508694668 dr:1621564 al:61640 bm:28031 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:112536948
  417.  
  418. # drbd-overview
  419. ## It never completed - just hung so I CTRL-C'd it ##
  420.  
  421. # ps aux | grep drbd | grep -v grep
  422. root 4975 0.0 0.0 0 0 ? S 22:08 0:00 [drbd0_asender]
  423. root 4976 0.0 0.0 0 0 ? S 22:08 0:00 [drbd1_asender]
  424. root 14266 0.0 0.0 0 0 ? S Jun29 0:00 [drbd0_worker]
  425. root 14277 0.2 0.0 0 0 ? S Jun29 4:10 [drbd1_worker]
  426. root 14311 0.0 0.0 0 0 ? S Jun29 0:00 [drbd0_receiver]
  427. root 14322 0.9 0.0 0 0 ? S Jun29 13:54 [drbd1_receiver]
  428.  
  429. # /etc/init.d/drbd stop
  430. * Caching service dependencies ...
  431. DRBD module version: 8.3.9
  432. userland version: 8.3.8
  433. you should upgrade your drbd tools! [ ok ]
  434. DRBD module version: 8.3.9
  435. userland version: 8.3.8
  436. you should upgrade your drbd tools!
  437. * Stopping all DRBD resources ...
  438. DRBD module version: 8.3.9
  439. userland version: 8.3.8
  440. you should upgrade your drbd tools!
  441. 0: State change failed: (-12) Device is held open by someone
  442. Command '/sbin/drbdsetup 0 down' terminated with exit code 11
  443. 1: State change failed: (-12) Device is held open by someone
  444. Command '/sbin/drbdsetup 1 down' terminated with exit code 11
  445.  
  446. ## The service has stopped, so I stopped it on the secondary as well then started the primary before the secondary##
  447. # /var/log/messages on primary after restart
  448. Jun 30 22:24:34 itfof01 kernel: block drbd1: Handshake successful: Agreed network protocol version 95
  449. Jun 30 22:24:34 itfof01 kernel: block drbd1: conn( WFConnection -> WFReportParams )
  450. Jun 30 22:24:34 itfof01 kernel: block drbd1: Starting asender thread (from drbd1_receiver [14322])
  451. Jun 30 22:24:34 itfof01 kernel: block drbd1: data-integrity-alg: <not-used>
  452. Jun 30 22:24:34 itfof01 kernel: block drbd1: max_segment_size ( = BIO size ) = 65536
  453. Jun 30 22:24:34 itfof01 kernel: block drbd1: drbd_sync_handshake:
  454. Jun 30 22:24:34 itfof01 kernel: block drbd1: self BDB69235F03FB11B:FC433F9C35D4E19D:EFD171D1BE6D85C5:305E2BE3E64F5FA9 bits:28134237 flags:0
  455. Jun 30 22:24:34 itfof01 kernel: block drbd1: peer FC433F9C35D4E19C:0000000000000000:EFD171D1BE6D85C4:305E2BE3E64F5FA9 bits:0 flags:0
  456. Jun 30 22:24:34 itfof01 kernel: block drbd1: uuid_compare()=1 by rule 70
  457. Jun 30 22:24:34 itfof01 kernel: block drbd1: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate )
  458. Jun 30 22:24:34 itfof01 kernel: block drbd0: Handshake successful: Agreed network protocol version 95
  459. Jun 30 22:24:34 itfof01 kernel: block drbd0: conn( WFConnection -> WFReportParams )
  460. Jun 30 22:24:34 itfof01 kernel: block drbd0: Starting asender thread (from drbd0_receiver [14311])
  461. Jun 30 22:24:34 itfof01 kernel: block drbd0: data-integrity-alg: <not-used>
  462. Jun 30 22:24:34 itfof01 kernel: block drbd0: max_segment_size ( = BIO size ) = 65536
  463. Jun 30 22:24:34 itfof01 kernel: block drbd0: drbd_sync_handshake:
  464. Jun 30 22:24:34 itfof01 kernel: block drbd0: self 378FBD5A1F9BAB2D:C3C0A265E9277309:173E3BD3C3CA5CD4:BBF4F1769486E305 bits:0 flags:0
  465. Jun 30 22:24:34 itfof01 kernel: block drbd0: peer C3C0A265E9277308:0000000000000000:173E3BD3C3CA5CD4:BBF4F1769486E305 bits:0 flags:0
  466. Jun 30 22:24:34 itfof01 kernel: block drbd0: uuid_compare()=1 by rule 70
  467. Jun 30 22:24:34 itfof01 kernel: block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate )
  468. Jun 30 22:24:46 itfof01 kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967295
  469. Jun 30 22:24:52 itfof01 kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967294
  470. Jun 30 22:24:58 itfof01 kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967293
  471. Jun 30 22:25:04 itfof01 kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967292
  472. Jun 30 22:25:10 itfof01 kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967291
  473. Jun 30 22:25:16 itfof01 kernel: block drbd1: [drbd1_worker/14277] sock_sendmsg time expired, ko = 4294967290
  474. # ETC
  475.  
  476. # dmesg output (from primary) after stopping & starting the services on both nodes
  477. block drbd0: peer( Secondary -> Unknown ) conn( WFBitMapS -> TearDown ) pdsk( UpToDate -> DUnknown )
  478. block drbd0: meta connection shut down by peer.
  479. block drbd0: asender terminated
  480. block drbd0: Terminating drbd0_asender
  481. block drbd0: Connection closed
  482. block drbd0: conn( TearDown -> Unconnected )
  483. block drbd0: receiver terminated
  484. block drbd0: Restarting drbd0_receiver
  485. block drbd0: receiver (re)started
  486. block drbd0: conn( Unconnected -> WFConnection )
  487. block drbd1: peer( Secondary -> Unknown ) conn( WFBitMapS -> TearDown ) pdsk( UpToDate -> DUnknown )
  488. block drbd1: meta connection shut down by peer.
  489. block drbd1: asender terminated
  490. block drbd1: Terminating drbd1_asender
  491. block drbd1: short sent ReportBitMap size=4096 sent=2380
  492. block drbd1: Connection closed
  493. block drbd1: conn( TearDown -> Unconnected )
  494. block drbd1: receiver terminated
  495. block drbd1: Restarting drbd1_receiver
  496. block drbd1: receiver (re)started
  497. block drbd1: conn( Unconnected -> WFConnection )
  498. block drbd0: role( Primary -> Secondary )
  499. block drbd0: conn( WFConnection -> Disconnecting )
  500. block drbd0: Discarding network configuration.
  501. block drbd0: Connection closed
  502. block drbd0: conn( Disconnecting -> StandAlone )
  503. block drbd0: receiver terminated
  504. block drbd0: Terminating drbd0_receiver
  505. block drbd0: disk( UpToDate -> Diskless )
  506. block drbd0: Sending state for being diskless failed
  507. block drbd0: drbd_bm_resize called with capacity == 0
  508. block drbd0: worker terminated
  509. block drbd0: Terminating drbd0_worker
  510. block drbd1: State change failed: Device is held open by someone
  511. block drbd1: state = { cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown r--- }
  512. block drbd1: wanted = { cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown r--- }
  513. block drbd0: Starting worker thread (from kworker/u:0 [5])
  514. block drbd0: disk( Diskless -> Attaching )
  515. block drbd0: No usable activity log found.
  516. block drbd0: Method to ensure write ordering: flush
  517. block drbd0: max_segment_size ( = BIO size ) = 65536
  518. block drbd0: drbd_bm_resize called with capacity == 1023896
  519. block drbd0: resync bitmap: bits=127987 words=2000
  520. block drbd0: size = 500 MB (511948 KB)
  521. block drbd0: recounting of set bits took additional 0 jiffies
  522. block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
  523. block drbd0: disk( Attaching -> UpToDate )
  524. block drbd0: conn( StandAlone -> Unconnected )
  525. block drbd0: Starting receiver thread (from drbd0_worker [9168])
  526. block drbd0: receiver (re)started
  527. block drbd0: conn( Unconnected -> WFConnection )
  528. block drbd0: Handshake successful: Agreed network protocol version 95
  529. block drbd0: conn( WFConnection -> WFReportParams )
  530. block drbd0: Starting asender thread (from drbd0_receiver [9180])
  531. block drbd0: data-integrity-alg: <not-used>
  532. block drbd0: max_segment_size ( = BIO size ) = 65536
  533. block drbd0: drbd_sync_handshake:
  534. block drbd0: self 378FBD5A1F9BAB2C:C3C0A265E9277309:173E3BD3C3CA5CD4:BBF4F1769486E305 bits:0 flags:0
  535. block drbd0: peer C3C0A265E9277308:0000000000000000:173E3BD3C3CA5CD4:BBF4F1769486E305 bits:0 flags:0
  536. block drbd0: uuid_compare()=1 by rule 70
  537. block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate )
  538. block drbd1: Handshake successful: Agreed network protocol version 95
  539. block drbd1: conn( WFConnection -> WFReportParams )
  540. block drbd1: Starting asender thread (from drbd1_receiver [8012])
  541. block drbd1: data-integrity-alg: <not-used>
  542. block drbd1: max_segment_size ( = BIO size ) = 65536
  543. block drbd1: drbd_sync_handshake:
  544. block drbd1: self BDB69235F03FB11B:FC433F9C35D4E19D:EFD171D1BE6D85C5:305E2BE3E64F5FA9 bits:28134237 flags:0
  545. block drbd1: peer FC433F9C35D4E19C:0000000000000000:EFD171D1BE6D85C4:305E2BE3E64F5FA9 bits:0 flags:0
  546. block drbd1: uuid_compare()=1 by rule 70
  547. block drbd1: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate )
  548.  
  549.  
  550. #cat /etc/drbd.conf
  551. global {
  552. usage-count no;
  553. }
  554.  
  555. common {
  556. # transfer protocol to use.
  557. # C: write IO is reported as completed, if we know it has
  558. # reached _both_ local and remote DISK.
  559. # * for critical transactional data.
  560. # B: write IO is reported as completed, if it has reached
  561. # local DISK and remote buffer cache.
  562. # * for most cases.
  563. # A: write IO is reported as completed, if it has reached
  564. # local DISK and local tcp send buffer. (see also sndbuf-size)
  565. # * for high latency networks
  566. #
  567. protocol C;
  568.  
  569. handlers {
  570. # what should be done in case the cluster starts up in
  571. # degraded mode, but knows it has inconsistent data.
  572. #pri-on-incon-degr "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ; halt -f";
  573.  
  574. pri-on-incon-degr "echo 'DRBD: primary requested but inconsistent!' | wall; /etc/init.d/heartbeat stop"; #"halt -f";
  575. pri-lost-after-sb "echo 'DRBD: primary requested but lost!' | wall; /etc/init.d/heartbeat stop"; #"halt -f";
  576.  
  577. #pri-on-incon-degr "echo o > /proc/sysrq-trigger";
  578. #pri-lost-after-sb "echo o > /proc/sysrq-trigger";
  579. #local-io-error "echo o > /proc/sysrq-trigger";
  580. }
  581.  
  582. startup {
  583. #The init script drbd(8) blocks the boot process until the DRBD resources are connected. When the cluster manager
  584. #starts later, it does not see a resource with internal split-brain. In case you want to limit the wait time, do it
  585. #here. Default is 0, which means unlimited. The unit is seconds.
  586. wfc-timeout 0; # 2 minutes
  587.  
  588. # Wait for connection timeout if this node was a degraded cluster.
  589. # In case a degraded cluster (= cluster with only one node left)
  590. # is rebooted, this timeout value is used.
  591. #
  592. degr-wfc-timeout 120; # 2 minutes.
  593. }
  594.  
  595. syncer {
  596. rate 13M;
  597. # This is now expressed with "after res-name"
  598. #group 1;
  599. al-extents 257;
  600. }
  601.  
  602. net {
  603. # TODO: Should these timeouts be relative to some heartbeat settings?
  604. # timeout 60; # 6 seconds (unit = 0.1 seconds)
  605. # connect-int 10; # 10 seconds (unit = 1 second)
  606. # ping-int 10; # 10 seconds (unit = 1 second)
  607.  
  608. # if the connection to the peer is lost you have the choice of
  609. # "reconnect" -> Try to reconnect (AKA WFConnection state)
  610. # "stand_alone" -> Do not reconnect (AKA StandAlone state)
  611. # "freeze_io" -> Try to reconnect but freeze all IO until
  612. # the connection is established again.
  613. # FIXME This appears to be obsoleate
  614. #on-disconnect reconnect;
  615.  
  616. # FIXME Experemental Crap
  617. #cram-hmac-alg "sha256";
  618. #shared-secret "secretPassword555";
  619. #after-sb-0pri discard-younger-primary;
  620. #after-sb-1pri consensus;
  621. #after-sb-2pri disconnect;
  622. #rr-conflict disconnect;
  623. }
  624.  
  625. disk {
  626. # if the lower level device reports io-error you have the choice of
  627. # "pass_on" -> Report the io-error to the upper layers.
  628. # Primary -> report it to the mounted file system.
  629. # Secondary -> ignore it.
  630. # "panic" -> The node leaves the cluster by doing a kernel panic.
  631. # "detach" -> The node drops its backing storage device, and
  632. # continues in disk less mode.
  633. #
  634. on-io-error pass_on;
  635.  
  636. # Under fencing we understand preventive measures to avoid situations where both nodes are
  637. # primary and disconnected (AKA split brain).
  638. fencing dont-care;
  639.  
  640. # In case you only want to use a fraction of the available space
  641. # you might use the "size" option here.
  642. #
  643. # size 10G;
  644. }
  645. }
  646.  
  647.  
  648. #
  649. # this need not be drbd#, you may use phony resource names,
  650. # like "resource web" or "resource mail", too
  651. #
  652.  
  653. resource "meta" {
  654. device /dev/drbd0;
  655. meta-disk internal;
  656.  
  657. on primary {
  658. address 192.168.50.51:7788;
  659. disk /dev/sdb1;
  660. }
  661.  
  662. on secondary {
  663. address 192.168.50.52:7788;
  664. disk /dev/md0p1;
  665. }
  666. }
  667.  
  668. resource "data" {
  669. device /dev/drbd1;
  670. meta-disk internal;
  671.  
  672. on primary {
  673. address 192.168.50.51:7789;
  674. disk /dev/sdb2;
  675. }
  676.  
  677. on secondary {
  678. address 192.168.50.52:7789;
  679. disk /dev/md0p2;
  680. }
  681. }
  682.  
  683. ########### Notes
  684. DRBD wasn't built as a module; its baked into the kernel; lsmod doesn't show the drbd module loaded. (i.e.: find /lib/modules -name "drbd*" and find /lib/modules/2.6.*gentoo-*/ -type f -iname '*.o' -or -iname '*.ko' | grep drbd return no results)
  685.  
  686. Linux primary 2.6.38-gentoo-r6 #1 SMP Mon Jun 27 07:35:00 EDT 2011 x86_64 Intel(R) Xeon(R) CPU E5504 @ 2.00GHz GenuineIntel GNU/Linux
  687.  
  688. Linux secondary 2.6.38-gentoo-r6 #1 SMP Sun Jun 26 15:57:21 EDT 2011 x86_64 Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz GenuineIntel GNU/Linux
  689.  
  690. Using 'sys-cluster/drbd' emerge build (v 8.3.8.1) on both nodes.
  691.  
  692. Did not emerge 'sys-cluster/drbd-kernel' emerge build (v 8.0.16)
  693.  
  694. Other steps performed:
  695. `drbdadm {disconnect,connect} resource` on both nodes
  696. `drbdadm role resource` on both nodes and primary reports primary, secondary reports secondary
  697. `drbdadm -- --discard-my-data connect resource` on the secondary
  698.  
  699. While most of the above commands work correctly on the secondary (except the --discard-my-data of course), on the primary I occasionally see things like
  700. No response from the DRBD driver! Is the module loaded?
  701. 1: State change failed: (-12) Device is held open by someone
  702. Command '/sbin/drbdsetup 1 down' terminated with exit code 11
  703. 0: Failure: (125) Device has a net-config (use disconnect first)
  704. Command 'drbdsetup 0 net 192.168.50.51:7788 192.168.50.52:7788 C --set-defaults --create-device' terminated with exit code 10
  705. Command 'drbdsetup 1 net 192.168.50.51:7789 192.168.50.52:7789 C --set-defaults --create-device' did not terminate within 5 seconds
  706. Some commands never seem to complete.
  707.  
  708. The problem is closely related, if not identical, to the issue outlined here: http://copilotco.com/mail-archives/drbd.2009/msg00449.html. Instead of a bad link, or link with latency I simply stopped the DRBD service on the secondary to simulate an outage or disconnect scenario. I left it off for hours just to get an idea of how well it replicated & how long it would take but to my dismay it hasn't reconnected.
Advertisement
Add Comment
Please, Sign In to add comment