Advertisement
Guest User

Untitled

a guest
Jun 29th, 2012
81
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 11.07 KB | None | 0 0
  1. config and log file follow -
  2.  
  3. config:
  4.  
  5. node st15-mds1
  6. node st15-mds2
  7. node st15-oss1
  8. node st15-oss2
  9. primitive lustre-MDT0000 ocf:heartbeat:Filesystem \
  10. meta target-role="Started" \
  11. operations $id="lustre-MDT0000-operations" \
  12. op monitor interval="120" timeout="60" \
  13. op start interval="0" timeout="300" \
  14. op stop interval="0" timeout="300" \
  15. params device="-Llustre-MDT0000" directory="/mnt/mdt" fstype="lustre"
  16. primitive lustre-OST0000 ocf:heartbeat:Filesystem \
  17. meta target-role="Started" \
  18. operations $id="lustre-OST0000-operations" \
  19. op monitor interval="120" timeout="60" \
  20. op start interval="0" timeout="300" \
  21. op stop interval="0" timeout="300" \
  22. params device="-Llustre-OST0000" directory="/mnt/ost0" fstype="lustre"
  23. primitive lustre-OST0001 ocf:heartbeat:Filesystem \
  24. meta target-role="Started" \
  25. operations $id="lustre-OST0001-operations" \
  26. op monitor interval="120" timeout="60" \
  27. op start interval="0" timeout="300" \
  28. op stop interval="0" timeout="300" \
  29. params device="-Llustre-OST0001" directory="/mnt/ost1" fstype="lustre"
  30. primitive lustre-OST0002 ocf:heartbeat:Filesystem \
  31. meta target-role="Started" \
  32. operations $id="lustre-OST0002-operations" \
  33. op monitor interval="120" timeout="60" \
  34. op start interval="0" timeout="300" \
  35. op stop interval="0" timeout="300" \
  36. params device="-Llustre-OST0002" directory="/mnt/ost2" fstype="lustre"
  37. primitive lustre-OST0003 ocf:heartbeat:Filesystem \
  38. meta target-role="Started" \
  39. operations $id="lustre-OST0003-operations" \
  40. op monitor interval="120" timeout="60" \
  41. op start interval="0" timeout="300" \
  42. op stop interval="0" timeout="300" \
  43. params device="-Llustre-OST0003" directory="/mnt/ost3" fstype="lustre"
  44. primitive st-nodes stonith:external/libvirt \
  45. params hostlist="st15-mds1,st15-mds2,st15-oss1,st15-oss2" hypervisor_uri="qemu+ssh://wc0008/system" stonith-timeout="30" \
  46. op start interval="0" timeout="60" \
  47. op stop interval="0" timeout="60" \
  48. op monitor interval="60"
  49. location lustre-MDT0000-primary lustre-MDT0000 20: st15-mds1
  50. location lustre-MDT0000-secondary lustre-MDT0000 10: st15-mds2
  51. location lustre-OST0000-primary lustre-OST0000 20: st15-oss1
  52. location lustre-OST0000-secondary lustre-OST0000 10: st15-oss2
  53. location lustre-OST0001-primary lustre-OST0001 20: st15-oss1
  54. location lustre-OST0001-secondary lustre-OST0001 10: st15-oss2
  55. location lustre-OST0002-primary lustre-OST0002 20: st15-oss2
  56. location lustre-OST0002-secondary lustre-OST0002 10: st15-oss1
  57. location lustre-OST0003-primary lustre-OST0003 20: st15-oss2
  58. location lustre-OST0003-secondary lustre-OST0003 10: st15-oss1
  59. property $id="cib-bootstrap-options" \
  60. dc-version="1.0.12-unknown" \
  61. cluster-infrastructure="openais" \
  62. expected-quorum-votes="4" \
  63. symmetric-cluster="false" \
  64. stonith-enabled="true" \
  65. last-lrm-refresh="1340974907"
  66. rsc_defaults $id="rsc-options" \
  67. resource-stickiness="1000"
  68.  
  69.  
  70. log file:
  71.  
  72.  
  73. Jun 29 09:58:59 st15-mds2 corosync[4479]: [TOTEM ] A processor failed, forming new configuration.
  74. Jun 29 09:59:00 st15-mds2 corosync[4479]: [pcmk ] notice: pcmk_peer_update: Transitional membership event on ring 396: memb=3, new=0, lost=1
  75. Jun 29 09:59:00 st15-mds2 corosync[4479]: [pcmk ] info: pcmk_peer_update: memb: st15-mds2 41134272
  76. Jun 29 09:59:00 st15-mds2 corosync[4479]: [pcmk ] info: pcmk_peer_update: memb: st15-oss1 57911488
  77. Jun 29 09:59:00 st15-mds2 corosync[4479]: [pcmk ] info: pcmk_peer_update: memb: st15-oss2 74688704
  78. Jun 29 09:59:00 st15-mds2 corosync[4479]: [pcmk ] info: pcmk_peer_update: lost: st15-mds1 24357056
  79. Jun 29 09:59:00 st15-mds2 corosync[4479]: [pcmk ] notice: pcmk_peer_update: Stable membership event on ring 396: memb=3, new=0, lost=0
  80. Jun 29 09:59:00 st15-mds2 corosync[4479]: [pcmk ] info: pcmk_peer_update: MEMB: st15-mds2 41134272
  81. Jun 29 09:59:00 st15-mds2 corosync[4479]: [pcmk ] info: pcmk_peer_update: MEMB: st15-oss1 57911488
  82. Jun 29 09:59:00 st15-mds2 corosync[4479]: [pcmk ] info: pcmk_peer_update: MEMB: st15-oss2 74688704
  83. Jun 29 09:59:00 st15-mds2 corosync[4479]: [pcmk ] info: ais_mark_unseen_peer_dead: Node st15-mds1 was not seen in the previous transition
  84. Jun 29 09:59:00 st15-mds2 corosync[4479]: [pcmk ] info: update_member: Node 24357056/st15-mds1 is now: lost
  85. Jun 29 09:59:00 st15-mds2 corosync[4479]: [pcmk ] info: send_member_notification: Sending membership update 396 to 2 children
  86. Jun 29 09:59:00 st15-mds2 crmd: [4490]: info: ais_dispatch: Membership 396: quorum retained
  87. Jun 29 09:59:00 st15-mds2 cib: [4486]: info: ais_dispatch: Membership 396: quorum retained
  88. Jun 29 09:59:00 st15-mds2 corosync[4479]: [TOTEM ] A processor joined or left the membership and a new membership was formed.
  89. Jun 29 09:59:00 st15-mds2 crmd: [4490]: info: ais_status_callback: status: st15-mds1 is now lost (was member)
  90. Jun 29 09:59:00 st15-mds2 cib: [4486]: info: crm_update_peer: Node st15-mds1: id=24357056 state=lost (new) addr=r(0) ip(192.168.115.1) votes=1 born=384 seen=392 proc=00000000000000000000000000013312
  91. Jun 29 09:59:00 st15-mds2 corosync[4479]: [MAIN ] Completed service synchronization, ready to provide service.
  92. Jun 29 09:59:00 st15-mds2 crmd: [4490]: info: crm_update_peer: Node st15-mds1: id=24357056 state=lost (new) addr=r(0) ip(192.168.115.1) votes=1 born=384 seen=392 proc=00000000000000000000000000013312
  93. Jun 29 09:59:00 st15-mds2 crmd: [4490]: info: erase_node_from_join: Removed node st15-mds1 from join calculations: welcomed=0 itegrated=0 finalized=0 confirmed=1
  94. Jun 29 09:59:00 st15-mds2 cib: [4486]: info: cib_process_request: Operation complete: op cib_modify for section nodes (origin=local/crmd/232, version=0.313.9): ok (rc=0)
  95. Jun 29 09:59:00 st15-mds2 crmd: [4490]: info: crm_ais_dispatch: Setting expected votes to 4
  96. Jun 29 09:59:00 st15-mds2 cib: [4486]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=local/crmd/235, version=0.313.10): ok (rc=0)
  97. Jun 29 09:59:00 st15-mds2 crmd: [4490]: WARN: match_down_event: No match for shutdown action on st15-mds1
  98. Jun 29 09:59:00 st15-mds2 crmd: [4490]: info: te_update_diff: Stonith/shutdown of st15-mds1 not matched
  99. Jun 29 09:59:00 st15-mds2 crmd: [4490]: info: abort_transition_graph: te_update_diff:198 - Triggered transition abort (complete=1, tag=node_state, id=st15-mds1, magic=NA, cib=0.313.10) : Node failure
  100. Jun 29 09:59:00 st15-mds2 crmd: [4490]: info: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
  101. Jun 29 09:59:00 st15-mds2 crmd: [4490]: info: do_state_transition: All 3 cluster nodes are eligible to run resources.
  102. Jun 29 09:59:00 st15-mds2 crmd: [4490]: info: do_pe_invoke: Query 236: Requesting the current CIB: S_POLICY_ENGINE
  103. Jun 29 09:59:00 st15-mds2 crmd: [4490]: info: do_pe_invoke_callback: Invoking the PE: query=236, ref=pe_calc-dc-1340978340-160, seq=396, quorate=1
  104. Jun 29 09:59:00 st15-mds2 pengine: [4489]: info: unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
  105. Jun 29 09:59:00 st15-mds2 pengine: [4489]: info: determine_online_status: Node st15-mds2 is online
  106. Jun 29 09:59:00 st15-mds2 pengine: [4489]: WARN: pe_fence_node: Node st15-mds1 will be fenced because it is un-expectedly down
  107. Jun 29 09:59:00 st15-mds2 pengine: [4489]: info: determine_online_status_fencing: ha_state=active, ccm_state=false, crm_state=online, join_state=member, expected=member
  108. Jun 29 09:59:00 st15-mds2 pengine: [4489]: WARN: determine_online_status: Node st15-mds1 is unclean
  109. Jun 29 09:59:00 st15-mds2 pengine: [4489]: info: determine_online_status: Node st15-oss1 is online
  110. Jun 29 09:59:00 st15-mds2 pengine: [4489]: info: determine_online_status: Node st15-oss2 is online
  111. Jun 29 09:59:00 st15-mds2 pengine: [4489]: notice: native_print: lustre-OST0000 (ocf::heartbeat:Filesystem): Started st15-oss1
  112. Jun 29 09:59:00 st15-mds2 pengine: [4489]: notice: native_print: lustre-OST0001 (ocf::heartbeat:Filesystem): Started st15-oss1
  113. Jun 29 09:59:00 st15-mds2 pengine: [4489]: notice: native_print: lustre-OST0002 (ocf::heartbeat:Filesystem): Started st15-oss2
  114. Jun 29 09:59:00 st15-mds2 pengine: [4489]: notice: native_print: lustre-OST0003 (ocf::heartbeat:Filesystem): Started st15-oss2
  115. Jun 29 09:59:00 st15-mds2 pengine: [4489]: notice: native_print: lustre-MDT0000 (ocf::heartbeat:Filesystem): Started st15-mds1
  116. Jun 29 09:59:01 st15-mds2 pengine: [4489]: notice: native_print: st-nodes (stonith:external/libvirt): Stopped
  117. Jun 29 09:59:01 st15-mds2 pengine: [4489]: info: native_color: Resource st-nodes cannot run anywhere
  118. Jun 29 09:59:01 st15-mds2 pengine: [4489]: WARN: custom_action: Action lustre-MDT0000_stop_0 on st15-mds1 is unrunnable (offline)
  119. Jun 29 09:59:01 st15-mds2 pengine: [4489]: WARN: custom_action: Marking node st15-mds1 unclean
  120. Jun 29 09:59:01 st15-mds2 pengine: [4489]: notice: RecurringOp: Start recurring monitor (120s) for lustre-MDT0000 on st15-mds2
  121. Jun 29 09:59:01 st15-mds2 pengine: [4489]: WARN: stage6: Scheduling Node st15-mds1 for STONITH
  122. Jun 29 09:59:01 st15-mds2 pengine: [4489]: info: native_stop_constraints: lustre-MDT0000_stop_0 is implicit after st15-mds1 is fenced
  123. Jun 29 09:59:01 st15-mds2 pengine: [4489]: notice: LogActions: Leave resource lustre-OST0000 (Started st15-oss1)
  124. Jun 29 09:59:01 st15-mds2 pengine: [4489]: notice: LogActions: Leave resource lustre-OST0001 (Started st15-oss1)
  125. Jun 29 09:59:01 st15-mds2 pengine: [4489]: notice: LogActions: Leave resource lustre-OST0002 (Started st15-oss2)
  126. Jun 29 09:59:01 st15-mds2 pengine: [4489]: notice: LogActions: Leave resource lustre-OST0003 (Started st15-oss2)
  127. Jun 29 09:59:01 st15-mds2 pengine: [4489]: notice: LogActions: Move resource lustre-MDT0000 (Started st15-mds1 -> st15-mds2)
  128. Jun 29 09:59:01 st15-mds2 pengine: [4489]: notice: LogActions: Leave resource st-nodes (Stopped)
  129. Jun 29 09:59:01 st15-mds2 crmd: [4490]: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
  130. Jun 29 09:59:01 st15-mds2 pengine: [4489]: WARN: process_pe_message: Transition 28: WARNINGs found during PE processing. PEngine Input stored in: /var/lib/pengine/pe-warn-111.bz2
  131. Jun 29 09:59:01 st15-mds2 crmd: [4490]: info: unpack_graph: Unpacked transition 28: 7 actions in 7 synapses
  132. Jun 29 09:59:01 st15-mds2 pengine: [4489]: info: process_pe_message: Configuration WARNINGs found during PE processing. Please run "crm_verify -L" to identify issues.
  133. Jun 29 09:59:01 st15-mds2 crmd: [4490]: info: do_te_invoke: Processing graph 28 (ref=pe_calc-dc-1340978340-160) derived from /var/lib/pengine/pe-warn-111.bz2
  134. Jun 29 09:59:01 st15-mds2 crmd: [4490]: info: te_pseudo_action: Pseudo action 21 fired and confirmed
  135. Jun 29 09:59:01 st15-mds2 crmd: [4490]: info: te_fence_node: Executing reboot fencing operation (23) on st15-mds1 (timeout=60000)
  136. Jun 29 09:59:01 st15-mds2 stonithd: [4485]: info: client tengine [pid: 4490] requests a STONITH operation RESET on node st15-mds1
  137. Jun 29 09:59:01 st15-mds2 stonithd: [4485]: info: we can't manage st15-mds1, broadcast request to other nodes
  138. Jun 29 09:59:01 st15-mds2 stonithd: [4485]: info: Broadcasting the message succeeded: require others to stonith node st15-mds1.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement