Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- node-5 was reset. All nodes joined cluster later but node-8
- node-5 rejoined corosync 2015-12-30T16:45:00.128
- lrmd.log:
- last time seen healthy:
- 2015-12-30T16:39:30.616927+00:00 info: INFO: p_rabbitmq-server: get_monitor(): rabbit app is running and is member of healthy cluster
- started join node-7 failure loop:
- 2015-12-30T16:39:40.953840+00:00 info: INFO: p_rabbitmq-server: get_monitor(): we are the oldest node
- 2015-12-30T16:39:40.966122+00:00 info: INFO: p_rabbitmq-server: notify: pre-promote begin.
- 2015-12-30T16:39:40.971940+00:00 info: INFO: p_rabbitmq-server: my_host(): hostlist is: node-7.mirantis.com
- 2015-12-30T16:40:30.734986+00:00 info: INFO: p_rabbitmq-server: su_rabbit_cmd(): the invoked command exited 137: /usr/sbin/rabbitmqctl list_channels 2>&1 > /dev/null
- 2015-12-30T16:40:30.767526+00:00 err: ERROR: p_rabbitmq-server: get_monitor(): 'rabbitmqctl list_channels' timed out 1 of max. 1 time(s) in a row and is not responding. The resource is
- failed.
- 2015-12-30T16:41:19.884869+00:00 info: INFO: p_rabbitmq-server: join_to_cluster(): Execute join_cluster with timeout: 60
- 2015-12-30T16:41:20.312866+00:00 info: INFO: p_rabbitmq-server: su_rabbit_cmd(): the invoked command exited 2: /usr/sbin/rabbitmqctl join_cluster rabbit@node-7
- 2015-12-30T16:41:20.318842+00:00 err: ERROR: p_rabbitmq-server: join_to_cluster(): Can't join to cluster by node 'rabbit@node-7'. Stopping.
- 2015-12-30T16:41:34.164293+00:00 warning: WARNING: p_rabbitmq-server: reset_mnesia(): Beam have been killed. Mnesia files appear corrupted and have been removed.
- 2015-12-30T16:41:34.167877+00:00 info: INFO: p_rabbitmq-server: notify: post-promote end.
- 2015-12-30T16:41:34.171393+00:00 err: ERROR: p_rabbitmq-server: notify: Failed to join the cluster on post-promote. The resource will be restarted.
- ... last jon attempt failed:
- 2015-12-30T16:46:04.707802+00:00 err: ERROR: p_rabbitmq-server: notify: Failed to join the cluster on post-promote. The resource will be restarted.
- and was kept down after that. Pacemaker status reported it as a running Slave, hence it didn't try to start it anymore!
- crmd.log:
- 2015-12-30T16:40:30.771540+00:00 notice: notice: process_lrm_event: Operation p_rabbitmq-server_monitor_103000: unknown error (node=node-8.mirantis.com, call=174, rc=1, cib-update=385, confirmed=false)
- 2015-12-30T16:40:41.742090+00:00 notice: notice: process_lrm_event: Operation p_rabbitmq-server_monitor_30000: unknown error (node=node-8.mirantis.com, call=175, rc=1, cib-update=386, confirmed=false)
- 2015-12-30T16:41:17.427104+00:00 notice: notice: process_lrm_event: Operation p_rabbitmq-server_monitor_30000: not running (node=node-8.mirantis.com, call=175, rc=7, cib-update=387, confirmed=false)
- 2015-12-30T16:42:14.387788+00:00 notice: notice: process_lrm_event: Operation p_rabbitmq-server_monitor_103000: not running (node=node-8.mirantis.com, call=174, rc=7, cib-update=388, confirmed=false)
- 2015-12-30T16:42:55.145390+00:00 notice: notice: process_lrm_event: Operation p_rabbitmq-server_notify_0: ok (node=node-8.mirantis.com, call=348, rc=0, cib-update=0, confirmed=true)
- 2015-12-30T16:42:58.420655+00:00 notice: notice: process_lrm_event: Operation p_rabbitmq-server_notify_0: ok (node=node-8.mirantis.com, call=349, rc=0, cib-update=0, confirmed=true)
- (repeats)
Advertisement
Add Comment
Please, Sign In to add comment