Guest User

LOGS HA

a guest
Apr 19th, 2017
194
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 18.53 KB | None | 0 0
  1. ** nn2 : zkfc log when stopping nn1 **
  2.  
  3.  
  4. 2017-04-19 01:16:09,275 INFO org.apache.hadoop.ha.ActiveStandbyElector: Trying to re-establish ZK session
  5. 2017-04-19 01:16:09,278 INFO org.apache.zookeeper.ZooKeeper: Session: 0x15b854004c90003 closed
  6. 2017-04-19 01:16:10,280 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=192.168.10.153:2181,192.168.10.155:2181,192.168.10.154:2181 sessionTimeout=5000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@3045b95a
  7. 2017-04-19 01:16:10,283 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server 192.168.10.155/192.168.10.155:2181. Will not attempt to authenticate using SASL (unknown error)
  8. 2017-04-19 01:16:10,284 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to 192.168.10.155/192.168.10.155:2181, initiating session
  9. 2017-04-19 01:16:10,285 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server 192.168.10.155/192.168.10.155:2181, sessionid = 0x15b854004c90004, negotiated timeout = 5000
  10. 2017-04-19 01:16:10,286 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down
  11. 2017-04-19 01:16:10,289 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session connected.
  12. 2017-04-19 01:16:10,295 INFO org.apache.hadoop.ha.ActiveStandbyElector: Checking for any old active which needs to be fenced...
  13. 2017-04-19 01:16:10,296 INFO org.apache.hadoop.ha.ActiveStandbyElector: Old node exists: 0a0a68612d636c757374657212036e6e311a036e6e3120a84628d33e
  14. 2017-04-19 01:16:10,300 WARN org.apache.hadoop.ha.ActiveStandbyElector: Exception handling the winning of election
  15. java.lang.RuntimeException: Mismatched address stored in ZK for NameNode at /192.168.10.153:9000: Stored protobuf was nameserviceId: "ha-cluster"
  16. namenodeId: "nn1"
  17. hostname: "nn1"
  18. port: 9000
  19. zkfcPort: 8019
  20. , address from our own configuration for this NameNode was /192.168.10.153:9000
  21. at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.dataToTarget(DFSZKFailoverController.java:75)
  22. at org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:502)
  23. at org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:61)
  24. at org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:892)
  25. at org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:910)
  26. at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:809)
  27. at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:418)
  28. at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
  29. at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
  30. 2017-04-19 01:16:10,300 INFO org.apache.hadoop.ha.ActiveStandbyElector: Trying to re-establish ZK session
  31. 2017-04-19 01:16:10,303 INFO org.apache.zookeeper.ZooKeeper: Session: 0x15b854004c90004 closed
  32. 2017-04-19 01:16:11,304 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=192.168.10.153:2181,192.168.10.155:2181,192.168.10.154:2181 sessionTimeout=5000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@12af3fe6
  33. 2017-04-19 01:16:11,307 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server 192.168.10.154/192.168.10.154:2181. Will not attempt to authenticate using SASL (unknown error)
  34. 2017-04-19 01:16:11,310 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to 192.168.10.154/192.168.10.154:2181, initiating session
  35. 2017-04-19 01:16:11,312 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server 192.168.10.154/192.168.10.154:2181, sessionid = 0x15b85401dd80003, negotiated timeout = 5000
  36. 2017-04-19 01:16:11,312 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down
  37. 2017-04-19 01:16:11,319 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session connected.
  38. 2017-04-19 01:16:11,320 INFO org.apache.hadoop.ha.ZKFailoverController: ZK Election indicated that NameNode at 192.168.10.155/192.168.10.155:9000 should become standby
  39. 2017-04-19 01:16:11,335 INFO org.apache.hadoop.ha.ZKFailoverController: Successfully transitioned NameNode at 192.168.10.155/192.168.10.155:9000 to standby state
  40.  
  41.  
  42.  
  43.  
  44. **nn2 : zkfc log when stopping active nn1 namenode**
  45.  
  46.  
  47.  
  48.  
  49. 2017-04-19 01:18:07,547 INFO org.apache.hadoop.ha.ActiveStandbyElector: Trying to re-establish ZK session
  50. 2017-04-19 01:18:07,552 INFO org.apache.zookeeper.ZooKeeper: Session: 0x15b85401dd80003 closed
  51. 2017-04-19 01:18:08,553 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=192.168.10.153:2181,192.168.10.155:2181,192.168.10.154:2181 sessionTimeout=5000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@249ba7f0
  52. 2017-04-19 01:18:08,558 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server 192.168.10.153/192.168.10.153:2181. Will not attempt to authenticate using SASL (unknown error)
  53. 2017-04-19 01:18:08,559 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to 192.168.10.153/192.168.10.153:2181, initiating session
  54. 2017-04-19 01:18:08,561 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server 192.168.10.153/192.168.10.153:2181, sessionid = 0x15b853fd5430003, negotiated timeout = 5000
  55. 2017-04-19 01:18:08,562 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down
  56. 2017-04-19 01:18:08,567 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session connected.
  57. 2017-04-19 01:18:08,567 FATAL org.apache.hadoop.ha.ActiveStandbyElector: Received create error from Zookeeper. code:NONODE for path /hadoop-ha/ha-cluster/ActiveStandbyElectorLock
  58. 2017-04-19 01:18:08,570 INFO org.apache.zookeeper.ZooKeeper: Session: 0x15b853fd5430003 closed
  59. 2017-04-19 01:18:08,570 FATAL org.apache.hadoop.ha.ZKFailoverController: Fatal error occurred:Received create error from Zookeeper. code:NONODE for path /hadoop-ha/ha-cluster/ActiveStandbyElectorLock
  60. 2017-04-19 01:18:08,571 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down
  61. 2017-04-19 01:18:08,571 INFO org.apache.hadoop.ipc.Server: Stopping server on 8019
  62. 2017-04-19 01:18:08,571 INFO org.apache.hadoop.ha.ActiveStandbyElector: Yielding from election
  63. 2017-04-19 01:18:08,572 INFO org.apache.hadoop.ha.HealthMonitor: Stopping HealthMonitor thread
  64. 2017-04-19 01:18:08,572 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8019
  65. 2017-04-19 01:18:08,572 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
  66. 2017-04-19 01:18:08,573 FATAL org.apache.hadoop.hdfs.tools.DFSZKFailoverController: Got a fatal error, exiting now
  67. java.lang.RuntimeException: ZK Failover Controller failed: Received create error from Zookeeper. code:NONODE for path /hadoop-ha/ha-cluster/ActiveStandbyElectorLock
  68. at org.apache.hadoop.ha.ZKFailoverController.mainLoop(ZKFailoverController.java:369)
  69. at org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:238)
  70. at org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:61)
  71. at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:172)
  72. at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:168)
  73. at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
  74. at org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:168)
  75. at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:181)
  76.  
  77.  
  78.  
  79.  
  80. **nn2 : namenode log when stopping **
  81.  
  82.  
  83. 2017-04-19 01:20:07,907 WARN org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unable to trigger a roll of the active NN
  84. java.net.ConnectException: Call From localhost/127.0.0.1 to 192.168.10.153:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
  85. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  86. at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
  87. at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  88. at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
  89. at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
  90. at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
  91. at org.apache.hadoop.ipc.Client.call(Client.java:1479)
  92. at org.apache.hadoop.ipc.Client.call(Client.java:1412)
  93. at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
  94. at com.sun.proxy.$Proxy16.rollEditLog(Unknown Source)
  95. at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.rollEditLog(NamenodeProtocolTranslatorPB.java:148)
  96. at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:273)
  97. at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$600(EditLogTailer.java:61)
  98. at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:315)
  99. at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:284)
  100. at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:301)
  101. at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
  102. at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:297)
  103. Caused by: java.net.ConnectException: Connection refused
  104. at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  105. at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
  106. at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
  107. at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
  108. at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
  109. at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
  110. at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
  111. at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
  112. at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
  113. at org.apache.hadoop.ipc.Client.call(Client.java:1451)
  114. ... 11 more
  115. 2017-04-19 01:20:09,734 WARN org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Unresolved datanode registration: hostname cannot be resolved (ip=192.168.10.154, hostname=192.168.10.154)
  116. 2017-04-19 01:20:09,735 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 9000, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 192.168.10.154:58536 Call#499 Retry#0
  117. org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode because hostname cannot be resolved (ip=192.168.10.154, hostname=192.168.10.154): DatanodeRegistration(0.0.0.0:50010, datanodeUuid=c6e060e4-9645-4b3e-a519-1a9fd83b0449, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-ca1fac53-fb12-49e3-a01f-11880e517eb4;nsid=1312587850;c=0)
  118. at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:873)
  119. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4529)
  120. at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1286)
  121. at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:96)
  122. at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28752)
  123. at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
  124. at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
  125. at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
  126. at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
  127. at java.security.AccessController.doPrivileged(Native Method)
  128. at javax.security.auth.Subject.doAs(Subject.java:422)
  129. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
  130. at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
  131. 2017-04-19 01:20:14,744 WARN org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Unresolved datanode registration: hostname cannot be resolved (ip=192.168.10.154, hostname=192.168.10.154)
  132. 2017-04-19 01:20:14,746 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 9000, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 192.168.10.154:58536 Call#502 Retry#0
  133. org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode because hostname cannot be resolved (ip=192.168.10.154, hostname=192.168.10.154): DatanodeRegistration(0.0.0.0:50010, datanodeUuid=c6e060e4-9645-4b3e-a519-1a9fd83b0449, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-ca1fac53-fb12-49e3-a01f-11880e517eb4;nsid=1312587850;c=0)
  134. at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:873)
  135. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4529)
  136. at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1286)
  137. at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:96)
  138. at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28752)
  139. at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
  140. at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
  141. at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
  142. at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
  143. at java.security.AccessController.doPrivileged(Native Method)
  144. at javax.security.auth.Subject.doAs(Subject.java:422)
  145. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
  146. at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
  147. 2017-04-19 01:20:19,753 WARN org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Unresolved datanode registration: hostname cannot be resolved (ip=192.168.10.154, hostname=192.168.10.154)
  148. 2017-04-19 01:20:19,754 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9000, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 192.168.10.154:58536 Call#504 Retry#0
  149. org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode because hostname cannot be resolved (ip=192.168.10.154, hostname=192.168.10.154): DatanodeRegistration(0.0.0.0:50010, datanodeUuid=c6e060e4-9645-4b3e-a519-1a9fd83b0449, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-ca1fac53-fb12-49e3-a01f-11880e517eb4;nsid=1312587850;c=0)
  150. at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:873)
  151. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4529)
  152. at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1286)
  153. at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:96)
  154. at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28752)
  155. at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
  156.  
  157.  
  158.  
  159. **nn1 : name node log when stopping nn1**
  160.  
  161.  
  162.  
  163.  
  164. 2017-04-19 01:21:52,061 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: nn1/192.168.10.153:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
  165. 2017-04-19 01:21:52,063 WARN org.apache.hadoop.ha.HealthMonitor: Transport-level exception trying to monitor health of NameNode at nn1/192.168.10.153:9000: java.net.ConnectException: Connection refused Call From nn1/192.168.10.153 to nn1:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
  166. 2017-04-19 01:21:54,066 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: nn1/192.168.10.153:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
  167. 2017-04-19 01:21:54,067 WARN org.apache.hadoop.ha.HealthMonitor: Transport-level exception trying to monitor health of NameNode at nn1/192.168.10.153:9000: java.net.ConnectException: Connection refused Call From nn1/192.168.10.153 to nn1:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
  168. 2017-04-19 01:21:56,074 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: nn1/192.168.10.153:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
  169. 2017-04-19 01:21:56,076 WARN org.apache.hadoop.ha.HealthMonitor: Transport-level exception trying to monitor health of NameNode at nn1/192.168.10.153:9000: java.net.ConnectException: Connection refused Call From nn1/192.168.10.153 to nn1:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
  170. 2017-04-19 01:21:58,078 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: nn1/192.168.10.153:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
  171. 2017-04-19 01:21:58,081 WARN org.apache.hadoop.ha.HealthMonitor: Transport-level exception trying to monitor health of NameNode at nn1/192.168.10.153:9000: java.net.ConnectException: Connection refused Call From nn1/192.168.10.153 to nn1:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Add Comment
Please, Sign In to add comment