A kernel panic makes small HBase cluster to crush?


---------- Forwarded message ----------
From: Tatsuya Kawano <[email protected]>
Date: 2011/3/5
Subject: A kernel panic makes small HBase cluster to crush?
To: [email protected]


Hi,

I got this question at Hadoop User Group Japan mailing list, but I
need some helps from the experts here. It looks like HDFS issue, maybe
"append" related?  but I'm not totally sure yet.

The person who posted the original question is testing HA features in
HBase 0.90.0 and ASF Hadoop 0.20.2 (with
hadoop-core-0.20-append-r1056497.jar)

His test cluster has only 3 nodes.

Node 1: RS, DN, ZK   plus   HM, NN
Node 2: RS, DN, ZK
Node 3: RS, DN, ZK

dfs.replication = 3


He brought down Node 3 (which was handling Put requests from his test
client) by a kernel panic ("echo c > /proc/sysrq-trigger"). But he
also got Region Servers on Node 1 and Node 2 down with the following
message.

---------------------------------------------------------------------
2011-03-01 23:13:13,056 FATAL
org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region
server serverName=ap12.secur2,60020,1298987576087, load=(requests=0,
regions=4, usedHeap=218, maxHeap=1998): Replay of HLog required.
Forcing server shutdown
org.apache.hadoop.hbase.DroppedSnapshotException: region:
Object_Speed_Test,
5003017357526424133520110201051038918,1298988549775.1dbc1bf84b48e1145638b3a3bc3ad1cd
---------------------------------------------------------------------

He can easily reproduce this issue on his cluster.

So, by looking at the above message, I thought there was something
wrong with HDFS, and RS was reading corrupted HFile or something from
HDFS.

Then, we checked HDFS NN and DN logs, and it seems NN was confused and
it wasn't able to allocate block for write.

---------------------------------------------------------------------
2011-03-01 23:13:13,006 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
ugi=hbase,hadoop        ip=/XX.XX.XX.XX   cmd=create      src=/hbase/
Object_Speed_Test/1dbc1bf84b48e1145638b3a3bc3ad1cd/.tmp/
1275904589980700621    dst=null        perm=hbase:supergroup:rw-r--r--
2011-03-01 23:13:13,048 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 2 on 9000, call addBlock(/hbase/Object_Speed_Test/
1dbc1bf84b48e1145638b3a3bc3ad1cd/.tmp/1275904589980700621,
DFSClient_hb_rs_ap12.secur2,60020,1298987576087_1298987617433, null)
from XX.XX.XX.XX:55462: error: java.io.IOException: File /hbase/
Object_Speed_Test/1dbc1bf84b48e1145638b3a3bc3ad1cd/.tmp/
1275904589980700621 could only be replicated to 0 nodes, instead of 1
java.io.IOException: File /hbase/Object_Speed_Test/
1dbc1bf84b48e1145638b3a3bc3ad1cd/.tmp/1275904589980700621 could only
be replicated to 0 nodes, instead of 1
---------------------------------------------------------------------

It seems the kernel panic on Node 3 put HDFS in a wrong state, so
Region Servers couldn't write to and read from HDFS and had to shut
themselves down.

We couldn't find any more clues in the logs, but I pasted them here:

http://pastebin.com/NYkNS1c1


Since dfs.replication = 3, all Data Nodes were participating HLog
write at the time Node 3 got the kernel panic. I think this somehow
made the Name Node to think those Data Nodes were all gone. But I
couldn't find the root cause of this issue.

Also, he checked the network and disk spaces, and he believes there
was no issue on them when he was testing.

Thanks,

--
Tatsuya Kawano
Tokyo, Japan