HBase: hbck can’t fix region inconsistencies

hbase

We are using stock HBase 0.94.4 on Hadoop 1.0.4.
One of HBase regions stuck in transition state and I got the following when I run /opt/hbase/bin/hbase hbck:

ERROR: Region { meta => dev1_sliceagg_location_file,,1369128923119.21accc8b27bbd501ed4d3575d6ee725e., hdfs => hdfs://192.168.3.100:8020/hbase/dev1_sliceagg_location_file/21accc8b27bbd501ed4d3575d6ee725e, deployed =>  } not deployed on any region server.
ERROR: Region { meta => crash_experiment_sliceagg_client_file,,1369316587953.46e475f415d83f0d5caebccf67acc696., hdfs => hdfs://192.168.3.100:8020/hbase/crash_experiment_sliceagg_client_file/46e475f415d83f0d5caebccf67acc696, deployed =>  } not deployed on any region server.
ERROR: Region { meta => dev1_sliceagg_client_file,\x94\xDC\x97\x85\x94\x15\xAFO\xFEv\xE5}2\xBA\xE6\xC5\x8E\x87'\x0CG\x04\xCF)Q\xE1\xE7\x82\x0Dl\x8A+\x90\x18\xF8{2?\xD2]~6oO\x0F\\x97\x96\xBF\xE5Fc6|\xE8x\xF6+\x09s\xAF\xC9\xC3\xC8\x00<\x11\x00\x00\x00\x00\x00,1369315360949.92fc7ad4623318547cf7f4cb13e3afdc., hdfs => hdfs://192.168.3.100:8020/hbase/dev1_sliceagg_client_file/92fc7ad4623318547cf7f4cb13e3afdc, deployed =>  } not deployed on any region server.
13/05/23 18:54:16 DEBUG util.HBaseFsck: There are 64 region info entries
ERROR: There is a hole in the region chain between \x94\xDC\x97\x85\x94\x15\xAFO\xFEv\xE5}2\xBA\xE6\xC5\x8E\x87'\x0CG\x04\xCF)Q\xE1\xE7\x82\x0Dl\x8A+\x90\x18\xF8{2?\xD2]~6oO\x0F\\x97\x96\xBF\xE5Fc6|\xE8x\xF6+\x09s\xAF\xC9\xC3\xC8\x00<\x11\x00\x00\x00\x00\x00 and \xC80\xCD\x96\xBF-\xB0\xB6hm\x80\xE5\xD7\xDE\xAF\xB0\x0ANWW\xAE\x09\xFA\x96"\xE3\x15\x8C\xC1\xAE\xF1\x14\xEDWNB\x0EW7N2\x8C|Re\x04\xEC\xA5i\xC1d(yf\xF0`\x19\xEC |\xB1\x7F,T@6\x00\x00\x00\x00\x00\x00.  You need to create a new .regioninfo and region dir in hdfs to plug the hole.
ERROR: Found inconsistency in table dev1_sliceagg_client_file
ERROR: (region dev1_sliceagg_location_file,\x80+\x02)\xD9\x04\xE2\x8C\x1E\xA9\xA5'J\xB4W\xFC\xD4\x8C\x86Kgx\x87"\x0C\x14\x8F\xCD\x00p\x11\xEB\xB7;\x98\x9B02J[\x07\xF0\xE8\xAE\xC1m\xFF\xA4\x00$\x01\x00\x00\x00\x00\x00\x00\x00\x03\xEE\x00\x00\x00\x00\x00\x00?\xB2\x00\x00\x00\x00\x00\x00\x0A\xB5,1369128923119.f7b1c0288f9fcc36ebceca091103ac18.) First region should start with an empty key.  You need to  create a new region and regioninfo in HDFS to plug the hole.
ERROR: Found inconsistency in table dev1_sliceagg_location_file
13/05/23 18:54:17 WARN regionserver.StoreFile: Failed match of store file name hdfs://192.168.3.100:8020/hbase/crash_experiment_sliceagg_file_stat/06f163c5f5e79b02e260f3b2752c9cb8/.oldlogs/hlog.1369315359473
13/05/23 18:54:17 WARN regionserver.StoreFile: Failed match of store file name hdfs://192.168.3.100:8020/hbase/-ROOT-/70236052/.oldlogs/hlog.1358951260249
13/05/23 18:54:17 WARN regionserver.StoreFile: Failed match of store file name hdfs://192.168.3.100:8020/hbase/dev1_sliceagg_client_file/92fc7ad4623318547cf7f4cb13e3afdc/.oldlogs/hlog.1369315360956
13/05/23 18:54:17 WARN regionserver.StoreFile: Failed match of store file name hdfs://192.168.3.100:8020/hbase/crash_experiment_sliceagg_client_file/46e475f415d83f0d5caebccf67acc696/.oldlogs/hlog.1369316587995
13/05/23 18:54:17 WARN regionserver.StoreFile: Failed match of store file name hdfs://192.168.3.100:8020/hbase/.META./1028785192/.oldlogs/hlog.1358951260483

/opt/hbase/bin/hbase hbck -fix does not fix anything becuase it gets stuck printing Region still in transition, waiting for it to become assigned error message.
/opt/hbase/bin/hbase hbck -repairHoles does not help too.
What should we do to resolve this situation?

Best Answer

We had to stop HBase and delete recovered.edits folders for failing regions. hbck succeeded after that.

Related Topic