Hadoop DataNode is giving me an incompatible namespace ID

hadoophdfs

When I run the start-all.sh script from my master node, some of my DataNodes fail to start; the log file reports a Java IOException: Incompatible Namespace IDs in /tmp/$MY_USER_NAME.

Best Answer

When the NameNode is formatted a namespace ID is generated, which essentially identifies that specific instance of the distributed filesystem. When DataNodes first connect to the NameNode they store that namespace ID along with the data blocks, because the blocks have to belong to a specific filesystem.

If a DataNode later connects to a NameNode, and the namespace ID which the NameNode declares does not match the namespace ID stored on the DataNode, it will refuse to operate with the "incompatible namespace ID" error. It means that the DataNode has connected to a different NameNode, and the blocks which it is storing don't belong to that distributed filesystem.

This usually means that you've misplaced your NameNode metadata somehow. If you have multiple HDFS installations your DataNode may be connecting to the wrong NameNode. If you only have a single installation then your NameNode is either running with a different metadata directory, or you've somehow lost the metadata and started with a newly formatted filesystem (which should only happen by running hadoop namenode -format). Try to find the correct NameNode metadata or restore it from a backup.