I'm trying to understand better how Hadoop works, and I'm reading
The NameNode is a Single Point of Failure for the HDFS Cluster. HDFS is not currently a High Availability system. When the NameNode goes down, the file system goes offline. There is an optional SecondaryNameNode that can be hosted on a separate machine. It only creates checkpoints of the namespace by merging the edits file into the fsimage file and does not provide any real redundancy. Hadoop 0.21+ has a BackupNameNode that is part of a plan to have an HA name service, but it needs active contributions from the people who want it (i.e. you) to make it Highly Available.
from http://wiki.apache.org/hadoop/NameNode
So why is the NameNode a single point of failure? What is bad or difficult about having a complete duplicate of the NameNode running as well?
Best Answer
Why does the design of HDFS have a single name node? Simplicity. According to http://hadoop.apache.org/common/docs/r0.20.2/hdfs_design.html#NameNode+and+DataNodes:
You can have a secondary name node that can take over when the primary fails (see http://hadoop.apache.org/common/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html) and there are design proposals for distributed name nodes but, as far as I know, none are implemented reliably at this time.