Slaves get a connection timed out with hdfs

amazon ec2clusterhadoop

I have 3 node instances –

master, slave1 and slave2

SSHing between these nodes works fine.

Here is the processes that starts on each node when I say ./sbin/start-dfs.sh

master:

SecondaryNameNode
Jps
NameNode
DataNode

slave1:

Jps
DataNode

slave2:

Jps
DataNode

But when I try to access hdfs from slave nodes, I get a connection timed out.

Also when I check the hdfs dfsadmin -report, I only see one datanode(on master's localhost) as part of hdfs.

Datanodes available: 1 (1 total, 0 dead)

Live datanodes:
Name: 127.0.0.1:50010 (localhost)
Hostname: master
Decommission Status : Normal
Configured Capacity: 8309932032 (7.74 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 2309738496 (2.15 GB)
DFS Remaining: 6000168960 (5.59 GB)
DFS Used%: 0.00%
DFS Remaining%: 72.20%

Here is my /etc/hosts file mapping on all the three nodes.

127.0.0.1 localhost
<ip of master> master
<ip of slave1> slave1
<ip of slave2> slave2

Here is my $HADOOP_HOME/etc/hadoop/slaves file on the master

localhost
slave1
slave2

In short, datanodes on slaves are unable to connect with hdfs.

What am I doing wrong?

Best Answer

If you can't telnet to port 9000 on the master from the slaves(and the error is connection timed out), then it's most likely a firewall issue.

Check that the ec2 instances are in the same ec2 security group, and that there's no iptables firewall rules active blocking the connections(you can check this by running iptables -vnxL).