Hadoop – How to access hadoop via the hdfs protocol from java

hadoophdfsssh

I found a way to connect to hadoop via hftp, and it works fine, (read only) :

uri = "hftp://172.16.xxx.xxx:50070/";

System.out.println( "uri: " + uri );           
Configuration conf = new Configuration();

FileSystem fs = FileSystem.get( URI.create( uri ), conf );
fs.printStatistics();

However, I want to read/write as well as copy files, that is, I want to connect over hdfs . How can I enable hdfs connections so that i can edit the actual , remote filesystem ?

I tried to change the protocol above from hftp -> hdfs, but I got the following exception …

(forgive my poor knowledge of url protocols and hadoop , I assume this is a somewhat strange question im asking, but any help would really be appreciated !)

Exception in thread "main" java.io.IOException: Call to /172.16.112.131:50070 failed on local exception: java.io.EOFException at org.apache.hadoop.ipc.Client.wrapException(Client.java:1139) at org.apache.hadoop.ipc.Client.call(Client.java:1107) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) at $Proxy0.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:384) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:111) at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:213) at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:180) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1514) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1548) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1530) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:228) at sb.HadoopRemote.main(HadoopRemote.java:24)

Best Answer

Just add the core-site.xml and the hdfs-site.xml of the hadoop you want to hit to conf, something like this:

import java.net.URI;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.testng.annotations.Test;

/**
 * @author karan
 *
 */
public class HadoopPushTester {

    @Test
    public void run() throws Exception {

        Configuration conf = new Configuration();

        conf.addResource(new Path("src/test/resources/HadoopConfs/core-site.xml"));
        conf.addResource(new Path("src/test/resources/HadoopConfs/hdfs-site.xml"));

        String dirName = "hdfs://hosthdfs:port/user/testJava";

        // Values of hosthdfs:port can be found in the core-site.xml  in the fs.default.name
        FileSystem fileSystem = FileSystem.get(conf);


        Path path = new Path(dirName);
        if (fileSystem.exists(path)) {
            System.out.println("Dir " + dirName + " already exists");
            return;
        }

        // Create directories
        fileSystem.mkdirs(path);

        fileSystem.close();
    }
}
Related Topic