Java – Apache Commons FTPClient Hanging

apache-commons-netftpftp-clientjava

We are using the following Apache Commons Net FTP code to connect to an FTP server, poll some directories for files, and if files are found, to retrieve them to the local machine:

try {
logger.trace("Attempting to connect to server...");

// Connect to server
FTPClient ftpClient = new FTPClient();
ftpClient.setConnectTimeout(20000);
ftpClient.connect("my-server-host-name");
ftpClient.login("myUser", "myPswd");
ftpClient.changeWorkingDirectory("/loadables/");

// Check for failed connection
if(!FTPReply.isPositiveCompletion(ftpClient.getReplyCode()))
{
    ftpClient.disconnect();
    throw new FTPConnectionClosedException("Unable to connect to FTP server.");
}

// Log success msg
logger.trace("...connection was successful.");

// Change to the loadables/ directory where we poll for files
ftpClient.changeWorkingDirectory("/loadables/");    

// Indicate we're about to poll
logger.trace("About to check loadables/ for files...");

// Poll for files.
FTPFile[] filesList = oFTP.listFiles();
for(FTPFile tmpFile : filesList)
{
    if(tmpFile.isDirectory())
        continue;

    FileOutputStream fileOut = new FileOutputStream(new File("tmp"));
    ftpClient.retrieveFile(tmpFile.getName(), fileOut);
    // ... Doing a bunch of things with output stream
    // to copy the contents of the file down to the local
    // machine. Ommitted for brevity but I assure you this
    // works (except when the WAR decides to hang).
    //
    // This was used because FTPClient doesn't appear to GET
    // whole copies of the files, only FTPFiles which seem like
    // file metadata...
}

// Indicate file fetch completed.
logger.trace("File fetch completed.");

// Disconnect and finish.
if(ftpClient.isConnected())
    ftpClient.disconnect();

logger.trace("Poll completed.");
} catch(Throwable t) {
    logger.trace("Error: " + t.getMessage());
}

We have this scheduled to run every minute, on the minute. When deployed to Tomcat (7.0.19) this code loads up perfectly fine and begins working without a hitch. Every time though, at some point or another, it seems to just hang. By that I mean:

  • No heap dumps exist
  • Tomcat is still running (I can see its pid and can log into the web manager app)
  • Inside the manager app, I can see my WAR is still running/started
  • catalina.out and my application-specific log show no signs of any exceptions being thrown

So the JVM is still running. Tomcat is still running, and my deployed WAR is still running, but its just hanging. Sometimes it runs for 2 hours and then hangs; other times it runs for days and then hangs. But when it does hang, it does so between the line that reads About to check loadables/ for files... (which I do see in the logs) and the line that reads File fetch completed. (which I don't see).

This tells me the hang occurs during the actual polling/fetching of the files, which kind of points me in the same direction as this question that I was able to find which concerns itself with FTPClient deadlocking. This has me wondering if these are the same issues (if they are, I'll happily delete this question!). However I don't think believe they're the same (I don't see the same exceptions in my logs).

A co-worker mentioned it might be a "Passive" vs. "Active" FTP thing. Not really knowing the difference, I am a little confused by the FTPClient fields ACTIVE_REMOTE_DATA_CONNECTION_MODE, PASSIVE_REMOTE_DATA_CONNECTION_MODE, etc. and didn't know what SO thought about that as being a potential issue.

Since I'm catching Throwables as a last resort here, I would have expected to see something in the logs if something is going wrong. Ergo, I feel like this is a definite hang issue.

Any ideas? Unfortunately I don't know enough about FTP internals here to make a firm diagnosis. Could this be something server-side? Related to the FTP server?

Best Answer

This could be a number of things, but your friend's suggestion would be worthwhile.

Try ftpClient.enterLocalPassiveMode(); to see if it helps.

I would also suggest to put the disconnect in the finally block so that it never leaves a connection out there.

Related Topic