Linux – fopen() failing to open a file on /tmp share

fopenlinuxtmp

I have a C application that occasionally fails to open a file which is stored on a /tmp share.

Here is the relevant chunk of code:

  // open file and start parsing

  notStdin = strcmp(inFile, "-");
  if (notStdin) {
     coordsIn = fopen(inFile, "r");   <----- inFile = file that I want to open
     if (coordsIn == NULL) {
        fprintf(stderr, "ERROR: Could not open coordinates file: %s\n\t%s\n", inFile, strerror(errno));
        exit(EXIT_FAILURE);
     }
  }
  else
     coordsIn = stdin;

Once out of eight to ten trials, I get a NULL FILE pointer. Here is an example error message:

ERROR: Could not open coordinates file: /tmp/coordinates.txt
       File or directory does not exist

However, the file /tmp/coordinates.txt does indeed exist, as I can open it with standard utilities like head, cat or more, etc.

The permissions of different /tmp/coordinates.txt trial files are the same.

Here is the result from uname -a:

$ uname -a
Linux hostname 2.6.18-128.2.1.el5 #1 SMP Wed Jul 8 11:54:47 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux

If I use a different inFile that is stored in a different, non-/tmp share, then I do not observe this symptom.

Is there anything that would cause fopen() to fail on a file stored in the /tmp share? Are there other troubleshooting steps I can pursue?

Best Answer

Too Many Open Files?
Is your program opening lots of files? Perhaps you are running out of file descriptors? Here is a link about how to change your program, the shell, and the OS if this is the case. To see you many you are using with your program:

sudo lsof | grep <PID> | wc -l

On my Ubuntu system, the shell limit is 1024 include the stdout, stderr, and stdin. This is set in /etc/security/limits.conf. The following little program shows this:

#include <stdio.h>

int count=0;

int main( void ) {
    while(1) {
        FILE *fd = fopen("foo", "r");
        if ( fd == NULL) {
            printf("%i\n", count);
            return(1);
        }
        count++;
    }
    return(0);
}

When I run it prints "1021" with an exit status of 1.

Check For System Errors:
More generically, you can always check the output of dmesg or /var/log/messages for any errors.

Watch the file, see if something else is messing with it:
Perhaps the file doesn't exist, something is deleting it out from under you? You might want to use inotify to watch all events on the file, or tools that uses inotify such as incron or inotify-tools.

Related Topic