Linux – Reading from FIFO in C: select() doesn’t return

clinuxselect

I've got a C program I'm writing. Here's what it does:

  • Create n fifos using mkfifo
  • Open them for read (with the O_NONBLOCK flag set)
  • Open them for write
  • Spawn a thread

In the thread, run in a loop:

  • Create an fd_set of the file descriptors for all n fifos
  • Call select(n, &my_set, NULL, NULL, NULL)
  • For each fd ready for I/O (FD_ISSET(fd, &my_set)):
    • Read a string from the fd (read(fd, buf, buf_len))
    • Print the string
    • If string == "kill", mark the fd as dead and remove it from the list (n–)
    • If n == 0, terminate the thread

In the main program:

  • For i = 0 to n
    • Write to fds[i] with a string (write(fds[i], buf, buf_len))
  • For i = 0 to n
    • Write to fds[i] with the string "kill"
  • Join on the thread I created
  • Exit

The behavior I'm seeing is that select() will return once with a length of 1, being the first fd in the list. The second time through the loop, select will just sit there forever.

Here's my output:

thread created
Waiting on 4 file descriptors
> Wrote 'Hello to target 0 from writer 0' to 0
> Wrote 'Hello to target 0 from writer 1' to 1
> Wrote 'Hello to target 1 from writer 0' to 2
> Wrote 'Hello to target 1 from writer 1' to 3
> Sending kill to 0:0 (#0)
> Sending kill to 0:1 (#1)
> Sending kill to 1:0 (#2)
> Sending kill to 1:1 (#3)
< Got string: 'Hello to target 0 from writer 0'
Waiting on 4 file descriptors
^C

The OS is Linux, in case it matters.

Link to the code: https://dl.getdropbox.com/u/188590/fifotest.c
(Sorry it's a bit heinous)

Thanks,
Nathan

Best Answer

As Lance Richardson said, the first problem is that you need to pass the number of the maximum file descriptor plus one, not the number of file descriptors.

You then have to clean up the housekeeping in the listener thread - I got most of the data, but ended up listening to 6 file descriptors, not 4. (The reported number was now the largest fd, not the number of file descriptors.)

You also have a problem that you write a string plus a null byte to each pipe, then a second string plus a null byte. Since the scheduling is non-deterministic, the main program actually gets to write both its strings to each fifo, so when the listener thread gets to read it, it reads both strings. When I printed out lengths, I got a total length of 41 read (read_len), but the length of the string per strlen() was 31. In other words, the first read included the 'kill' part, but you didn't notice in the printout because of the trailing null at the end of the first message. Hence you were waiting for that which will never happen.