Unix – Seeking a simple description regarding ‘file descriptor’ after fork()

cfile-descriptorforkunix

In "Advanced Programming in the Unix Environment", 2nd edition, By W. Richard Stevens. Section 8.3 fork function.

Here's the description:

It is important that the parent and the child share the same file offset.

Consider a process that forks a child, then waits for the child to complete. Assume that both processes write to standard output as part of their normal processing. If the parent has its standard output redirected (by a shell, perhaps) it is essential that the parent's file offset be updated by the child when the child writes to standard output.

My responses:

{1} What does it mean? if parent's std output is redirected to a 'file1' for example, then what should child update after child writes? parent's original std output offset or redirected ouput(i.e file1) offset? Can't be the later, right?

{2} How is the update done? by child explicitly, by OS implicitly, by files descriptor itself? After fork, I thought parent and child went their own ways and has their own COPY of file descriptor. So how does child update offset to parent side?

In this case, the child can write to standard output while the parent is waiting for it; on completion of the child, the parent can continue writing to standard output, knowing that its output will be appended to whatever the child wrote. If the parent and the child did not share the same file offset, this type of interaction would be more difficult to accomplish and would require explicit actions by the parent.

If both parent and child write to the same descriptor, without any form of synchronization, such as having the parent wait for the child, their output will be intermixed (assuming it's a descriptor that was open before the fork). Although this is possible, it's not the normal mode of operation.

There are two normal cases for handling the descriptors after a fork.

The parent waits for the child to complete. In this case, the parent does not need to do anything with its descriptors. When the child terminates, any of the shared descriptors that the child read from or wrote to will have their file offsets updated accordingly.

Both the parent and the child go their own ways. Here, after the fork, the parent closes the descriptors that it doesn't need, and the child does the same thing. This way, neither interferes with the other's open descriptors. This scenario is often the case with network servers.

My response:

{3} When fork() is invoked, all i understand is that child get a COPY of what parent has, file descriptor in this case, and does its thing. If any offset changes to file descriptor that parent and child share, it can only be because the descriptor remember the offset itself. Am I right?

I am kind of new to the concepts.

Best Answer

It's important to distinguish between the file descriptor, which is a small integer that the process uses in its read and write calls to identify the file, and the file description, which is a structure in the kernel. The file offset is part of the file description. It lives in the kernel.

As an example, let's use this program:

#include <unistd.h>
#include <fcntl.h>
#include <sys/wait.h>

int main(void)
{
    int fd;

    fd = open("output", O_CREAT|O_TRUNC|O_WRONLY, 0666);

    if(!fork()) {
        /* child */
        write(fd, "hello ", 6);
        _exit(0);
    } else {
        /* parent */
        int status;

        wait(&status);
        write(fd, "world\n", 6);
    }
}

(All error checking has been omitted)

If we compile this program, call it hello, and run it like this:

./hello

here's what happens:

The program opens the output file, creating it if it didn't exist already or truncating it to zero size if it did exist. The kernel creates a file description (in the Linux kernel this is a struct file) and associates it with a file descriptor for the calling process (the lowest non-negative integer not already in use in that process's file descriptor table). The file descriptor is returned and assigned to fd in the program. For the sake of argument suppose that fd is 3.

The program does a fork(). The new child process gets a copy of its parent's file descriptor table, but the file description is not copied. Entry number 3 in both processes' file tables points to the same struct file.

The parent process waits while the child process writes. The child's write causes the first half of "hello world\n" to be stored in the file, and advances the file offset by 6. The file offset is in the struct file!

The child exits, the parent's wait() finishes, and the parent writes, using fd 3 which is still associated with the same file description that had its file offset updated by the child's write(). So the second half of the message is stored after the first part, not overwriting it as it would have done if the parent had a file offset of zero, which would be the case if the file description was not shared.

Finally the parent exits, and the kernel sees that the struct file is no longer in use and frees it.

Related Solutions

Sqlite – Improve INSERT-per-second performance of SQLite

Several tips:

Put inserts/updates in a transaction.
For older versions of SQLite - Consider a less paranoid journal mode (pragma journal_mode). There is NORMAL, and then there is OFF, which can significantly increase insert speed if you're not too worried about the database possibly getting corrupted if the OS crashes. If your application crashes the data should be fine. Note that in newer versions, the OFF/MEMORY settings are not safe for application level crashes.
Playing with page sizes makes a difference as well (PRAGMA page_size). Having larger page sizes can make reads and writes go a bit faster as larger pages are held in memory. Note that more memory will be used for your database.
If you have indices, consider calling CREATE INDEX after doing all your inserts. This is significantly faster than creating the index and then doing your inserts.
You have to be quite careful if you have concurrent access to SQLite, as the whole database is locked when writes are done, and although multiple readers are possible, writes will be locked out. This has been improved somewhat with the addition of a WAL in newer SQLite versions.
Take advantage of saving space...smaller databases go faster. For instance, if you have key value pairs, try making the key an INTEGER PRIMARY KEY if possible, which will replace the implied unique row number column in the table.
If you are using multiple threads, you can try using the shared page cache, which will allow loaded pages to be shared between threads, which can avoid expensive I/O calls.
Don't use !feof(file)!

I've also asked similar questions here and here.

Linux – printf anomaly after “fork()”

I note that <system.h> is a non-standard header; I replaced it with <unistd.h> and the code compiled cleanly.

When the output of your program is going to a terminal (screen), it is line buffered. When the output of your program goes to a pipe, it is fully buffered. You can control the buffering mode by the Standard C function setvbuf() and the _IOFBF (full buffering), _IOLBF (line buffering) and _IONBF (no buffering) modes.

You could demonstrate this in your revised program by piping the output of your program to, say, cat. Even with the newlines at the end of the printf() strings, you would see the double information. If you send it direct to the terminal, then you will see just the one lot of information.

The moral of the story is to be careful to call fflush(0); to empty all I/O buffers before forking.

Line-by-line analysis, as requested (braces etc removed - and leading spaces removed by markup editor):

printf( "Hello, my pid is %d", getpid() );
pid = fork();
if( pid == 0 )
printf( "\nI was forked! :D" );
sleep( 3 );
else
waitpid( pid, NULL, 0 );
printf( "\n%d was forked!", pid );

The analysis:

Copies "Hello, my pid is 1234" into the buffer for standard output. Because there is no newline at the end and the output is running in line-buffered mode (or full-buffered mode), nothing appears on the terminal.
Gives us two separate processes, with exactly the same material in the stdout buffer.
The child has pid == 0 and executes lines 4 and 5; the parent has a non-zero value for pid (one of the few differences between the two processes - return values from getpid() and getppid() are two more).
Adds a newline and "I was forked! :D" to the output buffer of the child. The first line of output appears on the terminal; the rest is held in the buffer since the output is line buffered.
Everything halts for 3 seconds. After this, the child exits normally through the return at the end of main. At that point, the residual data in the stdout buffer is flushed. This leaves the output position at the end of a line since there is no newline.
The parent comes here.
The parent waits for the child to finish dying.
The parent adds a newline and "1345 was forked!" to the output buffer. The newline flushes the 'Hello' message to the output, after the incomplete line generated by the child.

The parent now exits normally through the return at the end of main, and the residual data is flushed; since there still isn't a newline at the end, the cursor position is after the exclamation mark, and the shell prompt appears on the same line.

What I see is:

Osiris-2 JL: ./xx
Hello, my pid is 37290
I was forked! :DHello, my pid is 37290
37291 was forked!Osiris-2 JL: 
Osiris-2 JL:

The PID numbers are different - but the overall appearance is clear. Adding newlines to the end of the printf() statements (which becomes standard practice very quickly) alters the output a lot:

#include <stdio.h>
#include <unistd.h>

int main()
{
    int pid;
    printf( "Hello, my pid is %d\n", getpid() );

    pid = fork();
    if( pid == 0 )
        printf( "I was forked! :D %d\n", getpid() );
    else
    {
        waitpid( pid, NULL, 0 );
        printf( "%d was forked!\n", pid );
    }
    return 0;
}

I now get:

Osiris-2 JL: ./xx
Hello, my pid is 37589
I was forked! :D 37590
37590 was forked!
Osiris-2 JL: ./xx | cat
Hello, my pid is 37594
I was forked! :D 37596
Hello, my pid is 37594
37596 was forked!
Osiris-2 JL:

Notice that when the output goes to the terminal, it is line-buffered, so the 'Hello' line appears before the fork() and there was just the one copy. When the output is piped to cat, it is fully-buffered, so nothing appears before the fork() and both processes have the 'Hello' line in the buffer to be flushed.

Best Answer

Related Solutions

Sqlite – Improve INSERT-per-second performance of SQLite

Linux – printf anomaly after “fork()”

Related Topic