C – Proper Way of Handling EINTR in Libraries

clibrariesposixsignals

What is the recommended etiquette when it comes to EINTR in libraries?

I'm currently writing a function that does some file system tasks with the POSIX API, but a lot of the calls I use can potentially return EINTR. Additionally, the function can block under some circumstances. (For those interested, it implements a locking mechanism.)

In the interests of making this as general as possible, I would like to know what is the proper way to deal with an interrupted system call.

  • From most sources I've read, people typically retry the call and continue on with their business. However, I'm not sure that's the right thing to do here, since there may be legitimate reasons to interrupt my function given that it can take a significant amount of time. Furthermore, it means the EINTR would simply get swallowed by the function and the caller would lose any indication that it occurred.

  • My current strategy is to immediately abort the operation if I receive EINTR and notify the caller about it. This way, the caller can decide if they want to retry my function, right? (Or perhaps my understanding of signals is flawed?)

Best Answer

Always leave the decision of how to handle EINTR to the user, and make it easy to resume the operation as appropriate.

Usually the best way to do that is to return from your library function to the caller upon EINTR, but in some cases a callback or some other implemention might be better - which way is best depends on other factors, but always let the user control retry and resume.

This means that if your library code can partially succeed before getting an EINTR, then you should think carefully what the user might need to know about that partial success, or if the user might need to resume the operation from where it failed. You might need to return additional information or provide an interface for resuming from any place where it might be appropriate to.

This is why system calls like read and write nowadays return partial success - because it is very frustrating as a user to be told:

You tried to write "foo" and we successfully wrote either nothing, or "f", or "fo", and you don't get to know which. Have fun! Hope your system can handle restarting the whole write after any of those!

Of course, in some cases, we should write systems to handle situations exactly like that - for example, perhaps after a partial write you always recreate the file, or reopen the network connection, or you use some byte to mean "starting over" - so it depends on what use cases your library targets.

If a library function does several operations, and there is no way to know at which of them it failed, and those operations are not all safely and efficiently idempotent, that basically makes a library unusable for code that needs to be robust.

If all steps in a library function are safely and efficiently idempotent, or the whole thing is atomic - like acquiring a lock - then just letting the user know that an EINTR happened is enough.

Also, if we retry on EINTR, then we might break signal handling. At the low level, signal handlers can only safely use a limited set of features, and so in many cases a signal handler will just set a boolean indicating that the signal was received, and then return, expecting that when the code resumes, it will exit out of whatever it was doing. If we get an EINTR and then we retry instead of returning control to the user, we might be keeping the code from doing that.

What to do after an EINTR is a whole program decision - the right answer cannot be known without knowing what the program is doing and how the program is meant to respond to a signal, and it has effects on the rest of the program.

Knowing how or if the user might need to resume, and helping the user do so if it is needed, is a library responsibility - the right answer cannot be known without knowing what the library is doing.

Related Topic