Linux – Do system calls in x86_64 linux still generate interrupts

interruptslinux

In older versions of linux architecture, system calls would always generate an interrupt during their execution. They would be executed by setting the system call number into %eax and parameters into %ebx, %ecx and so on, followed by issuing the specific interrupt int 0x80. Thus, system calls could be said to be a common cause of software interrupts on a system.

However, on modern architectures of x86_64 there is a specific system call instruction "syscall", which bypasses the need to use interrupt 0x80, and thus, the interrupt descriptor table at all. While I believe the previous method of generating an interrupt for syscall is still supported, the syscall instruction seems to be the way it's done in practice.

Thus, my question is: Is it no longer correct to say that system calls generate interrupts? Would a system call still increment the number seen in the "interrupts" column output of vmstat, for example?

Best Answer

Yes, modern C code for Linux x86_64 uses the syscall instruction, see for example glibc sysdeps/unix/sysv/linux/x86_64/syscall.S. No, this does not mean system call interrupts go away, due to compatibility.

https://www.kernel.org/doc/Documentation/x86/entry_64.txt

The x86 architecture has quite a few different ways to jump into kernel code. Most of these entry points are registered in arch/x86/kernel/traps.c and implemented in arch/x86/entry/entry_64.S for 64-bit, arch/x86/entry/entry_32.S for 32-bit and finally arch/x86/entry/entry_64_compat.S which implements the 32-bit compatibility syscall entry points and thus provides for 32-bit processes the ability to execute syscalls when running on 64-bit kernels.

The IDT vector assignments are listed in arch/x86/include/asm/irq_vectors.h.

Some of these entries are:

system_call: syscall instruction from 64-bit code.

entry_INT80_compat: int 0x80 from 32-bit or 64-bit code; compat syscall either way.

entry_INT80_compat, ia32_sysenter: syscall and sysenter from 32-bit code

And for read only syscalls (gettimeofday) there is vDSO which does not enter kernel mode at all.

system calls can be profiled in a few ways, such as ftrace or eBPF. In addition to being obsolete in 64 bit mode, interrupts happen for reasons other than system calls.

Related Solutions

Linux – Monitor system CPU / system calls in Linux

I think strace with the -c flag is probably the closest I know of. If you haven't used the -c flag, try this:

$  sudo strace -c -p 12345

Where 12345 is the Process ID (PID) of the process in question. Note that stracing a process does add additional overhead, so while you're tracing it, the process will run slower.

After running that for however long you want to collect data, press Ctrl-C to stop your data collection and output the results. It'll produce something like this:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 31.88    0.001738         145        12           futex
 16.79    0.000915          11        80           tgkill
 12.36    0.000674          34        20           read
  9.76    0.000532         266         2           statfs
  8.42    0.000459          13        35           time
  4.38    0.000239           6        40           gettimeofday
  3.65    0.000199           4        48           sigprocmask
  2.94    0.000160          18         9           open
  2.88    0.000157          12        13           stat64
  1.32    0.000072           9         8           munmap
  0.90    0.000049           6         8           mmap2
  0.88    0.000048           3        14         7 sigreturn
  0.79    0.000043           5         9           close
  0.77    0.000042           4        10           rt_sigprocmask
  0.64    0.000035           3        12           setitimer
  0.55    0.000030           5         6         6 rt_sigsuspend
  0.53    0.000029           4         8           fstat64
  0.29    0.000016           8         2           setresuid32
  0.13    0.000007           4         2           _llseek
  0.09    0.000005           3         2           prctl
  0.04    0.000002           2         1           geteuid32
------ ----------- ----------- --------- --------- ----------------
100.00    0.005451                   341        13 total

As you can see, this is a breakdown of all system calls made by the application, sorted by total time, and including the average time per call and number of calls for each syscall. If you want to sort them differently, see the man page for strace, as there's a couple of options.

Linux – Measuring system calls per second in Linux

You could look into writing a SystemTap script. There is even an example script that could be modified to meet your needs.

Best Answer

Related Solutions

Linux – Monitor system CPU / system calls in Linux

Linux – Measuring system calls per second in Linux

Related Topic