How to set breakpoint in GDB for open(2) syscall returning -1

cgdbsystemtap

OS: GNU/Linux
Distro: OpenSuSe 13.1
Arch: x86-64
GDB version: 7.6.50.20130731-cvs
Program language: mostly C with minor bits of assembly

Imagine that I've got rather big program that sometimes fails to open a file. Is it possible to set breakpoint in GDB in such way that it stops after open(2) syscall returns -1?

Of course, I can grep through the source code and find all open(2) invocations and narrow down the faulting open() call but maybe there's a better way.

I tried to use "catch syscall open" then "condition N if $rax==-1" but obviously it didn't get hit.
BTW, Is it possible to distinct between a call to syscall (e.g. open(2)) and return from syscall (e.g. open(2)) in GDB?

As a current workaround I do the following:

  1. Run the program in question under the GDB
  2. From another terminal launch systemtap script:

    stap -g -v -e 'probe process("PATH to the program run under GDB").syscall.return { if( $syscall == 2 && $return <0) raise(%{ SIGSTOP %}) }'
    
  3. After open(2) returns -1 I receive SIGSTOP in GDB session and I can debug the issue.

TIA.

Best regards,
alexz.

UPD: Even though I tried the approach suggested by n.m before and wasn't able to make it work I decided to give it another try. After 2 hours it now works as intended. But with some weird workaround:

  1. I still can't distinct between call and return from syscall
  2. If I use finish in comm I can't use continue, which is OK according to GDB docs
    i.e. the following does drop to gdb prompt on each break:

    gdb> comm
    gdb> finish
    gdb> printf "rax is %d\n",$rax
    gdb> cont
    gdb> end
    
  3. Actually I can avoid using finish and check %rax in commands but in this case I have to check for -errno rather than -1 e.g. if it's "Permission denied" then I have to check for "-13" and if it's "No such file or direcory" – then for -2. It's just simply not right

  4. So the only way to make it work for me was to define custom function and use it in the following way:

    (gdb) catch syscall open
    Catchpoint 1 (syscall 'open' [2]
    (gdb) define mycheck
    Type commands for definition of "mycheck".
    End with a line saying just "end".
    >finish
    >finish
    >if ($rax != -1)
     >cont
     >end
    >printf "rax is %d\n",$rax
    >end
    (gdb) comm
    Type commands for breakpoint(s) 1, one per line.
    End with a line saying just "end".
    >mycheck
    >end
    (gdb) r
    The program being debugged has been started already.
    Start it from the beginning? (y or n) y
    Starting program: /home/alexz/gdb_syscall_test/main
    .....
    Catchpoint 1 (returned from syscall open), 0x00007ffff7b093f0 in __open_nocancel () from /lib64/libc.so.6
    0x0000000000400756 in main (argc=1, argv=0x7fffffffdb18) at main.c:24
    24                      fd = open(filenames[i], O_RDONLY);
    Opening test1
    fd = 3 (0x3)
    Successfully opened test1
    
    Catchpoint 1 (call to syscall open), 0x00007ffff7b093f0 in __open_nocancel () from /lib64/libc.so.6
    rax is -38
    
    Catchpoint 1 (returned from syscall open), 0x00007ffff7b093f0 in __open_nocancel () from /lib64/libc.so.6
    0x0000000000400756 in main (argc=1, argv=0x7fffffffdb18) at main.c:24
    ---Type <return> to continue, or q <return> to quit---
    24                      fd = open(filenames[i], O_RDONLY);
    rax is -1
    (gdb) bt
    #0  0x0000000000400756 in main (argc=1, argv=0x7fffffffdb18) at main.c:24
    (gdb) step
    26                      printf("Opening %s\n", filenames[i]);
    (gdb) info locals
    i = 1
    fd = -1
    

Best Answer

This gdb script does what's requested:

set $outside = 1
catch syscall open
commands
  silent
  set $outside = ! $outside
  if ( $outside && $rax >= 0)
    continue
  end
  if ( !$outside )
    continue
  end
  echo `open' returned a negative value\n
end

The $outside variable is needed because gdb stops both at syscall enter and syscall exit. We need to ignore enter events and check $rax only at exit.