Some Notes on Debugging libsocket
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

$Id: debug.txt,v 1.5 2000/08/20 10:01:27 rich Exp $

[ These notes are taken from an e-mail that I wrote to help someone debug ]
[ libsocket. While the basic content is the same, I've modified the notes ]
[ to fix some mistakes and make them more general.                        ]

I use the demo programs 'diag' and 'httpget' for most of my debugging,
since they're pretty simple. When analysing crashes within the library, it's
best to use simple test programs, if possible.

'diag' tests libsocket's detection & start-up routines and basic socket
creation. This means that it loads the necessary virtual device drivers into
memory, parses data from the registry, reads its own config files, etc. It
also tests the socket() call, but no other socket-related calls.
If 'diag' crashes, then it's likely that there is a problem in the start-up
sequence.

'httpget' clearly goes through the same start-up sequence as 'diag'. If you
run the bash script 'gdbx.sh' on httpget (or diag), you should be able to
debug the library, e.g.:

    bash
    cd c:/some/dir/lsck/demo
    make httpget.exe
    ./gdbx.sh httpget.exe

gdbx.sh loads the executable into gdb with all the correct paths
specified, so that gdb can find the sources.

To use gdbx.sh you need to install GNU findutils. Version 4.1 is available
as v2gnu/find41b.zip in the DJGPP archives. To use it, you need to enssure
that the DJGPP bin directory appears before the Windows command directory in
your path. Otherwise, gdbx.sh will use the Windows find command and fail,
because Windows's find works in a completely different way.

For example, here is how to debug socket creation - the socket() call.
Creation of a socket leads to the following calls, when libsocket is
running with Winsock 2:

- socket()
  - __socketx()
    - __lsck_init() (only if libsocket not already init'd)
    - __csock_socket()
      - __csock_get_usage()

You should probably set a breakpoint in __socketx() and __csock_socket().
You should use 'hbreak' to set hardware breakpoints. I've found that
software breakpoints are sometimes ignored. Also, I've found that it tends
to work better if you set the breakpoint on line numbers rather than
function names. So, if you do 'l __socketx' and then note which line
number the function starts on, you can do:

    hbreak socket.c:xxx

where xxx = the line number for __socketx(). The file for
__csock_socket(), __csock_get_usage() is c_socket.c.

Breakpoints also seem to be set more reliably, if you use the actual address
of the function. E.g. use:

    p/x __csock_socket
    hbreak *...

where '...' should be replaced by the address shown by the print command,
e.g. 'hbreak *0x123456'.

Once you reach a breakpoint, trying single-stepping - use the next command
'n', since you're probably not that interested in what the called
functions do. That is, unless the crash is in one of them. ;)

Debugging is generally time-independent, but I have seen some differents
in short (1/10 second) vs. long (several seconds) time periods. This can
make debugging complicated.

Some other points to bear in mind: gcc's optimised output sometimes shows
non-linear progress when debugging - 'n' may actually move you backwards
a line, then forward two lines, etc. Do not expect the order of execution
to be exactly as written in the C code. If you want the order of execution
to be the same as the code, recompile the library with '-O0' instead of
'-O2'. Also, some variables may have been optimised out (e.g. into
a register), so you may not be able to display their values. You can often see
which register contains the variable by using the dissassembly - try
'disass __csock_socket' for instance.

If you have any questions on debugging libsocket, please feel free to mail
me.

Richard Dawe <richdawe@bigfoot.com> 2000-08-20