Chris Gottbrath from TotalView Technologies has just posted a new whitepaper [PDF] up at the TotalView site, “Deterministically Troubleshooting Network Distributed Applications.”
This paper will look at three different ways to approach debugging a client-server application: tracing, interactive debugging, and replay debugging. The application I am looking at is a simple multi-threaded, multi-machine, memory status monitoring application written in C using the UNIX socket interface. The purpose is to share some techniques that may give you new ideas on how to tackle bugs you may encounter with network programs.
The paper is a useful overview of the major ways of debugging a parallel application. Of course, it’s from TotalView, so you might expect that a case in favor of TotalView products
This article has looked at three different approaches to finding out what is going on inside of a program. Tracing program execution provides you with a way of looking at the behavior of your program over time. Strace provides easy access to limited information; print statements give you more detail about what your program is doing, but require recompilation; and tvscript provides flexibility along with detailed information but at the cost of some overhead.
The best way to explore the behavior of the program interactively is to use a graphical source code debugger. Remote desktop software makes it possible to have a graphical debugging experience even when working on applications running at distant sites. The control that a full-featured debugger gives you over program execution can be vital for solving hard-to- reproduce bugs.
And you won’t be disappointed if that’s your expectation. But I think overall it’s a helpful product, worth a read.