The goal of this article is to enable you to extend the gdb. This is not an exhaustive treatise on the gdb, but if your task is simular it may help you narrow the study of the sources.
Warning 1: I am a beginner on the internals of gdb, I only learned about those parts I needed for this job.
FIXME: this document may well be to short to be useful.
The patch works fine for gdb-4.18.
At our company we have our own layer for persistence. In that implementation, all pointers to persistent objects need to be smart pointers, adding an extra layer of indirection. This makes debugging tedious. What we want to be able to do if we have a smart pointer p is:
The patch deals with three smart pointers, OSPtr<T>, OViewIterator<T> and OViewConstIterator<T>. These three have about the same layout. Here is are the relevant parts of the classes the extension needs to deal with:
All objects that are persistent need to derive from Persistent and implement some member functions to enable serialisation (using the reader/writer pattern).
So if we have a class Foo, deriving from Persistent, we can have smart pointers to objects of Foo.
Without extending the debugger we would need to type:
This is difficult to remember, especially for those who do not know how the persistence layer works. He/She does not want to be bothered with the implementation details of the persistence layer:
The patch hooks into the expression evaluator of gdb.
Evaluation of an expression string consists of two phases. First the string is parsed. The parser outputs an array of commands. In the second phase this array gets executed.
The print command of gdb first parses the expression string using gdb/c-exp.y (for c and c++ at least). gdb/c-exp.y converts it to a simple stack based language (see struct expression in gdb/expression.h). The struct expression contains a member elts (an array of exp_elements), containing opcodes and operands.
For example the expression "*p", where p is a variable, is parsed into an array of 5 elements:
|1||opcode=OPVAR_VALUE||specification of variable p|
|2||block = 0x0||evalute the symbol relative to innermost frame|
|3||symbol = ...||points to symbol information about p|
|4||opcode=OP_VAR_VALUE||marks the end this operation|
The expression array of "p->some_member" is 9 elements long:
|1||length||length of the string "some_member" + 1|
|2||string[0..12]||"some_member" ('\0' inclusive)|
|3||...||continuation of the above string|
|4||opcode=STRUCTOP_PTR||marks the end of this operation|
|6||block=0x0||evalute the symbol relative to innermost frame|
|7||symbol = ...||points to symbol information on variable p|
|8||opcode=OP_VAR_VALUE||marks the end of this operation|
Notice that OP_VAR_VALUE/STRUCTOP_PTR is mentioned twice. This way the expression array can be parsed in both directions. Forward evaluation would be typically implemented by a recursive function using the program stack of the gdb to store intermediate values. Backward evaluation would be possible but need to do a little more work. I did not find any part of the gdb which uses this backward evaluation feature.
Note: debugging gdb itself is easy. To debug gdb itself you should fire up the debugger in the source directory. This directory contains a useful .gdbinit (adding break points and changing the prompt):
The thus obtained expression struct is then evaluated by the function evaluate_subexp (gdb/eval.c). evaluate_subexp just defers it to the language specific evaluator. For c/c++ this is evaluate_subexp_standard (also in gdb/eval.c).
The function evaluate_subexp_standard recursively evaluates the expression struct. Among others, it takes arguments: an expression struct "exp", and an index "pos" into the elts array. It examines the opcode at index pos to branch in a large switch statement (you can't miss it as it fills the most part of eval.c). The UNOP_IND case, for example, will evaluate the expression after pos (by calling evaluate_subexp), and do the indirection on the returned value.
Values are returned by a pointer to struct value (value_ptr, gdb/value.h). Struct value looks complex, because it can represent values from various sources (register/memory/immediate/...). Fortunatly, most functions operate/create on a value_ptr as a hole, so it can be seen as an opaque most of the time.
The struct value contains type information (member type), which can be extracted by calling "check_typedef(VALUE_TYPE(v))". This returns a pointer to "struct type"(gdb/gdbtypes.h).
The main chunk is in the function dereference_osptr in gdb/osptr.c. It takes as argument value. It will return a nil pointer if the function thinks the value has nothing to do with the smart pointer types, otherwise it will return the object to which the smart pointer pointed to.
To do the indirection I changed the case UNOP_IND in function evaluate_subexp_standard. By calling dereference_osptr I check if the argument is an osptr and return the object, otherwise the original code is evaluated.
Implementing operator-> is less cleanly, because there is an ambiquity. The member named could refer to a member in the smart pointer or it could refer to a member in the object refered to by the smart pointer. I resolve this problem by first searching through the members of the smart pointer (and its bases), and when no member with that name matched, dereferencing the smart pointer, and searching that. The code present in the STRUCTOP_PTR branch of evaluate_subexp_standard does not allow handling this cleanly, because error messaging is done in value_struct_elt (gdb/valops.c). And because I didn't want to disrupt much of the debugger code, I decided to hook it into the value_struct_elt itself.