Liam White
GDB remote serial protocol is bad

Here are several annoying problems with the GDB remote serial protocol.

Packets are delineated by ASCII characters

Specifically, a packet starts with the transmission of $ and ends with the transmission of #. Except not quite, because there is also a two hex-nibble checksum of the packet contents tacked on the end. The checksum is computed as the sum of all the bytes in the packet, not including the start and end delimiters, mod 256.

This design necessitates the existence of escape characters so that literal $ and # are not processed. Instead of \, which is relatively standard as an escape character across the programming world, } is used, and it has special semantics on every value that may require escaping:

Character byte Escaped bytes
"#" "}\x03"
"$" "}\x04"
"*" "}\x0a"
"}" "}\x5d"

Did you also notice that $ is used to indicate the start of the packet? In PCRE, the $ metacharacter is used to indicate the end of a line. This definitely did not confuse me while implementing a parser for this protocol.

The * character is also used to implement run-length encoding for the 9600 baud serial ports this protocol was originally made for.

A design that wasn't implemented for 9600 baud serial ports with 50% loss would use a simpler mechanism for binary framing.

ACK and NACK

By default, every command requires the server to send, and the client to respond with, either an ACK (+) or NACK (-), as its own transmission. This is not a packet, so it requires special handling. This is useless on a reliable transport like TCP, so if the server advertises support for QStartNoAckMode, the client will immediately attempt to disable the sending of these commands.

int3

As a presumed tribute to the x86 architecture, sending the literal byte \x03, as its own transmission, signals to the server to interrupt the running process. This is not a packet, so it requires special handling.

XML

A majority of GDB commands are ASCII-based queries that generate XML responses. Depending on the architecture and client, the client may also require a full XML register map to operate correctly. This introduces additional complexity underneath the layering caused by the custom packet architecture, because output from many commands now requires XML escaping. This also has implications for Unicode, as any codepoint above 0x7f must be encoded as an entity or the client will reject the response.

Brittleness

If specific features are not advertised by the server, several functions on the client will break. The most important one is vContSupported, without which hardware single-stepping is assumed to not be supported by the server, and is emulated with the client inserting breakpoints.

All GDB client functionality assumes the remote operating system is Linux and files to debug are ELF. So even when attempting to list libraries on a target where that could theoretically be supported, it does not load symbols for them unless the client can get access to the libraries as ELF files. Can I not just send the client the .text section and relocation information for it to figure out?

Extensibility

The only way I could find to extend the server protocol to support target-specific functionality is the qRcmd command. This is issued by the client when the user issues monitor commands, which are simply sent as strings to the server.

Not exhaustive

There are a lot of problems with the remote serial protocol. This is just what I managed to find in my short time working with it.