July 30, 2020
Several months back I needed to troubleshoot a problem that looked like a slow query, but the query wasn't appearing in the slow query log. It was as if there was some kind of pause between preparing and executing the query. I suspected that the Java application code was up to some funny business but I couldn't prove it. The system was (still is) quite busy and many of the queries are prepared at daemon start time so "normal" packet sniffing didn't help me much.
I tried a few things including some Wireshark PostgreSQL plugins but they weren't stateful. I really needed something to link the relevant TCP packets together for me. In a fit of pique I hacked up a first sketch of
pgtrace the stateful packet sniffer. I used
pgtrace to figure out what was going on with the weird pausing queries.
Since then I've used
pgtrace several times to troubleshoot other anomalies faster than I could with a generic packet sniffer.
Tested on CentOS 7 after compiling with GCC.
There's not much to it.
pgtrace device_to_sniff pcap_filter_string
pgtrace works whether you're sniffing on the client or the server machine. It assumes that the PostgreSQL server is listening on port 5432 and it uses this assumption to figure out which side of the communication is the server.
pgtrace eth0 "port 2222"
will sniff all network traffic on port 2222. In this scenario
pgtrace will assume that 2222 is the client's source port and that the server is listening on port 5432. This would be useful if you have a long-running PostgreSQL connection that you need to sniff.
pgtrace eth0 "port 5432"
will sniff all network traffic on port 5432.
will replay all the data from the packet capture file, as prepared by
tcpdump. This can be useful if you're troubleshooting a busy system that generates packets faster than
pgtrace can parse them.
It helps to have the PostgreSQL Frontent/Backend Protocol specs on hand for this.
# pgtrace lo "port 5432" pgtrace: 0 Self-test complete. device_or_file='lo' filter='port 5432' pgtrace: 1596111995311480 source_port=17386 dest_port=5432 seq=3376136114 ack=2164505585 window=3635 size_payload=70 flags=0x18 1596111995311480 17386 fe Bind 15 .S_1....... 1596111995311480 17386 fe Execute 9 ..... pgtrace: 1596111995312776 source_port=16792 dest_port=5432 seq=3795050389 ack=3794399569 window=3635 size_payload=52 flags=0x18 1596111995312776 16792 fe Parse 16 .SELECT 1... 1596111995312776 16792 fe Bind 12 ........ 1596111995312776 16792 fe Describe 6 P. 1596111995312776 16792 fe Execute 9 .....
Walking through the above...
# pgtrace lo "port 5432"
pgtrace listening on the loopback device on port 5432
pgtrace: 0 Self-test complete. device_or_file='lo' filter='port 5432'
The first output line.
pgtrace runs a brief self-test on every run. Lines that aren't PostgreSQL messages start with "pgtrace: "
pgtrace: 1596111995311480 source_port=17386 dest_port=5432 seq=3376136114 ack=2164505585 window=3635 size_payload=70 flags=0x18
pgtrace has sniffed a TCP packet at 1596111995311480 microseconds after the Unix epoch.
pgtrace uses the timestamp supplied by
libpcap so that the timestamps in pcap files make sense.
1596111995311480 17386 fe Bind 15 .S_1.......
Bind a prepared query previously saved as
S_1. The message was received on port 17386.
1596111995311480 17386 fe Execute 9 .....
Execute the prepared query.
pgtrace: 1596111995312776 source_port=16792 dest_port=5432 seq=3795050389 ack=3794399569 window=3635 size_payload=52 flags=0x18
pgtrace has sniffed another TCP packet.
1596111995312776 16792 fe Parse 16 .SELECT 1... 1596111995312776 16792 fe Bind 12 ........ 1596111995312776 16792 fe Describe 6 P. 1596111995312776 16792 fe Execute 9 .....
Execute the "SELECT 1" SQL statement.
pgtraceexpects the server to be listening on port 5432.
pgtracewrites out only the first 4096 bytes of messages. This is a function of
pgtrace's zero-alloc design.
pgtracedoes not cope well with dropped or reordered packets. This area of the code in particular could use some love. The best workaround I've found for this is to pre-sniff a packet capture file using
tcpdumpand a big
-B(buffer size) argument. This reduces the chance of missed packets.
pgtrace's performance isn't completely woeful but it could use some improvement. It's implemented as a big state machine that processes each message byte by byte. This suited my original troubleshooting deadline but is not the fastest way to skin this cat.
pgtraceparses message types and lengths but doesn't parse their internals, it just makes a best effort to write out the printable characters.