From Oscilloscope to Wireshark: A UDP Story

A deep dive into decoding UDP packets directly from raw voltage waveforms on a hardware circuit, tracing the journey from the physical layer (L1) to the transport layer (L4).

From Oscilloscope to Wireshark: A UDP Story

UDP is a transport-level protocol for sending messages through an IP network.

It sits at level 4 in the OSI model:

| 7 | Application | |||||| | 6 | Presentation | |||||| | 5 | Session | |||||| 4 | Transport 3 | Network | 2 | Data link | 1 | Physical | |

Like many of you, I've got hardware on my desk that's sending UDP packets, and the time has come to take a closer look at them.

Most "low-level" networking tutorials will bottom out somewhere at "use tcpdump

to see raw packets". We'll be starting a bit lower in the stack; specifically, here:

This is a high-speed active differential probe soldered to an Oxide Computer Company rack switch. We're going all the way down to the metal.

(Huge thanks to Eric for the careful soldering that made this possible!)

Looking at the signals on an oscilloscope, we see data zooming down the wires:

The rest of this post will take us from these raw voltage waveforms all the way to decoded UDP packets. Hold on tight, we're going from L1 all the way to L4.

First, a bit of context

I work at Oxide Computer Company, writing embedded software for a rack-scale computer.

Over the past few months, I've been focused on the management network, which is a low-speed network between each server's Service Processor. The service processor is roughly equivalent to a baseboard management controller; it allows for lights-out management of the rack.

The heart of the management network is the VSC7448, a 52-port, 80G ethernet switch chip. This is the slower switch, and it's still a beast; here's the dev kit:

This decoding work was part of tracking down a nasty bug which caused a subset of links to only work some of the time. The root cause turned out to be a misconfiguration of the switch IC, but the hunt was an interesting dive into the physical layer of modern networking.

Loading the waveforms

That's enough context, back to work!

The oscilloscope doesn't have a built-in QSGMII analyzer (and we'll want to do fairly sophisticated processing of the data), so I wanted to export waveform data to my computer.

How much data should I capture? Analog waveforms can easily add up to multiple gigabytes, so I'd like to capture a small amount while still catching a packet or two.

I knew that a device on the network was emitting about 30K UDP packets per second, or one packet every 33 Âµs. I configured the oscilloscope to collect 100M samples at 1 TSPS (tera-sample per second, 1012), which multiplies out to 100 Âµs of data; this means we should catch 1-3 UDP packets.

After hunting down a USB key, I ended up with a 191M .wfm

file to process.

Fortunately, the .wfm

file format is documented by Tektronix.

In about 400 lines of code, I implemented a simple parser using nom

(which has great support for this kind of binary format). The parser is overkill: it decodes the entire file based on the specification. In practice, we only care about two things:

The sample waveform (which is an array of i16

) - The sample rate (in seconds per sample)

It's also possible to write a decoder in three lines of Python, once you know where the data stream starts in the file:

import numpy as np
data = open('udp-spam.wfm', 'rb').read()
pts = np.frombuffer(data[904:-1], dtype=np.int16)

Plotting a chunk of this data, it looks like what I was seeing on the oscilloscope:

Now, we have to figure out what it means.

QSGMII for dummies

Thinking back to the probes soldered to our PCB, we are probing the link between the main VSC7448 switch and a VSC8504 PHY. The PHY is acting as a port expander: it's taking four ports from the switch (combined into a single Tx/Rx channel), and splitting them into four separate channels.

(This is only necessary because the VSC7448 does not have enough pins to drive all 52 of its ports directly)

The link between the switch and the PHY is using QSGMII, which stands for quad serial gigabit media-independent interface. QSGMII is a protocol for communication between a Media Access Control (MAC) block and an ethernet PHY.

Specifically, QSGMII is a way to pack four SGMII channels into a single Tx/Rx pair (hence the "quad"), running 4Ã as fast (5 GBPS instead of 1.25 GBPS)

The SGMII and QSGMII standards are both freely available, and are both quite readable (especially compared to IEEE 802.3-2015).

Let's start with the encoding.

8b/10b decoding

Both SGMII and QSGMII use 8b/10b encoding, which is a way to pack a stream of (8-bit) bytes into 10-bit "code-groups" (sometimes called "symbols") with various desirable properties:

On average, there are the same number of 0s and 1s in the stream
There are enough bit transitions to recover the clock

The code-groups are all smushed together, so when you look at the raw data, it's not obvious where code-groups begin or end.

To recover the code-group framing, we need to look for comma characters, which are characters of the form 1100000

or 0011111

. These are (almost) the only characters which have five 0s or 1s in a row.

First, let's convert the i16

values to binary:

let pts: Vec<bool> = t.pts.iter().map(|p| *p < 0).collect();

Next, we'll start by picking out places where the signal changes from 0 to 1 or vice versa:

let crossings = pts.iter()
.zip(&pts[1..])
.enumerate()
.filter(|(_, (a, b))| a != b)
.map(|(i, _)| i)
.collect()

We know our sample rate (1 TPSP) and the nominal QSGMII bit rate (5 GHz); this means that a single-bit pulse (e.g. 010

) should be a 200-sample pulse. In turn, we expect a comma character to be roughly 1000 samples long (200 Ã 5).

Here's a chunk of data (normalized to 0-1); can you spot the comma character?

We can detect comma characters by picking out places where the signal doesn't change for > 900 samples, i.e. 4.5 bit lengths:

// We detect a potential comma if there are >= 4.5 bit lengths between
// transitions, since a comma has 5 identical bits in a row.
let comma_length = samples_per_clock * 9 / 2;
let commas = crossings
.iter()
.zip(&crossings[1..])
.filter(|&(a, b)| b - a >= comma_length)
.map(|(a, _b)| *a - samples_per_clock * 2)
.filter(|a| {
// A comma is either 0011111 or 1100000. We detected the crossing
// at the beginning of the 11111 or 00000 run, so we backed up by
// two clock periods (in the `map` above). Now, we advance by by
// 1/2 clock period to land in the middle of the bit.
let i = a + samples_per_clock / 2;
let mut value = 0;
for j in 0..7 {
value = (value << 1) | (pts[i + j * samples_per_clock] as u16);
}
value == 0b1100000 || value == 0b0011111
})
.collect();

(Notice that we check for 0011111

/1100000

, not just 11111

/00000

; certain other data patterns can produce a run of 5 identical bits, but lack the leading 00

/11

)

The comma character is right here in our data:

(I'm going to stop showing the analog trace now, to keep the graphs readable)

Once synchronized with the comma character, we know that bits are spaced at the bit sample rate, so we can read the 10-bit code-group. In this case, it's 1100000101

However, it's not quite that easy!

The oscilloscope and switch may not have exactly the same clock rate. If we go a long time between comma characters, we may end up sampling at the wrong position in the waveform!

It turns out that we need to synchronize in two places:

Comma characters tell us when a new code-group starts
Bit transitions help us keep the clock in sync

Here's what that looks like:

// The comma iterator points to the bit transition at the beginning of the
// comma codegroup, i.e. 0011111 or 1100000
let mut iter_comma = commas.iter().cloned().peekable();
// Index at which to sample the data. We start at a half-cycle offset from
// the bit transition at the beginning of the comma character.
let mut i = iter_comma.next().unwrap() + samples_per_clock / 2;
// The zero-crossing iterator points to bit transitions, so we ca

Source: Hacker News