Linux extreme performance H1 load generator

gcannon is a high-performance HTTP/1.1 and WebSocket load generator built on Linux io_uring, offering microsecond-resolution latency tracking and extreme throughput.

A high-performance HTTP/1.1 and WebSocket load generator built on Linux io_uring.

Official load generator for Http Arena

Requires Linux 6.1+, gcc, liburing-dev 2.5+

sudo apt install build-essential liburing-dev

git clone https://github.com/MDA2AV/gcannon.git && cd gcannon && make

The fastest HTTP load generator available. Built on io_uring's batched async I/O to maximize requests per second from a single machine.

Per-request latency tracking with microsecond-resolution histograms via CLOCK_MONOTONIC. Every response is recorded — percentiles are exact, not estimated.

Pass --tui for a rich terminal interface with live progress, throughput graph, and colored results.

During execution, the TUI displays a progress bar, real-time throughput stats, and a sparkline graph showing req/s over time. Updates every second.

Percentile latencies displayed in a clean box-drawn table with color coding: cyan for normal, yellow for p99, red for p99.9.

When using -r N, each connection closes and reconnects after N request/response pairs. The results show the total reconnect count and confirm that every response was latency-sampled.

The histogram automatically zooms into your data range. Bucket boundaries are computed from p0 to p99.9 of the actual latency distribution, divided into equal-width slices. Control granularity with -b.

Results are saved after every run to ~/.gcannon/history.bin (up to 100 runs). In TUI mode, bar graphs show req/s and avg latency trends across the last 10 runs. The current run is highlighted in green.

When using multiple --raw request files, pass --per-tpl-latency to track latency histograms per template. Each template gets its own percentile breakdown (avg, p50, p99, p99.9).

Machine-readable output for scripts, CI pipelines, and dashboards. Pass --json to get a single JSON object on stdout with no banner or progress output.

In WebSocket mode (--ws), additional ws_upgrades and ws_frames fields are included.

Latency numbers are only useful if they're measured correctly. Glass Cannon tracks per-request latency with microsecond resolution using a two-tier histogram.

All timestamps use the Linux kernel's CLOCK_MONOTONIC via clock_gettime(). On modern x86_64, this reads the TSC register via the kernel's vDSO, so it doesn't even require a syscall — it's a fast userspace read with nanosecond resolution.

The latency sample is recorded in a two-tier histogram. Tier 1 covers 0–10ms at 1μs resolution (10,000 buckets). Tier 2 covers 10ms–5s at 100μs resolution (49,900 buckets). This gives exact percentile calculations without storing individual samples or doing any heap allocation.

Source: Hacker News