Zero-Cost POSIX Compliance: Encoding the Socket State Machine in Lean's Types

Learn how Lean 4 uses dependent types to enforce POSIX socket protocols at compile time with zero runtime overhead, making logic errors unrepresentable.

The best runtime check is the one that never runs.

The problem

The POSIX socket API is a state machine. A socket must be created, then bound, then set to listen, before it can accept connections. Calling operations in the wrong order — send on an unbound socket, accept before listen, close twice — is undefined behaviour in C.

Every production socket library deals with this in one of three ways:

Runtime checks— assert the state at every call, throw on violation (Python, Java, Go).
Documentation— trust the programmer to read the man page (C, Rust).
Ignore it— let the OS return EBADF and hope someone checks the return code.

All three push the bug to runtime. Lean 4 offers a fourth option: make the bug unrepresentable at the type level, then erase the proof at compile time so the generated code is identical to raw C.

The state machine

Five states, seven transitions, and one proof obligation. That is the entire POSIX socket protocol. Let us encode it.

Step 1: States as an inductive type

inductive SocketState where
  | fresh -- socket() returned, nothing else happened
  | bound -- bind() succeeded
  | listening -- listen() succeeded
  | connected -- connect() or accept() produced this socket
  | closed -- close() succeeded — terminal state
  deriving DecidableEq

DecidableEq gives us by decide for free — the compiler can prove any two concrete states are distinct without any user effort.

Step 2: The socket carries its state as a phantom parameter

structure Socket (state : SocketState) where
  protected mk :: raw : RawSocket -- opaque FFI handle (lean_alloc_external)

The state parameter exists only at the type level. It is erased at runtime: a Socket .fresh and a Socket .connected have the exact same memory layout (a single pointer to the OS file descriptor). Zero overhead.

The constructor is protected to prevent casual state fabrication.

Step 3: Each function declares its pre- and post-state

-- Creation: produces .fresh def socket (fam : Family) (typ : SocketType) : IO (Socket .fresh)

-- Binding: requires .fresh, produces .bound def bind (s : Socket .fresh) (addr : SockAddr) : IO (Socket .bound)

-- Listening: requires .bound, produces .listening def listen (s : Socket .bound) (backlog : Nat) : IO (Socket .listening)

-- Accepting: requires .listening, produces .connected def accept (s : Socket .listening) : IO (Socket .connected × SockAddr)

-- Connecting: requires .fresh, produces .connected def connect (s : Socket .fresh) (addr : SockAddr) : IO (Socket .connected)

-- Sending/receiving: requires .connected def send (s : Socket .connected) (data : ByteArray) : IO Nat def recv (s : Socket .connected) (maxlen : Nat) : IO ByteArray

The Lean 4 kernel threads these constraints through the program. If you write send freshSocket data, the kernel sees Socket .fresh where it expects Socket .connected, and reports a type error. No runtime check. No assertion. No exception. No branch in the generated code.

Step 4: Double-close prevention via proof obligation

def close (s : Socket state) (_h : state ≠ .closed := by decide) : IO (Socket .closed)

This is where dependent types shine brightest. The second parameter is a proof that the socket is not already closed. Let us trace what happens for each concrete state:

| Call | Proof obligation | by decide | Result | |---|---|---|---| | close (s : Socket .fresh) | .fresh ≠ .closed | trivially true | compiles | | close (s : Socket .bound) | .bound ≠ .closed | trivially true | compiles | | close (s : Socket .listening) | .listening ≠ .closed | trivially true | compiles | | close (s : Socket .connected) | .connected ≠ .closed | trivially true | compiles | | close (s : Socket .closed) | .closed ≠ .closed | impossible | type error |

For the first four, the default tactic by decide discharges the proof automatically — the caller writes nothing. For the fifth, the proposition .closed ≠ .closed is logically false: no proof can exist, so the program is rejected at compile time.

The proof is erased during compilation. The generated C code is:

lean_object* close(lean_object* socket) {
    close_fd(lean_get_external_data(socket));
    return lean_io_result_mk_ok(lean_box(0));
}

No branch. No flag. No state field. The proof did its job during type-checking and vanished.

What the compiler actually rejects

-- ❌ send on a fresh socket
let s ← socket .inet .stream
send s "hello".toUTF8 
-- Error: type mismatch — expected Socket .connected, got Socket .fresh

-- ❌ accept before listen
let s ← socket .inet .stream
let s ← bind s addr
accept s 
-- Error: type mismatch — expected Socket .listening, got Socket .bound

-- ❌ double close
let s ← socket .inet .stream
let s ← close s
close s 
-- Error: state ≠ .closed — proposition .closed ≠ .closed is false

-- ✅ correct sequence
let s ← socket .inet .stream
let s ← bind s ⟨"0.0.0.0", 8080⟩
let s ← listen s 128
let (conn, addr) ← accept s
let _ ← send conn "HTTP/1.1 200 OK\r\n\r\n".toUTF8
let _ ← close conn
let _ ← close s

The punchline

| Approach | Lines of state-checking code | Runtime cost | Catches at | |---|---|---|---| | C (man page) | 0 | 0 | never (UB) | | Python (runtime) | ~50 | branch per call | runtime | | Rust (typestate) | ~30 | 0 | compile-time | | Lean 4 (dependent types) | 0 | 0 | compile-time |

Lean 4 is unique: the proof obligation state ≠ .closed is a real logical proposition that the kernel verifies. It is not a lint, not a static analysis heuristic, not a convention. It is a mathematical proof of protocol compliance, checked by the same kernel that verifies Mathlib's theorems — and then thrown away so the generated code runs at the speed of C.

Source: Hacker News