SBCL Fibers – Lightweight Cooperative Threads

SBCL Fibers are a work-in-progress implementation of lightweight, user-space cooperative threads designed to handle high-concurrency workloads efficiently while maintaining a simple, sequential programming model.

This is a work-in-progress draft documentdescribing lightweight userland cooperative threads for SBCL. The implementation is under active development and details may change. This is a living document — you can view its revision history. You can try it out on thefibers-v2

branch at github.com/atgreen/sbcl.

Introduction- Motivation: Why Fibers?

Design Goals
Terminology

Programming API- Creating and Running Fibers

Yielding and Waiting
Fiber Sleep and Timed Waits
Fiber Parking (Condition-Based Suspend/Resume)
Fiber Join
Spawning Fibers from Fibers
Multi-Carrier Scheduling run-fibers

(Simple, Blocking)start-fibers

/submit-fiber

/finish-fibers

(Dynamic, Long-Lived)

Fiber-Aware Blocking Primitives
Mutex Acquisition
Condition Variables
Semaphores
I/O ( wait-until-fd-usable

) sleep

andwait-for

Fiber Pinning (Preventing Yield)
Introspection ( list-all-fibers

,fiber-state

, Backtraces) - Idle Hooks

API Reference Summary

Architecture Overview- Carrier Threads and Schedulers

Fiber Lifecycle State Machine
Data Flow: Yield and Resume

Context Switching- Register Save/Restore Convention

The fiber_switch

Assembly Routine - Stack Frame Initialization for New Fibers

The Entry Trampoline (Assembly to C to Lisp)
Zero-Allocation Design (No SAPs, No GC Pressure)
Thread Register Patching on Carrier Migration

Stack Management- Control Stack Layout and Guard Pages

Binding Stack (Separate Allocation)
Stack Pooling ( madvise(MADV_DONTNEED)

Recycling) - Stack Overflow Detection in Signal Handlers

Stack Size Tradeoffs (No Dynamic Growth)

Dynamic Variable Bindings (TLS)- The Problem: unbind_to_here

Zeroes Entries - TLS Overlay Arrays (Save/Restore Without Unbinding)

Carrier Value Update on Migration
Catch Block and Unwind-Protect Chain Save/Restore
Empty Binding Stack Fast Path
Same-Carrier Resume Optimization
The Problem: Garbage Collector Integration- The Two-List Design: Suspended Fibers vs. Active Contexts fiber_gc_info

: Conservative Control Stack Scanningactive_fiber_context

: Carrier + Fiber Stack Visibility- Precise Binding Stack Scavenging

Persistent Carrier Context (Lock-Free Hot Path)
The fiber-context

Thread Struct Slot (O(1) Lookup) - GC Safety Windows ( without-interrupts

vs.without-gcing

) - Correctness Argument: Why Partial Updates Are Safe

Scheduler Design- The Scheduler Loop

Fast Path: Skip Maintenance When Deque Is Hot
Maintenance: Pending Queue Drain, Deadline Expiry, Wake Checks
Post-Switch Dispatch (Suspended, Dead)
Idle Detection and Carrier Parking
Maintenance Frequency Backstop (Every 64 Fibers)

Work Stealing- Chase-Lev Lock-Free Deque

Owner Operations (Push/Pop from Bottom, LIFO)
Thief Operations (Steal from Top, FIFO, CAS)
Buffer Growth (Power-of-Two Circular Array)
Random Victim Selection
Fiber Migration and Thread Register Fixup

I/O Multiplexing- Platform Abstraction (epoll, kqueue, poll Fallback)

Edge-Triggered Mode with One-Shot ( EPOLLET | EPOLLONESHOT

) - Indexed I/O Waiters (fd-to-Fiber Table)

Bounded Epoll Drain Loop fd-ready-p

andwait-until-fd-usable

Integration- The Default I/O Idle Hook

Batched FD Polling vs. Per-Fiber Polling

Deadline Scheduling- Binary Min-Heap with Inline Index

O(log N) Insert and Remove
Batch Expiry (Pop All Expired in One Pass)
Interaction with I/O Waiters (Dual-Indexed Fibers)

Fiber Death and Cleanup- The Lisp Trampoline ( fiber-trampoline

) - Error Handling and Result Capture

Binding Stack Cleanup on Death (Without unbind_to_here

) - GC Info Unregistration Timing

Stack Return to Pool
The Lisp Trampoline ( Integration with SBCL- Thread Struct Extension ( fiber-context

Slot) serve-event

Dispatch (Fiber-Awarewait-until-fd-usable

)sleep

andwait-for

Dispatch- Mutex and Condition Variable Dispatch

Pinned Blocking Fallback to OS Primitives *pinned-blocking-action*

Warning/Error Policyinterrupt-thread

Debugger Integration
Thread Struct Extension ( Performance- HTTP Benchmark
Memory Efficiency
Scalability Under High Connection Counts
Context Switch Microbenchmark
GC Impact

Platform Support- x86-64 (Linux, macOS, Windows)

ARM64
ARM32
PPC64
PPC32
RISC-V
Feature Flag: :sb-fiber
Platform-Specific Assembly and I/O Backends

Appendix A: Using Hunchentoot with Fibers

The Fiber Taskmaster
Taskmaster Methods
Starting the Server
How It Works
SSL
Considerations

1. Introduction

1.1 Motivation: Why Fibers?

Many server workloads are concurrent but not parallel. A web server handling 10,000 connections spends almost all of its time waiting for network I/O; the actual computation per request is trivial. The natural programming model is one thread of control per connection — read a request, compute a response, write it back — but OS threads are too expensive to use this way at scale.

Each OS thread in SBCL carries a full-sized control stack (typically 8 MB), a binding stack, signal handling infrastructure, and a kernel task_struct. Creating a thread requires mmap

, clone

, and TLS setup; destroying one requires the reverse. Context switching between threads requires a kernel transition, TLB management, and scheduler bookkeeping. At 10,000 concurrent connections, this means 80 GB of virtual address space for stacks alone, and the kernel scheduler — designed for dozens to hundreds of runnable tasks — begins to degrade.

The conventional alternative is event-driven programming: a single thread multiplexes connections using epoll

or kqueue

, dispatching callbacks on I/O readiness. This scales well but inverts control flow. Sequential logic must be broken into state machines or continuation chains. Error handling, resource cleanup, and debugging all become harder. Backtraces show the event loop, not the logical call stack of the request being served.

Fibers offer a third option. A fiber is a user-space thread with its own control stack and binding stack, scheduled cooperatively by a library-level scheduler rather than the kernel. Fibers preserve the sequential programming model — the code reads top-to-bottom like a normal thread — while achieving the resource efficiency of event-driven I/O. A fiber’s control stack is 256 KB by default (not 8 MB), context switching saves and restores six registers in user space (not a full kernel transition), and thousands of fibers can share a small pool of OS carrier threads.

Common Lisp complicates this picture because the language carries more implicit per-thread state than most: special variable bindings live in a separate binding stack, non-local exits thread through catch tags and unwind-protect

chains, and all of these must be saved and restored correctly across suspension points. A fiber implementation that only switches the control stack will corrupt this state. SBCL’s fiber implementation handles all of it: dynamic bindings made within a fiber are local to that fiber, unwind-protect

cleanup forms run on fiber death, and handler-case

/handler-bind

work as expected.

1.2 Design Goals

The implementation is guided by several priorities, roughly in order:

Correctness under GC. SBCL uses a generational, compacting, stop-the-world garbage collector. The GC must be able to find every live Lisp object, including those on suspended fiber stacks and in fiber binding stacks. Getting this wrong means silent heap corruption. Every design decision in the fiber runtime is constrained by the requirement that GC can fire at almost any point (the only exclusion is without-gcing

regions) and must see a consistent view of all fiber state.

Transparent integration. Existing SBCL code should work inside fibers without modification. grab-mutex

, condition-wait

, wait-until-fd-usable

, sleep

, and wait-for

all detect when they ar

Source: Hacker News