SBCL Fibers – Lightweight Cooperative Threads

SBCL Fibers are a work-in-progress implementation of lightweight, user-space cooperative threads designed to handle high-concurrency workloads efficiently while maintaining a simple, sequential programming model.
This is a work-in-progress draft documentdescribing lightweight userland cooperative threads for SBCL. The implementation is under active development and details may change. This is a living document — you can view its revision history. You can try it out on thefibers-v2
branch at github.com/atgreen/sbcl.
Table of Contents
Introduction- Motivation: Why Fibers?
- Design Goals
- Terminology
Programming API- Creating and Running Fibers
- Yielding and Waiting
- Fiber Sleep and Timed Waits
- Fiber Parking (Condition-Based Suspend/Resume)
- Fiber Join
- Spawning Fibers from Fibers
- Multi-Carrier Scheduling
run-fibers
(Simple, Blocking)start-fibers
/submit-fiber
/finish-fibers
(Dynamic, Long-Lived)
- Fiber-Aware Blocking Primitives
- Mutex Acquisition
- Condition Variables
- Semaphores
- I/O (
wait-until-fd-usable
) sleep
andwait-for
- Fiber Pinning (Preventing Yield)
- Introspection (
list-all-fibers
,fiber-state
, Backtraces) - Idle Hooks
- API Reference Summary
Architecture Overview- Carrier Threads and Schedulers
- Fiber Lifecycle State Machine
- Data Flow: Yield and Resume
Context Switching- Register Save/Restore Convention
- The
fiber_switch
Assembly Routine - Stack Frame Initialization for New Fibers
- The Entry Trampoline (Assembly to C to Lisp)
- Zero-Allocation Design (No SAPs, No GC Pressure)
- Thread Register Patching on Carrier Migration
Stack Management- Control Stack Layout and Guard Pages
- Binding Stack (Separate Allocation)
- Stack Pooling (
madvise(MADV_DONTNEED)
Recycling) - Stack Overflow Detection in Signal Handlers
- Stack Size Tradeoffs (No Dynamic Growth)
Dynamic Variable Bindings (TLS)- The Problem:
unbind_to_here
Zeroes Entries - TLS Overlay Arrays (Save/Restore Without Unbinding)
-
Carrier Value Update on Migration
-
Catch Block and Unwind-Protect Chain Save/Restore
-
Empty Binding Stack Fast Path
-
Same-Carrier Resume Optimization
-
The Problem: Garbage Collector Integration- The Two-List Design: Suspended Fibers vs. Active Contexts
fiber_gc_info
: Conservative Control Stack Scanningactive_fiber_context
: Carrier + Fiber Stack Visibility- Precise Binding Stack Scavenging
- Persistent Carrier Context (Lock-Free Hot Path)
- The
fiber-context
Thread Struct Slot (O(1) Lookup) - GC Safety Windows (
without-interrupts
vs.without-gcing
) - Correctness Argument: Why Partial Updates Are Safe
Scheduler Design- The Scheduler Loop
- Fast Path: Skip Maintenance When Deque Is Hot
- Maintenance: Pending Queue Drain, Deadline Expiry, Wake Checks
- Post-Switch Dispatch (Suspended, Dead)
- Idle Detection and Carrier Parking
- Maintenance Frequency Backstop (Every 64 Fibers)
Work Stealing- Chase-Lev Lock-Free Deque
- Owner Operations (Push/Pop from Bottom, LIFO)
- Thief Operations (Steal from Top, FIFO, CAS)
- Buffer Growth (Power-of-Two Circular Array)
- Random Victim Selection
- Fiber Migration and Thread Register Fixup
I/O Multiplexing- Platform Abstraction (epoll, kqueue, poll Fallback)
- Edge-Triggered Mode with One-Shot (
EPOLLET | EPOLLONESHOT
) - Indexed I/O Waiters (fd-to-Fiber Table)
- Bounded Epoll Drain Loop
fd-ready-p
andwait-until-fd-usable
Integration- The Default I/O Idle Hook
- Batched FD Polling vs. Per-Fiber Polling
Deadline Scheduling- Binary Min-Heap with Inline Index
- O(log N) Insert and Remove
- Batch Expiry (Pop All Expired in One Pass)
- Interaction with I/O Waiters (Dual-Indexed Fibers)
Fiber Death and Cleanup- The Lisp Trampoline (
fiber-trampoline
) - Error Handling and Result Capture
- Binding Stack Cleanup on Death (Without
unbind_to_here
) - GC Info Unregistration Timing
-
Stack Return to Pool
-
The Lisp Trampoline ( Integration with SBCL- Thread Struct Extension (
fiber-context
Slot) serve-event
Dispatch (Fiber-Awarewait-until-fd-usable
)sleep
andwait-for
Dispatch- Mutex and Condition Variable Dispatch
- Pinned Blocking Fallback to OS Primitives
*pinned-blocking-action*
Warning/Error Policyinterrupt-thread
-
Debugger Integration
-
Thread Struct Extension ( Performance- HTTP Benchmark
-
Memory Efficiency
-
Scalability Under High Connection Counts
-
Context Switch Microbenchmark
-
GC Impact
Platform Support- x86-64 (Linux, macOS, Windows)
-
ARM64
-
ARM32
-
PPC64
-
PPC32
-
RISC-V
-
Feature Flag:
:sb-fiber -
Platform-Specific Assembly and I/O Backends
Appendix A: Using Hunchentoot with Fibers
- The Fiber Taskmaster
- Taskmaster Methods
- Starting the Server
- How It Works
- SSL
- Considerations
1. Introduction
1.1 Motivation: Why Fibers?
Many server workloads are concurrent but not parallel. A web server handling 10,000 connections spends almost all of its time waiting for network I/O; the actual computation per request is trivial. The natural programming model is one thread of control per connection — read a request, compute a response, write it back — but OS threads are too expensive to use this way at scale.
Each OS thread in SBCL carries a full-sized control stack (typically
8 MB), a binding stack, signal handling infrastructure, and a kernel
task_struct. Creating a thread requires mmap
, clone
, and TLS setup; destroying one requires the reverse. Context switching between threads requires a kernel transition, TLB management, and scheduler bookkeeping. At 10,000 concurrent connections, this means 80 GB of virtual address space for stacks alone, and the kernel scheduler — designed for dozens to hundreds of runnable tasks — begins to degrade.
The conventional alternative is event-driven programming: a single
thread multiplexes connections using epoll
or kqueue
, dispatching callbacks on I/O readiness. This scales well but inverts control flow. Sequential logic must be broken into state machines or continuation chains. Error handling, resource cleanup, and debugging all become harder. Backtraces show the event loop, not the logical call stack of the request being served.
Fibers offer a third option. A fiber is a user-space thread with its own control stack and binding stack, scheduled cooperatively by a library-level scheduler rather than the kernel. Fibers preserve the sequential programming model — the code reads top-to-bottom like a normal thread — while achieving the resource efficiency of event-driven I/O. A fiber’s control stack is 256 KB by default (not 8 MB), context switching saves and restores six registers in user space (not a full kernel transition), and thousands of fibers can share a small pool of OS carrier threads.
Common Lisp complicates this picture because the language carries
more implicit per-thread state than most: special variable bindings
live in a separate binding stack, non-local exits thread through
catch tags and unwind-protect
chains, and all of these must be
saved and restored correctly across suspension points. A fiber
implementation that only switches the control stack will corrupt this
state. SBCL’s fiber implementation handles all of it: dynamic
bindings made within a fiber are local to that fiber,
unwind-protect
cleanup forms run on fiber death, and
handler-case
/handler-bind
work as expected.
1.2 Design Goals
The implementation is guided by several priorities, roughly in order:
Correctness under GC. SBCL uses a generational, compacting,
stop-the-world garbage collector. The GC must be able to find every
live Lisp object, including those on suspended fiber stacks and in
fiber binding stacks. Getting this wrong means silent heap corruption.
Every design decision in the fiber runtime is constrained by the
requirement that GC can fire at almost any point (the only exclusion is
without-gcing
regions) and must see a consistent view of all fiber state.
Transparent integration. Existing SBCL code should work inside
fibers without modification. grab-mutex
, condition-wait
,
wait-until-fd-usable
, sleep
, and wait-for
all detect when they ar
Source: Hacker News










