Show HN: AI Subroutines – Run automation scripts inside your browser tab

AI Subroutines solve the complex authentication problem in web automation by running scripts directly within the page context, optimizing costs and reliability compared to traditional AI agents.

AI Subroutines: Browser Automations That Run Inside the Page

Most web agents solve the wrong half of the problem. You can get an LLM to post on X, DM on Instagram, or send a LinkedIn connection request — once. The moment you need to do it a thousand times, the economics break: tokens per invocation, latency per invocation, non-determinism per invocation. On outreach, CRM updates, and bulk posting, "the agent clicked the wrong button this time" is not a quirk. It's a failure mode.

The obvious fix is to skip the UI and call the site's internal API directly. That's correct, and it's where most "just call the API" projects die. Because the hard problem isn't the endpoint. It's auth.

Auth is the actual hard problem

Authenticated web requests carry some combination of cookies, rotating CSRF tokens, session tokens, bearer headers, anti-replay nonces, fingerprint-bound parameters, and request-signing hashes computed in the site's own JS at request time. Some are set by the server. Some are derived in the browser. Some rotate per request.

Out-of-process scrapers — Node workers, Playwright workers, cloud functions — have to rebuild all of that out of band. That's the thing that breaks the moment a site rotates a header or ships a new signing scheme. Most HAR-replay tooling ends its useful life right here.

The trick: record in the extension, replay inside the webpage

In rtrvr, both the recording and the replay happen inside the user's browser, from within the webpage itself.

The extension intercepts the network requests the tab makes while you perform the task. Two layers: a MAIN-world fetch/XHR patch installed before any page script runs, with Chrome's webRequest API as a correlated fallback for the CORS and service-worker paths the in-page patch can't see. Request bodies — FormData, Blob, raw bytes, not just JSON — are captured too.
When the script runs later, those requests are dispatched from the page's own execution context — same origin, same cookies, same TLS session, same JS that computes the signed headers.

No Puppeteer driver. No headless worker. No separate TLS stack. The browser does what it always does: attach the cookies, run the site's own JS to compute the headers, ship the request.

Auth, CSRF, signing, and fingerprinting all propagate for free. The agent never touches any of it. No key extraction, no session rebuild, no proxy rotation.

This sounds like a footnote. It's the whole architecture.

Ranking and trimming the network capture

There's a second problem hiding inside "just record the network." A typical minute of browsing fires dozens to hundreds of requests per tab — analytics beacons, RUM pings, feature-flag polls, third-party pixels, prefetches, media chunks, hot-module reload pokes. The actual API call you care about is often 3 requests out of 300.

You cannot hand all of that to an LLM to figure out which one is the tool. It does not fit in the context window, and even if you paid to stretch it, the signal drowns in the noise.

So before the generator sees anything, we rank and trim the capture. Requests are scored on a handful of weighted signals:

First-party vs. third-party origin (+20 / −15). A known telemetry host — Sentry, Segment, Hotjar, RUM, the usual suspects — is a flat −80.
Temporal correlation to the DOM event (+28 within 800ms, +16 within 2.5s). A POST that fires 40ms after you click "Send" is almost certainly the send.
Method and payload shape (mutating POST/PUT/PATCH/DELETE: +35; GET: +5; with a request body: +8; OPTIONS/HEAD/perf entries: −40).
Response quality (2xx: +12; 4xx+: −25; non-empty body: +4).
Volatile operation identifiers (−18). Requests that carry a GraphQL queryId, doc_id, operationHash, or any build-specific hash in the URL or body.

Concretely: a first-party mutating POST that fires 80ms after a click with a 200 response and a body lands around +83. A generic analytics beacon is −80. Everything in between gets ordered and the top five survive. Those five plus the DOM interactions around them get rendered into a 12,000-character context for the generator.

Subroutines are tool calls, not macros

A recorded task — a Subroutine — is registered as a callable tool in the agent's tool set, next to search and fetch.

Point the agent at a sheet of 500 rows. It picks parameters per row. The Subroutine runs. The LLM is invoked exactly once per row — for parameter selection — and the action itself is a script.

Zero token cost on the hot path. The replay is a fetch, not an inference.
Deterministic. Same input, same output, every time.
Low detection surface. Requests come from the same origin, with the same headers, in the same user session that sent the original.
LLM-callable from natural language. The agent reaches for a Subroutine the same way it reaches for any other tool.

Inside a Subroutine: the `rtrvr` helpers

A Subroutine is a small async JavaScript function that runs in the tab. The parameters the agent passes are injected as const declarations above your code. Inside the body, an rtrvr.* helper namespace covers the common moves you need on real sites:

| Helper | Use | |---|---| | rtrvr.find({ role, name, text, placeholder }) | Find a semantic page target | | rtrvr.click(handleOrTarget) | Click a previously found handle | | rtrvr.type(handleOrTarget, value, { clear, submit }) | Type into inputs or editors | | rtrvr.waitFor(targetOrFn, { timeoutMs }) | Wait for the next UI state | | rtrvr.request(url, init) | Make authenticated in-page requests | | rtrvr.getCsrfToken() | Read the current-page CSRF token |

Source: Hacker News