Building AI Agents That Actually Work: Lessons from Jason Lemkin, Jeanne DeWitt Grosser (Vercel), Amelia Lerutte & Amjad Masad (Replit)

A realistic look at running a company on AI agents from SaaStr AI Annual 2026, moving past the hype to address real costs, bugs, and strategies. Key insights from SaaStr and Vercel leaders reveal how to build, deploy, and scale agents effectively.

The half-day that kicked off SaaStr AI Annual 2026 was the most concrete look we’ve put on a stage at what running a company on AI agents actually looks like. Not the hype version. The real version, with the costs, the bugs, the drift, and the wins.

Four sessions, four very different angles: a hands-on agent build for people who have never built one, a behind-the-scenes look at how Vercel automated core go-to-market functions, a live build of SaaStr’s own AI VP of Marketing, and a fireside with the founder of the platform most of it runs on. Below is the full breakdown of each, the top 3 takeaways from each, and then the themes that showed up across all four.

Session 1: AI Agents 101 with Jason Lemkin, Founder SaaStr AI

The opening session was deliberately basic. The pitch was simple: if you have already deployed 100 agents and have eight Mac minis running open models at home, go get lunch. This one was for everyone who buys tools, plays with tools, and watches their team use tools, but has never built one themselves.

The whole exercise was building a digital clone on Delphi. SaaStr started its own agent journey here about 14 months ago with a single agent, a digital version of Jason that people could ask questions about SaaStr. What was surprising at the time: the agent answered better than most of the humans helping us. It knew what a gold sponsorship cost, how ticket prices changed over time, what the WiFi password was. One sponsor bought a $50,000 sponsorship interacting with Digital Jason alone.

Building one took minutes. You connect a website, a YouTube channel, an X handle, and the tool ingests and chunks the content. The version built live for the session pulled in 8.3 million words from SaaStr.com and social in a few minutes and started answering real founder questions reasonably well.

The core framework was “stair-step it.” Three rungs:

Build something good today with an off-the-shelf product. You can do this in five minutes.
Let it run for weeks. Read every output. When it gets something wrong, correct it with a line of text and let it re-ingest. After the first few hundred messages, it gets sharp.
Only then, if you have a reason, build a custom version. The custom Digital Jason runs on a vector database, chunked content, Voyage embeddings, and Claude Haiku. It sounds better and embeds cleaner, but it breaks and needs maintenance.

That maintenance point became the meta-lesson. When asked whether the custom version was still ingesting recent content, the platform admitted it had quietly stopped pulling YouTube transcripts. Nobody noticed because nothing broke loudly. That is drift. Things fall out of sync, a new model ships, something stops working, and unless the agent is mission critical, you miss it. Which is why the repeated advice was buy, don’t build. SaaStr uses at least eight of the vendors on the floor: Artisan, Monaco, Qualified, Agentforce, and more. We only build what we can’t buy.

On guardrails: when Digital Jason first launched 15 months ago, someone asked it for Jason’s email and home address and to book a meeting, and it actually set four meetings on his calendar. The vendor patched it fast, and guardrails have improved every month since. Lesson stands though: the more you build yourself, the more guardrail risk you own. Agents are goal-seeking, they want to make you happy, prompt injections work, and a capable agent will eventually share data it shouldn’t. Start with your least sensitive information and add PII slowly, or just buy from a vendor whose guardrails are better than anything you’ll vibe yourself.

On time: getting an agent going takes minutes, initial training takes the first two to four weeks of reading every output, and then maintenance never really stops. Someone on the team needs an hour or two a day on your agents. The flip side is you get pulled in, because you keep seeing more productivity to add.

On pricing your own agents: customers want a product that works and certainty on cost. Nobody in B2B wants to pay per token. Charge a fair price tied to outcomes, and force yourself to charge enough that you have to deliver real value. HubSpot recently moved to outcome-based pricing because their agents now resolve 90% of tickets.

Top 3 takeaways from Session 1

**Simple but trained beats complex.**A five-minute agent that you correct daily for a month outperforms a sophisticated build you set and forget. The training loop matters more than the architecture.**Buy, don’t build, and only build what you can’t buy.**Vendors who have been in market have better guardrails than anything you’ll create. Building it yourself means owning drift, maintenance, and data-leak risk.**Agents replace work, so price for outcomes.**A founder will pay $50,000 for an AI SDR that lands real customers but not for an email automation tool. Charge for the result, not the tokens.

Session 2: Building Business Agents at Scale with Jeanne DeWitt Grosser, COO of Vercel

Jeanne DeWitt Grosser scaled go-to-market at both Google and Stripe for roughly a decade each before joining Vercel as COO. Six weeks in, she stood up a go-to-market engineering team in June 2025, before that phrase was common, with one mandate: bring AI and agents to everything GTM. Ten months later, Vercel has automated a real chunk of core company functions.

The numbers she shared:

The customer support agent now handles 93% of total case load, and Vercel’s cases are highly technical.
The content agent did 96% of major content updates last quarter.
The lead qualification agent, launched in August, started as 20% of one engineer’s time. With a human in the loop over six weeks, it took the team on that function from 10 people to one in the US plus 20% of a person covering all of Europe and APAC.

That lead agent runs about $5,000 a year between infrastructure and tokens and takes 20% of one engineer to maintain. Against 10 salaries, that is a 32x ROI, and it runs 24/7 with faster speed-to-lead and human-equivalent quality. The people who came off that function moved into higher-value roles.

The build method is a tripod: a GTM engineer, a data scientist, and the single best subject-matter expert for that function. They sit shoulder to shoulder and document best practice, then encode it into workflows. For the lead agent, an engineer literally shadowed Vercel’s best SDR for days, watching every tab she opened (LinkedIn, BuiltWith, the company site, CRM, Slack history), and converted each into a step in a tool-calling workflow. The agent ran in shadow mode for six weeks with that SDR reviewing every output, until she couldn’t improve it anymore. Then they pulled the human. The result performs like a 90th-percentile rep 100% of the time. The same framework went to 30 different SDR workflows, and SDR quotas rose 30% that quarter. A single engineer prototyped the first version over a weekend and shipped it six weeks later.

She walked through several agents:

Deal One, the meeting intelligence agent, ingests every call, generates notes and action items, posts coaching to Slack, proposes CRM updates, tracks competitive mentions, and runs loss post-mortems. Reps mention it in Slack, it queries other agents as sub-agents, pulls Gong transcripts, searches the knowledge base, and streams answers back. The rep never leaves Slack. The agent has no UI.The Playbook Platformturns the instincts of the best reps into automation. A signal fires (a usage spike, a high-intent pricing page visit), the platform matches it to a play, generates personalized outreach, and surfaces it for a single-click review.D0, the most popular agent in the company, is a data analyst everyone reaches through Slack. Questions that used to take a week and a ticket now get answered in under a minute, because it translates plain English into SQL against a semantic layer the head of data science built on top of a model of the business.Vertex, the customer service

Source: SaaStr

Building AI Agents That Actually Work: Lessons from Jason Lemkin, Jeanne DeWitt Grosser (Vercel), Amelia Lerutte & Amjad Masad (Replit)

Session 1: AI Agents 101 with Jason Lemkin, Founder SaaStr AI

Top 3 takeaways from Session 1

Session 2: Building Business Agents at Scale with Jeanne DeWitt Grosser, COO of Vercel

She walked through several agents:

More in this category

Our New AI VP of Finance Closes the Deal, Sends the Invoice, and Chases the Cash. It Took 4 Deals to Train It.

Build on the Stack You Have: How Anthropic’s Head of Industries, Atlassian’s Head of AI, and Scale’s Rory O’Driscoll Landed on the Same AI Playbook

Dear SaaStr: When Should We Start Pushing For Multi-Year Contracts?

We Peaked at 30 AI Agents. Now We’re Coming Back Down to 20. Here’s What Consolidation Actually Looks Like. The Agents #011 Live!

You Should Be Collecting At Least 100% Of Your MRR Each Month in Cash. Ideally, 110%+.

5 Interesting Learnings from Stripe at $6.8 Billion in Revenue: 33% Growth, 47% Free Cash Flow Margins, and a $53B Bid for PayPal

Most read

Discover All Categories