How Moda Builds Production-Grade AI Design Agents with Deep Agents

Moda leverages a multi-agent system built on Deep Agents and LangSmith to overcome the challenges of visual design, creating a seamless, iterative AI design experience.

Moda is an AI-native design platform for non-designers like marketers, founders, salespeople, and small business owners who need professional-grade presentations, social media posts, brochures, and PDFs without a design background. Think Canva or Figma, but with a Cursor-style AI sidebar that builds and iterates on designs directly on a fully editable 2D vector canvas.

At the core of Moda is a multi-agent system built with Deep Agents, with LangSmith providing the observability layer that lets the team iterate quickly and ship with confidence.

The Challenge: Making AI Good at Visual Design

AI code generation works well partly because HTML and CSS already have layout abstractions like Flexbox and grid. You describe relationships and relative sizes, not pixel coordinates.

Visual design has no equivalent. The closest thing to a standard is PowerPoint's XML spec, a 40-year-old format packed with verbose, absolute XY coordinates that LLMs are notoriously bad at reasoning about. Tools that use XML produce designs that look like every other AI-generated deck.

Moda needed a system that could generate designs that actually look good, and an agent architecture capable of handling complex, multi-turn, visually-grounded tasks at production quality.

Agent Harness: Built on Deep Agents

Moda's AI system consists of three agents:

Design Agent: the primary agent behind the Cursor-style sidebar, handling all real-time design creation and iteration on the canvas Research Agent: fetches and stores structured content from external sources (e.g., a company's website) into a per-user file system within Moda Brand Kit Agent: ingests brand assets (colors, fonts, logos, brand voice) from websites, uploaded brand books, or existing slide decks, so every design feels on-brand from the start

The Research Agent and Brand Kit Agent both run on Deep Agents. These are the team's newest agents, which they've invested in heavily. The Design Agent runs on a custom LangGraph loop — an older implementation built before Deep Agents — and the team is actively evaluating migrating it as well.

All three agents share the same overall architecture: a lightweight triage step, a main agent loop, dynamic context loading, and full tracing in LangSmith.

Context Engineering: The Details That Matter

Getting a design agent to produce genuinely good output that is visually coherent and brand-accurate (not just technically correct) required a lot of intentional context engineering.

A Custom DSL Instead of Raw Scene Graph

One of the hardest parts of building a design agent is figuring out how to represent visual layouts in a way LLMs can reason about effectively. Raw canvas state is verbose, coordinate-heavy, and token-expensive — not a natural fit for how models think about structure and layout.

Moda developed a context representation layer that gives the agent a cleaner, more compact view of what's on the canvas, which reduces token cost and improves output quality. The specifics are proprietary, but the general principle is the same one that makes LLMs effective at web development: give the model layout abstractions it can reason about, rather than raw numerical coordinates.

"LLMs are not good at math. PowerPoint's XML spec has a bunch of XY coordinates — that's a fine representation of the data, but it's not a great way for an LLM to describe where it wants things to live." — Ravi Parikh, Co-Founder, Moda.app

Triage → Skills → Main Loop

Every request passes through a lightweight triage node (using fast and cheap Haiku models) before reaching the main agent. The triage node classifies the output format (slide deck, PDF, LinkedIn carousel, logo, etc.) and pre-loads the relevant skills, which are Markdown documents containing design best practices, format guidelines, and task-specific creative instructions.

Skills are injected as human messages, with prompt caching breakpoints placed after the system prompt and after the skills block. This keeps the system prompt fixed and always cached while allowing dynamic context injection per request.

Dynamic Tool Loading

The design agent runs with 12–15 core tools in context at all times. An additional ~30 tools are available on demand via a RequestToolActivation tool the agent can call when it recognizes a specialized need, like parsing an uploaded PowerPoint file.

UX: The Cursor Moment for Design

One of Moda's most deliberate product choices is the interaction model. Rather than a generate-and-replace flow, where AI produces a static output and hands it off, Moda's AI works directly on a fully editable 2D vector canvas. Every element the agent creates is immediately selectable, movable, resizable, and styleable by the user.

Observability with LangSmith

Because all three agents are traced through LangSmith, Ravi has full visibility into every execution. Key workflows include prompt & tool iteration, cost tracking, cache hit analysis, and error diagnosis. "It's made us move faster." — Ravi Parikh

Source: LangChain Blog