EVENT RECAP

Two models for AI agents in enterprise video—and why video shelf life is the forcing function

Every enterprise creative team is being asked the same question right now: how do you use AI agents in your video workflow? The honest answer, for most teams, is that they're still figuring it out. But the outlines of two very different models are starting to emerge.

Session 3 of Capsule's Video First Summit brought together Oscar Estrada (AI video specialist, HubSpot), Victor Duran (creative director, Zendesk), and Alex Marston and Brad Dehaven (creative directors at ServiceNow) to talk about where they're seeing AI agents actually change things—and where the promise still outpaces the practice.

Two models for using AI agents in enterprise video production

Capsule CEO Champ Bennett framed the conversation around a distinction that kept coming up in his conversations with creative leaders:

The first model is agent-as-guide: AI that helps a non-video person—a product marketer, an instructional designer, a sales rep—make a video that they couldn't have made without it. The agent provides direction, structure, and guardrails at every step, functioning like an expert sitting over the person's shoulder.

The second model is fully autonomous production: AI agents that generate video at scale from existing data sources, without a human touching the output at all.

Both models are real, and both are already in use. But they solve different problems, and the organizations pursuing each one are in different places.

Why video shelf life is the forcing function

Before getting into what agents can do, it helps to understand what's driving the urgency.

At Zendesk, Duran described a problem that's become increasingly common at fast-moving software companies: videos are outdated almost before they're published.

"Historically, a video would take two to three weeks to execute," he said. "Maybe by the time it's out on YouTube, it's already outdated—there's a new feature or a new set of components that are available."

The product is evolving faster than the production timeline allows. And as video becomes the primary medium for explaining software to customers, that gap gets more expensive. A knowledge base article can be edited in minutes. A produced video takes weeks to redo.

This dynamic—short shelf life, high demand, slow production—is the environment that makes AI agents genuinely useful rather than just interesting.

What the agent-as-guide model looks like in practice

At Zendesk, Duran ran a pilot with product marketers using Capsule to create their own video content. The results were split.

"Some of our product marketers are really good storytellers. They know the product inside and out, and some of them made really great videos." But others—who had never touched video editing software—found it difficult even with guardrails built in. Good templates and brand controls reduce the technical barrier, but they don't replace the decision-making that goes into an edit.

That gap is exactly where an agent-as-guide model has the most to offer. Instead of giving a non-editor a simplified tool and leaving them to work it out, you give them an agent that moves through the production process with them—asking the right questions, making the structural decisions, handling the tedious parts.

"If there was an agent guiding them," Duran said, "that's going to make it so much faster."

Estrada at HubSpot framed it similarly: the goal is to use AI to keep the creative team's quality standards intact while making it possible for people outside the creative org to produce within those standards.

What fully autonomous AI video production looks like at scale

The second model operates at a different scale and solves a different problem.

At ServiceNow, Dehaven and Marston are thinking about what it looks like to pull from large datasets of existing content—documentation, internal knowledge bases, recorded presentations—and generate short-form video clips at volume, with limited or no human review per clip.

This is the use case where AI agents become a production infrastructure, not just a production aid. A content library that previously required weeks of editing work to process could generate hundreds of short assets in hours.

The consumer behavior context matters here too. As Marston pointed out, audiences now expect a Netflix-level experience—video everywhere, not dripped out one piece at a time. A learner moving through a course, a prospect researching a product, a customer trying to solve a support issue: none of them expect to see one video. They expect a video-first environment. Meeting that expectation at the volume required makes some form of autonomous production increasingly necessary.

How AI agents change the enterprise creative team's role

The common thread across all three companies: AI agents don't replace the creative team. They change what the creative team is responsible for.

When agents handle the production mechanics—formatting, captioning, template population, basic assembly—the creative team's responsibilities shift toward:

  1. Systems design — building the guardrails, templates, and quality criteria that agents operate within
  2. Quality oversight — reviewing outputs that require genuine storytelling judgment
  3. Agent training — encoding the team's standards into what the agent looks for and flags
  4. Defining the human line — identifying which decisions must stay with a person, regardless of efficiency gains

"The question," as Bennett put it, "is how do we bake that domain expertise into an agent to guide more and more people to do it?"

That's the design problem enterprise creative teams are working on right now. The tools are arriving faster than the frameworks for using them well.

Session 3 of Capsule's Video First Summit was hosted by Champ Bennett, co-founder and CEO of Capsule. The Video First Summit brought together enterprise creative and marketing leaders to share what they're learning as video becomes a company-wide function.