With the magic of AI, now anyone can add B-roll to their videos—no money or special skills required. A step-by-step guide for how to do it.
Nearly every great thing that’s happened to Capsule—our largest volume of inbound leads, press coverage, raising a seed round—has come as a result of the videos we’ve made.
One key component that’s made these videos perform so well? Engaging, relevant B-roll.
Here you’ll learn what B-roll is, why professionals use B-roll to make videos more engaging, and how now, with the magic of AI, you can add it to your own videos without any skills, time, or money.
Video is working wonders for content teams—and not just our team.
Video is the most-used format for marketers, and it also has the highest marketing ROI of any other format by far.
But many content teams are still publishing either:
📈👎 Lots of videos, but they’re unengaging or low-quality
🌟🙅 High-quality videos, but they don’t have the resources to make as many as they need
We’ll first cover how adding B-roll can address the first issue (making videos more engaging), and then how using AI can help with the second issue (scaling higher-quality production).
When you're telling a story with video, you’re not relying on the precision of a single image to tell that story like you would with a still image; you're stitching together hundreds or hundreds of thousands of images over a period of time. And as you're telling that linear story, bringing in other assets helps support that narrative.
B-roll refers to any supplemental photos or videos that are cut in or layered on top of the primary footage.
Using B-roll harnesses that magical force that makes video the ultimate format: combining our auditory brain (the words spoken) with our visual brain (the B-roll) to leave a stronger impression than each component could on its own.
Benefits of using B-roll in your videos:
Research shows that human attention spans have been decreasing. At the same time, our capacity for long attention spans is also increasing through engaging storytelling (👋 hi, binge-watching 6 hours of Succession).
B-roll is a simple way to keep viewers’ short and long attention spans: it adds variety to short-form videos, and it also adds layers of meaning in longer-form videos to create more resonant stories.
What does B-roll look like in action? Here are a few different types of videos that benefit from adding B-roll:
Bloomberg posts news updates to its YouTube Shorts channel and relies on B-roll to help add more meaning to a story or establish its setting.
This video about gold pricing uses a combination of stock footage, text, and custom charts to visually support the script and tell a more engaging story.
A video about Singapore’s soaring rents uses B-roll footage of Singapore to give the story more context and depth.
B-roll is the perfect (and necessary) addition for a video that's promoting or recapping an event. Here, Capsule customer Twilio uses only B-roll from previous events, plus some text and motion graphics, to promote an upcoming event series.
Adding B-roll to podcast video clips is a simple way to add some visual variation to an otherwise static talking head (where the subject is talking directly into the camera).
While B-roll clearly adds a layer of depth to videos, most content teams can’t keep up with the demand for producing more engaging and professional-looking videos.
Marketers often run into the same three constraints of not having enough:
No matter which method you use for sourcing B-roll, the process has always been time-consuming.
The 3 traditional methods for sourcing B-roll:
Here’s how those benefits break down when you’re trying to make videos that are cost-efficient, high-quality, and quick to produce:
Thanks to the explosion in the popularity of generative AI tools like OpenAI’s ChatGPT and Midjourney, content teams can generate an entire new class of assets for their videos quickly and cheaply.
Below is a step-by-step process of how you can stitch these AI tools together to generate completely customized B-roll imagery.
Choose the moments, phrases, or stories in your video that you want highlighted with B-roll. Make sure these moments are spaced out in a way that will keep your viewers engaged throughout the video.
Open ChatGPT and ask it to create a prompt for your B-roll image. Here’s an example you can use as a template:
You're a video editor in the process of editing a 40-second video. The video is for content creators and it's about “the importance of storytelling and using AI to tell more stories” and has an “energetic” tone to it. Can you suggest a prompt to generate a B-roll image that supports this dialogue:
“The only thing that matters is whether or not you have a great story to tell.”
Open Midjourney (or another AI image generator like Stable Diffusion or DreamStudio) and copy and paste the prompt from ChatGPT. Keep tweaking the prompt until you get an image you’re happy with.
Using your video editing software of choice, add your image in at the appropriate time. For static images, it’s best to add in some movement effects as well.
You can recreate the “Ken Burns” effect, which adds panning and zooming to a still image, by adding keyframes to the Zoom and Position effects.
The technology outlined above is exciting and fun to play with, but the process can still be fairly complex. Capsule has automated all of those steps so that you can achieve the same outcome in seconds.
Here’s how Capsule uses three bits of technology to instantly add custom B-roll to your video:
Now, all you have to do is upload your video into Capsule, highlight the text from the auto-generated transcript that you want to create an image for, tweak the prompt, click Use image, and your completely custom image appears in your video.
The process looks essentially like this:
To see how Capsule’s AI Studio automates this whole process, watch the demo below:
While generative AI is excellent for visualizing abstract concepts and adding depth to videos, it won’t solve every type of video’s B-roll needs.
Here are a few examples where generative AI may not be necessary or helpful:
Certain news coverage and stories about historical events (including documentaries) should use archival images and video when they’re available.
Likewise, if you’re covering or recapping an event, you’ll want actual footage from that event: the setting, the people there, the presentations, the food.
Here’s how you could add that type of B-roll instead:
In general, hyper-specific references won’t gain much from AI-generated B-roll. But an abstract or generalized concept can benefit from layering in an AI-generated image.
Video is the richest, most contextual, and engaging format for storytelling—and platforms will continue to prioritize it. But video is also the most challenging format to create.
With AI, content teams can now bypass hours of tedious, manual work and focus on what really matters: telling a great story.
Our mission at Capsule is to allow anyone to create professional-looking video without any expertise, and we’re excited about how AI will continue to eliminate those barriers for non-professionals.
To get access to our AI-assisted video editor that automates B-roll sourcing and editing, join the waitlist here.
The rich text element allows you to create and format headings, paragraphs, blockquotes, images, and video all in one place instead of having to add and format them individually. Just double-click and easily create content.
A rich text element can be used with static or dynamic content. For static content, just drop it into any page and begin editing. For dynamic content, add a rich text field to any collection and then connect a rich text element to that field in the settings panel. Voila!
A rich text element can be used with static or dynamic content. For static content, just drop it into any page and begin editing. For dynamic content, add a rich text field to any collection and then connect a rich text element to that field in the settings panel. Voila!
A rich text element can be used with static or dynamic content. For static content, just drop it into any page and begin editing. For dynamic content, add a rich text field to any collection and then connect a rich text element to that field in the settings panel. Voila!
Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of" nested selector system.
Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of" nested selector system.
Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of" nested selector system. added to the rich text element using the "When inside of" nested selector system.added to the rich text element using the "When inside of" nested selector system.
This
This
that
Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of" nested selector system.
Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of" nested selector system.