Captions & Subtitles in Zella
Quick answer: Open the Captions tab in Zella’s editor and click Transcribe audio — Zella transcribes on-device (no upload) into a word-level transcript. Pick one of 6 viral styles (Word Pop, Hormozi, Karaoke, Pop Box, Neon, Clean), set size/color/position, and the captions burn into your export. Need a file? Export SRT, VTT, or CSV. Captions stay aligned even after you cut the video.
On this page: why captions matter · transcribe · styles · active-word highlight · size & position · edit text · ripple · export SRT/VTT · impact · FAQ
Why add captions to a video?
Most social video is watched on mute, and captions are the single highest-leverage edit for watch-time and accessibility. Zella transcribes on-device, renders viral-style animated captions, and exports standard subtitle files — no third-party caption app needed.
Figure: the Captions tab. ① Transcribe, ② pick a preset, ③ set size & position, ④ export SRT/VTT/CSV.
How to transcribe a video
- Open the Captions tab (right inspector).
- Confirm the Language (default English; one language per pass).
- Click Transcribe audio.
Zella transcribes on-device and builds a word-level transcript — every word, in order, with ascending timestamps (not just the last phrase) — then aligns caption lines to the speech.
The clip needs spoken audio; a silent screen capture has nothing to caption.
How to choose a caption style
Zella ships 6 viral caption presets — pick the one that fits your channel:
| Preset | Look | Best for |
|---|---|---|
| Word Pop | Words pop in one at a time | High-energy social |
| Hormozi | Bold, high-contrast, punchy | Hooks, sales |
| Karaoke | Active word highlighted as spoken | Retention |
| Pop Box | Words in a filled box | Clean + bold |
| Neon | Glowing neon text | Stylized reels |
| Clean | Minimal subtitles | Tutorials, courses |
Select one in the Captions tab; each looks visibly different.
What is active-word highlighting?
With styles like Karaoke (and word-by-word styles), the active word is highlighted as it’s spoken, advancing word-by-word within each line. This is the “retention caption” look that keeps viewers locked in — it’s on automatically for styles that use it.
How to set caption font size, color, and position
In the Captions tab set:
- Font size — larger for vertical/social, smaller for desktop tutorials.
- Color — match your brand (and it stays colored even on a black-and-white grade).
- Position — bottom (classic) or center (reels).
- Fit/Fill behavior so captions sit safely inside a reframed vertical video without overflowing.
How to edit caption text
Transcription is excellent but not perfect (names, jargon, brand terms):
- Click a caption line in the Captions tab and edit the text.
- Your edit sticks and renders in the preview and export.
Do captions stay aligned after cuts?
Yes. Captions are time-aware: if you ripple-delete a chunk (chapter 9) or run Remove Silences/Fillers (chapter 10), later captions slide earlier to match the new timeline and stay on the right words.
How to export SRT, VTT, or CSV subtitles
For YouTube’s caption upload, a web player, or localization:
- In the Captions tab, use export to write SRT, VTT, or CSV.
- The file opens cleanly in any standard tool, with a cue count matching your caption lines.
You can also import an SRT to rebuild a caption track from an external transcript.
What impact captions have
For content creators and video editors, captions deliver outsized results:
- Higher watch-time and completion — sound-off viewers stay because they can follow along; animated word-by-word captions measurably hold attention.
- Wider reach and accessibility — captioned video is watchable by deaf/hard-of-hearing viewers and in sound-sensitive places (offices, transit), expanding your audience.
- Better SEO and discovery — an exported SRT feeds platform transcripts and search.
- Faster production — on-device transcription plus presets replaces a separate captioning tool and hours of manual timing.
Captions FAQ
How do I add subtitles to a video on a Mac for free? Open Zella’s Captions tab → Transcribe audio → pick a style. It’s on-device, no subscription.
Are captions burned in or a separate file? Both — they burn into MP4/MOV/GIF exports, and you can also export SRT/VTT/CSV.
Do captions keep their color on a black-and-white video? Yes — the footage desaturates while captions retain their color.
Will captions stay in sync if I cut the video? Yes — they ripple with the timeline and stay on the right words.
Can I fix a misheard word? Yes — click the caption line and edit the text; it sticks and renders.
Pro tips & gotchas
- Transcription runs per-utterance and chunks long videos — give it a moment on big files.
- Pick a preset first, then tweak size/position; use Fit to keep text on one line or Fill for emphasis.
- Export SRT/VTT to upload captions to YouTube, or CSV to edit the transcript in a spreadsheet.
- Captions stay in sync across multi-segment edits — Zella renders them through a video composition automatically.
Related: AI cleanup (silences/fillers) → · Reframe for vertical → · Color & black-and-white → · Export →