Skip to main content
The captions.animate job returns a video with word-level animated captions burned in. Speech is automatically transcribed, segmented into timed cues, and rendered as styled subtitles burned onto the source.
Accepts: video · Type: captions.animate

Request

import { createClient, outputUrl } from "@rendobar/sdk";

const client = createClient({ apiKey: "rb_YOUR_KEY" });

const job = await client.jobs.create({
  type: "captions.animate",
  inputs: {
    source: "https://cdn.rendobar.com/assets/examples/sample.mp4",
  },
  params: { preset: "hormozi", mode: "balanced" },
});

const result = await client.jobs.wait(job.id);
console.log(outputUrl(result));

Parameters

preset
enum
default:"hormozi"
Built-in caption style. One of: hormozi, mrbeast, tiktok, pill, karaoke, word-pop, minimal, subtitle-block, neon, gold, reveal.
language
string
ISO-639-1 language hint (e.g. en, es). Absent = English. Non-English forces the large-v3-turbo model regardless of mode.
mode
enum
default:"balanced"
Speed-vs-accuracy tradeoff for transcription.
  • fast: base.en, English only, ~3× faster, ~7-9% WER
  • balanced: small.en, default, ~5-7% WER
  • accurate: large-v3-turbo, ~3× slower, ~3-4% WER
style
object
Style overrides layered on top of preset. Any field set here wins. Unset fields fall through to the preset. Key fields:
  • font: font family name. A bundled family or the family name of a custom font you pass via the font input.
  • layout: phrase (up to two balanced lines) or word (one word at a time).
  • position: top, center, or bottom.
  • entrance: fade, pop, or bounce. Each word reveals at its spoken moment.
  • highlight: active-word treatment. { "mode": "color" | "box", "color": "#RRGGBB", "textColor": "#RRGGBB", "scale": 100-200 }.
  • fill, stroke, shadow, bold, case, marginV: colour, outline, and weight controls.
autoEmphasis
boolean
default:"false"
Let an in-stack model mark the important words so they render in the accent colour (the “gold words” look).
translateTo
enum
Translate captions before animating. One of es, fr, de, pt, it, nl. Omit to keep the original language.
captionData
boolean
default:"false"
Also return the word-level transcript plus SRT and VTT on output.data.

Inputs

source
string
required
URL of the source video. An uploaded asset’s content URL works too.
subtitles
string
URL of an SRT or VTT file. When provided, its text is animated and transcription is skipped.
font
string
URL of a TTF or OTF font file. Set style.font to that font’s family name to render captions in it. See Fonts.

Fonts

Eleven presets ship with sensible default fonts. To pick a different one, set style.font to a bundled family. No upload needed.
Bundled familyPass as style.font
Inter"Inter"
Bebas Neue"Bebas Neue"
Anton"Anton"
Montserrat"Montserrat"
Poppins"Poppins"
Roboto"Roboto"
Oswald"Oswald"
To use your own font, pass two values together:
  1. The font file URL as the font input.
  2. The font’s real family name as style.font.
{
  "type": "captions.animate",
  "inputs": {
    "source": "https://cdn.rendobar.com/assets/examples/sample.mp4",
    "font": "https://example.com/MyBrand.ttf"
  },
  "params": {
    "preset": "gold",
    "style": { "font": "My Brand" }
  }
}
style.font must match the family name baked into the file, not the filename or URL. If it does not match, the renderer falls back to the preset font. Like every job, this returns the standard response. See Job output for the shape and how to read the result. To burn existing subtitles instead, see caption.burn.