Animated word-level captions

The captions.animate job returns a video with word-level animated captions burned in. Speech is automatically transcribed, segmented into timed cues, and rendered as styled subtitles burned onto the source.

Accepts: video · Type: captions.animate

Request

import { createClient, outputUrl } from "@rendobar/sdk";

const client = createClient({ apiKey: "rb_YOUR_KEY" });

const job = await client.jobs.create({
  type: "captions.animate",
  inputs: {
    source: "https://cdn.rendobar.com/assets/examples/sample.mp4",
  },
  params: { preset: "hormozi", mode: "balanced" },
});

const result = await client.jobs.wait(job.id);
console.log(outputUrl(result));

Parameters

preset

enum

default:"hormozi"

Built-in caption style. One of: hormozi, mrbeast, tiktok, pill, karaoke, word-pop, minimal, subtitle-block, neon, gold, reveal.

language

string

ISO-639-1 language hint (e.g. en, es). Absent = English. Non-English forces the large-v3-turbo model regardless of mode.

mode

enum

default:"balanced"

Speed-vs-accuracy tradeoff for transcription.

fast: base.en, English only, ~3× faster, ~7-9% WER
balanced: small.en, default, ~5-7% WER
accurate: large-v3-turbo, ~3× slower, ~3-4% WER

style

object

Style overrides layered on top of preset. Any field set here wins. Unset fields fall through to the preset. Key fields:

font: font family name. A bundled family or the family name of a custom font you pass via the font input.
layout: phrase (up to two balanced lines) or word (one word at a time).
position: top, center, or bottom.
entrance: fade, pop, or bounce. Each word reveals at its spoken moment.
highlight: active-word treatment. { "mode": "color" | "box", "color": "#RRGGBB", "textColor": "#RRGGBB", "scale": 100-200 }.
fill, stroke, shadow, bold, case, marginV: colour, outline, and weight controls.

autoEmphasis

boolean

default:"false"

Let an in-stack model mark the important words so they render in the accent colour (the “gold words” look).

translateTo

enum

Translate captions before animating. One of es, fr, de, pt, it, nl. Omit to keep the original language.

captionData

boolean

default:"false"

Also return the word-level transcript plus SRT and VTT on output.data.

Inputs

source

string

required

URL of the source video. An uploaded asset’s content URL works too.

subtitles

string

URL of an SRT or VTT file. When provided, its text is animated and transcription is skipped.

font

string

URL of a TTF or OTF font file. Set style.font to that font’s family name to render captions in it. See Fonts.

Fonts

Eleven presets ship with sensible default fonts. To pick a different one, set style.font to a bundled family. No upload needed.

Bundled family	Pass as `style.font`
Inter	`"Inter"`
Bebas Neue	`"Bebas Neue"`
Anton	`"Anton"`
Montserrat	`"Montserrat"`
Poppins	`"Poppins"`
Roboto	`"Roboto"`
Oswald	`"Oswald"`

To use your own font, pass two values together:

The font file URL as the font input.
The font’s real family name as style.font.

{
  "type": "captions.animate",
  "inputs": {
    "source": "https://cdn.rendobar.com/assets/examples/sample.mp4",
    "font": "https://example.com/MyBrand.ttf"
  },
  "params": {
    "preset": "gold",
    "style": { "font": "My Brand" }
  }
}

style.font must match the family name baked into the file, not the filename or URL. If it does not match, the renderer falls back to the preset font. Like every job, this returns the standard response. See Job output for the shape and how to read the result. To burn existing subtitles instead, see caption.burn.

​Request

​Parameters

​Inputs

​Fonts

Request

Parameters

Inputs

Fonts