Kaltura VOD Avatar Studio API¶

The VOD Avatar Studio lets you create pre-recorded avatar video presentations programmatically. You can select an AI avatar, write scenes with narration text, optionally use AI to compose scripts from existing video content, and generate a professional video of the avatar delivering the content. The generated video is saved as a standard Kaltura media entry.

Server-Side API Base URL: https://video-avatar.$REGION.ovp.kaltura.com/api/v1 (default region: nvp1)
Auth: Authorization: Bearer $KS header
Format: JSON request/response

Widget Base URL: https://unisphere.nvp1.ovp.kaltura.com/v1 (for browser embedding)

This guide covers two integration paths: - Server-side API (sections 4–10) — Full programmatic control over avatar videos: create, compose, generate, manage
- Widget embed (section 11) — Drop-in browser UI for end users via the Unisphere framework

For real-time conversational avatars that hold live AI-powered conversations, see the Conversational Avatar Embed.

1. When to Use¶

Training video production — Generate professional training videos with AI presenters without recording equipment or on-camera talent
Content localization — Create avatar-narrated versions of content in multiple languages from translated scripts
Executive communications — Produce avatar-delivered announcements, updates, or presentations from written scripts
Session highlights — Turn recorded webinars or meetings into short avatar-narrated summary videos using AI composition
Video explainers — Generate explainer videos from documents, video captions, or a text brief using AI composition
Automated video pipelines — Build server-side workflows that create avatar videos without any browser UI

2. Prerequisites¶

A valid Kaltura Session (KS) — a user-level session (type=0) is sufficient; no admin privileges required. See Session Guide
The VOD Avatar feature enabled on your account — contact your Kaltura account manager
For AI composition: source entries must have captions or transcripts available

3. Architecture¶

The VOD Avatar system has two layers:

Layer	URL Pattern	Purpose
Server-side API	`https://video-avatar.$REGION.ovp.kaltura.com/api/v1/`	Video project CRUD, AI composition, video generation, avatar management
Unisphere widget	`https://unisphere.$REGION.ovp.kaltura.com/v1/`	Browser-based studio UI (uses the server-side API internally)

Server-side API flow:

List avatar templates — avatarTemplate/list returns the 36 available AI presenters
Create an avatar — avatar/upsert configures a template with a background
Create a video project — video/add creates a project with scenes and narration
Optionally compose with AI — video/compose generates scenes from source content
Preview audio — video/previewAudio lets you hear the TTS narration before generating
Generate the video — video/generate starts rendering; poll video/get until status is ready
Retrieve the Kaltura entry — The entryId field on the completed video links to the generated media entry

Video status lifecycle:

draft ──→ composing ──→ composed ──→ generating ──→ ready
  │          │                          │
  │          ↓                          ↓
  │       compose-error            generate-error
  │          │                          │
  └──────────┘──── resetStatus ─────────┘

Status	Meaning
`draft`	New project, scenes can be edited
`composing`	AI is generating scenes from source content (read-only)
`composed`	AI composition complete, scenes populated and editable
`compose-error`	AI composition failed — use `resetStatus` to return to `draft`
`generating`	Video is being rendered (read-only)
`ready`	Video generation complete, `entryId` populated with the Kaltura media entry
`generate-error`	Generation failed — use `resetStatus` to return to `composed` or `draft`

Scenes cannot be modified while the video is in composing or generating status.

4. Auth & Headers¶

All server-side API endpoints require a valid Kaltura Session (KS). A user-level KS (type=0) is sufficient — no admin privileges are required. The service authenticates the KS and extracts the partnerId and userId to scope all data: each user only sees and manages their own videos and avatars.

# Generate a KS (type=0 user session is sufficient)
KS=$(curl -s -X POST "$KALTURA_SERVICE_URL/service/session/action/start" \
  -d "format=1" \
  -d "secret=$KALTURA_ADMIN_SECRET" \
  -d "partnerId=$KALTURA_PARTNER_ID" \
  -d "type=0" \
  -d "userId=creator@example.com" \
  -d "expiry=86400" | tr -d '"')

# All API calls use Bearer auth with JSON body
AVATAR_API="https://video-avatar.nvp1.ovp.kaltura.com/api/v1"

Every request uses: - Method: POST
- Header: Authorization: Bearer $KS
- Header: Content-Type: application/json
- Body: JSON

KS requirements: - No special privilege strings are needed (no disableentitlement or custom privileges)
- Both type=0 (USER) and type=2 (ADMIN) sessions work — there is no session-type check
- Data isolation is per-user: an admin KS does not grant cross-user visibility in this service
- The partner account must have the VOD Avatar feature enabled — contact your Kaltura account manager if avatar or video endpoints return authorization errors
- If the KS contains a urirestrict privilege, the restricted URI pattern must match the API path

5. Avatar Templates & Configuration¶

Before creating a video, you need an avatar — a specific AI presenter with a chosen background. Avatars are built in two steps:

Pick a template — Each template is a predefined AI character with a unique face, voice, and speaking style. You cannot create custom characters; you choose from the available set.
Configure it as an avatar — Combine the template with a background (solid color, library image, or custom image from your Kaltura account). This creates a reusable avatar configuration tied to your user.

The avatar ID is then passed to video.add to assign the presenter for that video project.

Step 1: List Available Templates¶

Call avatarTemplate/list to get the full set of available AI characters. Each template has an id (used when creating avatars) and a display name:

curl -s -X POST "$AVATAR_API/avatarTemplate/list" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d '{}'

Response:

{
  "objects": [
    { "id": "jane", "name": "Jane" },
    { "id": "adam", "name": "Adam" },
    { "id": "amir", "name": "Amir" }
  ],
  "totalCount": 36
}

The id field (e.g., "jane", "adam") is what you pass as templateId when creating an avatar. The full set of 36 templates: adam, amir, ben, cristina, david, derek, dylan, elizabeth, gloria, harper, harry, henry, james, jane, jason, jennifer, julia, kevin, larry, lisa, maria, maya, mia, miguel, ming, rita, sam, sara, sharon, sophia, taylor, theodore, tim, victoria, william, yasmin.

Step 2: Create an Avatar (`avatar/upsert`)¶

An avatar pairs a template with a background. The upsert action is idempotent — if an avatar with the same template + background combination already exists for your user, it returns the existing one instead of creating a duplicate. This means you can safely call upsert every time without checking for existing avatars first.

AVATAR_RESULT=$(curl -s -X POST "$AVATAR_API/avatar/upsert" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d '{
    "templateId": "jane",
    "background": { "type": "color", "color": "#CEEEDB" }
  }')
AVATAR_ID=$(echo "$AVATAR_RESULT" | python3 -c "import sys,json; print(json.load(sys.stdin)['id'])")
echo "Avatar ID: $AVATAR_ID"

Request fields:

Field	Type	Required	Description
`templateId`	string	yes	One of the template IDs from `avatarTemplate/list` (e.g., `"jane"`, `"adam"`)
`background`	object	yes	Background configuration — structure depends on the `type` field (see below)

background object:

Field	Type	Required	Description
`type`	string	yes	One of: `"color"`, `"library"`, `"entry"`
`color`	string	if type=`"color"`	Hex color code (e.g., `"#CEEEDB"`, `"#FFFFFF"`)
`id`	string	if type=`"library"`	Predefined background image ID (lowercase alphanumeric and hyphens only, e.g., `"office-1"`)
`entryId`	string	if type=`"entry"`	Kaltura entry ID of a custom background image from your account

Background type examples:

# Solid color background
'{ "templateId": "adam", "background": { "type": "color", "color": "#1A1A2E" } }'

# Predefined library image
'{ "templateId": "adam", "background": { "type": "library", "id": "office-1" } }'

# Custom image from your Kaltura account
'{ "templateId": "adam", "background": { "type": "entry", "entryId": "0_bg7x9k2m" } }'

Response fields:

Field	Type	Description
`id`	string	The avatar ID — pass this as `avatarId` when creating video projects
`templateId`	string	The template used
`background`	object	The background configuration
`createdAt`	string	ISO 8601 creation timestamp
`updatedAt`	string	ISO 8601 last update timestamp

Get an Avatar¶

Retrieve an existing avatar configuration by ID:

curl -s -X POST "$AVATAR_API/avatar/get" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d "{ \"id\": \"$AVATAR_ID\" }"

Returns the same response structure as avatar/upsert.

Preview an Avatar¶

Get a PNG image showing how the avatar looks with its configured background. Use this to display a visual preview before creating videos:

curl -s -X POST "$AVATAR_API/avatar/preview" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d "{ \"id\": \"$AVATAR_ID\" }" \
  --output avatar_preview.png

Returns image/png binary data. The preview shows the avatar character composited on the configured background.

6. Video Project Management¶

A video project is the central object — it holds the avatar assignment, an ordered list of scenes (each with narration text and an optional layout), and tracks the generation status. You create a project, populate its scenes (manually or via AI composition), then generate the final video.

Create a Video Project¶

The video/add endpoint creates a new project. You must provide a name and an avatarId (from section 5). Scenes can be included at creation time or added later via video/update.

VIDEO_RESULT=$(curl -s -X POST "$AVATAR_API/video/add" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d "{
    \"name\": \"Q1 Training Overview\",
    \"avatarId\": \"$AVATAR_ID\",
    \"scenes\": [
      {
        \"layoutType\": \"full-screen\",
        \"narration\": { \"text\": \"Welcome to the Q1 training overview.\" }
      },
      {
        \"layoutType\": \"broll\",
        \"narration\": { \"text\": \"Here we see the key metrics from last quarter.\" },
        \"broll\": {
          \"entryId\": \"$BROLL_ENTRY_ID\",
          \"startTime\": 30
        }
      }
    ]
  }")
VIDEO_ID=$(echo "$VIDEO_RESULT" | python3 -c "import sys,json; print(json.load(sys.stdin)['id'])")
echo "Video ID: $VIDEO_ID"

Top-Level Request Fields¶

Field	Type	Required	Description
`name`	string	yes	Display name for the video project
`avatarId`	string	yes	The avatar ID returned by `avatar/upsert` — determines which AI presenter appears in the video
`scenes`	array of scene objects	no	Ordered list of scenes. Can be empty at creation and populated later via `video/update` or `video/compose`

Scene Object¶

Each element in the scenes array represents one segment of the video. A video can have up to 20 scenes.

Field	Type	Required	Description
`layoutType`	string enum	no	How the scene is displayed. Default: `"full-screen"`
`narration`	object	no	The spoken content for this scene (see narration fields below)
`broll`	object	no	Background video configuration (see broll fields below). The broll data is stored regardless of `layoutType` — you can set it up front and switch `layoutType` to `"broll"` later

layoutType enum:

Value	Visual	Description
`"full-screen"`	Avatar fills the frame	The avatar character is rendered full-screen with its configured background. Use for introductions, conclusions, and talking-head segments
`"broll"`	Avatar overlaid on video	The avatar is composited as a smaller overlay on top of a background video clip. Use when referencing visual content like charts, demos, or slides

Narration Object¶

Field	Type	Required	Description
`text`	string	yes (if narration provided)	The script text the avatar will speak. This is converted to audio via text-to-speech during generation
`avatarId`	string	no	Override the video-level avatar for this specific scene. Omit to use the project's default `avatarId`. Useful for multi-presenter videos where different scenes feature different characters

Broll Object¶

Field	Type	Required	Description
`entryId`	string	yes (if broll provided)	Kaltura entry ID of the background video to display behind the avatar
`startTime`	number	yes (if broll provided)	Start time in seconds within the background video. The clip plays from this point for the duration of the scene's narration

Response Fields¶

The response returns the full video object. Key fields:

Field	Type	Description
`id`	string	The video project ID — use as `$VIDEO_ID` in all subsequent calls
`partnerId`	number	Your Kaltura partner ID
`userId`	string	The KS user who created the project
`status`	string enum	Current status — starts as `"draft"` (see status lifecycle in section 3)
`name`	string	The project name
`avatarId`	string	The assigned avatar ID
`scenes`	array	The scenes array as submitted
`entryId`	string or null	Kaltura media entry ID of the generated video — populated only when status is `"ready"`
`composeParams`	object or null	The compose parameters if AI composition was used (see section 7)
`createdAt`	string	ISO 8601 creation timestamp
`updatedAt`	string	ISO 8601 last update timestamp

Scene Examples¶

Full-screen scene — avatar talks directly to camera:

{ "layoutType": "full-screen", "narration": { "text": "Let me introduce the agenda." } }

B-roll scene — avatar overlaid on a video clip starting at the 45-second mark:

{ "layoutType": "broll", "narration": { "text": "As you can see in this demo..." }, "broll": { "entryId": "1_xyz789", "startTime": 45 } }

Scene with per-scene avatar override — different presenter for this scene:

{ "layoutType": "full-screen", "narration": { "text": "Hi, I am Adam.", "avatarId": "ADAM_AVATAR_ID" } }

Minimal scene — layout defaults to "full-screen":

{ "narration": { "text": "This scene uses the default full-screen layout." } }

Get a Video Project¶

curl -s -X POST "$AVATAR_API/video/get" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d "{ \"id\": \"$VIDEO_ID\" }"

Returns the full VideoDto including status, entryId (if generated), and all scenes.

Update a Video Project¶

curl -s -X POST "$AVATAR_API/video/update" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d "{
    \"id\": \"$VIDEO_ID\",
    \"name\": \"Q1 Training — Updated\",
    \"scenes\": [
      {
        \"layoutType\": \"full-screen\",
        \"narration\": { \"text\": \"Updated welcome message for Q1 training.\" }
      }
    ]
  }"

Request fields:

Field	Type	Required	Description
`id`	string	yes	Video project ID
`name`	string	no	Updated name
`avatarId`	string	no	Updated avatar ID
`scenes`	array	no	Replaces all scenes (removed trailing scenes are cleaned up)

Scenes cannot be modified while status is composing or generating — the API returns VIDEO_IS_PROCESSING.

curl -s -X POST "$AVATAR_API/video/list" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d '{
    "filter": { "orderBy": "-createdAt" },
    "pager": { "offset": 0, "limit": 10 }
  }'

Filter options:

Field	Type	Values
`orderBy`	string	`"-createdAt"`, `"createdAt"`, `"-updatedAt"`, `"updatedAt"` (default: `"-createdAt"`)

Pager options:

Field	Type	Description
`offset`	number	Number of results to skip (0-based)
`limit`	number	Maximum number of results to return

Response:

{
  "objects": [ ... ],
  "totalCount": 42
}

Delete a Video Project¶

curl -s -X POST "$AVATAR_API/video/delete" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d "{ \"id\": \"$VIDEO_ID\" }"

7. AI Composition¶

The compose action uses AI to generate scenes from source video content. It analyzes captions and transcripts from the provided entries and creates a structured narration script.

Compose Scenes from Content¶

curl -s -X POST "$AVATAR_API/video/compose" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d "{
    \"id\": \"$VIDEO_ID\",
    \"formatType\": \"session-highlights\",
    \"duration\": 120,
    \"entryIds\": [\"$SOURCE_ENTRY_1\", \"$SOURCE_ENTRY_2\"],
    \"userBrief\": \"Focus on the product roadmap announcements\",
    \"generateName\": true
  }"

Request fields:

Field	Type	Required	Description
`id`	string	yes	Video project ID
`formatType`	string	yes	`"session-highlights"` or `"video-explainer"` (see below)
`duration`	number	yes	Target video duration in seconds. Min: 1, max: 1200 (20 minutes)
`entryIds`	array	yes	Kaltura entry IDs with captions to analyze. Max: 5 entries
`userBrief`	string	no	Describes the video goals, style, or focus areas for the AI
`generateName`	boolean	no	Auto-generate a video name from the content

Format types:

Format	Source Content	Output
`session-highlights`	Video captions only	Short highlights video narrated by the avatar summarizing the key points
`video-explainer`	Video captions + documents	Explainer video combining multiple sources into a coherent narrative

The compose action: 1. Transitions the video status to composing
2. Extracts captions and documents from the source entries
3. Uses AI (AWS Bedrock Claude) to generate a structured scene-by-scene narration
4. Populates the video's scenes array with the generated content
5. Transitions to composed on success, or compose-error on failure

Source entries must have captions or transcripts — the API returns CAPTIONS_NOT_FOUND if no text content is available.

Response: Returns the video with status composing. Poll video.get until status changes to composed.

8. Audio Preview¶

Preview the text-to-speech narration for a specific scene before generating the full video:

# Returns audio/mpeg binary
curl -s -X POST "$AVATAR_API/video/previewAudio" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d "{ \"id\": \"$VIDEO_ID\", \"sceneId\": 0 }" \
  --output scene_preview.mp3

Request fields:

Field	Type	Required	Description
`id`	string	yes	Video project ID
`sceneId`	number	yes	Scene index (0-based)

The scene must have non-empty narration text — returns SCENE_EMPTY_NARRATION otherwise.

Use previewAudioStream for streaming playback instead of downloading the full file:

curl -s -X POST "$AVATAR_API/video/previewAudioStream" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d "{ \"id\": \"$VIDEO_ID\", \"sceneId\": 0 }" \
  --output scene_stream.mp3

9. Video Generation¶

Generate the Video¶

Once scenes are ready (status is draft or composed), generate the final video:

curl -s -X POST "$AVATAR_API/video/generate" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d "{ \"id\": \"$VIDEO_ID\" }"

The generate action: 1. Transitions the video status to generating
2. For each scene: generates TTS audio via ElevenLabs, then renders the avatar video
3. Stitches all scene videos together with green-screen replacement and resolution normalization (1920×1080)
4. Uploads the final video as a Kaltura media entry
5. Sets entryId on the video and transitions to ready

Response: Returns the video with status generating. Poll video.get until status becomes ready.

Poll for Completion¶

# Poll every 10 seconds until status is "ready" or an error
while true; do
  RESULT=$(curl -s -X POST "$AVATAR_API/video/get" \
    -H "Authorization: Bearer $KS" \
    -H "Content-Type: application/json" \
    -d "{ \"id\": \"$VIDEO_ID\" }")

  STATUS=$(echo "$RESULT" | python3 -c "import sys,json; print(json.load(sys.stdin)['status'])")
  echo "Status: $STATUS"

  if [ "$STATUS" = "ready" ]; then
    ENTRY_ID=$(echo "$RESULT" | python3 -c "import sys,json; print(json.load(sys.stdin).get('entryId',''))")
    echo "Generated entry: $ENTRY_ID"
    break
  elif [ "$STATUS" = "generate-error" ]; then
    echo "Generation failed"
    break
  fi

  sleep 10
done

Generation time depends on the number of scenes and narration length. Expect 1–5 minutes for typical videos.

Reset Status After Error¶

If composition or generation fails, reset the status to allow retrying:

curl -s -X POST "$AVATAR_API/video/resetStatus" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d "{ \"id\": \"$VIDEO_ID\" }"

Reset behavior: - compose-error → resets to draft
- generate-error → resets to composed (if previously composed) or draft
- Other statuses → returns CANNOT_RESET_STATUS

10. Complete Server-Side Workflow¶

This example creates an avatar video from scratch using only the server-side API:

AVATAR_API="https://video-avatar.nvp1.ovp.kaltura.com/api/v1"

# 1. Create an avatar with a color background
AVATAR=$(curl -s -X POST "$AVATAR_API/avatar/upsert" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d '{
    "templateId": "jane",
    "background": { "type": "color", "color": "#CEEEDB" }
  }')
AVATAR_ID=$(echo "$AVATAR" | python3 -c "import sys,json; print(json.load(sys.stdin)['id'])")

# 2. Create a video project with scenes
VIDEO=$(curl -s -X POST "$AVATAR_API/video/add" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d "{
    \"name\": \"Product Update\",
    \"avatarId\": \"$AVATAR_ID\",
    \"scenes\": [
      {
        \"layoutType\": \"full-screen\",
        \"narration\": { \"text\": \"Hello! Today I will walk you through our latest product updates.\" }
      },
      {
        \"layoutType\": \"full-screen\",
        \"narration\": { \"text\": \"We have three major features to cover. Let us get started.\" }
      }
    ]
  }")
VIDEO_ID=$(echo "$VIDEO" | python3 -c "import sys,json; print(json.load(sys.stdin)['id'])")

# 3. Preview audio for the first scene
curl -s -X POST "$AVATAR_API/video/previewAudio" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d "{\"id\": \"$VIDEO_ID\", \"sceneId\": 0}" \
  --output scene0_preview.mp3

# 4. Generate the video
curl -s -X POST "$AVATAR_API/video/generate" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d "{\"id\": \"$VIDEO_ID\"}"

# 5. Poll until ready
while true; do
  RESULT=$(curl -s -X POST "$AVATAR_API/video/get" \
    -H "Authorization: Bearer $KS" \
    -H "Content-Type: application/json" \
    -d "{\"id\": \"$VIDEO_ID\"}")
  STATUS=$(echo "$RESULT" | python3 -c "import sys,json; print(json.load(sys.stdin)['status'])")
  echo "Status: $STATUS"
  [ "$STATUS" = "ready" ] || [ "$STATUS" = "generate-error" ] && break
  sleep 10
done

# 6. Get the generated Kaltura entry ID
ENTRY_ID=$(echo "$RESULT" | python3 -c "import sys,json; print(json.load(sys.stdin).get('entryId',''))")
echo "Generated Kaltura entry: $ENTRY_ID"

Manual Storyboard with Multi-Source B-Roll¶

When you need precise control over the narrative and which video clips appear in each scene, author the scenes yourself instead of using AI composition (section 7). This approach lets you pick exact source entries, set b-roll start times, and interleave full-screen and b-roll layouts in any order.

The key difference: with AI composition (video/compose), you provide source entries and the AI decides how to structure the narrative and which clips to reference. With a manual storyboard, you write each scene's narration and explicitly assign b-roll entries and timestamps — the output matches your storyboard exactly.

AVATAR_API="https://video-avatar.nvp1.ovp.kaltura.com/api/v1"

# 1. Create (or reuse) an avatar
AVATAR=$(curl -s -X POST "$AVATAR_API/avatar/upsert" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d '{
    "templateId": "jane",
    "background": { "type": "color", "color": "#1A1A2E" }
  }')
AVATAR_ID=$(echo "$AVATAR" | python3 -c "import sys,json; print(json.load(sys.stdin)['id'])")

# 2. Create a video with manually authored scenes mixing two source entries
#    - Scenes 0 and 5: full-screen (avatar on background, no b-roll)
#    - Scenes 1 and 4: b-roll from $SOURCE_ENTRY_A (e.g., a keynote recording)
#    - Scenes 2 and 3: b-roll from $SOURCE_ENTRY_B (e.g., a tutorial)
VIDEO=$(curl -s -X POST "$AVATAR_API/video/add" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d "{
    \"name\": \"Cloud AI Meets Neural Networks\",
    \"avatarId\": \"$AVATAR_ID\",
    \"scenes\": [
      {
        \"layoutType\": \"full-screen\",
        \"narration\": { \"text\": \"Welcome to this deep dive into two pillars of modern artificial intelligence. Today we connect the dots between cloud infrastructure powering AI at scale and the neural network architectures that make it all possible.\" }
      },
      {
        \"layoutType\": \"broll\",
        \"narration\": { \"text\": \"At AWS re:Invent 2023, Amazon unveiled its vision for generative AI infrastructure. Purpose-built chips like Trainium and Inferentia are redefining how we train and deploy large language models in the cloud.\" },
        \"broll\": { \"entryId\": \"$SOURCE_ENTRY_A\", \"startTime\": 30 }
      },
      {
        \"layoutType\": \"broll\",
        \"narration\": { \"text\": \"But what exactly are these AI models learning? At their core, neural networks process data through layers of interconnected nodes, each layer extracting increasingly abstract features from raw input.\" },
        \"broll\": { \"entryId\": \"$SOURCE_ENTRY_B\", \"startTime\": 10 }
      },
      {
        \"layoutType\": \"broll\",
        \"narration\": { \"text\": \"Consider digit recognition. A neural network takes pixel values as input, detects edges and curves in hidden layers, and outputs a prediction. This elegant architecture is the foundation of modern computer vision.\" },
        \"broll\": { \"entryId\": \"$SOURCE_ENTRY_B\", \"startTime\": 60 }
      },
      {
        \"layoutType\": \"broll\",
        \"narration\": { \"text\": \"Now scale that up to the cloud. Enterprises can run these neural networks across thousands of custom accelerators, making real-time AI inference accessible to any application, anywhere in the world.\" },
        \"broll\": { \"entryId\": \"$SOURCE_ENTRY_A\", \"startTime\": 180 }
      },
      {
        \"layoutType\": \"full-screen\",
        \"narration\": { \"text\": \"The convergence of scalable cloud infrastructure and intelligent neural architectures is accelerating AI innovation faster than ever. Thank you for watching.\" }
      }
    ]
  }")
VIDEO_ID=$(echo "$VIDEO" | python3 -c "import sys,json; print(json.load(sys.stdin)['id'])")

# 3. Preview a b-roll scene's narration audio before committing to generation
curl -s -X POST "$AVATAR_API/video/previewAudio" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d "{\"id\": \"$VIDEO_ID\", \"sceneId\": 1}" \
  --output scene1_preview.mp3

# 4. Generate — skips compose entirely, goes straight from draft to generating
curl -s -X POST "$AVATAR_API/video/generate" \
  -H "Authorization: Bearer $KS" \
  -H "Content-Type: application/json" \
  -d "{\"id\": \"$VIDEO_ID\"}"

# 5. Poll until ready
while true; do
  RESULT=$(curl -s -X POST "$AVATAR_API/video/get" \
    -H "Authorization: Bearer $KS" \
    -H "Content-Type: application/json" \
    -d "{\"id\": \"$VIDEO_ID\"}")
  STATUS=$(echo "$RESULT" | python3 -c "import sys,json; print(json.load(sys.stdin)['status'])")
  echo "Status: $STATUS"
  [ "$STATUS" = "ready" ] || [ "$STATUS" = "generate-error" ] && break
  sleep 10
done

ENTRY_ID=$(echo "$RESULT" | python3 -c "import sys,json; print(json.load(sys.stdin).get('entryId',''))")
echo "Generated Kaltura entry: $ENTRY_ID"

When to use each approach:

Approach	When to use
Manual storyboard (above)	You know exactly what the avatar should say in each scene, which source video clips to show, and at which timestamps. Use for curated presentations, training modules, or any video where the storyboard is predetermined
AI composition (section 7)	You have source entries and want the AI to analyze their captions and generate a coherent narrative automatically. Use for quick highlights, summaries, or when you do not have a specific script in mind
Hybrid	Use AI composition to generate a first draft, then call `video/update` to refine the scenes — rewrite narration text, swap b-roll entries, adjust start times, or reorder scenes before generating

The VOD Avatar Studio is also available as a drop-in browser widget via the Unisphere framework. The widget uses the server-side API internally and provides a full UI for avatar selection, script editing, AI composition, and video generation.

Basic Embed¶

<div id="avatar-studio" style="width: 100%; height: 100vh;"></div>
<script type="module">
  import { loader } from "https://unisphere.nvp1.ovp.kaltura.com/v1/loader/index.esm.js";

  const workspace = await loader({
    serverUrl: "https://unisphere.nvp1.ovp.kaltura.com/v1",
    appId: "my-app",
    appVersion: "1.0.0",
    session: { ks: "$KALTURA_KS", partnerId: $KALTURA_PARTNER_ID },
    runtimes: [{
      widgetName: "unisphere.widget.vod-avatars",
      runtimeName: "studio",
      settings: {
        ks: "$KALTURA_KS",
        partnerId: $KALTURA_PARTNER_ID,
        kalturaServerURI: "https://www.kaltura.com"
      },
      visuals: [{
        type: "page",
        target: "avatar-studio",
        settings: {}
      }]
    }]
  });

  const studio = await workspace.getRuntimeAsync(
    "unisphere.widget.vod-avatars",
    "studio"
  );
</script>

Runtime Settings¶

Parameter	Type	Required	Description
`ks`	string	yes	Kaltura Session token — user-level (type=0) is sufficient
`partnerId`	number	yes	Partner ID — must be a number, not a string
`kalturaServerURI`	string	yes	Kaltura API server URL (e.g., `https://www.kaltura.com`)
`entryLink`	function	no	`(entryId: string) => string` — returns a URL for navigating to an entry in the host application
`handleShare`	function	no	`(entryId: string) => void` — called when the user clicks share on a generated video
`allowedProjectTypes`	array	no	Restricts available project types (see below). Default: all types available
`initialView`	string	no	`"videoLibrary"` (default) or `"projectBuilder"` — which view to show on load
`additionalEsearchFilters`	object	no	Extra eSearch filters for the media picker when selecting source content
`loadThumbnailWithKS`	boolean	no	Append KS to thumbnail URLs for access-controlled thumbnails

Project Types¶

The widget supports three project creation flows, controlled by allowedProjectTypes:

Value	Label	Description
`"fromScratch"`	Start from scratch	Create an avatar video by writing scenes manually
`"session-highlights"`	Create session highlights	AI composes a highlights video from recorded session captions
`"video-explainer"`	Generate a video on any topic	AI composes an explainer from video captions and documents

// Only allow manual creation (no AI composition)
settings: {
  ks: "$KALTURA_KS",
  partnerId: $KALTURA_PARTNER_ID,
  kalturaServerURI: "https://www.kaltura.com",
  allowedProjectTypes: ["fromScratch"]
}

Host-Page Callbacks¶

settings: {
  ks: "$KALTURA_KS",
  partnerId: $KALTURA_PARTNER_ID,
  kalturaServerURI: "https://www.kaltura.com",
  entryLink: (entryId) => `https://myapp.com/media/${entryId}`,
  handleShare: (entryId) => {
    navigator.clipboard.writeText(`https://myapp.com/share/${entryId}`);
  }
}

Workspace Lifecycle¶

// Refresh the KS when it approaches expiry
workspace.session.setData(prev => ({ ...prev, ks: "new-ks-value" }));

// Destroy the workspace when the user navigates away
workspace.kill();

Auto-save: Scene edits are auto-saved after a 5-second debounce
Polling: The widget polls video.get every 10 seconds during generation
Max scenes: 20 scenes per video
Default avatar: jane template with #CEEEDB background

12. Error Handling¶

Server-Side API Errors¶

Error Code	Meaning	Resolution
`VIDEO_IS_PROCESSING`	Scenes cannot be modified while composing or generating	Wait for the current operation to complete
`VIDEO_CANNOT_COMPOSE`	Video status does not allow composition	Use `resetStatus` if in error state, or wait for current operation
`VIDEO_CANNOT_GENERATE`	Video status does not allow generation	Ensure video is in `draft` or `composed` status
`VIDEO_IS_BEING_GENERATED`	A generation is already in progress	Wait for it to complete
`CANNOT_RESET_STATUS`	Only error statuses can be reset	Only `compose-error` and `generate-error` can be reset
`SCENE_NOT_FOUND`	Scene index out of range	Check scene count in the video
`SCENE_EMPTY_NARRATION`	Scene has no narration text	Add narration text before previewing audio
`CAPTIONS_NOT_FOUND`	Source entries have no captions	Add captions/transcripts to source entries before composing
`TOO_MANY_SOURCES`	More than 5 source entries	Reduce to 5 or fewer entry IDs
`AVATAR_NOT_FOUND`	Invalid avatar ID	Create an avatar with `avatar.upsert` first
`AVATAR_TEMPLATE_NOT_FOUND`	Invalid template ID	Use an ID from `avatarTemplate.list`
`BACKGROUND_NOT_FOUND`	Invalid library background ID	Use a valid background ID from the asset library
`VIDEO_AVATAR_NOT_CONFIGURED`	Video has no avatar set	Set `avatarId` when creating or updating the video
`INVALID_STATUS_TRANSITION`	Status change not allowed	Follow the status lifecycle diagram in section 3

Blank studio — Verify the KS is valid and partnerId is a number (not string). Check browser console for API errors
No avatars available — The account needs VOD Avatar feature provisioning
Generation fails — Ensure the KS is valid and not expired. If generation repeatedly fails, contact your Kaltura account manager to verify the VOD Avatar rendering pipeline is provisioned
KS expiry — Update reactively: workspace.session.setData(prev => ({ ...prev, ks: "new-ks" }))

13. Best Practices¶

Generate the KS server-side. The KS is visible in client-side code — generate it on your backend and pass it to the widget
Set partnerId as a number. The VOD Avatar widget requires partnerId as a number type, not a string
Ensure captions before composing. Source entries need captions or transcripts for AI composition. Use REACH to add captions first
Poll at 10-second intervals. The widget uses 10-second polling; match this in server-side integrations
Handle error states. Use resetStatus to recover from compose-error or generate-error before retrying
Preview audio before generating. Use previewAudio to verify narration quality — generation is more expensive
Limit source entries. AI composition accepts at most 5 source entries. Select the most relevant content
Process generated videos. The resulting Kaltura entry can be enriched via REACH (captions, translation), Content Lab (chapters, summaries), or Agents (automated workflows)
Use HTTPS. The Unisphere loader and all widget bundles require HTTPS

14. Multi-Region¶

Region	Server-Side API	Widget URL
NVP1 (US, default)	`https://video-avatar.nvp1.ovp.kaltura.com/api/v1`	`https://unisphere.nvp1.ovp.kaltura.com/v1`
IRP2 (EU)	`https://video-avatar.irp2.ovp.kaltura.com/api/v1`	`https://unisphere.irp2.ovp.kaltura.com/v1`
FRP2 (DE)	`https://video-avatar.frp2.ovp.kaltura.com/api/v1`	`https://unisphere.frp2.ovp.kaltura.com/v1`

Conversational Avatar Embed — Real-time AI avatar conversations via iframe SDK or WebRTC — the live counterpart to this pre-recorded studio
Unisphere Framework — The micro-frontend framework that powers the widget embed: loader, workspace lifecycle, services
Experience Components Overview — Index of all embeddable components with shared guidelines
REACH API — Add captions and transcripts to source entries before AI composition, or enrich generated avatar videos
Content Lab API — Generate summaries, chapters, or clips from avatar videos
Session Guide — KS generation and privilege management
AppTokens API — Production token management for secure KS generation

Kaltura VOD Avatar Studio API¶

1. When to Use¶

2. Prerequisites¶

3. Architecture¶

4. Auth & Headers¶

5. Avatar Templates & Configuration¶

Step 1: List Available Templates¶

Step 2: Create an Avatar (avatar/upsert)¶

Get an Avatar¶

Preview an Avatar¶

6. Video Project Management¶

Create a Video Project¶

Top-Level Request Fields¶

Scene Object¶

Narration Object¶

Broll Object¶

Response Fields¶

Scene Examples¶

Get a Video Project¶

Update a Video Project¶

List Video Projects¶

Delete a Video Project¶

7. AI Composition¶

Compose Scenes from Content¶

8. Audio Preview¶

9. Video Generation¶

Generate the Video¶

Poll for Completion¶

Reset Status After Error¶

10. Complete Server-Side Workflow¶

Manual Storyboard with Multi-Source B-Roll¶

11. Widget Embedding¶

Basic Embed¶

Runtime Settings¶

Project Types¶

Host-Page Callbacks¶

Workspace Lifecycle¶

Widget Behavior¶

12. Error Handling¶

Server-Side API Errors¶

Widget Errors¶

13. Best Practices¶

14. Multi-Region¶

15. Related Guides¶

Step 2: Create an Avatar (`avatar/upsert`)¶