The Kling 3.0 series models API is now fully available
Learn More
Get Started
Overview
Quick Start
Changelog
API Reference
General Info
Rate Limits
Callback Schema
Video Generation
Models
Video Omni
Text to Video
Image to Video
Reference to Video
Motion Control
Multi-elements to video
Extend Video
Lip Sync
Avatar
Text to Audio
Video to Audio
Text to Speech
Voice Clone
Image Recognize
Element
Effects
Effect Templates
NEW
Video Effects
Image Generation
Models
Image Omni
Image Generation
Reference to Image
Extend Image
AI Multi-Shot
Virtual Try-On
Others
Query user info
Pricing
Billing Info
Prepaid Resource Packs
Protocols
Privacy Policy of API Service
Terms of API Service
API Service Level Agreement
Image to Video
Create Task
POST
/v1/videos/image2video
cURL
Copy
Collapse
curl --location --request POST 'https://api-singapore.klingai.com/v1/videos/image2video' \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data-raw '{
    "model_name": "kling-v2-6",
    "image": "https://p2-kling.klingai.com/kcdn/cdn-kcdn112452/kling-qa-test/multi-2.png",
    "image_tail": "https://p2-kling.klingai.com/kcdn/cdn-kcdn112452/kling-qa-test/multi-1.png",
    "prompt": "Camera zooms out, the girl smiles",
    "negative_prompt": "",
    "duration": "5",
    "mode": "pro",
    "sound": "off",
    "callback_url": "",
    "external_task_id": ""
}'
200
Copy
Collapse
{
  "code": 0, // Error codes; Specific definitions can be found in "Error Code"
  "message": "string", // Error information
  "request_id": "string", // Request ID, generated by the system
  "data": {
    "task_id": "string", // Task ID, generated by the system
    "task_info": { // Task creation parameters
      "external_task_id": "string" // Customer-defined task ID
    },
    "task_status": "string", // Task status, Enum values: submitted, processing, succeed, failed
    "created_at": 1722769557708, // Task creation time, Unix timestamp, unit ms
    "updated_at": 1722769557708 // Task update time, Unix timestamp, unit ms
  }
}
💡

Please note that in order to maintain naming consistency, the original model field has been changed to model_name. Please use this field to specify the model version in the future.
We maintain backward compatibility. If you continue using the original model field, it will not affect API calls and will be equivalent to the default behavior when model_name is empty (i.e., calling the V1 model).

Request Header
Content-Type
string
Required
Default to application/json

Data Exchange Format

Authorization
string
Required

Authentication information, refer to API authentication

Request Body
model_name
string
Optional
Default to kling-v1

Model Name

Enum values：
kling-v1
kling-v1-5
kling-v1-6
kling-v2-master
kling-v2-1
kling-v2-1-master
kling-v2-5-turbo
kling-v2-6
kling-v3
image
string
Optional

Reference Image

Supports image Base64 encoding or image URL (ensure accessibility)
Important: When using Base64, do NOT add any prefix like data:image/png;base64,. Submit only the raw Base64 string.
Correct Base64 format:
iVBORw0KGgoAAAANSUhEUgAAAAUA...
Incorrect Base64 format (with data: prefix):
data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUA...
Supported image formats: .jpg / .jpeg / .png
File size: ≤10MB, dimensions: min 300px, aspect ratio: 1:2.5 ~ 2.5:1
At least one of image or image_tail must be provided; both cannot be empty

Support varies by model version and video mode. See Capability Map for details.

image_tail
string
Optional

Reference Image - End frame control

Supports image Base64 encoding or image URL (ensure accessibility)
Important: When using Base64, do NOT add any prefix like data:image/png;base64,. Submit only the raw Base64 string.
Supported image formats: .jpg / .jpeg / .png
File size: ≤10MB, dimensions: min 300px
At least one of image or image_tail must be provided; both cannot be empty
image_tail, dynamic_masks/static_mask, and camera_control are mutually exclusive - only one can be used at a time

Support varies by model version and video mode. See Capability Map for details.

multi_shot
boolean
Optional
Default to false

Whether to generate multi-shot video

When true: the prompt parameter is invalid, and the first/end frame generation is not supported.

When false: the shot_type and multi_prompt parameters are invalid

shot_type
string
Optional

Storyboard method

Enum values：
customize
intelligence

When multi_shot is true, this parameter is required

prompt
string
Optional

Positive text prompt

💡

The Omni model can achieve various capabilities through Prompt with elements, images, videos, and other content:

Specify elements/images/videos using <<<>>> format, e.g.: <<<element_1>>>, <<<image_1>>>, <<<video_1>>>
For detailed capabilities, see: KLING Omni Model User Guide, Kling VIDEO 3.0 Omni Model User Guide
Cannot exceed 2500 characters
When multi_shot is false or shot_type is intelligence, this parameter must not be empty.
Use <<<voice_1>>> to specify voice, with the sequence matching the voice_list parameter order
A video generation task can reference up to 2 voices; when specifying a voice, the sound parameter must be "on"
The simpler the syntax structure, the better. Example: The man<<<voice_1>>> said: "Hello"
When voice_list is not empty and prompt references voice ID, the task will be billed as "with specified voice"

Support varies by model version and video mode. See Capability Map for details.

multi_prompt
array
Optional

Information about each storyboard, such as prompts and duration

Define the shot sequence number, corresponding prompt word, and duration through the index, prompt, and duration parameters, where:

Supports up to 6 storyboards, with a minimum of 1 storyboard.
The maximum length of the prompt for each storyboard 512 characters.
The duration of each storyboard should not exceed the total duration, but should not be less than 1.
The sum of the durations of all storyboards equals the total duration of the current task.

Load with key:value format as follows:

"multi_prompt":[
{"index":int,"prompt":"string","duration":"5"},
{"index":int,"prompt":"string","duration":"5"}
]

When multi_shot is true and shot_type is customize, this parameter is required.

negative_prompt
string
Optional

Negative text prompt

Cannot exceed 2500 characters
It is recommended to supplement negative prompt via negative sentences within positive prompts
element_list
array
Optional

Reference Element List, based on element ID from element library

Supports up to 3 reference elements

The elements are categorized into video customization element (named as Video Character Elements) and image customization elements (named as Multi-Image Elements), each with distinct scopes of application. Please exercise caution in distinguishing between them. See Kling Element Library User Guide.

Load with key:value format as follows:
"element_list":[
  { "element_id": long },
  { "element_id": long }
]

Support varies by model version and video mode. See Capability Map for details.

▾
Hide child attributes
element_id
long
Required

Element ID from element library

voice_list
array
Optional

List of voices referenced when generating videos

A video generation task can reference up to 2 voices
When voice_list is not empty and prompt references voice ID, the task will be billed as "with specified voice"
voice_id is returned through the voice customization API, or use system preset voices. See Custom Voices API; NOT the voice_id of Lip-Sync API
element_list and voice_list are mutually exclusive and cannot coexist

Example:

"voice_list":[
  {"voice_id":"voice_id_1"},
  {"voice_id":"voice_id_2"}
]

The support range for different model versions and video modes varies. For details, see Capability Map

sound
string
Optional
Default to off

Whether to generate sound when generating video

Enum values：
on
off

The support range for different model versions and video modes varies. For details, see Capability Map

cfg_scale
float
Optional
Default to 0.5

Flexibility in video generation; higher value means lower model flexibility and stronger relevance to user prompt

Value range: [0, 1]

kling-v2.x models do not support this parameter

mode
string
Optional
Default to std

Video generation mode

Enum values：
std
pro
std: Standard Mode - basic mode, cost-effective
pro: Professional Mode (High Quality) - high performance mode, better video quality

Support varies by model version and video mode. See Capability Map for details.

static_mask
string
Optional

Static brush mask area (mask image created by user using motion brush)

The "Motion Brush" feature includes Dynamic Brush (dynamic_masks) and Static Brush (static_mask)

Supports image Base64 encoding or image URL (same format requirements as image field)
Supported image formats: .jpg / .jpeg / .png
Aspect ratio must match the input image (image field), otherwise task will fail
Resolution of static_mask and dynamic_masks.mask must be identical, otherwise task will fail

Support varies by model version and video mode. See Capability Map for details.

dynamic_masks
array
Optional

Dynamic brush configuration list

Can configure multiple groups (up to 6), each containing "mask area" and "motion trajectory" sequence

Support varies by model version and video mode. See Capability Map for details.

▾
Hide child attributes
mask
string
Required

Dynamic brush mask area (mask image created by user using motion brush)

Supports image Base64 encoding or image URL (same format requirements as image field)
Supported image formats: .jpg / .jpeg / .png
Aspect ratio must match the input image (image field), otherwise task will fail
Resolution of static_mask and dynamic_masks.mask must be identical, otherwise task will fail
trajectories
array
Required

Motion trajectory coordinate sequence

For 5s video, trajectory length ≤77, coordinate count range: [2, 77]
Coordinate system uses bottom-left corner of image as origin

Note 1: More coordinate points = more accurate trajectory. 2 points = straight line between them

Note 2: Trajectory direction follows input order. First coordinate is start point, subsequent coordinates are connected sequentially

▾
Hide child attributes
x
int
Required

X coordinate of trajectory point (pixel coordinate with image bottom-left as origin)

y
int
Required

Y coordinate of trajectory point (pixel coordinate with image bottom-left as origin)

camera_control
object
Optional

Camera movement control protocol (if not specified, model will intelligently match based on input text/images)

Support varies by model version and video mode. See Capability Map for details.

▾
Hide child attributes
type
string
Required

Predefined camera movement type

Enum values：
simple
down_back
forward_up
right_turn_forward
left_turn_forward
simple: Simple camera movement, can choose one of six options in "config"
down_back: Camera descends and moves backward ➡️ Pan down and zoom out. config parameter not required
forward_up: Camera moves forward and tilts up ➡️ Zoom in and pan up. config parameter not required
right_turn_forward: Rotate right then move forward ➡️ Right rotation advance. config parameter not required
left_turn_forward: Rotate left then move forward ➡️ Left rotation advance. config parameter not required
config
object
Optional

Contains 6 fields to specify camera movement in different directions

Required when type is "simple"; leave empty for other types
Choose only one parameter to be non-zero; rest must be 0
▾
Hide child attributes
horizontal
float
Optional

Horizontal movement - camera translation along x-axis

Value range: [-10, 10]. Negative = left, Positive = right
vertical
float
Optional

Vertical movement - camera translation along y-axis

Value range: [-10, 10]. Negative = down, Positive = up
pan
float
Optional

Horizontal pan - camera rotation around y-axis

Value range: [-10, 10]. Negative = rotate left, Positive = rotate right
tilt
float
Optional

Vertical tilt - camera rotation around x-axis

Value range: [-10, 10]. Negative = tilt down, Positive = tilt up
roll
float
Optional

Roll - camera rotation around z-axis

Value range: [-10, 10]. Negative = counterclockwise, Positive = clockwise
zoom
float
Optional

Zoom - controls camera focal length change, affects field of view

Value range: [-10, 10]. Negative = longer focal length (narrower FOV), Positive = shorter focal length (wider FOV)
duration
string
Optional
Default to 5

Video duration in seconds

Enum values：
3
4
5
6
7
8
9
10
11
12
13
14
15

Support varies by model version and video mode. See Capability Map for details.

watermark_info
object
Optional

Whether to generate watermarked results simultaneously

Defined by the enabled parameter, format:
  "watermark_info": { "enabled": boolean } 
true: generate watermarked result, false: do not generate
Custom watermarks are not currently supported
callback_url
string
Optional

Callback notification URL for task result. If configured, server will notify when task status changes.

For specific message schema, see Callback Protocol
external_task_id
string
Optional

Customized Task ID

Will not overwrite system-generated task ID, but supports querying task by this ID
Must be unique within a single user account
Scenario invocation examples
Image to video with multi-shot
curl --location 'https://xxx/v1/videos/image2video' \
--header 'Authorization: Bearer xxx' \
--header 'Content-Type: application/json' \
--data '{
    "model_name": "kling-v3",
    "image": "xxx",
    "prompt": "",
    "multi_shot": "true",
    "shot_type": "customize",
    "multi_prompt": [
        {
            "index": 1,
            "prompt": "Two friends talking under a streetlight at night.  Warm glow, casual poses, no dialogue.",
            "duration": "2"
        },
        {
            "index": 2,
            "prompt": "A runner sprinting through a forest, leaves flying.  Low-angle shot, focus on movement.",
            "duration": "3"
        },
        {
            "index": 3,
            "prompt": "A woman hugging a cat, smiling.  Soft sunlight, cozy home setting, emphasize warmth.",
            "duration": "3"
        },
        {
            "index": 4,
            "prompt": "A door creaking open, shadowy hallway.  Dark tones, minimal details, eerie mood.",
            "duration": "3"
        },
        {
            "index": 5,
            "prompt": "A man slipping on a banana peel, shocked expression.  Exaggerated pose, bright colors.",
            "duration": "3"
        },
        {
            "index": 6,
            "prompt": "A sunset over mountains, small figure walking away.  Wide angle, peaceful atmosphere.",
            "duration": "1"
        }
    ],
    "negative_prompt": "",
    "duration": "15",
    "mode": "pro",
    "sound": "on",
    "callback_url": "",
    "external_task_id": ""
}'
Image to video with element
curl --location 'https://api-singapore.klingai.com/v1/images/generations' \
--header 'Authorization: Bearer xxx' \
--header 'Content-Type: application/json' \
--data '{
    "model_name": "kling-v3",
    "prompt": "Merge all the characters from the images into the <<<object_2>>> diagram",
    "element_list": [
        {
            "element_id": "160"
        },
        {
            "element_id": "161"
        },
        {
            "element_id": "159"
        }
    ],
    "image": "xxx",
    "resolution": "2k",
    "n": "9",
    "aspect_ratio": "3:2",
    "external_task_id": "",
    "callback_url": ""
}'
curl --location 'https://xxx/v1/videos/text2video' \
--header 'Authorization: Bearer xxx' \
--header 'Content-Type: application/json' \
--data '{
    "model_name": "kling-v3",
    "prompt": "",
    "multi_prompt": [
        {
            "index": 1,
            "prompt": "Two friends talking under a streetlight at night.  Warm glow, casual poses, no dialogue.",
            "duration": "2"
        },
        {
            "index": 2,
            "prompt": "A runner sprinting through a forest, leaves flying.  Low-angle shot, focus on movement.",
            "duration": "3"
        },
        {
            "index": 3,
            "prompt": "A woman hugging a cat, smiling.  Soft sunlight, cozy home setting, emphasize warmth.",
            "duration": "3"
        },
        {
            "index": 4,
            "prompt": "A door creaking open, shadowy hallway.  Dark tones, minimal details, eerie mood.",
            "duration": "3"
        },
        {
            "index": 5,
            "prompt": "A man slipping on a banana peel, shocked expression.  Exaggerated pose, bright colors.",
            "duration": "3"
        },
        {
            "index": 6,
            "prompt": "A sunset over mountains, small figure walking away.  Wide angle, peaceful atmosphere.",
            "duration": "1"
        }
    ],
    "multi_shot": true,
    "shot_type": "customize",
    "duration": "15",
    "mode": "pro",
    "sound": "on",
    "aspect_ratio": "9:16",
    "callback_url": "",
    "external_task_id": ""
}'
Generate video with voice control
curl --location 'https://api-singapore.klingai.com/v1/videos/image2video/' \
--header 'Authorization: Bearer {Replace your token}' \
--header 'Content-Type: application/json; charset=utf-8' \
--data '{
    "model_name": "kling-v2-6",
    "image": "Replace the URL of image",
    "prompt": "<<<voice_1>>>Ask the people in the picture to say the following words, '\''Welcome everyone'\''",    //If a specific dialogue needs to be enclosed in quotation marks
    "voice_list": [
        {
            "voice_id": "Replace the ID of voice"
        }
    ],
    "duration": "5",
    "mode": "pro",
    "sound": "on",
    "callback_url": "",
    "external_task_id": ""
}'
Query Task (Single)
GET
/v1/videos/image2video/{id}
cURL
Copy
Collapse
curl --request GET \
  --url https://api-singapore.klingai.com/v1/videos/image2video/{task_id} \
  --header 'Authorization: Bearer <token>'
200
Copy
Collapse
{
  "code": 0, // Error codes; Specific definitions can be found in "Error Code"
  "message": "string", // Error information
  "request_id": "string", // Request ID, generated by the system, is used to track requests and troubleshoot problems
  "data": {
    "task_id": "string", // Task ID, generated by the system
    "task_status": "string", // Task status, Enum values: submitted, processing, succeed, failed
    "task_status_msg": "string", // Task status information, displaying the failure reason when the task fails (such as triggering the content risk control of the platform, etc.)
    "watermark_info": {
      "enabled": boolean
    },
    "task_result": {
      "videos": [
        {
          "id": "string", // Generated video ID; globally unique
          "url": "string", // URL for generating videos (To ensure information security, generated images/videos will be cleared after 30 days. Please make sure to save them promptly.)
          "watermark_url": "string", // Watermarked video download URL, anti-leech format
          "duration": "string" // Total video duration, unit: s
        }
      ]
    },
    "task_info": { // Task creation parameters
      "external_task_id": "string" // Customer-defined task ID
    },
    "final_unit_deduction": "string", // The deduction units of task
    "created_at": 1722769557708, // Task creation time, Unix timestamp, unit: ms
    "updated_at": 1722769557708 // Task update time, Unix timestamp, unit: ms
  }
}
Request Header
Content-Type
string
Required
Default to application/json

Data Exchange Format

Authorization
string
Required

Authentication information, refer to API authentication

Path Parameters
task_id
string
Optional

Task ID for image to video

Request path parameter, fill value directly in request path
Choose one between task_id and external_task_id for querying
external_task_id
string
Optional

Customized Task ID for image to video

The external_task_id provided when creating the task
Choose one between task_id and external_task_id for querying
Query Task (List)
GET
/v1/videos/image2video
cURL
Copy
Collapse
curl --request GET \
  --url 'https://api-singapore.klingai.com/v1/videos/image2video?pageNum=1&pageSize=30' \
  --header 'Authorization: Bearer <token>'
200
Copy
Collapse
{
  "code": 0, // Error codes; Specific definitions can be found in Error codes
  "message": "string", // Error information
  "request_id": "string", // Request ID, generated by the system, to track requests and troubleshoot problems
  "data": [
    {
      "task_id": "string", // Task ID, generated by the system
      "task_status": "string", // Task status, Enum values: submitted, processing, succeed, failed
      "task_status_msg": "string", // Task status information, displaying the failure reason when the task fails (such as triggering the content risk control of the platform, etc.)
      "task_info": { // Task creation parameters
        "external_task_id": "string" // Customer-defined task ID
      },
      "task_result": {
        "videos": [
          {
            "id": "string", // Generated video ID; globally unique
            "url": "string", // URL for generating videos (To ensure information security, generated images/videos will be cleared after 30 days. Please make sure to save them promptly.)
            "watermark_url": "string", // Watermarked video download URL, anti-leech format
            "duration": "string" // Total video duration, unit: s (seconds)
          }
        ]
      },
      "watermark_info": {
        "enabled": boolean
      },
      "final_unit_deduction": "string", // The deduction units of task
      "created_at": 1722769557708, // Task creation time, Unix timestamp, unit: ms
      "updated_at": 1722769557708 // Task update time, Unix timestamp, unit: ms
    }
  ]
}
Request Header
Content-Type
string
Required
Default to application/json

Data Exchange Format

Authorization
string
Required

Authentication information, refer to API authentication

Query Parameters
pageNum
int
Optional
Default to 1

Page number

Value range: [1, 1000]
pageSize
int
Optional
Default to 30

Data volume per page

Value range: [1, 500]
Previous chapter：Text to Video
Next chapter：Reference to Video
Create Task
Scenario invocation examples
Query Task (Single)
Query Task (List)
The Kling 3.0 Series Models API is Now Fully Available
– All in One, One for All！

Models Available in This Release

Kling 3.0 Motion Control, Kling Video 3.0, Kling Video 3.0 Omni, Kling Image 3.0, Kling Image 3.0 Omni

Refer to <Kling AI Series 3.0 Model API Specification>

Key Highlights of the Models

3.0 All-in-One: A unified model for multi-modal input and output.

Most powerful consistency across the universe: Subject consistency (supports cameo, subject with voice control, i2v + subject) and text consistency.
Narrative control at your fingertips: More freedom, precision, and control—up to 15 seconds long, video scene cuts, ultra-high-definition storyboards/images, custom seconds.
Upgraded native audio-visual output: Supports multiple speakers and languages (with accents).

Kling 3.0 Motion Control

Consistent Facial Identity from any angle
Complex Emotions faithfully reproduced
High fidelity Restoration, Even with Face Occlusions
Consistent Facial Clarity Across Dynamic Framing

User Guide ->

Kling Video 3.0

Compared to 2.6, expected improvements:

Supports subject upload in I2V scenarios for enhanced consistency
Significant improvement in multi-character referencing, especially for three-person scenarios
Supports Japanese, Korean, and Spanish in addition to Chinese and English
Capable of generating certain dialects and accents
Better distinction and control over different types of audio (speech, sound effects, BGM)
Improved text retention in I2V scenarios
Supports scene transitions, with up to 6 shots and customizable storyboarding

User Guide ->

Kling Video 3.0 Omni

Compared to O1, expected improvements:

Native audio-visual synchronization
Supports video subject creation
Further improved consistency in reference-based tasks, especially for characters and products
Combined capabilities of reference + storyboarding + audio-visual sync significantly enhance usability
Supports scene transitions, with up to 6 shots
Extended generation duration up to 15 seconds

User Guide ->

Kling Image 3.0

Highly consistent feature retention
Precise response to detail modifications
Accurate control over style and tone
Rich imaginative capabilities

User Guide ->

Kling Image 3.0 Omni

Enhanced narrative sense
New storyboard image set generation, retaining reference image features with scene relevance
Direct output of 2K/4K ultra-high-definition images
Further improved detail consistency

User Guide ->

Thank you for your support and understanding!

I Got It