The Kling 3.0 series models API is now fully available Learn More Get Started Overview Quick Start Changelog API Reference General Info Rate Limits Callback Schema Video Generation Models Video Omni Text to Video Image to Video Reference to Video Motion Control Multi-elements to video Extend Video Lip Sync Avatar Text to Audio Video to Audio Text to Speech Voice Clone Image Recognize Element Effects Effect Templates NEW Video Effects Image Generation Models Image Omni Image Generation Reference to Image Extend Image AI Multi-Shot Virtual Try-On Others Query user info Pricing Billing Info Prepaid Resource Packs Protocols Privacy Policy of API Service Terms of API Service API Service Level Agreement Image Recognize Image Recognize POST /v1/videos/image-recognize cURL Copy Collapse curl --request POST \ --url https://api-singapore.klingai.com/v1/videos/image-recognize \ --header 'Authorization: Bearer ' \ --header 'Content-Type: application/json' \ --data '{ "image": "https://p2-kling.klingai.com/kcdn/cdn-kcdn112452/kling-qa-test/multi-1.png" }' 200 Copy Collapse { "code": 0, // Error codes; Specific definitions can be found in Error codes "message": "string", // Error information "request_id": "string", // Request ID, generated by the system, is used to track requests and troubleshoot problems "data": { "final_unit_deduction": "string", // The deduction units of task "task_result": { "images": [ { "type": "object_seg", // Identification of subject recognition results "is_contain": true, // Has the subject been identified; Boolean value "url": "string" //URL for generating videos,such as https://p1.a.kwimgs.com/bs2/upload-ylab-stunt/special-effect/output/HB1_PROD_ai_web_46554461/-2878350957757294165/output.png(请注意,为保障信息安全,生成的图片/视频会在30天后被清理,请及时转存) }, { "type": "head_seg", // Identification of facial recognition results for individuals with hair included "is_contain": true, // Has the subject been identified; Boolean value "url": "string" //URL for generating videos,such as https://p1.a.kwimgs.com/bs2/upload-ylab-stunt/special-effect/output/HB1_PROD_ai_web_46554461/-2878350957757294165/output.png(请注意,为保障信息安全,生成的图片/视频会在30天后被清理,请及时转存) }, { "type": "face_seg", // Identification of facial recognition results for individuals without hair included "is_contain": true, // Has the subject been identified; Boolean value "url": "string" //URL for generating videos,such as https://p1.a.kwimgs.com/bs2/upload-ylab-stunt/special-effect/output/HB1_PROD_ai_web_46554461/-2878350957757294165/output.png(请注意,为保障信息安全,生成的图片/视频会在30天后被清理,请及时转存) }, { "type": "cloth_seg", // Identification of clothing recognition results "is_contain": true, // Has the subject been identified; Boolean value "url": "string" //URL for generating videos,such as https://p1.a.kwimgs.com/bs2/upload-ylab-stunt/special-effect/output/HB1_PROD_ai_web_46554461/-2878350957757294165/output.png(请注意,为保障信息安全,生成的图片/视频会在30天后被清理,请及时转存) } ] }, "final_unit_deduction": "string" // The deduction units of task } } Request Header Content-Type string Required Default to application/json Data Exchange Format Authorization string Required Authentication information, refer to API authentication Request Body image string Required Image to be recognized Support inputting image Base64 encoding or image URL (ensure accessibility). Please note, if you use the Base64 method, make sure all image data parameters you pass are in Base64 encoding format. When submitting data, do not add any prefixes to the Base64-encoded string, such as data:image/png;base64,. The correct parameter format should be the Base64-encoded string itself. Please provide only the Base64-encoded string portion so that the system can correctly process and parse your data. Supported image formats: .jpg / .jpeg / .png. The image file size cannot exceed 10MB, and the width and height dimensions of the image shall not be less than 300px, and the aspect ratio of the image should be between 1:2.5 ~ 2.5:1. Previous chapter:Voice Clone Next chapter:Element The Kling 3.0 Series Models API is Now Fully Available – All in One, One for All! Models Available in This Release Kling 3.0 Motion Control, Kling Video 3.0, Kling Video 3.0 Omni, Kling Image 3.0, Kling Image 3.0 Omni Refer to Key Highlights of the Models 3.0 All-in-One: A unified model for multi-modal input and output. Most powerful consistency across the universe: Subject consistency (supports cameo, subject with voice control, i2v + subject) and text consistency. Narrative control at your fingertips: More freedom, precision, and control—up to 15 seconds long, video scene cuts, ultra-high-definition storyboards/images, custom seconds. Upgraded native audio-visual output: Supports multiple speakers and languages (with accents). Kling 3.0 Motion Control Consistent Facial Identity from any angle Complex Emotions faithfully reproduced High fidelity Restoration, Even with Face Occlusions Consistent Facial Clarity Across Dynamic Framing User Guide -> Kling Video 3.0 Compared to 2.6, expected improvements: Supports subject upload in I2V scenarios for enhanced consistency Significant improvement in multi-character referencing, especially for three-person scenarios Supports Japanese, Korean, and Spanish in addition to Chinese and English Capable of generating certain dialects and accents Better distinction and control over different types of audio (speech, sound effects, BGM) Improved text retention in I2V scenarios Supports scene transitions, with up to 6 shots and customizable storyboarding User Guide -> Kling Video 3.0 Omni Compared to O1, expected improvements: Native audio-visual synchronization Supports video subject creation Further improved consistency in reference-based tasks, especially for characters and products Combined capabilities of reference + storyboarding + audio-visual sync significantly enhance usability Supports scene transitions, with up to 6 shots Extended generation duration up to 15 seconds User Guide -> Kling Image 3.0 Highly consistent feature retention Precise response to detail modifications Accurate control over style and tone Rich imaginative capabilities User Guide -> Kling Image 3.0 Omni Enhanced narrative sense New storyboard image set generation, retaining reference image features with scene relevance Direct output of 2K/4K ultra-high-definition images Further improved detail consistency User Guide -> Thank you for your support and understanding! I Got It