# Kling API Reference

> **Doc maintenance:** This file is indexed in `docs/current/README.md`. If you change this file's role, scope, status, or filename, update `docs/current/README.md` in the same edit.


작성: 2026-03-28  
상태: **완성 문서 (working reference)**

## 문서 목적
이 문서는 **Kling 공식 문서 구조 + Playwright 실제 수집 결과 + wrapper/repackaging 교차검증**을 바탕으로 만든 **실사용용 API 레퍼런스**다.

파이프라인 설계 전에 먼저 이 문서를 기준점으로 삼는다.

---

# 1. 문서 신뢰도 규칙

## 라벨
- **[공식 수집 확인]**: Playwright로 실제 렌더된 문서 본문에서 확인
- **[wrapper 교차검증]**: Freepik / mcp-kling 등 외부 문서와 일치 확인
- **[추정/보강 필요]**: 구조상 거의 확실하지만 원문 전체를 완전히 확인하지는 못함

## 현재 결론
- 핵심 비디오 API 문서: **실사용 가능한 수준으로 확보 완료**
- callback / account 문서: **운영 설계에 사용할 수 있는 수준으로 확보 완료**
- rate limits / billing / policy 문서: **운영 참고에 충분하지만 일부는 요약/보조 수준**
- 일부 정책/개요 문서: 보조 참고용

## 1.1 추가 상세 문서
- `kling-api-series-3-spec.md` — Kling 3.0 Qingque 상세 스펙 deep-dive

---

# 2. 전체 구조

## 2.1 최상위 구조
공식 문서 메뉴에서 아래 구조를 확인했다. **[공식 수집 확인]**

- Quick Start
- Changelog
- API Reference
  - General Info
  - Rate Limits
  - Callback Schema
  - Video Generation
    - Models
    - Video Omni
    - Text to Video
    - Image to Video
    - Reference to Video
    - Motion Control
    - Multi-elements to video
    - Extend Video
    - Lip Sync
    - Avatar
    - Text to Audio
    - Video to Audio
    - Text to Speech
    - Voice Clone
    - Image Recognize
    - Element
    - Video Effects
  - Image Generation
    - Models
    - Image Omni
    - Image Generation
    - Reference to Image
    - Extend Image
    - AI Multi-Shot
    - Virtual Try-On
  - Others
    - Query user info
- Pricing
  - Billing Info
  - Prepaid Resource Packs
- Protocols
  - Privacy Policy of API Service
  - Terms of API Service
  - API Service Level Agreement

## 2.2 설계에 중요한 capability 축
이번 조사에서 특히 중요했던 점은 Kling의 비디오 기능이 다음처럼 분리되어 있다는 점이다. **[공식 수집 확인]**

1. **기본 생성**
   - `textToVideo`
   - `imageToVideo`
   - `OmniVideo`
2. **참조 기반 생성**
   - `multiImageToVideo`
3. **연장 생성**
   - `videoExtension`
4. **고급 제어 생성**
   - `motionControl`
   - `multiElements`
   - `lipSync`
5. **운영/상태 관리**
   - `callbackProtocol`
   - `rateLimits`
   - `accountInfoInquiry`

이 구조는 파이프라인에서 그대로 capability 분리 기준으로 삼을 수 있다.

---

# 3. 공통 비동기 모델

## 3.1 Create Task 패턴
생성형 API는 대체로 **Create Task** 패턴을 따른다. **[공식 수집 확인]**

공통 응답 예시:
```json
{
  "code": 0,
  "message": "string",
  "request_id": "string",
  "data": {
    "task_id": "string",
    "task_status": "submitted|processing|succeed|failed",
    "created_at": 1722769557708,
    "updated_at": 1722769557708,
    "task_info": {
      "external_task_id": "string"
    }
  }
}
```

### 공통 해석
- 비동기 생성 구조
- `task_id` 기준 상태 추적 필요
- `task_status` 상태머신 처리 필요
- `external_task_id`는 내부 파이프라인 ID 매핑용으로 쓰기 좋음

## 3.2 공통 필드
여러 video endpoint에서 반복적으로 확인된 필드: **[공식 수집 확인]**
- `model_name`
- `prompt`
- `negative_prompt`
- `duration`
- `mode`
- `aspect_ratio`
- `callback_url`
- `external_task_id`

### 파이프라인 적용 메모
- `external_task_id`는 `video_job_id`, `scene_id`, `sequence_id`와 연결
- `callback_url`은 worker callback handler와 연결


## 3.3 Authentication (AK/SK → JWT Bearer)
**[공식 스크린샷 확인]**

Kling API 인증은 단순 고정 Bearer token이 아니라, **Access Key / Secret Key를 이용해 JWT API Token을 생성한 뒤** 요청 헤더에 넣는 방식으로 보인다.

### 확인된 흐름
1. Access Key / Secret Key 발급
2. JWT 생성
   - `alg`: `HS256`
   - `typ`: `JWT`
3. JWT payload 예시
   - `iss`: AccessKey
   - `exp`: 현재 시각 + 1800초
   - `nbf`: 현재 시각 - 5초
4. 요청 시 헤더 사용
   - `Authorization: Bearer <API Token>`

### 실무 의미
- `KLING_ACCESS_KEY` / `KLING_SECRET_KEY`를 직접 API header에 넣는 것이 아니라, 런타임마다 JWT를 생성해야 한다.
- 토큰 만료/clock skew 처리 필요.
- auth smoke test 단계에서 실제 토큰 생성 로직을 먼저 검증해야 한다.

---

# 4. 운영 핵심 문서

## 4.1 General Info
**문서 경로**: `apiReference/commonInfo`  
**상태**: **[공식 수집 확인]**

공통 에러/기본 규칙/호출 개념을 설명하는 기반 문서로 보인다.

### 사용 목적
- 공통 규칙 요약
- 에러 처리 정책 정리
- 문서 전체의 baseline semantics 확인

## 4.2 Callback Schema
**문서 경로**: `apiReference/callbackProtocol`  
**상태**: **[공식 수집 확인]**

확인된 내용:
- 비동기 task (image generation / video generation / virtual try-on)에 callback 지원
- task 생성 시 `callback_url` 설정 가능
- 상태 변경 시 서버가 callback 수행

### 확인된 payload 필드
- `task_id`
- `task_status`
- `task_status_msg`
- `created_at`
- `updated_at`
- `final_unit_deduction`
- `task_info`
- `parent_video.id`
- `parent_video.url`
- `parent_video.duration`

### 실무 의미
- polling-only보다 callback-first 구조가 적합
- 실패 사유(`task_status_msg`)를 운영 로그에 남겨야 함
- 결과 자산 URL은 만료 전 저장 필요
- 생성 결과 URL은 보존 기간이 제한되므로(문서/스크린샷상 30일) downloader가 사실상 필수

### 주의 (callback auth)
현재 스캐폴드에는 shared-secret 기반 callback 보호 로직이 있으나, Kling이 실제로 어떤 callback 인증 헤더/서명 방식을 쓰는지는 아직 실연동으로 검증되지 않았다. 따라서 현재 구현은 **임시 보호 레이어**로 보고, 실 callback 관찰 후 최종 고정해야 한다.

## 4.3 Rate Limits / Concurrency Rules
**문서 경로**: `apiReference/rateLimits`  
**상태**: **[공식 수집 확인 / 부분 요약]**

확인된 핵심:
- Kling API는 **QPS보다 concurrency 규칙이 더 중요**함
- task creation 인터페이스만 concurrency를 점유
- query 인터페이스는 concurrency 미점유
- account 단위로 계산
- resource pack 타입(video/image/virtual try-on)별 독립 계산
- 가장 높은 활성 패키지 concurrency 값이 적용

### 매우 중요한 문구
- **QPS 제한과 무관**
- **submitted 상태부터 task 종료까지 concurrency 점유**

### 주의
이 문서는 usable API 호출 문서처럼 request/response 예제가 풍부한 타입은 아니고, 운영 규칙 설명 문서에 가깝다. 따라서 concurrency 설계 기준으로는 충분히 쓸 수 있지만, 세부 수치/정책이 바뀔 수 있으므로 운영 전 재확인이 필요하다.

### 실무 의미
- 대량 영상 생성은 “초당 요청수”보다 “동시 실행 task 수” 제어가 핵심
- scheduler/queue가 필요
- 상태 조회는 상대적으로 자유롭게 돌릴 수 있음

## 4.4 Query user info
**문서 경로**: `apiReference/accountInfoInquiry`  
**상태**: **[공식 수집 확인]**

계정 정보 조회용 endpoint 문서. 운영 모니터링/자원 상태 확인용으로 활용 가능.

---

# 5. Video Generation APIs

## 5.1 Video Models
**문서 경로**: `apiReference/model/videoModels`  
**상태**: **[공식 수집 확인 / 부분 요약]**

비디오 모델 카탈로그 문서로 보인다.

### 실무 의미
- `model_name` 허용값 확인용
- 지원 모델 차이/세대별 capability 확인용
- 구현 전 모델 매핑표 작성 필요

## 5.2 Video Omni
**문서 경로**: `apiReference/model/OmniVideo`  
**상태**: **[공식 수집 확인]**

### Create Task
```http
POST /v1/videos/omni-video
```

### 확인된 request example
```json
{
  "model_name": "kling-video-o1",
  "prompt": "Make the person in <<<image_1>>> wave to the camera",
  "image_list": [
    {
      "image_url": "https://.../multi-1.png"
    }
  ],
  "duration": "5",
  "mode": "pro",
  "aspect_ratio": "16:9",
  "callback_url": "",
  "external_task_id": ""
}
```

### 확인된 공통 응답 구조
- `code`
- `message`
- `request_id`
- `data.task_id`
- `data.task_info`
- `data.task_status`
- `created_at`
- `updated_at`

### 해석
- OmniVideo는 Kling video capability의 상위/주력 API로 보임
- `image_list`를 함께 받는 구조는 스타일/캐릭터 고정에 유리
- 단순 텍스트 영상 생성보다 더 풍부한 conditioning 가능성
- important SOT limits for actual use:
  - with no reference video: `number(reference images) + number(reference elements) <= 7`
  - with reference video: `number(reference images) + number(reference elements) <= 4`
  - `end_frame` is not supported when there are more than 2 images

### 파이프라인 적용
- key scene 생성
- style anchor 생성
- sequence 시작점 생성
- multi-reference 기반 consistency 실험

### 용도 구분 가이드 (current-production interpretation)
- **한 장면을 빠르게 만들고 싶다** -> start with Omni `image_list[].image_url`
  - especially when the goal is one-scene reference-guided generation
  - `first_frame` is the strongest current anchor
- **한 클립 안에서 드라마처럼 shot을 나눠 연출하고 싶다** -> use Omni `multi_shot` + `multi_prompt[]`
  - treat this as storyboard sequencing inside one generated clip
  - do not overclaim seamless continuity across shot boundaries unless verified for the specific workflow
- **같은 캐릭터/오브젝트를 반복 재사용하고 싶다** -> move toward `element_list[].element_id`
  - this is the stronger reusable identity / asset workflow in the docs structure
- **장면 A에서 장면 B로 broader continuity를 이어가고 싶다** -> use `video_list` continuation / remote video reference
  - this is for cross-scene continuity reference, not for pretending several clips are one hidden single scene
- **같은 클립을 뒤로 더 늘리고 싶다** -> no current-production route confirmed
  - historical `video-extend` exists in the broader docs corpus, but preserved support notes restrict it to older video generations and it is not part of this repo's active production route set

### `video_list` vs `element_list`
- `video_list` can carry previous scene / clip context
- `element_list` can carry recurring character or object identity
- current SOT limits imply they can coexist when reference-video rules are respected
- practical reading: they are complementary, not mutually exclusive

## 5.3 Text to Video
**문서 경로**: `apiReference/model/textToVideo`  
**상태**: **[공식 수집 확인]**

### 특징
- Create Task / cURL / request example 존재
- 기본 텍스트 기반 영상 생성

### 실무 역할
- 제일 단순한 baseline 생성
- 참조 자산 없는 독립 장면용
- continuity보단 독립 샷에 적합

## 5.4 Image to Video
**문서 경로**: `apiReference/model/imageToVideo`  
**상태**: **[공식 수집 확인]**

### 특징
- Create Task / cURL / request example 존재
- 시작 이미지를 기반으로 영상 생성

### 실무 역할
- 스타일 고정이 중요할 때 T2V보다 우선
- 썸네일/키프레임 기반 shot 생성에 유리

## 5.5 Reference to Video / Multi-Image to Video
**문서 경로**: `apiReference/model/multiImageToVideo`  
**상태**: **[공식 수집 확인 + historical probe evidence]**

> 중요: 아래의 historical PASS / probe 흔적은 현재 repo의 production-default truth와 동일하지 않다. 현재 production-facing verified pair는 `kling-v3` + `kling-v3-omni`이며, `reference2video`는 여전히 current production-default 경로가 아니라 later verification / refinement 대상으로 읽어야 한다.

### Create Task
```http
POST /v1/videos/multi-image2video
```

### 확인된 request example
```json
{
  "model_name": "kling-v1-6",
  "image_list": [
    { "image": "https://.../dog.png" },
    { "image": "https://.../dog_cloth.png" }
  ],
  "prompt": "A white Bichon Frise wearing a red Northeast-style floral cotton jacket, licking its paw",
  "negative_prompt": "",
  "mode": "pro",
  "duration": "5",
  "aspect_ratio": "16:9",
  "callback_url": "",
  "external_task_id": ""
}
```

### 해석
- 문서 메뉴 이름은 `Reference to Video`
- 실제 endpoint 이름은 `multi-image2video`
- 여러 참조 이미지 기반 consistency 제어용으로 해석 가능

### 파이프라인 적용
- 장면 간 시각적 일관성 유지
- recurring character / object / style reference
- start image 대신 multi-reference conditioning

## 5.6 Motion Control
**문서 경로**: `apiReference/model/motionControl`  
**상태**: **[공식 수집 확인]**

### 실무 역할
- 동작/카메라/움직임 제어형 generation
- 일반 생성보다 움직임 제어가 필요할 때 사용
- v2 고급 shot generator 후보

## 5.7 Multi-elements to video
**문서 경로**: `apiReference/model/multiElements`  
**상태**: **[공식 수집 확인]**

### 실무 역할
- 여러 요소(인물/오브젝트/브랜드 요소) 일관성 유지
- multi-asset scene 구성용

## 5.8 Extend Video / Video Extension
**문서 경로**: `apiReference/model/videoExtension`  
**상태**: **[공식 수집 확인]**

### Create Task
```http
POST /v1/videos/video-extend
```

### 확인된 request example
```json
{
  "prompt": "A puppy appears",
  "video_id": "743211632612511839",
  "negative_prompt": "",
  "callback_url": ""
}
```

### 중요 설명
- text-to-video / image-to-video 결과를 연장
- **각 extension은 4~5초 추가**
- 모델/모드는 직접 선택 불가

### 실무 의미
이건 유튜브 롱폼 설계에서 매우 중요하다.

- 기존 생성 결과를 계속 이어붙일 수 있음
- 긴 시퀀스 chaining의 핵심 capability 후보
- 단, 4~5초 단위라는 제약이 존재

### 파이프라인 적용
- `initial clip` 생성 후 `video_id` 기준 extension loop
- scene chain 방식 검토 가능
- extension 품질/비용/속도 실험 필요

## 5.9 Lip Sync
**문서 경로**: `apiReference/model/lipSync`  
**상태**: **[공식 수집 확인]**

실제 파라미터/예제 존재. 캐릭터 발화형 장면이 필요할 때 사용 가능.

## 5.10 Avatar
**문서 경로**: `apiReference/model/avatar`  
**상태**: **[공식 수집 확인]**

아바타형 생성 capability.

## 5.11 Text to Audio / Video to Audio / TTS / Voice Clone
**문서 경로**:
- `textToAudio`
- `videoToAudio`
- `TTS`
- `customVoices`

**상태**: **[공식 수집 확인 / 일부는 부분 요약]**

### 실무 의미
- Kling 내부 오디오 계열 capability 존재
- 현재 유튜브 자동화 v1에서는 외부 TTS를 우선 고려하지만,
- callback/async 구조와 음성 capability 확장 후보로 중요

### 주의
이 섹션들은 존재와 대략적 기능은 확인했지만, 현재 문서에서는 비디오 핵심 문서만큼 세부 필드를 완전 정규화하지는 않았다. 실제 오디오 파이프라인에 넣기 전 추가 정리가 필요하다.

---

# 6. Image Generation APIs

## 6.1 Image Models
**문서 경로**: `apiReference/model/imageModels`  
**상태**: **[공식 수집 확인]**

Image 모델 카탈로그 문서.

## 6.2 Image Omni
**문서 경로**: `apiReference/model/OmniImage`  
**상태**: **[공식 수집 확인]**

Image Omni capability 존재.

## 6.3 Image Generation
**문서 경로**: `apiReference/model/imageGeneration`  
**상태**: **[공식 수집 확인]**

기본 이미지 생성 endpoint 문서.

## 6.4 Reference to Image / Multi-Image to Image
**문서 경로**: `apiReference/model/multiImageToImage`  
**상태**: **[공식 수집 확인]**

여러 참조 이미지 기반 생성 구조로 보임.

## 6.5 Extend Image / Image Expansion
**문서 경로**: `apiReference/model/imageExpansion`  
**상태**: **[공식 수집 확인]**

이미지 확장 capability.

## 6.6 AI Multi-Shot
**문서 경로**: `apiReference/model/aiMultiShot`  
**상태**: **[공식 수집 확인]**

multi-shot 구성 capability.

## 6.7 Virtual Try-On
**문서 경로**: `apiReference/model/virtualTryOn`  
**상태**: **[공식 수집 확인]**

가상 착용/try-on capability.

### 이미지 계열 실무 의미
현재 유튜브 자동화 v1에선 직접 우선순위는 낮지만,
- keyframe 생성
- 스타일 가이드 생성
- 캐릭터 참조 자산 생성
에 활용 가능성이 높다.

---

# 7. Pricing / Billing / Protocols

## 7.1 Billing Info
**문서 경로**: `productBilling/billingMethod`  
**상태**: **[공식 수집 확인 / 부분 요약]**

과금 구조 이해용 문서.

### 주의
실제 API endpoint 문서라기보다는 운영/상품 구조 참고 문서 성격이 강하다. 구현보다 비용 계획과 운영 정책 검토에 사용한다.

## 7.2 Prepaid Resource Packs
**문서 경로**: `productBilling/prePaidResourcePackage`  
**상태**: **[공식 수집 확인 / 부분 요약]**

resource pack / concurrency / 운영 한도 이해용 문서.

### 주의
이 문서는 usable API 호출 문서보다는 운영 참고 문서이며, 실제 패키지/정책은 변동 가능성이 있으므로 구매/운영 직전에 다시 확인해야 한다.

## 7.3 Protocols
- Privacy Policy of API Service
- Terms of API Service
- API Service Level Agreement

운영/상업적 사용 참고용. 구현 세부보단 정책 참고용.

---

# 8. Wrapper 교차검증에서 얻은 보강 포인트

## 8.1 Freepik
**[wrapper 교차검증]**

확인된 것:
- Kling 3 Omni를 실제 API 상품으로 제공
- reference-to-video가 별도 endpoint
- `webhook_url` 사용
- async task 흐름 명시

### 의미
- Kling의 capability 분리와 실제 wrapper 설계가 유사함
- reference / callback 구조 해석의 신뢰도를 높여줌

## 8.2 mcp-kling
**[wrapper 교차검증]**

확인된 capability map:
- Text to Video
- Image to Video
- Multi-Image to Video
- Video Extension
- Callback Protocol
- Account Information Inquiry

### 의미
- 공식 문서 구조와 높은 정합성
- 현재 문서 구조 해석이 크게 빗나가지 않았다는 보강 근거

---

# 9. 파이프라인 설계에 직접 반영할 핵심

## 9.1 v1 필수 capability
- `OmniVideo`
- `textToVideo`
- `imageToVideo`
- `multiImageToVideo`
- `videoExtension`
- `callbackProtocol`
- `rateLimits`

## 9.2 v2 확장 capability
- `motionControl`
- `multiElements`
- `lipSync`
- 내부 오디오 계열
- image-side reference/expand/multishot

## 9.3 운영상 중요한 제약
- QPS보다 **concurrency** 중심
- task creation이 concurrency 소모
- query는 concurrency 미소모
- extension은 4~5초 단위
- callback_url을 활용한 비동기 처리 권장

---

# 10. 현재 문서의 완성도와 한계

## 확정적으로 쓸 수 있는 것
- 문서 구조
- 핵심 path
- 주요 endpoint path
- Create Task 패턴
- callback 개념
- concurrency 중심 운영 개념
- Omni / Reference / Extend capability 분리
- 핵심 비디오 생성 API의 request 예시와 적용 해석

## 부분 요약/운영 참고로 봐야 하는 것
- Rate Limits 문서의 세부 운영 수치
- Billing / Resource Pack 문서
- Policy / Terms / SLA 문서
- 오디오 계열 capability의 세부 필드 정규화 (일부는 이번 심화조사로 보강됨)

## 아직 보강 여지가 있는 것
- 일부 endpoint의 전체 parameter table 정규화
- 일부 response schema의 필드 누락 보정
- billing/resource pack의 수치/정책 상세 정리
- 모든 모델별 허용값 표 작성
- image/audio 하위 capability의 세부 파라미터 표 확장

하지만 현재 상태는 **파이프라인 설계에 들어가기 전 기준 문서**로 사용 가능한 수준이다.

---

# 11. 산출물 근거
- 본문 수집: `youtube-automation/samples/kling-api/pages/`
- 전체 재검증: `docs/history/20260328_research_kling-api-full-revalidation.md`
- wrapper 검증: `docs/history/20260328_research_kling-repackaging-validation.md`
- 구조 분석: `docs/history/20260328_research_kling-api-phase1-final.md`


---

# 12. 추가 심화조사 반영 (2026-03-28)

이번 라운드에서 기존 핵심 5개 외에 유튜브 자동화와 연관 가능성이 있는 하위 capability를 추가 Playwright 정밀 수집했다.  
결론적으로 아래 17개 문서는 **strong 품질**로 확보되었다.

## 12.1 비디오/제어 확장

### Motion Control
- **POST** `/v1/videos/motion-control`
- 확인 필드: `model_name`, `prompt`, `duration`, `mode`, `callback_url`, `external_task_id`
- 섹션 확보: `Create Task`, `Parameters`, `Callback Protocol`
- 의미: 장면 움직임/카메라/동작 제어형 capability로 보이며, v2 shot quality 강화에 유용하다.

### Multi-elements to video
- **POST** `/v1/videos/multi-elements/init-selection`
- 확인 필드: `model_name`, `prompt`, `negative_prompt`, `duration`, `mode`, `callback_url`, `external_task_id`, `video_id`, `image_list`
- 섹션 확보: `Create Task`, `Parameters`, `Callback Protocol`
- 의미: 여러 요소/자산을 동시에 다루는 고급 일관성 제어 capability.

### Video Effects
- **POST** `/v1/videos/effects`
- 확인 필드: `prompt`, `duration`, `mode`, `callback_url`, `external_task_id`
- 섹션 확보: `Create Task`, `Parameters`, `Request Example`, `Callback Protocol`
- 의미: 단순 생성 이후 효과/스타일 적용형 워크플로우 후보.

### Lip Sync
- **POST** `/v1/videos/identify-face`
- 확인 필드: `prompt`, `duration`, `mode`, `callback_url`, `external_task_id`, `video_id`
- 섹션 확보: `Create Task`, `Parameters`, `Callback Protocol`
- 의미: 발화 장면/캐릭터형 콘텐츠에 연결 가능.

### Avatar
- **POST** `/v1/videos/avatar/image2video`
- 확인 필드: `prompt`, `duration`, `mode`, `callback_url`, `external_task_id`
- 섹션 확보: `Create Task`, `Parameters`, `Callback Protocol`
- 의미: 범용 faceless 파이프라인보다는 앵커/캐릭터형 채널에 적합.

## 12.2 이미지 계열

### Image Omni
- **POST** `/v1/images/omni-image`
- 확인 필드: `model_name`, `prompt`, `duration`, `mode`, `aspect_ratio`, `callback_url`, `external_task_id`, `image_list`
- 섹션 확보: `Create Task`, `Parameters`, `Callback Protocol`
- 의미: reference/keyframe/style-bible 생성용으로 유용.

### Image Generation
- **POST** `/v1/images/generations`
- 확인 필드: `model_name`, `prompt`, `negative_prompt`, `duration`, `mode`, `aspect_ratio`, `callback_url`, `external_task_id`
- 섹션 확보: `Create Task`, `Parameters`, `Callback Protocol`

### Reference to Image
- **POST** `/v1/images/multi-image2image`
- 확인 필드: `model_name`, `prompt`, `negative_prompt`, `duration`, `mode`, `aspect_ratio`, `callback_url`, `external_task_id`, `image_list`
- 섹션 확보: `Create Task`, `Parameters`, `Callback Protocol`

### Extend Image
- **POST** `/v1/images/editing/expand`
- 확인 필드: `prompt`, `duration`, `mode`, `aspect_ratio`, `callback_url`, `external_task_id`
- 섹션 확보: `Create Task`, `Parameters`, `Callback Protocol`

### AI Multi-Shot
- **POST** `/v1/general/ai-multi-shot`
- 확인 필드: `prompt`, `duration`, `mode`, `callback_url`, `external_task_id`
- 섹션 확보: `Create Task`, `Parameters`, `Callback Protocol`

### Virtual Try-On
- **POST** `/v1/images/kolors-virtual-try-on`
- 확인 필드: `model_name`, `prompt`, `duration`, `mode`, `callback_url`, `external_task_id`
- 섹션 확보: `Create Task`, `Parameters`, `Callback Protocol`
- 의미: 일반 유튜브 범용 파이프라인과 직접성은 낮지만, 패션/커머스 채널에는 중요할 수 있음.

## 12.3 오디오/기타 보조 capability

### Text to Audio
- **POST** `/v1/audio/text-to-audio`
- 섹션 확보: `Create Task`, `Parameters`, `Callback Protocol`

### Video to Audio
- **POST** `/v1/audio/video-to-audio`
- 섹션 확보: `Create Task`, `Parameters`, `Callback Protocol`

### Text to Speech
- **POST** `/v1/audio/tts`
- 확인 필드: `text`, `voice_id`
- 섹션 확보: `Create Task`

### Voice Clone
- **POST** `/v1/general/custom-voices`
- 확인 필드: `callback_url`, `external_task_id`, `voice_id`
- 섹션 확보: `Parameters`, `Callback Protocol`

### Image Recognize
- **POST** `/v1/videos/image-recognize`
- 섹션 확보: `Parameters`

### Element
- **POST** `/v1/general/advanced-custom-elements`
- 확인 필드: `callback_url`, `external_task_id`, `image_list`
- 섹션 확보: `Parameters`, `Callback Protocol`

---


---

# 13. 2026-03-29 live smoke test findings

**[실 API 검증]**

Omni Video에 대해 실제 create-task smoke test를 수행했고, 다음을 확인했다.

## 확인된 것
- AK/SK → JWT Bearer 인증 실제 성공
- `POST /v1/videos/omni-video` 실제 성공
- `model_name='kling-v3-omni'` 실제 성공
- 응답 구조 확인:
  - `code: 0`
  - `message: SUCCEED`
  - `request_id`
  - `data.task_id`
  - `data.task_status='submitted'`

## 중요한 불일치 수정
첫 시도에서는 `mode='standard'`로 요청했고, 실제 API는 다음 에러를 반환했다.
- `code: 1201`
- `message: mode value 'standard' is invalid`

이후 `mode='std'`로 수정하자 요청이 성공했다.

### 따라서 현재 확정
- `mode` enum은 `standard/professional`이 아니라 **`std` / `pro`** 로 다뤄야 한다.
- 관련 builder / validator / 구현 스펙은 이 실측 결과를 기준으로 맞춘다.


## 2026-03-29 multi-endpoint smoke sequence
- text2video: FAIL
- image2video: FAIL
- reference2video: FAIL
- query_lists: PASS
- extend_video: FAIL

### Live mismatches observed
- text2video: model kling-video-o1 not supported
- text2video: model kling-v3-omni not supported
- image2video: model kling-v3-omni not supported
- image2video: model kling-video-o1 not supported
- reference2video: request body shape still mismatched for current variant


## 2026-03-29 deep contract probe
- text2video_probe PASS via payload: {'prompt': 'A calm sunrise over the ocean, cinematic, realistic', 'model_name': 'kling-v1', 'duration': '5', 'mode': 'std', 'aspect_ratio': '16:9'}
- image2video_probe PASS via payload: {'image': 'iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mP8/x8AAwMCAO+y3ioAAAAASUVORK5CYII=', 'prompt': 'Subtle cinematic camera motion, realistic', 'model_name': 'kling-v1-6', 'duration': '5', 'mode': 'std', 'aspect_ratio': '16:9'}
- reference2video_probe PASS via payload: {'image_list': [{'image': 'iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mP8/x8AAwMCAO+y3ioAAAAASUVORK5CYII='}], 'prompt': 'Keep subject consistent with subtle camera motion', 'model_name': 'kling-v1-6', 'duration': '5', 'mode': 'std', 'aspect_ratio': '16:9'}
- extend_video_probe fail: {"code":1201,"message":"This video not supported extend-video","request_id":"dcdc71ba-ffee-4586-8b00-9bdb5a797e4d"}


## 2026-03-29 final verification round
- text2video_5s: PASS
- image2video_5s: PASS
- reference2video_5s: PASS
- omni_multishot_15s: FAIL
- callback files observed: 0


## 2026-03-29 retry round after valid image + multishot validator fix
- image2video_valid_image: PASS
- reference2video_valid_image: PASS
- omni_multishot_15s_retry: FAIL
- callback externally reachable configured: NO


## 2026-03-29 contract refinements from deeper live probing
- image2video_img256_png: failed (Image pixel is invalid)
- image2video_img512_png: submitted ()
- image2video_img512_jpg: submitted ()
- reference2video_img256_png: processing ()
- reference2video_img512_png: processing ()
- reference2video_img512_jpg: processing ()
- omni_multishot_probe_3: create/query succeeded after adding index fields to multi_prompt
- omni_multishot_probe_3: create/query succeeded after adding index fields to multi_prompt

## 2026-03-29 Omni image reference structure refinement
- Live probing showed that `POST /v1/videos/omni-video` accepts:
  - text-only body
  - `image=<base64>` body (create/succeed possible)
  - `image_list[].image_url + type='first_frame'` body (create/succeed confirmed)
- Live probing also showed that Omni rejected the older reference-style body:
  - `image_list=[{'image': <base64>}]`
  - `image_list=[{'index': 1, 'image': <base64>}]`
  with `code=1201`, message=`Failed to resolve the request body`
- Important interpretation:
  - `image=<base64>` can succeed but did not prove strong reference adherence in our test
  - `image_list[].image_url + type='first_frame'` is the strongest currently confirmed Omni image-reference structure
- Current best-known Omni first-frame payload shape:
```json
{
  "model_name": "kling-v3-omni",
  "prompt": "...",
  "duration": "5",
  "mode": "std",
  "aspect_ratio": "16:9",
  "image_list": [
    {
      "image_url": "<base64-or-url>",
      "type": "first_frame"
    }
  ]
}
```


## 2026-03-29 additional doc-derived Series 3 structures
- `video_list` observed in Qingque capture:
```json
"video_list": [
  {
    "video_url": "video_url",
    "refer_type": "base",
    "keep_original_sound": "yes"
  }
]
```
- `element_list` observed in Qingque capture:
```json
"element_list": [
  {
    "element_id": long
  }
]
```
- These should be treated as doc-derived SOT structures pending exact live confirmation in their intended endpoint contexts.


## 2026-03-29 production policy note
Based on current pipeline decisions, 15 seconds is now treated as the maximum target length for a single scene.
This is a pipeline rule, not a vendor/API rule.
Implication:
- use one clip for one scene whenever possible
- reserve multi-clip continuity logic for transitions between scenes rather than as the default substitute for a single scene

## 2026-03-30 production policy refinement
- production model pair for the repo is `kling-v3` + `kling-v3-omni`
- verified production-facing routes are now recorded explicitly:
  - `text2video` with `kling-v3` = verified
  - `image2video` with `kling-v3` = verified
  - `omni` with `kling-v3-omni` = verified
- `kling-video-o1` should not be treated as the current 3.0 base production model
- production image-input principle:
  - keep source-of-truth assets on our side
  - send actual image attachments we control as raw base64 in requests by default
  - do not rely on remote image URLs as the default production image-input strategy
- preserve endpoint-specific image field shapes exactly:
  - `image2video` uses top-level `image` / optional `image_tail`
  - Omni uses `image_list[].image_url`
- verified production-facing image input methods:
  - `image2video` with `kling-v3` -> top-level `image=<raw base64>`
  - `omni` with `kling-v3-omni` -> `image_list[].image_url=<raw base64>` with `type='first_frame'`
- remote image URLs remain part of the broader documented value surface where allowed, but they are not this repo’s default production choice for image inputs