# Kling Phase 3 expanded verification plan

Status: planned, documentation-only
Captured: 2026-03-30
Purpose: define the smallest practical Phase 3 live-verification plan that expands beyond Omni while staying aligned with current docs, code, and billable-create guardrails. Current-production priority now excludes legacy/non-current route families from the default verification core.

## Scope
This plan is for the next **approved** live-verification round only.
It does **not** authorize execution by itself.
It converts the current repo state into an actionable matrix for the current production-relevant families:
- Omni
- text2video
- image2video

Legacy/non-current families such as `reference2video` may still be documented separately, but they are no longer part of the default current-production matrix.

## Ground rules
1. Every create call is billable.
2. One immediate purpose = one create by default.
3. First successful create for that purpose stops further variants.
4. No hypothesis-only payloads unless explicitly approved.
5. Prefer doc-derived payloads that already match current builders/validators when possible.

## Phase 3 current priority note
- `std` remains the default operating mode.
- use `pro` when output quality or resolution is explicitly part of the decision, or when a `std` run needs escalation.
- The next highest-value current-production verification target is `video_list` as scene-reference continuity.
- After that, the strongest remaining identity workflow to validate is Elements / `element_list`, and then the combination of `video_list` + `element_list`.
- historical `video-extend` is no longer part of the current-production verification core because the docs explicitly restrict it to older video generations.

## What changed since the earlier Phase 3 framing
Earlier Phase 3 language was effectively Omni-first and deferred non-Omni work.
Current repo state now justifies a broader verification plan because:
- Omni has the strongest synchronized path.
- non-Omni endpoints have historical PASS evidence in logs.
- current scaffold still treats non-Omni model policy as provisional.
- therefore the next useful question is not “do non-Omni endpoints exist?” but “what is the minimum safe verification set that can tighten current scaffold policy without drifting into multi-create spend.”

## Current state summary that drives this plan
### Omni
- strongest current path: `image_list[].image_url + type='first_frame'`
- stronger continuation path: `video_list[].video_url=<remote url>`
- multi-shot is real, but the earlier failure was due to missing `multi_prompt[].index`; later logs show create/query success after adding `index`

### Non-Omni
- historical create PASS evidence exists for older non-Omni families, including routes that surfaced with legacy model generations (`kling-v1`, `kling-v1-6`)
- current scaffold policy intentionally does **not** accept legacy model naming on non-Omni endpoints
- current-production verification should stay centered on `text2video` and `image2video`
- separately documented legacy/non-current families should not remain in the default production matrix just because the broader docs corpus still exposes them
- therefore Phase 3 should treat non-Omni verification as a **policy-tightening round** for current production routes, not as a fully open exploration round across mixed-generation endpoints

---

# Verification matrix

## A. Omni — first-frame anchored baseline
- Endpoint: `POST /v1/videos/omni-video`
- Immediate purpose: confirm the current safest one-scene Omni path still behaves as the anchor baseline for the repo
- Candidate models:
  - primary: `kling-v3-omni`
  - secondary only if explicitly approved later: model candidate under re-check (`kling-v3` vs prior `kling-video-o1` assumption)
- Minimal payload shape:
```json
{
  "model_name": "kling-v3-omni",
  "prompt": "<one-scene prompt>",
  "duration": "5",
  "mode": "std",
  "aspect_ratio": "16:9",
  "image_list": [
    {
      "image_url": "<base64-or-url>",
      "type": "first_frame"
    }
  ]
}
```
- Success criteria:
  - create returns `code=0`
  - query reaches `succeed`
  - output is visually consistent with the supplied first frame
  - response/log shape still matches current builder + validator assumptions
- Stop conditions:
  - first successful create ends the row
  - if create fails due to auth/platform/transient error, stop and fix environment before any variant
  - if create fails because the documented payload itself is rejected, stop and reopen docs/code review before retrying
- Cost-control notes:
  - this is the baseline row most worth paying for because it validates the repo’s current safe default
  - keep it at 5s and `std`
- Reopen criteria:
  - only reopen if builder/validator changes, model allowlist changes, or first-frame adherence regresses unexpectedly

## B. text2video — scaffold-policy verification
- Endpoint: `POST /v1/videos/text2video`
- Immediate purpose: determine whether the current scaffold-safe minimal body works **without** pinned `model_name`
- Candidate models:
  - current scaffold primary: omit `model_name`
  - historical fallback candidate for a separately approved later round: `kling-v1`
- Minimal payload shape:
```json
{
  "prompt": "<simple one-scene prompt>",
  "duration": "5",
  "mode": "std",
  "aspect_ratio": "16:9"
}
```
- Success criteria:
  - create returns `code=0`
  - query reaches `succeed`
  - output is a normal 5s text-generated clip
  - no hidden requirement emerges for scaffold-disallowed fields
- Stop conditions:
  - if minimal no-model payload succeeds, stop; do not also test legacy model-pinned variants in the same round
  - if minimal no-model payload fails with explicit model-required or unsupported-default semantics, stop and document that exact error before asking approval for one legacy-model fallback
  - do not test multiple model names in one approval block for the same purpose
- Cost-control notes:
  - this row is valuable only because it resolves a current scaffold policy question
  - keep prompt simple to avoid moderation/noise confusion
- Reopen criteria:
  - reopen only if the no-model attempt fails with a clearly model-policy-specific error, or if the endpoint later becomes a real routing candidate in production

## C. image2video — scaffold-policy + asset-floor verification
- Endpoint: `POST /v1/videos/image2video`
- Immediate purpose: verify the scaffold-safe minimal body using a known-good image floor rather than rediscovering image validity
- Candidate models:
  - current scaffold primary: omit `model_name`
  - historical fallback candidate for a separately approved later round: `kling-v1-6`
- Minimal payload shape:
```json
{
  "image": "<known-valid-512px-base64-jpg-or-png>",
  "prompt": "<simple motion prompt>",
  "duration": "5",
  "mode": "std",
  "aspect_ratio": "16:9"
}
```
- Success criteria:
  - create returns `code=0`
  - query reaches `succeed`
  - output respects the source image at least at a basic subject/scene level
  - no extra required fields emerge beyond the current scaffold shape
- Stop conditions:
  - if minimal no-model payload succeeds, stop
  - if failure is `Image pixel is invalid` or equivalent asset-quality error, stop and replace the asset first; do not infer endpoint contract failure
  - if failure explicitly points to model/version policy, stop and request approval for exactly one legacy-model retry later
- Cost-control notes:
  - reuse a previously validated 512px image asset
  - do not spend this row on image-format rediscovery
- Reopen criteria:
  - reopen only for model-policy clarification or if asset-validation rules appear to have changed materially

## D. Legacy/non-current route note — `reference2video`
- Endpoint: `POST /v1/videos/multi-image2video`
- Current classification: legacy/non-current route candidate, not part of the default current-production verification core
- Why it is no longer in the default production matrix:
  - dashboard/runtime observations suggest it may still map to older model generations (for example 1.6-class behavior)
  - the broader docs corpus still exposes it, but that is not sufficient reason to keep it in the current production-facing layer
  - current multi-reference-like production behavior is better explained through Omni `image_list`
- Handling rule:
  - keep historical notes and provenance, but do not present this route as a current production-default choice
  - if future work revisits it, do so explicitly as legacy-route analysis, not as part of the primary production path set
  - do not use `image2video` + `image_tail` results as a substitute for this route, but also do not keep this route in the current-production core just because it remains callable

---

# Cross-row approval logic

## Recommended execution order
1. Omni first-frame baseline
2. text2video minimal no-model
3. image2video minimal no-model with known-good asset
4. reference2video minimal no-model with known-good asset

Reason:
- start with the most synchronized row
- then answer the narrowest unresolved policy question for each non-Omni family
- avoid spending on quality-comparison or multi-variant batches before basic scaffold policy is tightened

## Row-level interpretation rules
- A row success means: the minimal current-plan contract is viable.
- A row failure does **not** automatically mean the endpoint is unusable.
- Failures must be classified as one of:
  - environment/auth/transient
  - asset-quality issue
  - model-policy mismatch
  - body-shape mismatch
  - moderation/content-risk

Only the last two justify schema/policy reopening.

### Preserved future-verification scenario: Omni lingerie stress attempt
Keep the following run as a preserved boundary case for future review, without over-reading it as an API-shape failure:
- model: `kling-v3-omni`
- duration: `10`
- attachment method: three user-provided reference images sent as raw base64 via `image_list[].image_url`
- semantic roles of the three references: host1 / host2 / lingerie object
- exact preserved prompt:

> Two female hosts stand in a private bedroom and notice a lingerie piece hanging on a rack. They react with subtle surprise and admiration. One host lifts the lingerie from the hanger and holds it against her body, then tries it on in place with visible, natural dressing motion. She adjusts smooths the fit while the second host watches with an impressed meaningful smile. The final scene concludes with a woman trying on the lingerie posing with a meaningful smile, followed by a close-up of the garment. smooth and seamless movement, vivid product images, and realistic body and clothing movements blend together.

- preserved execution outcome:
  - create accepted and returned task `867314916859576411`
  - final query later returned `task_status='failed'`
  - failure message was `Failure to pass the risk control system`
  - `final_unit_deduction='0'`
- current interpretation:
  - classify this as `moderation/content-risk`
  - do **not** classify it as `body-shape mismatch`
  - do **not** treat it as evidence that raw-base64 `image_list[].image_url` attachments failed structurally

## What not to do in this round
- no `pro` vs `std` comparisons
- no multiple candidate model names for the same row
- no Omni multi-shot retesting in the same approval batch
- no element-backed generation yet
- no audio / voice fields yet
- no callback reachability investigation mixed into create verification

---

# Minimum approved experiment set to request
If only one compact approval block is requested, the minimum useful current-production set is:
1. **1x Omni** first-frame 5s baseline create
2. **1x text2video** 5s minimal no-model create
3. **1x image2video** 5s minimal no-model create with a known-good 512px asset

Total planned create count: **3**

Why this is the minimum set:
- it covers the current production-facing route families without drifting into legacy/non-current endpoint analysis
- more than 3 starts mixing current-production tightening with historical-route cleanup
- legacy/non-current routes such as `reference2video` are intentionally out of scope for this minimum set

## If approval must be even smaller
Use this fallback order:
1. Omni first-frame baseline
2. text2video minimal no-model
3. image2video minimal no-model

That ordering preserves the highest decision value per create.

---

# Deliverables to capture for each approved row
- exact payload
- endpoint
- whether payload was doc-derived, live-confirmed, or policy-test
- create response
- query response
- final asset URL if any
- local download if any
- final unit deduction
- one-sentence verdict: `tightened`, `unchanged`, or `reopened`

## End state this plan is trying to reach
After this round, the repo should be able to say one of two things clearly:
1. non-Omni minimal no-model requests are viable and current scaffold policy can tighten safely, or
2. non-Omni still require explicit legacy model handling and should remain caller-explicit / provisional

Either answer is useful. The wasteful outcome would be mixing multiple variants before that question is answered.