The Kling 3.0 series models API is now fully available
Learn More

Concurrency Rules


What is Kling API concurrency?

Kling API concurrency refers to the maximum number of generation tasks that an account can process in parallel at any given time. This capability is determined by the resource package. A higher concurrency level allows you to submit more API generation requests simultaneously (each call to the task creation interface initiates a new generation task).

💡

Notes

  • This only applies to the task creation interface; query interfaces do not consume concurrency.
  • This limitation concerns the number of concurrent tasks and is unrelated to Queries Per Second(QPS)— the system imposes no QPS limit.

Core Rules

DimensionRule Description
Application ScopeApplied at the account level. Calculated independently per resource pack type (video/image/virtual try-on). All API keys under the same account share the same concurrency quota.
Occupancy LogicA task occupies concurrency from entering submitted status until completion (including failures). Released immediately after task ends.
Quota CalculationDetermined by the highest concurrency value among all active resource packages of the same type. Example: If a 5-concurrency + 10-concurrency video package are both active → video concurrency capacity = 10

Special Notes

  • Video / Virtual Try-on tasks: Each task occupies 1 concurrency.
  • Image generation tasks: Concurrency used = the n value in the API request parameter. (Example: n = 9 → occupies 9 concurrency)

Over-limit Error Mechanism

When the number of running tasks reaches the concurrency limit, submitting a request will return an error.

{
	"code": 1303,
	"message": "parallel task over resource pack limit",
	"request_id": "9984d27b-a408-4073-ae28-17ca6a13622d" //uuid
}

Since this error is triggered by system load (not by parameter issues), it is recommended to:

  1. Backoff Retry Strategy: Use an exponential backoff algorithm to delay retries (recommended initial delay ≥ 1 second).
  2. Queue Management: Control the submission rate through a task queue and dynamically adapt to available concurrency.
What is Kling API concurrency?
Core Rules
Over-limit Error Mechanism
Recommended Approach