{
  "Create Task": "Create Task\nPOST\n/v1/audio/tts\ncURL\nCopy\nCollapse\ncurl --request POST \\\n--url https://api-singapore.klingai.com/v1/audio/tts \\\n--header 'Authorization: Bearer <token>' \\\n--header 'Content-Type: application/json' \\\n--data '{\n\"text\": \"Throughout my time in college, several memorable event left a significant impact on my life\",\n\"voice_id\": \"oversea_male1\",\n\"voice_language\": \"en\",\n\"voice_speed\": 1\n}'\n200\nCopy\nCollapse\n{\n\"code\": 0, // Error codes; Specific definitions can be found in Error codes\n\"message\": \"string\", // Error information\n\"request_id\": \"string\", // Request ID, generated by the system, is used to track requests and troubleshoot problems\n\"data\": {\n\"task_id\": \"string\", // Task ID, generated by the system\n\"task_status\": \"string\", // Task status, Enum values: submitted, processing, succeed, failed\n\"task_status_msg\": \"string\", // Task status information, displaying the failure reason when the task fails (such as triggering the content risk control of the platform, etc.)\n\"task_result\": {\n\"audios\": [\n{\n\"id\": \"string\", // Generated sound ID; globally unique, will be cleared after 30 days\n\"url\": \"string\", // URL for generating sounds，such as https://p1.a.kwimgs.com/bs2/upload-ylab-stunt/special-effect/output/HB1_PROD_ai_web_46554461/-2878350957757294165/output.mp3(To ensure information security, generated images/videos will be cleared after 30 days. Please make sure to save them promptly.)\n\"duration\": \"string\" // Total audio duration, unit: s (seconds)\n}\n]\n},\n\"final_unit_deduction\": \"string\", // The deduction units of task\n\"created_at\": 1722769557708, // Task creation time, Unix timestamp, unit: ms\n\"updated_at\": 1722769557708 // Task update time, Unix timestamp, unit: ms\n}\n}\nText-to-Speech synthesis API for generating audio from text.\nRequest Header\nContent-Type\nstring\nRequired\nDefault to application/json\nData Exchange Format\nAuthorization\nstring\nRequired\nAuthentication information, refer to API authentication\nRequest Body\ntext\nstring\nRequired\nText Content for Audio Synthesis\nThe maximum length of the text content is 1000 characters; content that is too long will return an error code and other information.\nThe system will validate the text content; if there are issues, it will return an error code and other information.\nvoice_id\nstring\nRequired\nVoice ID\nThe system offers a variety of voice options to choose from. For specific voice effects, voice IDs, and corresponding voice languages, see Voice Guide. Voice previews do not support custom scripts.\nVoice preview file naming convention: Voice Name#Voice ID#Voice Language\nvoice_language\nstring\nRequired\nDefault to zh\nVoice Language\nEnum values：\nzh\nen\nThe voice language corresponds to the Voice ID, as detailed above.\nvoice_speed\nfloat\nOptional\nDefault to 1.0\nSpeech Rate\nValid range: [0.8, 2.0], accurate to one decimal place; values outside this range will be automatically rounded.\nPrevious chapter：Video to Audio\nNext chapter：Voice Clone\nThe Kling 3.0 Series Models API is Now Fully Available\n– All in One, One for All！\nModels Available in This Release\nKling 3.0 Motion Control, Kling Video 3.0, Kling Video 3.0 Omni, Kling Image 3.0, Kling Image 3.0 Omni\nRefer to <Kling AI Series 3.0 Model API Specification>\nKey Highlights of the Models\n3.0 All-in-One: A unified model for multi-modal input and output.\nMost powerful consistency across the universe: Subject consistency (supports cameo, subject with voice control, i2v + subject) and text consistency."
}