mingyan
SoundClone - Create Audio Task
Create an audio generation task with a voice cloning model ID
SoundClone Create Audio Task
Create an audio generation task with the
modelId returned by a voice cloning preview task. The preview task returns a modelId that must be used in this API. The modelId is valid for 3 days and cannot be used after it expires. When a valid modelId is used with this API for the first time, it is automatically converted to a permanent model and can then be used for audio generation permanently.认证
获取 KeyAll requests must include a Bearer token in the request header:
cURL
Authorization: Bearer {{key}}
Base URL
https://zcbservice.aizfw.cn/kyyReactApiServerbaseUrl is the shared prefix for all public APIs. The api field in the current page frontmatter shows the full endpoint. Use this baseUrl as the common prefix when reading or composing request paths.Request Parameters
modelIdbodystringrequiredVoice model ID returned by the preview task query result.
contentTextbodystringrequiredText content to generate audio from. The content must be shorter than
10000 characters.To control pauses in the speech, insert
<#x#> between characters. x is in seconds, supports 0.01-99.99, and can contain up to two decimal places.soundVersionbodystringVoice model version.
v1: Model 1, supports 24 languagesv2: Model 2, supports 40 languages
languagebodystringLanguage type. Defaults to
auto when omitted.Supported by both
v1 and v2: Chinese, Chinese,Yue, English, Arabic, Russian, Spanish, French, Portuguese, German, Turkish, Dutch, Ukrainian, Vietnamese, Indonesian, Japanese, Italian, Korean, Thai, Polish, Romanian, Greek, Czech, Finnish, Hindi.The following languages require
soundVersion to be v2: Bulgarian, Danish, Hebrew, Malay, Persian, Slovak, Swedish, Croatian, Filipino, Hungarian, Norwegian, Slovenian, Catalan, Nynorsk, Tamil, Afrikaans, auto.Example:
Chinese.emotionbodystringEmotion type. Defaults to
neutral when omitted.Supported values:
happy, sad, angry, fearful, disgusted, surprised, neutral.Example:
happy.speedbodyBigDecimalSpeech speed. Optional range:
[0.5,2]. Defaults to 1.0 when omitted. A larger value means faster speech.Example:
1.2.volbodyBigDecimalVolume. Optional range:
(0,10]. Defaults to 1.0 when omitted. A larger value means louder audio.Example:
2.5.pitchbodyintegerPitch. Optional range:
[-12,12]. Defaults to 0 when omitted. 0 outputs the original voice tone. The value must be an integer.Example:
5.subtitleEnablebodybooleanWhether to generate subtitles. Defaults to
false when omitted.subtitleTypebodystringSubtitle type. This parameter can be passed when subtitle generation is enabled.
- Omitted: sentence-level subtitles
word: word-level subtitles
Response Parameters
idstringTask ID, used to query the task status later.
objectstringObject type, fixed as
audio.createdintegerTask creation timestamp.
modelstringModel name used by the task. For audio generation tasks, it is
soundCloningAudio.statusstringTask status. It is usually
queued after creation.errorstringError message, returned when the task fails.

