mingyan

Videos - Create Task

Create video generation tasks using Videos models with support for text-only, first/last frame, and image/video references; audio references are supported by selected models

POST/kyyReactApiServer/v1/videos/videos

Videos Create Video Task

Create video generation tasks using the Videos model family. This API supports videos, videos_stable, and videos_stable_fast, including text-to-video, first/last frame control, reference images, reference videos, and audio references. videos supports up to 3 audio references; videos_stable and videos_stable_fast support up to 1 audio reference.

Authentication

Get Key
All requests require a Bearer token in the request header:
cURL
Authorization: Bearer {{key}}

Base URL

https://zcbservice.aizfw.cn/kyyReactApiServer
baseUrl is the shared prefix for all public APIs. The api field in the current page frontmatter shows the full endpoint. Use this baseUrl as the common prefix when reading or composing request paths.

Model Overview

Supported models:
  • videos - Videos standard model, billed per request, supports wider aspect ratios and audio references
  • videos_stable - Video2 full model, billed per request
  • videos_stable_fast - Video2 fast model, billed per request

Request Parameters

modelstringrequired
Model name.
Available values:
  • videos - Videos standard model, supports audio references
  • videos_stable - Video2 full model
  • videos_stable_fast - Video2 fast model
promptstringrequired
Video generation prompt. videos_stable / videos_stable_fast support up to 5000 characters; videos is not limited to 5000 characters.
Avoid prohibited, infringing, political, or explicit content.
Example: "A cute kitten playing in the grass"
durationintegerrequired
Video duration in seconds. Supported range: 4-15.
ratiostring
Output aspect ratio. Defaults to 16:9.
Supported values:
  • videos: 21:9, 16:9, 4:3, 1:1, 3:4, 9:16
  • videos_stable / videos_stable_fast: 16:9, 9:16, 1:1
resolutionstring
Output resolution. Defaults to 720p.
Currently supported:
  • 720p

First/Last Frame Mode

first_imagestring
First frame image URL.
  • Only used for first/last frame scenarios
  • Must be used together with last_image
Cannot be used together with referenceImages or referenceVideos; first/last frame mode does not support referenceAudios
last_imagestring
Last frame image URL.
  • Must be used together with first_image
Cannot be used together with referenceImages or referenceVideos; first/last frame mode does not support referenceAudios
Compatibility notes:
  • The legacy fields image and lastFrameImage are still supported
  • When both new and legacy fields are provided, first_image / last_image take priority
  • If both new and legacy fields are provided with different values, the API returns a parameter conflict

Reference Material Mode

referenceImagesarray
Reference image URL list.
Rules:
  • videos supports up to 9 images
  • videos_stable / videos_stable_fast support up to 4 images
  • Each image must not exceed 20MB
  • Can be combined with referenceVideos for image/video guidance
  • Can be combined with referenceAudios; videos supports up to 3 audios, while videos_stable / videos_stable_fast support up to 1 audio
Cannot be used together with image or lastFrameImage
referenceVideosarray
Reference video URL list.
Rules:
  • Up to 3 videos
  • Total duration must not exceed 15 seconds
  • Total size must not exceed 200MB
  • Each video must have a resolution between 720px and 2160px
Cannot be used together with image or lastFrameImage
referenceAudiosarray
Reference audio URL list.
Rules:
  • videos supports up to 3 audios
  • videos_stable / videos_stable_fast support up to 1 audio
  • Total duration must not exceed 15 seconds
  • Used to provide audio style, rhythm, or sound references for video generation
  • Can be combined with referenceImages and referenceVideos for multimodal guidance
Passing more audio URLs than the selected model supports returns an error; first/last frame mode does not support audio references

Response Parameters

idstring
Unique identifier for the video generation task, used for subsequent status queries.
objectstring
Object type, always video.
createdinteger
Task creation timestamp.
modelstring
Model name used for generation.
statusstring
Task status:
  • queued - Queued
  • processing - Processing
  • completed - Completed
  • failed - Failed
errorstring
Error message, returned when the task is failed.

Parameter Selection Rules

Mutually Exclusive Rules:
  • First/last frame mode: first_image and last_image must be used together
  • Reference material mode: referenceImages and referenceVideos can be used individually or together; referenceAudios supports up to 3 audios with videos, and up to 1 audio with videos_stable / videos_stable_fast
  • Mode exclusivity: First/last frame mode cannot be combined with reference material mode or audio references

Use Cases

Text to Video

Use only prompt, model, duration, and other base parameters

First/Last Frame

Use first_image and last_image for precise start and end frame control

Multimodal References

Combine referenceImages, referenceVideos, and referenceAudios for stronger generation guidance

Model Comparison

ModelBillingDuration RangeReference SupportAspect Ratios
videosPer request4-15 secondsUp to 9 images, 3 videos, and 3 audios21:9, 16:9, 4:3, 1:1, 3:4, 9:16
videos_stablePer request4-15 secondsUp to 4 images, 3 videos, and 1 audio16:9, 9:16, 1:1
videos_stable_fastPer request4-15 secondsUp to 4 images, 3 videos, and 1 audio16:9, 9:16, 1:1
Best Practices:
  1. Video generation is asynchronous, so save the returned id for later queries.
  2. Include scene, subject, motion, camera, and style details in the prompt.
  3. In first/last frame mode, make sure the two images are visually continuous.
  4. When using video or audio references, prefer short, clear, and subject-focused source materials.
  5. For the Video2 fast model, use videos_stable_fast.
Prompt Suggestions:
  • Describe the subject, action, and camera movement
  • Add style keywords such as "cinematic", "realistic", or "dreamlike"
  • If you need camera motion or scene transitions, describe them explicitly