ModelStream LogoModelStream Logo
Models
Video API
Image API
Chat API
Audio API
Studio
Pricing
Docs
Menu
IntroductionQuickstartAPI KeysUse with Hermes AgentUse with OpenClaw
Model ListBilling Guide
ModelStream

Video API

  • Seedance 2.0
  • Happyhorse 1.0
  • Vidu Q3
  • Kling V3.0
  • Veo 3.1
  • Wan 2.7
  • More Video Models →

Image API

  • GPT Image 2
  • Nano Banana 2
  • Seedream 5.0
  • Imagen 4
  • Qwen Image 2.0
  • Z-Image Turbo
  • More Image Models →

Audio API

  • Suno Music
  • Qwen3 TTS Flash
  • More Audio Models →

Chat API

  • GLM-5.2
  • Claude Opus 4.8
  • Gemini 3.5 Flash
  • Qwen 3.7 Max
  • GPT 5.5
  • More Chat Models →

About Us

  • Privacy Policy
  • Terms of Service
  • Support
  • Enterprise

© 2026 ModelStream Inc. All rights reserved.

API Documentation
API Reference
Chat
Create Chat Completion

Create Chat Completion

Loading models...
D
deepseek-v4-flash
deepseek-v4-flash0 models support this endpoint

This is a highly efficient and lightweight MoE model with 284 billion parameters in total and 13 billion activated parameters per inference. It natively supports context windows of up to one million tokens, offering fast inference speed, low latency, and cost-effective invocation while maintaining well-balanced overall performance. Designed for high-concurrency and lightweight workloads, it is ideally suited for common essential use cases such as everyday dialogue, content creation, basic RAG applications, and batch text processing.

Create chat completion

https://api.modelstream.ai
POST/v1/chat/completions

Authentication

BearerAuth
AuthenticationBearer <token>

All API requests must be authenticated using a Bearer token in the Authorization header. Please ensure your API key is active.Authorization: Bearer sk-xxxxxx

Parameter Location: Header Param

Request Body

application/json

These parameters come from the selected model form_schema. Switching models updates this list and the request example.

system_prompt?string

Global instructions or persona for the model.

Example Value: You are a helpful and expert assistant.Placeholder: e.g., You are a senior software architect...
prompt*string
RequiredExample Value: Hello! What can you do?Placeholder: Enter your question or instructions...
temperature?number

Higher values make output more random, lower more deterministic.

Example Value: 1Value Range: 0 ≤ value ≤ 2step: 0.1
top_p?number

Nucleus sampling threshold; an alternative to temperature.

Example Value: 1Value Range: 0 ≤ value ≤ 1step: 0.05
presence_penalty?number

Increases the tendency to talk about new topics.

Example Value: 0Value Range: -2 ≤ value ≤ 2step: 0.1
frequency_penalty?number

Reduces the likelihood of repeating the same text verbatim.

Example Value: 0Value Range: -2 ≤ value ≤ 2step: 0.1
max_tokens?number
Example Value: 4096Value Range: 1 ≤ value ≤ 65536
response_format?string
Example Value: text
Enum/Options:
Text: textJSON Object: json_object
stream?boolean
Example Value: true

Response Parameters

application/json
200apiDocs.responses.successCreateResponse
id?string

Parameter description for Id

object?string

Parameter description for Object

created?integer

Parameter description for Created

model?string

Model ID used

choices?array

Parameter description for Choices

usage?object

Parameter description for Usage

prompt_tokens?integer

Parameter description for Prompt Tokens

completion_tokens?integer

Parameter description for Completion Tokens

total_tokens?integer

Parameter description for Total Tokens

prompt_tokens_details?object

Parameter description for Prompt Tokens Details

completion_tokens_details?object

Parameter description for Completion Tokens Details

system_fingerprint?string

Parameter description for System Fingerprint

400apiDocs.responses.badRequestParams
error?object

Parameter description for Error

message?string

Error Message

type?string

Error Type

param?string

Related Parameters

code?string

Error Code

429apiDocs.responses.rateLimited
error?object

Parameter description for Error

message?string

Error Message

type?string

Error Type

param?string

Related Parameters

code?string

Error Code

curl -X POST "https://api.modelstream.ai/v1/chat/completions" \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "deepseek-v4-flash",
  "messages": [
    {
      "role": "system",
      "content": "You are a senior software architect."
    },
    {
      "role": "user",
      "content": "Explain the trade-offs between microservices and a monolith."
    }
  ],
  "temperature": 0.7,
  "top_p": 1,
  "presence_penalty": 0,
  "frequency_penalty": 0,
  "max_tokens": 4096,
  "stream": true
}'
{
  "id": "string",
  "object": "chat.completion",
  "created": 0,
  "model": "string",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "system",
        "content": null,
        "name": "string",
        "tool_calls": [
          {
            "id": "string",
            "type": "function",
            "function": {
              "name": "string",
              "arguments": "string"
            }
          }
        ],
        "tool_call_id": "string",
        "reasoning_content": "string"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "text_tokens": 0,
      "audio_tokens": 0,
      "image_tokens": 0
    },
    "completion_tokens_details": {
      "text_tokens": 0,
      "audio_tokens": 0,
      "reasoning_tokens": 0
    }
  },
  "system_fingerprint": "string"
}