Chat Completions

Generate chat completions using GPT-4, Claude, Gemini, Llama, and 200+ other top language models. This endpoint is fully OpenAI SDK compatible.

OpenAI Compatible: This endpoint works with the OpenAI SDK - just change the base URL to https://www.novakit.ai/api/v1

Try it Now

Test the Chat Completions API directly in your browser:

Endpoint

POST /chat/completions
POST /chat/completions?stream=true

Required scope: chat

Request Body

{
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "model": "openai/gpt-4o-mini",
  "temperature": 0.7,
  "max_tokens": 2048,
  "stream": false,
  "web_search": false,
  "top_p": 1,
  "frequency_penalty": 0,
  "presence_penalty": 0
}

Parameters

Parameter	Type	Required	Default	Description
`messages`	array	Yes	-	Array of message objects with `role` and `content`
`model`	string	No	`openai/gpt-4o-mini`	Model ID from `/models` endpoint
`temperature`	number	No	`0.7`	Sampling temperature (0-2). Higher = more creative
`max_tokens`	number	No	`2048`	Maximum tokens to generate
`stream`	boolean	No	`false`	Enable Server-Sent Events (SSE) streaming
`web_search`	boolean	No	`false`	Enable web search for real-time information
`top_p`	number	No	`1`	Nucleus sampling threshold (0-1)
`frequency_penalty`	number	No	`0`	Reduce repetition of tokens (-2 to 2)
`presence_penalty`	number	No	`0`	Encourage topic diversity (-2 to 2)
`stop`	string\|array	No	-	Stop sequences to end generation
`user`	string	No	-	Unique user ID for tracking

Message Object

Field	Type	Required	Description
`role`	string	Yes	One of: `system`, `user`, `assistant`, `tool`
`content`	string	Yes	The message content
`images`	array	No	Array of image URLs for vision models (up to 4)
`name`	string	No	Optional name for the participant

Vision Support: When using images, ensure you're using a vision-capable model like openai/gpt-4o, anthropic/claude-3.5-sonnet, or google/gemini-1.5-pro.

Response

{
  "id": "chatcmpl-1234567890",
  "object": "chat.completion",
  "created": 1703123456,
  "model": "openai/gpt-4o-mini",
  "model_tier": "standard",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 8,
    "total_tokens": 20,
    "credits_used": 5,
    "credits_breakdown": {"base": 5},
    "tokens_remaining_estimate": 99980
  }
}

Examples

curl -X POST https://www.novakit.ai/api/v1/chat/completions \
  -H "Authorization: Bearer sk_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "system", "content": "You are a helpful coding assistant."},
      {"role": "user", "content": "Write a Python function to calculate factorial"}
    ],
    "model": "openai/gpt-4o-mini",
    "max_tokens": 500
  }'

import requests

response = requests.post(
    "https://www.novakit.ai/api/v1/chat/completions",
    headers={
        "Authorization": "Bearer sk_your_api_key",
        "Content-Type": "application/json"
    },
    json={
        "messages": [
            {"role": "system", "content": "You are a helpful coding assistant."},
            {"role": "user", "content": "Write a Python function to calculate factorial"}
        ],
        "model": "openai/gpt-4o-mini",
        "max_tokens": 500
    }
)

print(response.json()["choices"][0]["message"]["content"])

const response = await fetch(
  "https://www.novakit.ai/api/v1/chat/completions",
  {
    method: "POST",
    headers: {
      "Authorization": "Bearer sk_your_api_key",
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      messages: [
        { role: "system", content: "You are a helpful coding assistant." },
        { role: "user", content: "Write a Python function to calculate factorial" }
      ],
      model: "openai/gpt-4o-mini",
      max_tokens: 500,
    }),
  }
);

const data = await response.json();
console.log(data.choices[0].message.content);

from openai import OpenAI

client = OpenAI(
    api_key="sk_your_api_key",
    base_url="https://www.novakit.ai/api/v1"
)

response = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful coding assistant."},
        {"role": "user", "content": "Write a Python function to calculate factorial"}
    ],
    max_tokens=500
)

print(response.choices[0].message.content)

Streaming

Enable streaming to receive tokens as they're generated:

from openai import OpenAI

client = OpenAI(
    api_key="sk_your_api_key",
    base_url="https://www.novakit.ai/api/v1"
)

stream = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[{"role": "user", "content": "Write a short story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Vision (Image Analysis)

Send images for analysis with vision-capable models:

response = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{
        "role": "user",
        "content": "What's in this image?",
        "images": ["https://example.com/image.jpg"]
    }]
)

Web Search

Enable web search for up-to-date information:

curl -X POST https://www.novakit.ai/api/v1/chat/completions \
  -H "Authorization: Bearer sk_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "What are the latest AI news?"}],
    "web_search": true
  }'

Available Models

NovaKit provides access to 200+ models via OpenRouter. Here are some popular choices:

OpenAI Models

Model	Context	Best For	Tier
`openai/gpt-4o`	128K	Complex reasoning, vision, coding	Standard
`openai/gpt-4o-mini`	128K	Fast, cost-effective, general use	Basic
`openai/gpt-4-turbo`	128K	Highest quality OpenAI model	Powerful
`openai/o1-preview`	128K	Advanced reasoning, math	Powerful
`openai/o1-mini`	128K	Fast reasoning tasks	Standard

Anthropic Models

Model	Context	Best For	Tier
`anthropic/claude-3.5-sonnet`	200K	Best balance of speed & quality	Standard
`anthropic/claude-3-opus`	200K	Most capable, complex analysis	Powerful
`anthropic/claude-3-haiku`	200K	Ultra-fast, cost-effective	Basic

Google Models

Model	Context	Best For	Tier
`google/gemini-1.5-pro`	1M	Massive context, multimodal	Standard
`google/gemini-1.5-flash`	1M	Fast, large context	Basic
`google/gemini-2.0-flash`	1M	Latest Gemini, fast	Basic

Open Source Models

Model	Context	Best For	Tier
`meta-llama/llama-3.1-405b`	128K	Best open-source, reasoning	Powerful
`meta-llama/llama-3.1-70b`	128K	Strong general purpose	Standard
`meta-llama/llama-3.1-8b`	128K	Fast, lightweight	Basic
`mistralai/mixtral-8x22b`	64K	Coding, analysis	Standard
`deepseek/deepseek-chat`	64K	Coding, math	Standard

Use the Models endpoint to get the complete list of 200+ available models with real-time pricing and availability.

Model Tiers & Credits

Different model tiers consume credits at different rates:

Tier	Multiplier	Description
Basic	1x	Fast, cost-effective models
Standard	1.5-2x	Balanced quality and speed
Powerful	2-3x	Highest capability models

Error Handling

Status	Code	Description
400	`invalid_request`	Malformed request body
401	`unauthorized`	Invalid or missing API key
402	`quota_exceeded`	Token quota exhausted
403	`forbidden`	API key missing `chat` scope
429	`rate_limited`	Too many requests
500	`server_error`	Internal error (retry safe)

{
  "error": "Quota exceeded for chat_tokens",
  "code": "quota_exceeded",
  "remaining": 0
}

Chat Completions

On this page