Chat Completions
Generate text completions with state-of-the-art language models
Chat Completions
Generate chat completions using GPT-4, Claude, Gemini, Llama, and 200+ other top language models. This endpoint is fully OpenAI SDK compatible.
OpenAI Compatible: This endpoint works with the OpenAI SDK - just change the base URL to https://www.novakit.ai/api/v1
Try it Now
Test the Chat Completions API directly in your browser:
Endpoint
POST /chat/completions
POST /chat/completions?stream=trueRequired scope: chat
Request Body
{
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"model": "openai/gpt-4o-mini",
"temperature": 0.7,
"max_tokens": 2048,
"stream": false,
"web_search": false,
"top_p": 1,
"frequency_penalty": 0,
"presence_penalty": 0
}Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
messages | array | Yes | - | Array of message objects with role and content |
model | string | No | openai/gpt-4o-mini | Model ID from /models endpoint |
temperature | number | No | 0.7 | Sampling temperature (0-2). Higher = more creative |
max_tokens | number | No | 2048 | Maximum tokens to generate |
stream | boolean | No | false | Enable Server-Sent Events (SSE) streaming |
web_search | boolean | No | false | Enable web search for real-time information |
top_p | number | No | 1 | Nucleus sampling threshold (0-1) |
frequency_penalty | number | No | 0 | Reduce repetition of tokens (-2 to 2) |
presence_penalty | number | No | 0 | Encourage topic diversity (-2 to 2) |
stop | string|array | No | - | Stop sequences to end generation |
user | string | No | - | Unique user ID for tracking |
Message Object
| Field | Type | Required | Description |
|---|---|---|---|
role | string | Yes | One of: system, user, assistant, tool |
content | string | Yes | The message content |
images | array | No | Array of image URLs for vision models (up to 4) |
name | string | No | Optional name for the participant |
Vision Support: When using images, ensure you're using a vision-capable model like openai/gpt-4o, anthropic/claude-3.5-sonnet, or google/gemini-1.5-pro.
Response
{
"id": "chatcmpl-1234567890",
"object": "chat.completion",
"created": 1703123456,
"model": "openai/gpt-4o-mini",
"model_tier": "standard",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 8,
"total_tokens": 20,
"credits_used": 5,
"credits_breakdown": {"base": 5},
"tokens_remaining_estimate": 99980
}
}Examples
curl -X POST https://www.novakit.ai/api/v1/chat/completions \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "system", "content": "You are a helpful coding assistant."},
{"role": "user", "content": "Write a Python function to calculate factorial"}
],
"model": "openai/gpt-4o-mini",
"max_tokens": 500
}'import requests
response = requests.post(
"https://www.novakit.ai/api/v1/chat/completions",
headers={
"Authorization": "Bearer sk_your_api_key",
"Content-Type": "application/json"
},
json={
"messages": [
{"role": "system", "content": "You are a helpful coding assistant."},
{"role": "user", "content": "Write a Python function to calculate factorial"}
],
"model": "openai/gpt-4o-mini",
"max_tokens": 500
}
)
print(response.json()["choices"][0]["message"]["content"])const response = await fetch(
"https://www.novakit.ai/api/v1/chat/completions",
{
method: "POST",
headers: {
"Authorization": "Bearer sk_your_api_key",
"Content-Type": "application/json",
},
body: JSON.stringify({
messages: [
{ role: "system", content: "You are a helpful coding assistant." },
{ role: "user", content: "Write a Python function to calculate factorial" }
],
model: "openai/gpt-4o-mini",
max_tokens: 500,
}),
}
);
const data = await response.json();
console.log(data.choices[0].message.content);from openai import OpenAI
client = OpenAI(
api_key="sk_your_api_key",
base_url="https://www.novakit.ai/api/v1"
)
response = client.chat.completions.create(
model="openai/gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful coding assistant."},
{"role": "user", "content": "Write a Python function to calculate factorial"}
],
max_tokens=500
)
print(response.choices[0].message.content)Streaming
Enable streaming to receive tokens as they're generated:
from openai import OpenAI
client = OpenAI(
api_key="sk_your_api_key",
base_url="https://www.novakit.ai/api/v1"
)
stream = client.chat.completions.create(
model="openai/gpt-4o-mini",
messages=[{"role": "user", "content": "Write a short story"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")Vision (Image Analysis)
Send images for analysis with vision-capable models:
response = client.chat.completions.create(
model="openai/gpt-4o",
messages=[{
"role": "user",
"content": "What's in this image?",
"images": ["https://example.com/image.jpg"]
}]
)Web Search
Enable web search for up-to-date information:
curl -X POST https://www.novakit.ai/api/v1/chat/completions \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"messages": [{"role": "user", "content": "What are the latest AI news?"}],
"web_search": true
}'Available Models
NovaKit provides access to 200+ models via OpenRouter. Here are some popular choices:
OpenAI Models
| Model | Context | Best For | Tier |
|---|---|---|---|
openai/gpt-4o | 128K | Complex reasoning, vision, coding | Standard |
openai/gpt-4o-mini | 128K | Fast, cost-effective, general use | Basic |
openai/gpt-4-turbo | 128K | Highest quality OpenAI model | Powerful |
openai/o1-preview | 128K | Advanced reasoning, math | Powerful |
openai/o1-mini | 128K | Fast reasoning tasks | Standard |
Anthropic Models
| Model | Context | Best For | Tier |
|---|---|---|---|
anthropic/claude-3.5-sonnet | 200K | Best balance of speed & quality | Standard |
anthropic/claude-3-opus | 200K | Most capable, complex analysis | Powerful |
anthropic/claude-3-haiku | 200K | Ultra-fast, cost-effective | Basic |
Google Models
| Model | Context | Best For | Tier |
|---|---|---|---|
google/gemini-1.5-pro | 1M | Massive context, multimodal | Standard |
google/gemini-1.5-flash | 1M | Fast, large context | Basic |
google/gemini-2.0-flash | 1M | Latest Gemini, fast | Basic |
Open Source Models
| Model | Context | Best For | Tier |
|---|---|---|---|
meta-llama/llama-3.1-405b | 128K | Best open-source, reasoning | Powerful |
meta-llama/llama-3.1-70b | 128K | Strong general purpose | Standard |
meta-llama/llama-3.1-8b | 128K | Fast, lightweight | Basic |
mistralai/mixtral-8x22b | 64K | Coding, analysis | Standard |
deepseek/deepseek-chat | 64K | Coding, math | Standard |
Use the Models endpoint to get the complete list of 200+ available models with real-time pricing and availability.
Model Tiers & Credits
Different model tiers consume credits at different rates:
| Tier | Multiplier | Description |
|---|---|---|
| Basic | 1x | Fast, cost-effective models |
| Standard | 1.5-2x | Balanced quality and speed |
| Powerful | 2-3x | Highest capability models |
Error Handling
| Status | Code | Description |
|---|---|---|
| 400 | invalid_request | Malformed request body |
| 401 | unauthorized | Invalid or missing API key |
| 402 | quota_exceeded | Token quota exhausted |
| 403 | forbidden | API key missing chat scope |
| 429 | rate_limited | Too many requests |
| 500 | server_error | Internal error (retry safe) |
{
"error": "Quota exceeded for chat_tokens",
"code": "quota_exceeded",
"remaining": 0
}