Upload documents and chat with them using AI. Get answers with citations from PDFs, URLs, YouTube videos, and more.

Document Chat

Chat with your documents using AI. Upload PDFs, paste URLs, import YouTube videos, or paste text — then ask questions and get intelligent answers with citations.

Document Chat is available on Pro and higher plans. Access it in AI → Documents.

Overview

Document Chat uses Retrieval-Augmented Generation (RAG) to provide accurate, grounded answers from your content. Every response includes citations linking back to the exact source passages.

Supported Sources

Source Type	Description	Formats
File Upload	Upload documents directly	PDF, DOCX, MD, TXT, CSV
URL Import	Extract content from web pages	Any public URL
YouTube	Import video transcripts	Any YouTube video with captions
Paste Text	Paste content directly	Plain text, Markdown

Getting Started

Add a Source

Navigate to AI → Documents and click Add Source. Choose from:

Upload File: Drag & drop or select files
URL: Paste a web page URL
YouTube: Paste a YouTube video URL
Paste Text: Paste text content directly

Wait for Processing

Documents are automatically processed:

Parsing: Content is extracted from the source
Chunking: Content is split into semantic chunks
Embedding: Chunks are converted to vector embeddings
Indexing: Content is indexed for semantic search

Start Chatting

Click on any document to open the chat interface. Ask questions in natural language and get AI-powered answers with citations.

Source Types

File Upload

Upload documents directly from your computer. Supported formats:

Format	Extension	Max Size	Notes
PDF	`.pdf`	10MB	Scanned PDFs not supported
Word	`.docx`	10MB	Modern Word format only
Markdown	`.md`	10MB	GitHub-flavored Markdown
Text	`.txt`	10MB	Plain text files
CSV	`.csv`	10MB	Tabular data

Scanned PDFs (image-based) are not currently supported. Documents must contain selectable text.

URL Import

Import content from any public web page:

# Supported URLs
https://example.com/article
https://docs.example.com/guide
https://blog.example.com/post

The system will:

Fetch the page content
Extract the main article/content
Remove navigation, ads, and boilerplate
Preserve headings, lists, and formatting

Pages behind authentication or paywalls cannot be imported. The URL must be publicly accessible.

YouTube Import

Import YouTube video transcripts:

# Supported URL formats
https://www.youtube.com/watch?v=VIDEO_ID
https://youtu.be/VIDEO_ID

Requirements:

Video must have captions (auto-generated or manual)
Videos without captions cannot be imported

The system imports the full transcript with timestamps, allowing you to ask questions about any part of the video.

Paste Text

Paste any text content directly:

Meeting notes
Email threads
Code snippets
Research notes

Chat Interface

Asking Questions

Ask questions in natural language:

"What are the key findings in this report?"
"Summarize the main arguments"
"What does it say about X?"
"Compare the approaches mentioned in section 2 and 3"

Citations

Every AI response includes numbered citations:

The report indicates revenue grew 35% year-over-year [1],
driven primarily by enterprise expansion [2].

Click any citation to:

View the original passage
See the page number or location
Navigate to that section in the document

Chat Settings

Customize chat behavior:

Setting	Options	Description
Model	GPT-4, Claude 3.5, etc.	AI model for responses
Temperature	0.0 - 1.0	Response creativity
Response Length	Concise, Balanced, Detailed	Output verbosity
Max Sources	1-10	Number of chunks to retrieve

Collections

Organize documents into collections for:

Project Organization: Group related documents
Multi-Document Chat: Chat across an entire collection
Team Collaboration: Share collections with team members

Creating Collections

Click New Collection in the sidebar
Enter a name and optional description
Choose a color for visual organization
Click Create

Moving Documents

Drag documents into collections
Use the document menu → Move to Collection
Documents can belong to one collection at a time

Collection Chat

Chat with all documents in a collection at once:

Select a collection in the sidebar
Click Chat with Collection
Ask questions that span multiple documents

API Access

Access Document Chat via the REST API.

Upload Document

curl -X POST https://api.novakit.ai/v1/documents/upload \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@document.pdf" \
  -F "collection_id=optional-collection-id"

Import from URL

curl -X POST https://api.novakit.ai/v1/documents/import \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "source_type": "url",
    "url": "https://example.com/article",
    "name": "Example Article"
  }'

Import YouTube

curl -X POST https://api.novakit.ai/v1/documents/import \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "source_type": "youtube",
    "url": "https://www.youtube.com/watch?v=VIDEO_ID"
  }'

Chat with Document

curl -X POST https://api.novakit.ai/v1/documents/chat \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document_ids": ["doc_123", "doc_456"],
    "message": "What are the key findings?",
    "model": "gpt-4o"
  }'

Response Format

{
  "content": "The key findings include...[1]...[2]",
  "citations": [
    {
      "index": 1,
      "chunk_id": "chunk_abc",
      "document_id": "doc_123",
      "text": "Original passage...",
      "page": 5
    }
  ],
  "usage": {
    "input_tokens": 1234,
    "output_tokens": 567
  }
}

Billing

Document Chat uses credits from your Chat Tokens quota:

Operation	Credits
Document embedding	~1 credit per 1000 characters
Chat query	Standard chat rates by model

Check your quota usage in Dashboard → Usage.

Best Practices

Document Preparation

Clean formatting: Remove unnecessary headers/footers
Text-based PDFs: Ensure PDFs contain selectable text
Reasonable length: Very long documents may have lower retrieval accuracy

Effective Questions

Be specific: "What does section 3 say about pricing?" vs "Tell me about pricing"
Reference context: "Based on the Q4 report, what were the growth metrics?"
Ask follow-ups: Build on previous answers for deeper exploration

Collection Strategy

Group by project or topic
Keep collections focused (5-20 documents)
Use descriptive names

Troubleshooting

Document Processing Failed

Error	Cause	Solution
"Unsupported file type"	Wrong format	Use PDF, DOCX, MD, TXT, or CSV
"File too large"	Exceeds 10MB	Split into smaller files
"No text content"	Scanned PDF	Use text-based documents
"Failed to fetch URL"	Inaccessible page	Check URL is public
"No transcript available"	No YouTube captions	Use video with captions

Chat Issues

Issue	Solution
Irrelevant answers	Rephrase question, be more specific
Missing citations	Increase "Max Sources" setting
Slow responses	Reduce response length setting

Retry Failed Documents

For URL and YouTube imports that failed:

Go to AI → Documents
Find the failed document
Click Retry to attempt processing again

File uploads cannot be retried. You'll need to upload the file again.

Limits

Limit	Free	Pro	Team
Documents	5	Unlimited	Unlimited
File size	10MB	10MB	10MB
Collections	1	Unlimited	Unlimited
API access	No	Yes	Yes

Document Chat

On this page