Skip to main content
The Knowledge Base gives your agent access to a collection of documents. When a user asks a question, the agent retrieves the most relevant passages and uses them to construct an accurate, grounded answer. This technique is called RAG — Retrieval-Augmented Generation.

Prerequisites

An Embeddings Model must be configured in Settings before you can use the Knowledge Base. If you don’t see an embeddings selector here, add one first.

Uploading documents

Drag and drop files into the upload area or click to browse. Uploaded documents are:
  1. Processed and chunked into passages
  2. Embedded using the selected embeddings model
  3. Stored in the vector database (Qdrant)

Retrieval parameters

These parameters control how the agent searches your documents when answering a question.

Search type

TypeDescriptionBest for
similarityPure semantic similarity — finds the most relevant chunksSpecific, focused questions
mmrMaximal Marginal Relevance — balances relevance with diversityBroad questions, avoiding repetitive passages

Parameters

ParameterDefaultDescription
K4Number of document chunks retrieved per query
Score ThresholdnoneMinimum similarity score (0.0–1.0). Chunks below this are excluded.
Fetch KMMR only: candidates to evaluate before selecting the final K
Lambda MultMMR only: diversity weight (0 = max diversity, 1 = max relevance)
Start with similarity, K=4, and no threshold. Run a few test queries in the Test Drawer. If answers are repetitive, switch to mmr. If answers include irrelevant content, add a score threshold (try 0.7 first).

How retrieval works at runtime

When the agent receives a message:
  1. The query is embedded using the same embeddings model
  2. The vector database finds the K most similar document chunks
  3. Those chunks are injected into the agent’s prompt as context
  4. The LLM uses this context to generate a grounded answer
The agent cites information from your documents rather than relying solely on what the LLM learned during training.

Next steps

Embeddings Models

Configure the embeddings model required for the knowledge base.

Memory

Add conversation memory alongside document retrieval.