Knowledge Base - Alquimia Docs

The Knowledge Base gives your agent access to a collection of documents. When a user asks a question, the agent retrieves the most relevant passages and uses them to construct an accurate, grounded answer. This technique is called RAG — Retrieval-Augmented Generation.

Prerequisites

None — embeddings are handled automatically by the Alquimia runtime. Just open the Knowledge Base step and start uploading.

Uploading documents

Drag and drop files into the upload area or click to browse. Uploaded documents are:

Processed and chunked into passages
Embedded by the runtime engine
Stored in the vector database (Qdrant)

Retrieval parameters

These parameters control how the agent searches your documents when answering a question.

Search type

Type	Description	Best for
`similarity`	Pure semantic similarity — finds the most relevant chunks	Specific, focused questions
`mmr`	Maximal Marginal Relevance — balances relevance with diversity	Broad questions, avoiding repetitive passages

Parameters

Parameter	Default	Description
K	4	Number of document chunks retrieved per query
Score Threshold	none	Minimum similarity score (0.0–1.0). Chunks below this are excluded.
Fetch K	—	MMR only: candidates to evaluate before selecting the final K
Lambda Mult	—	MMR only: diversity weight (0 = max diversity, 1 = max relevance)

Start with similarity, K=4, and no threshold. Run a few test queries in the Test Drawer. If answers are repetitive, switch to mmr. If answers include irrelevant content, add a score threshold (try 0.7 first).

How retrieval works at runtime

When the agent receives a message:

The query is embedded by the runtime using the same model that indexed the documents
The vector database finds the K most similar document chunks
Those chunks are injected into the agent’s prompt as context
The LLM uses this context to generate a grounded answer

The agent cites information from your documents rather than relying solely on what the LLM learned during training.

Next steps

Memory

Add conversation memory alongside document retrieval.

Model Selection

Choose the LLM that generates grounded answers from retrieved context.

Agent-to-Agent Integration Channels

​Prerequisites

​Uploading documents

​Retrieval parameters

​Search type

​Parameters

​How retrieval works at runtime

​Next steps

Memory

Model Selection

Prerequisites

Uploading documents

Retrieval parameters

Search type

Parameters

How retrieval works at runtime

Next steps