Skip to main content
Inference selects which LLM the agent uses and lets you fine-tune its behavior with model parameters.

Model selection

The Model dropdown shows all models registered in Settings → Models Registry. Select the one this agent should use. Different agents in the same workspace can use different models. Common patterns:
  • Use a powerful model (e.g., Claude Sonnet, GPT-4o) for complex reasoning agents
  • Use a fast, cheap model (e.g., Groq Llama, Claude Haiku) for simple classification or routing agents
  • Use a local Ollama model for development and testing
If the dropdown is empty, no models have been configured yet. Go to Settings → Models Registry and add at least one.

Model parameters

Model parameters are optional key-value pairs passed to the LLM at inference time. They let you tune the model’s behavior per-agent without changing the model itself. Common parameters:
ParameterTypeDescription
temperaturefloat (0.0–2.0)Creativity vs. determinism. 0 = highly deterministic, 1+ = more creative
max_tokensintegerMaximum number of tokens in the response
top_pfloat (0.0–1.0)Nucleus sampling. Alternative to temperature for controlling randomness
top_kintegerLimits sampling to the top K most likely tokens
frequency_penaltyfloatReduces repetition of frequently used words
presence_penaltyfloatEncourages the model to introduce new topics
Supported parameters vary by provider. Unsupported parameters are silently ignored by the runtime. Refer to your provider’s API documentation for the full list.

Next steps

With a model selected, your agent is ready to run. From here you can optionally add:

MCP Tools

Connect external tools and APIs.

Knowledge Base

Add document retrieval for RAG-based answers.