Studio is where you register cloud LLMs (Anthropic, OpenAI, Groq, etc.): open Settings → Models Registry, add each model, and paste the API key — credentials are stored as secrets. Use Base URL when you point at OpenAI-compatible gateways, proxies, or self-hosted endpoints; you are not limited to a fixed list baked into .env.(The optional Ollama workflow below still uses Compose for local pull-and-register — see Local models with Ollama.)
The stack includes optional Ollama support for running inference locally (CPU-based):
# Add to .env before startingOLLAMA_MODELS=qwen2.5:0.5b nomic-embed-text# Start with the local models profiledocker compose --profile with-local-models up -d
Naming convention for OLLAMA_MODELS:
Models with embed in the name → registered as Embeddings Models in Studio
All others → registered as chat models
Ollama runs on CPU inside Docker. Models larger than 3B parameters require significant RAM. Recommended: use 0.5B–3B models for local development.
# Pull a new model into the running Ollama containerdocker compose exec ollama ollama pull llama3.2:3b# Re-run studio-init to register it in Studiodocker compose --profile with-local-models run --rm studio-init
# View logs for all servicesdocker compose logs -f# View logs for a specific servicedocker compose logs -f studio# Stop all services (data is preserved)docker compose down# Stop and delete all data (volumes)docker compose down -v# Restart a single servicedocker compose restart studio