Skip to content

Use a local model

You can run KAOS against a local, self-hosted model server (Ollama, vLLM, LM Studio) instead of a cloud provider — useful for privacy, cost, or offline operation.

Point the client at your local OpenAI-compatible endpoint. Because it’s not HTTPS to a known provider, you must explicitly acknowledge the insecure base URL:

Terminal window
export KAOS_LLM_ALLOW_INSECURE_BASE_URL=1 # required for http://localhost endpoints
export OPENAI_BASE_URL=http://localhost:11434/v1 # e.g. Ollama
from kaos_llm_client import create_client
client = create_client("openai:llama3.1") # served by your local endpoint
print(client.chat([{"role": "user", "content": "Hi"}]).text)
  • The KAOS_LLM_ALLOW_INSECURE_BASE_URL gate is an SSRF safeguard: KAOS refuses non-HTTPS / non-provider base URLs by default so a misconfiguration can’t quietly send prompts somewhere unexpected. You opt in deliberately.
  • Everything downstream — typed programs, agents — works identically; only the client changes (the same reason the FunctionClient seam works).
  • For fully-offline deterministic runs (tests, demos), prefer FunctionClient over a local model — it needs no server at all.