Use a local model

You can run KAOS against a local, self-hosted model server (Ollama, vLLM, LM Studio) instead of a cloud provider — useful for privacy, cost, or offline operation.

Configure the endpoint

Point the client at your local OpenAI-compatible endpoint. Because it’s not HTTPS to a known provider, you must explicitly acknowledge the insecure base URL:

export KAOS_LLM_ALLOW_INSECURE_BASE_URL=1     # required for http://localhost endpoints
export OPENAI_BASE_URL=http://localhost:11434/v1   # e.g. Ollama

from kaos_llm_client import create_client

client = create_client("openai:llama3.1")   # served by your local endpoint
print(client.chat([{"role": "user", "content": "Hi"}]).text)

Notes

The KAOS_LLM_ALLOW_INSECURE_BASE_URL gate is an SSRF safeguard: KAOS refuses non-HTTPS / non-provider base URLs by default so a misconfiguration can’t quietly send prompts somewhere unexpected. You opt in deliberately.
Everything downstream — typed programs, agents — works identically; only the client changes (the same reason the FunctionClient seam works).
For fully-offline deterministic runs (tests, demos), prefer FunctionClient over a local model — it needs no server at all.