Prefetch models for offline use
kaos-nlp-transformers downloads ONNX models on first use. To run fully offline
afterward (CI, air-gapped, or just deterministic), pre-warm the cache once, then enforce
offline mode.
Prefetch, then go offline
Section titled “Prefetch, then go offline”# Download the models you'll use into the cache (one time, needs network)kaos-nlp-transformers prefetch --include embedding --include reranker# ...or a specific modelkaos-nlp-transformers prefetch --model BAAI/bge-small-en-v1.5
# Afterwards, force offline so no network fetch is attemptedexport KAOS_NLP_TRANSFORMERS_OFFLINE=1kaos-nlp-transformers info # confirm what's cached and the active device- The vendored static model
minishlab/potion-base-8M(the[model2vec]extra) needs no prefetch at all — it loads with no download, which is why the embeddings how-to and clustering how-to run offline out of the box. - Models are license-vetted and SHA-pinned; prefetch respects the registry.
- Cache location follows
KAOS_NLP_TRANSFORMERS_CACHE_DIR/HF_HOME(see environment variables).