Find money, dates & figures (no model)
Goal: sometimes you don’t need an LLM or even a NER model —
you need to find the sentences that contain a figure and put them in front of a reviewer.
kaos-content’s entity filters locate money, dates, durations, percentages, and numbers
over a document using built-in patterns: deterministic, instant, offline, zero dependencies
beyond the parser.
uv run examples/find-financial-terms.pymoney (1): - The total contract value is $4,500,000 payable in quarterly installments.dates (1): - This Master Services Agreement is effective as of March 1, 2026.percents (1): - Late payments accrue interest at 1.5% per month.durations (1): - Either party may terminate on 90 days written notice.#!/usr/bin/env -S uv run --script# /// script# requires-python = ">=3.13"# dependencies = ["kaos-content>=0.1.6,<0.2", "kaos-nlp-core>=0.1.6,<0.2"]# ///"""Locate money, dates, percentages, and durations in a document — no model.
Sometimes you don't need an LLM or even a NER model — you need to *find thesentences that contain a figure* and review them. `kaos-content`'s entity filterslocate money, dates, durations, percentages, and numbers over a document viewusing built-in patterns: deterministic, instant, offline.
Run it:
uv run examples/find-financial-terms.py"""
from __future__ import annotations
import kaos_content as kcfrom kaos_content.views import DocumentView, entity_filters as effrom kaos_nlp_core._defaults import get_default_punkt_tokenizer
CONTRACT = [ "This Master Services Agreement is effective as of March 1, 2026.", "The total contract value is $4,500,000 payable in quarterly installments.", "Either party may terminate on 90 days written notice.", "Late payments accrue interest at 1.5% per month.", "The parties agree to act in good faith.", # no figures]
def build_view() -> DocumentView: b = kc.DocumentBuilder() b.heading(1, "Master Services Agreement") for para in CONTRACT: b.paragraph(para) return DocumentView(b.build(), sentence_segmenter=get_default_punkt_tokenizer())
def main() -> dict[str, int]: view = build_view()
finders = { "money": ef.sentences_with_money, "dates": ef.sentences_with_dates, "percents": ef.sentences_with_percents, "durations": ef.sentences_with_durations, }
counts = {} for label, finder in finders.items(): hits = list(finder(view)) counts[label] = len(hits) print(f"{label} ({len(hits)}):") for hit in hits: print(f" - {hit.sentence.text}") return counts
if __name__ == "__main__": counts = main() # Each figure type is located in its sentence; the prose-only line matches none. assert counts["money"] >= 1 assert counts["dates"] >= 1 assert counts["percents"] >= 1 assert counts["durations"] >= 1What to notice
sentences_with_money,sentences_with_dates,sentences_with_percents,sentences_with_durations,sentences_with_numbers(andparagraphs_with_*variants) each return hits carrying thesentenceand the matched spans — so you keep provenance.- Pattern-based, not model-based. No download, no inference — instant over a whole
corpus. Use it as a cheap first pass, then send the flagged sentences to a
NER model or a typed
Callfor exact values. - Runs on a
DocumentView, so it composes with any parsed document — PDF, DOCX, or web.