Solutions · Legal

Retrieval for contracts that don't fit in a vector.

A typical M&A agreement runs 80–300 pages. A single embedding per document collapses indemnification, governing law, assignment, and termination into one averaged point — and the clause you actually need disappears into the mean.

ckem keeps the clause. Fine-grained ranking over passages, typed edges between definitions and their use, supersession modelled as a first-class relation. The retrieval engine returns the span — not a thirty-page document with the span buried inside.

See the numbers →

The problem

Why flat retrieval gets contracts wrong.

Multi-topic by design.
A single agreement covers payment terms, IP assignment, limitation of liability, governing law, and termination — in one file. A single document embedding mashes them together. A query for “cap on liability” comes back averaged against the other twenty topics.
Defined terms drift from their use.
“Confidential Information” is defined on page 4 and used on pages 11, 23, and 47. ckem walks the typed graph from the use site back to the definition and returns both together — flat retrieval has to hope they land in the same top-K.
Amendments supersede.
Schedule A as amended in the side letter is not the same as Schedule A in the master agreement. ckem models supersession as a first-class edge — queries default to current, but you can pin a date and get the corpus as it stood. Originals stay retrievable with provenance.
The span is the answer.
“Yes, there's a cap” is not the answer. “Section 9.4: liability is capped at the fees paid in the twelve months preceding the claim” is. ckem returns the passage, not a pointer to the document that contains it.

What ckem does

Built for the structure of a contract.

Passage-level scoring. Indexed and ranked at the clause level. The graph keeps the document boundary visible so the matched clause carries its section heading and parent agreement.
Typed edges between clauses. References, definitions, supersession, derived_from. The retrieval engine returns the clause and its neighbours, not one without the other.
Provenance survives every merge. Near-duplicate clauses (the same indemnification language across a portfolio of agreements) auto-merge with derived_from edges back to every original. Audit-grade by default.
Your contracts never leave. Local sentence-transformer embeddings; no third-party embedding API in the default path. Self-host via Docker, run in your own AWS account with the included Terraform, or managed. Same code path.

Measured on LegalBench-RAG

The benchmark, not a demo.

LegalBench-RAG is a 776-pair retrieval benchmark assembled by ZeroEntropy across four legal QA datasets: PrivacyQA, CUAD, MAUD, and ContractNLI. Every query has a ground-truth answer span inside a specific contract. We run ckem on the upstream corpus and judgement files verbatim — no question filtering, no held-out subsets we picked ourselves. The full benchmark write-up is on arXiv (2408.10343).

ckem · overall (776 queries)

Hit@1071.3%
Recall@10 (span)67.6%
F1@10 (span)16.2%
Precision@10 (span)9.4%

ckem vs. paper baseline (per subset)

ckem · Hit@10paper · Recall@16

PrivacyQA100.0%vs.42.5%
CUAD85.1%vs.51.0%
ContractNLI71.1%vs.56.8%
MAUD28.9%vs.13.2%

Baseline: LegalBench-RAG paper (arXiv:2408.10343), Table 4 “Naive Method” Recall@16 — the strongest non- proprietary number the authors publish. ckem uses K=10, a tighter budget. Hit@10 vs. Recall@K differ when a query has multiple gold spans (Hit credits any single match), noted below.

What each subset tests

Subset	What it tests
PrivacyQA	Privacy-policy questions; document-level retrieval.
CUAD	Commercial contract clauses across 41 categories.
ContractNLI	NDA hypotheses against contract text.
MAUD	M&A definitive agreements; deal-term retrieval.

Reading the numbers: LegalBench-RAG's relevance judgements are span-level. Hit@10 credits the system when the right span lands inside one of the returned passages — a generous metric. Precision@10 is the stricter read: of the 10 passages we returned, what fraction was actually relevant span.

Ongoing work: MAUD is the long tail. M&A definitive agreements are dense with deal-specific terminology (escrow, MAE carve-outs, fiduciary outs) where the right clause and a near-twin from a different deal embed close together. Three workstreams target this directly: a contracts-domain LoRA on top of the Qwen3 encoder, a learned reranker trained on the LegalBench-RAG judgements themselves, and graph-aware traversal that follows cross-reference edges to retrieve the cited definition rather than just its mention. The per-subset numbers are published so the tradeoff stays visible as those land.

In practice

Where teams put ckem in their legal stack.

Diligence review.
An agent walks a target's contracts looking for change-of-control, assignment, exclusivity, and limitation clauses. ckem returns the span with its parent agreement and section heading; the agent reads the surrounding two clauses if it needs to.
Playbook compliance.
A new contract gets checked against the firm's standard positions. ckem indexes the playbook and the in-flight draft; the typed graph surfaces every place the draft deviates and links to the relevant playbook entry.
Cross-portfolio question answering.
“Which of our MSAs cap data-breach liability at less than three times annual fees?” That's a clause- level question across thousands of documents. Flat retrieval averages it into noise; the graph keeps the clauses ranked on their own merits.

Run ckem on your contracts.

Bring a folder of agreements and a labeled query set. We'll run ckem against your current retrieval and walk through the graph together over MCP.

See all benchmarks →

Retrieval for contracts that don't fit in a vector.

Why flat retrieval gets contracts wrong.

Multi-topic by design.

Defined terms drift from their use.

Amendments supersede.

The span is the answer.

Built for the structure of a contract.

The benchmark, not a demo.

Where teams put ckem in their legal stack.

Diligence review.

Playbook compliance.

Cross-portfolio question answering.

Run ckem on your contracts.