Company

Built because stapling chunks isn't retrieval.

When a vector store returns the top-K chunks across your corpus, document boundaries vanish. Ask a corpus of research papers a question and you get a methodology paragraph from one paper, a results table from another, and a conclusion from a third — fragments an LLM will fuse into an answer no single paper actually supports.

ckem keeps the structure of the document visible: ranked passages tied back to their source by typed edges. The retrieval engine is our own; the engineering around it — local embeddings, soft-archive provenance, MCP-native — is what ckem ships.

Why “ckem”?

Say it like seek 'em— the command you give a dog to go find something. That's what we're asking of the product: go seek the documents.

Short, easy to say, easy to remember. That's the whole story.

How it fits

Where ckem fits in your stack.

Self-contained retrieval. ckem owns the path from passage to ranked result — local sentence-transformer embeddings, the typed graph, fine-grained scoring, and the lifecycle around them. No third-party encoder or external vector store in the default deployment.
For documents. Long, static, multi-topic, cross-referencing. ckem is the retrieval graph your agent reaches into when it needs to ground itself in the corpus.
Bring your own LLM. ckem returns ranked, graph-aware context over MCP. What your agent does with it — generate, summarize, reason — is your call.
The retrieval layer for vertical AI. Built so teams shipping agents on regulated or long-document corpora can ground them without a months-long retrieval-stack build.

Security & deployment

Built so procurement isn't the bottleneck.

Local embeddings by default. Soft-archive provenance. Project-isolated in the schema. HIPAA BAA available.

Your documents never leave

Local sentence-transformer embeddings by default — no third-party embedding API in the default path. Documents are encoded, indexed, and queried inside your project.

Isolation in the schema

Tenancy is structural: User → Team → Project → Branch → Session with cascading deletes between layers. Every passage, edge, and embedding is scoped to a project. The schema does not allow cross-project reads.

Authentication

API keys are hashed with SHA-256 at rest. RBAC across four roles — viewer, contributor, admin, owner. Every key resolution updates last_used_at for revocation hygiene.

Audit & deletion

Soft-archive by default. Every auto-merge writes derived_from edges to its sources, so originals stay retrievable for audit. Type-aware retention preserves decisions and people; transient nodes age out on policy.

Deployment

Self-hosted via Docker Compose, run in your own AWS account with the included Terraform (ECS Fargate + RDS), or managed by us. Same code path; you pick who operates it.

Looking for pilot teams.

Bring a sample of your documents — we'll run ckem on them and walk through the graph and retrieval together over MCP.

hello@ckem.ca