Proof, not positioning.
This page publishes the current in-repo WorkMemory regression package: the benchmark artifact path, current aggregate score, recall latency, benchmark subscores, comparison artifacts, and the exact local command used to reproduce it.
Benchmark breakdown
Waiting for proof payload…Ability to retrieve older facts after filler turns and topic drift.
Ability to preserve facts across separate sessions and later queries.
Accuracy of turning raw inputs into durable, searchable memories.
Point-in-time and event-time correctness under explicit temporal questions.
Correct handling of supersession and newer facts overriding older ones.
Graceful refusal when no matching memory should be returned.
Latency tables
Derived from the committed primary regression artifact.| Query class | Count | P50 | P95 | P99 | Range |
|---|---|---|---|---|---|
| Waiting for proof payload… | |||||
Broader surface latencies
Repo-local proof-pack artifact not loaded yet.| Surface | Case | Route | Coverage | Count | P50 | P95 | P99 | Outcome |
|---|---|---|---|---|---|---|---|---|
| Waiting for proof payload… | ||||||||
SDK parity checks
Canonical HTTP to SDK parity checks not loaded yet.| Surface | HTTP | SDK | Parity |
|---|---|---|---|
| Waiting for proof payload… | |||
Usage reference
Current canonical metering formulas and settled usage signals.| Surface | Route | Signal | Formula | Notes |
|---|---|---|---|---|
| Waiting for proof payload… | ||||
Optional LLM metering
Representative query-expansion and reranking estimate rows not loaded yet.| Surface | Workload | Model | Prompt | Completion | Reasoning | Run estimate | Metered estimate | Notes |
|---|---|---|---|---|---|---|---|---|
| Waiting for proof payload… | ||||||||
Reference matrix
Secondary comparison artifacts on the same harness.Waiting for secondary benchmark runs from the proof payload.
Official-dataset publication
Official-dataset publication not loaded yet.Current developer surfaces
Shipped today, not roadmap fiction`/memory/v1/remember`, `/recall`, `/forget`, `/profile`, sessions, and staged uploads are the primary contract.
`workmemory-mcp --stdio` is the installable bridge entrypoint, and `python -m src.mcp --stdio` remains the repo-local server path. HTTP MCP can run with tenant-header mode or explicit API-key enforcement.
TypeScript, Python, OpenAI-compatible, LangChain, and Vercel wrappers all target the same canonical shared-host namespace.
Reproduce locally
Same command used to generate the committed artifactscripts/evals/run-workmemory-regression.sh