Matryoshka Reduces Token Costs For Document Analysis
Matryoshka is a document-analysis tool that achieves over 80% token savings by caching and reusing past analysis results, enabling interactive, exploratory examination of large codebases without re-sending file contents. It combines a declarative S-expression query language called Nucleus, pointer-based server-side REPL state to return aggregated answers instead of raw text, and synthesis-from-examples, demonstrated on the anki-connect codebase to reduce token costs and mitigate context-rot during multi-pass workflows.
Key Points
- 1Implements pointer-based REPL state to store query results server-side, returning pointers not raw document text
- 2Reduces token transmission and prevents context rot, preserving model performance on large documents
- 3Enables iterative, multi-pass analysis of codebases with over 80% token cost savings for agents
Scoring Rationale
Practical, directly usable token-saving approach and demonstrated codebase results + limited independent validation and unclear adoption.
Sources
Public references used for this report.
Practice interview problems based on real data
1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems
