ChromaSQL¶
ChromaSQL is a SQL-inspired domain specific language that maps cleanly to ChromaDB’s vector and metadata query surfaces. It bundles everything you need to parse user-facing queries, validate semantics, inspect execution plans, and run them against a Chroma collection.
Architecture¶
ChromaSQL follows a compiler-style pipeline:
- Grammar –
chromasql/grammar.pydefines the surface language using Lark. - Parser –
chromasql/parser.pyturns the parse tree into typed AST nodes. - Planner –
chromasql/planner.pyconverts AST nodes into validated plans. - Executor –
chromasql/executor.pyruns those plans against ChromaDB. - Explain & Analysis – helpers for tooling, routing, and introspection.
The planner and executor operate on frozen dataclasses (chromasql/ast.py and
chromasql/plan.py), making the system easy to test and reason about.
Key Features¶
- Rich projection, filtering, ordering, pagination, and rerank clauses.
- Embedding support for inline text, literal vectors, and batches.
- Optional explain output for debugging query plans.
- Multi-collection routing primitives and adapters for sharded deployments.
- 100% unit test coverage to keep language changes predictable.
Next Steps¶
Pick the path that best suits your workflow. Happy querying!
-
Using ChromaSQL
Learn the language through the narrative tutorial and clause reference. -
Using ChromaSQL Python Package
Install the Python SDK, execute plans, and integrate multi-collection routing. -
Using ChromaSQL with LLMs
Drop this authoring prompt into your assistants so they emit valid queries. -
Building Extensions
Extend the grammar, planner, executor, and packaging workflow safely.