Skip to content

Architecture decision records

Every load-bearing architectural choice in OpenCodeHub is recorded as an ADR under docs/adr/ in the repo. This page is the index. Click through to the source ADR for the full context, candidates considered, and consequences.

DuckDB via @duckdb/node-api plus the hnsw_acorn community extension for filter-aware vector search and the official fts extension for BM25. SQLite + sqlite-vec was rejected because FTS5 has no filtered-HNSW story. Superseded as the default by ADR 0011 + ADR 0013, but DuckDB is still the temporal-store backend and the single-file opt-in fallback.

Read ADR 0001

v2.0 ships pure TypeScript. A Rust NAPI-RS native core is deferred until measured numbers force the move; the latency / memory / cold analyze budgets all sit comfortably below their reopen triggers.

Read ADR 0002

ADR 0004 — Hierarchical embeddings with filter-aware HNSW

Section titled “ADR 0004 — Hierarchical embeddings with filter-aware HNSW”

One embeddings table with a granularity discriminator column (symbol | file | community) and a single HNSW index. Filter-aware traversal pushes the granularity predicate into the graph walk. ColBERT-style and RAPTOR were rejected.

Read ADR 0004

Per-LSP phases and @opencodehub/lsp-oracle are deleted in favour of a single scip-index phase backed by @opencodehub/scip-ingest. Oracle-edge provenance switches to scip:<indexer>@<version>.

Read ADR 0005

The pin table for every per-language SCIP indexer plus install channel. New indexers (scip-clang, scip-dotnet, scip-kotlin, scip-ruby) are appended to the same table as they land.

Read ADR 0006

The artifact-generation skill family inside plugins/opencodehub/ that turns the graph into committed Markdown. Four P0 skills, subagents, Phase 0 precompute, .docmeta.json, deterministic Phase E assembler.

Read ADR 0007

The four-phase document pattern (Phase 0 precompute → Phase AB parallel content → Phase CD parallel diagrams + specialty → Phase E deterministic assembler), adapted for OpenCodeHub.

Read ADR 0008

Single authoritative output contract. .codehub/docs/ gitignored default; --committed opts in to docs/codehub/. Backtick citation grammar. .docmeta.json schema v1. Mermaid-only diagrams. 20-node diagram cap with a Legend table for overflow.

Read ADR 0009

ADR 0010 — Three dogfood findings from 2026-04-27

Section titled “ADR 0010 — Three dogfood findings from 2026-04-27”

Three small fixes after dogfooding codehub init and the artifact factory: parallel embedding workers default, codehub list health column, Phase 0 schema preflight.

Read ADR 0010

Adds @ladybugdb/core as the LadybugDB graph backend behind the IGraphStore seam. Motivation: recursive-CTE traversals on the polymorphic relations table do not get faster, and the predicate cannot be pushed into the graph walk.

Read ADR 0011

ADR 0012 — Repo as a first-class graph node

Section titled “ADR 0012 — Repo as a first-class graph node”

Promote repo_uri, default_branch, and group to typed graph attributes on a Repo node. Backs the cross-repo federation surface (group_query, group_status, group_contracts, group_list, group_cross_repo_links) and the structured AMBIGUOUS_REPO envelope returned by per-repo tools.

Read ADR 0012

ADR 0013 — Storage default + interface segregation

Section titled “ADR 0013 — Storage default + interface segregation”

LadybugDB is the default backend and IGraphStore is segregated from ITemporalStore. The temporal half (cochanges, summary cache) lives on DuckDB. The community-adapter escape hatch (AGE / Memgraph / Neo4j / Neptune) keeps OCH from locking users into LadybugDB.

Read ADR 0013

ADR 0013 — Parse runtime: WASM default, native opt-in

Section titled “ADR 0013 — Parse runtime: WASM default, native opt-in”

Sibling ADR sharing the number 0013 (authored on a parallel branch). WASM (web-tree-sitter) is the default parse runtime on Node 22 and Node 24. Native (tree-sitter N-API addon) is opt-in via OCH_NATIVE_PARSER=1 on Node 22.

Read ADR 0013 (parse runtime)

ADR 0014 — SCIP REFERENCES + TYPE_OF emission, embedder fingerprint

Section titled “ADR 0014 — SCIP REFERENCES + TYPE_OF emission, embedder fingerprint”

Two unrelated holes shipped together because they share a one-time fixture-regeneration cost. Wire up SCIP REFERENCES and TYPE_OF edge emission alongside the existing CALLS and IMPLEMENTS. Persist the embedder modelId in store metadata; refuse a query when the configured embedder differs from the one that produced the stored vectors (override available via documented force flag).

Read ADR 0014

Superseded by ADR 0006. The gopls pin matrix is historical — OCH no longer runs long-running language servers; oracle edges come from SCIP.

Read ADR 0003

New architectural decisions go under docs/adr/NNNN-slug.md using the next numeric prefix. Keep the headings: Status, Date, Context, Decision, Consequences, plus any ADR-specific sections.

If a new decision supersedes an older one, update the superseded ADR’s status line with a forward link and add a reverse link from the new ADR’s context section.