Skip to content

Configuration

OpenCodeHub honours a small, stable set of environment variables. Each variable is read from process.env at the entry point that owns it (CLI, MCP server, ingestion phase, embedder backend); none of them mutate global state.

VariablePurpose
CODEHUB_STORElbug forces LadybugDB; duck forces the single-file DuckDB layout. Unset (the default) means probe @ladybugdb/core and use LadybugDB when the binding is importable, otherwise fall back to DuckDB.
CODEHUB_HOMEOverride ~/.codehub/ (where the registry, embedder weights, and global state live).
OCH_VERBOSESet to 1 to surface the storage-backend probe advisory in non-TTY environments.

ADR 0013 (docs/adr/0013-m7-default-flip-and-abstraction.md) records the LadybugDB-default decision and the IGraphStore / ITemporalStore interface segregation.

VariablePurpose
OCH_NATIVE_PARSERSet to 1 on Node 22 to opt into the native tree-sitter N-API addon. The default runtime on Node 22 and Node 24 is web-tree-sitter (WASM).

The --native-parser CLI flag is equivalent. ADR 0013-parse-runtime-wasm-default records the WASM-default decision.

The cascade is SageMaker → HTTP → ONNX. The first variable group that resolves wins; the others are ignored.

VariablePurpose
CODEHUB_EMBEDDING_SAGEMAKER_ENDPOINTSigV4-authenticated SageMaker endpoint name. When set, the SageMaker backend wins.
CODEHUB_EMBEDDING_SAGEMAKER_REGIONOverride the AWS region for the SageMaker call.
CODEHUB_EMBEDDING_URLBase URL for an OpenAI-compatible HTTP endpoint (Infinity, vLLM, TEI, Ollama, LM Studio, OpenAI). /embeddings is appended.
CODEHUB_EMBEDDING_MODELModel id passed through to the HTTP endpoint verbatim.
CODEHUB_EMBEDDING_DIMSDimensionality of the embedding model. Default 768.
CODEHUB_EMBEDDING_API_KEYBearer token sent as Authorization: Bearer ....

When none of the above are set, the local ONNX backend (gte-modernbert-base, deterministic, offline-safe) is used.

VariablePurpose
CODEHUB_DISABLE_SCIPSet to 1 to make the scip-index ingestion phase a no-op. Heuristic edges still flow.
CODEHUB_ALLOW_BUILD_SCRIPTSSet to 1 to allow SCIP indexers that require a build (Rust, Java) to run. Off by default for clean-room safety.
CODEHUB_BEDROCK_DISABLEDSet to 1 to disable the LLM summarize phase. Equivalent to --no-summaries.
NO_COLORStandard convention; disables colored console output.

codehub analyze writes everything under <repo-root>/.codehub/. The exact files depend on the backend selected at index time.

PathPurpose
graph.lbugLadybugDB graph store — nodes, edges, embeddings.
temporal.duckdbSibling DuckDB file — temporal store (cochanges, symbol-summary cache).
meta.jsonIndex metadata: graph hash, node counts, CLI version, backend, embedder model id.
scan.sarifSARIF output from codehub scan.
sbom.cyclonedx.json / sbom.spdx.jsonSBOMs when codehub analyze --sbom has run.
PathPurpose
graph.duckdbSingle DuckDB file — nodes, edges, embeddings, and temporal views in one place.
meta.jsonSame shape as the LadybugDB layout.
scan.sarifSARIF output from codehub scan.

When both graph.lbug and graph.duckdb exist as siblings, the newer-mtime file wins.

Safe to delete and rebuild at any time via codehub clean + codehub analyze.

The registry maps each registered repo to its index path. It is consulted by:

  • Every per-repo MCP tool that accepts an optional repo argument.
  • codehub list, codehub status, codehub clean --all.
  • codehub group create when resolving repo names.

CODEHUB_HOME relocates the parent directory.

Each editor writer has a fixed target path and merges a codehub entry non-destructively:

EditorPathFormat
claude-code<project>/.mcp.jsonJSON
cursor~/.cursor/mcp.jsonJSON
codex~/.codex/config.tomlTOML
windsurf~/.codeium/windsurf/mcp_config.jsonJSON
opencode<project>/opencode.jsonJSON

--undo removes only the codehub entry each writer added; other entries are preserved.