Skip to content

Architecture Overview

System Design

flowchart TD
    A[CLI: cyclopts] --> B[run_analysis]
    B --> C[create_agent]
    C --> D[Strands Agent<br/>Opus 4.6 + adaptive thinking]
    D --> E[Jinja2 System Prompt]
    D --> F[HookProviders<br/>quality + efficiency + fail-fast]
    D --> G[AnalysisResult<br/>structured output]
    D --> H[Tool Execution]
    H --> I[Discovery<br/>ripgrep, repomix]
    H --> J[LSP<br/>ty, ts-server, rust-analyzer]
    H --> K[AST<br/>ast-grep patterns]
    H --> L[Graph<br/>NetworkX analysis]
    H --> M[Git<br/>coupling, churn, blame]
    H --> N[Shell<br/>bounded execution]
    H --> O[Output Files<br/>.code-context/ directory]
    H --> P[context7 MCP<br/>library docs]
    D -.-> Q[FastMCP Server<br/>MCP protocol]
    Q --> R[Claude Code / Cursor<br/>MCP clients]

Component Layout

src/code_context_agent/
├── cli.py              # CLI entry point (cyclopts)
├── config.py           # Configuration (pydantic-settings)
├── agent/              # Agent orchestration
│   ├── factory.py      # Agent creation with tools + structured output
│   ├── runner.py       # Analysis runner with event streaming
│   ├── prompts.py      # Jinja2 template rendering
│   └── hooks.py        # HookProviders: quality, efficiency, fail-fast (full mode)
├── templates/          # Jinja2 prompt templates
│   ├── system.md.j2    # Unified system prompt
│   ├── partials/       # Composable prompt sections
│   └── steering/       # Quality guidance fragments
├── models/             # Pydantic models
│   ├── base.py         # StrictModel, FrozenModel
│   └── output.py       # AnalysisResult, BusinessLogicItem, etc.
├── mcp/                # FastMCP v3 server
│   ├── __init__.py     # Package init
│   └── server.py       # MCP tools, resources, and server definition
├── consumer/           # Event display (Rich TUI)
│   ├── base.py         # EventConsumer ABC
│   ├── phases.py       # 10-phase detection, discovery events (v7)
│   ├── rich_consumer.py # Dashboard with phase indicator + discovery feed
│   └── state.py        # AgentDisplayState with phase/discovery tracking
├── tools/              # Analysis tools (45+)
│   ├── discovery.py    # ripgrep, repomix, write_file (11 tools)
│   ├── astgrep.py      # ast-grep (3 tools)
│   ├── git.py          # git history (7 tools)
│   ├── shell_tool.py   # Shell with security hardening
│   ├── clones.py       # Clone detection via jscpd
│   ├── validation.py   # Input validation (path traversal, injection prevention)
│   ├── lsp/            # LSP integration (8 tools)
│   └── graph/          # NetworkX analysis (14 tools)
└── rules/              # ast-grep rule packs

Key Design Decisions

Agent Framework: Strands

The agent uses Strands Agents SDK with Claude Opus 4.6 via Amazon Bedrock. Strands provides:

  • Tool registration and dispatch
  • Structured output via Pydantic models
  • Event streaming for phase-aware progress display (10 phases + discovery feed)
  • HookProviders for quality and efficiency guardrails

Prompt Architecture: Jinja2 Templates

The system prompt is composed from modular Jinja2 templates:

  • system.md.j2 -- Unified entry point that includes all partials
  • partials/ -- Composable sections (rules, business logic, output format, tool-specific guidance)
  • steering/ -- Quality fragments (size limits, conciseness, anti-patterns, tool efficiency)

This allows the prompt to adapt based on detected codebase characteristics without maintaining multiple monolithic prompts.

Five Signal Layers

The analysis combines five distinct signal sources, following Tenet 2: Layer signals, read less:

  1. Static structure (AST/types) -- ast-grep patterns, LSP symbols
  2. Dynamic relationships (call graphs) -- LSP references, definitions
  3. Temporal evolution (git history) -- churn, coupling, blame
  4. Compressed abstractions (signatures) -- Tree-sitter compression via repomix
  5. Human intent (naming, commits) -- commit messages, file naming patterns

Graph-First Ranking

Files are ranked by graph metrics rather than heuristics, following Tenet 1: Measure, don't guess:

  • Betweenness centrality -- identifies bridge/bottleneck files
  • PageRank/TrustRank -- identifies foundational modules
  • Louvain/Leiden communities -- detects module boundaries
  • Triangle detection -- finds tightly coupled triads

Structured Output

The agent produces a Pydantic-typed AnalysisResult rather than freeform text, following Tenet 5: Machines read it first. This enables downstream agents to parse the output programmatically.

MCP Server (FastMCP v3)

The mcp/ package exposes the core differentiators via the Model Context Protocol, enabling coding agents (Claude Code, Cursor, etc.) to use the analysis capabilities directly:

  • Tools: start_analysis/check_analysis (kickoff/poll), query_code_graph (10 algorithms), explore_code_graph (progressive disclosure), get_graph_stats
  • Resources: Read-only access to analysis artifacts via analysis:// URI templates
  • Transport: stdio (default, for local MCP clients) or HTTP (for networked access)

Commodity tools (ripgrep, LSP, git, ast-grep) are intentionally not exposed -- they are already available in every coding agent's MCP marketplace.

context7 MCP Integration

The analysis agent loads context7 documentation tools via strands.tools.mcp.MCPClient, enabling library documentation lookup during analysis. This is controlled by CODE_CONTEXT_CONTEXT7_ENABLED (default: true) and requires npx.

Mode-Aware Pipeline (v7)

The --full flag triggers exhaustive analysis. The mode is threaded through the entire pipeline:

CLI (--full) → _derive_mode() → run_analysis(mode=) → create_agent(mode=)
                                      ↓                       ↓
                              RichEventConsumer(mode=)   get_prompt(mode=)
                              (phase indicator,          create_all_hooks(full_mode=)
                               discovery feed,           (+ FailFastHook)
                               mode badge)
  • FailFastHook raises FullModeToolError when non-exempt tools return errors in full mode
  • Phase detection maps tool calls to 10 analysis phases for TUI progress display
  • Discovery feed extracts notable findings from tool results (file counts, symbol counts, hotspots)
  • Mode-aware prompt switches between size-limited (_size_limits.md.j2) and exhaustive (_full_mode.md.j2) steering directives

See Full Mode and TUI Phases for details.


Security Model

The agent operates within a defense-in-depth security boundary:

  • Shell allowlist -- Only a curated set of read-only programs can execute via the shell tool. Write operations, network commands, and shell operators are blocked.
  • Input validation -- All tool inputs (paths, patterns, globs) pass through validation functions that prevent path traversal and command injection.
  • Path containment -- File operations are validated to stay within the repository root.
  • Git read-only -- Git subcommands are restricted to read-only operations (log, diff, blame, status, etc.).

See the Security documentation for full details on the security model and CI pipeline.