Architecture Overview¶

System Design¶

flowchart TD
    A[CLI: cyclopts] --> B[run_analysis]
    B --> C[create_agent]
    C --> D[Strands Agent<br/>Opus 4.6 + adaptive thinking]
    D --> E[Jinja2 System Prompt]
    D --> F[HookProviders<br/>quality + efficiency + fail-fast]
    D --> G[AnalysisResult<br/>structured output]
    D --> H[Tool Execution]
    H --> I[Discovery<br/>ripgrep, repomix]
    H --> J[LSP<br/>ty, ts-server, rust-analyzer]
    H --> K[AST<br/>ast-grep patterns]
    H --> L[Graph<br/>NetworkX analysis]
    H --> M[Git<br/>coupling, churn, blame]
    H --> N[Shell<br/>bounded execution]
    H --> O[Output Files<br/>.code-context/ directory]
    H --> P[context7 MCP<br/>library docs]
    D -.-> Q[FastMCP Server<br/>MCP protocol]
    Q --> R[Claude Code / Cursor<br/>MCP clients]

Component Layout¶

src/code_context_agent/
├── cli.py              # CLI entry point (cyclopts)
├── config.py           # Configuration (pydantic-settings)
├── agent/              # Agent orchestration
│   ├── factory.py      # Agent creation with tools + structured output
│   ├── runner.py       # Analysis runner with event streaming
│   ├── prompts.py      # Jinja2 template rendering
│   └── hooks.py        # HookProviders: quality, efficiency, fail-fast (full mode)
├── templates/          # Jinja2 prompt templates
│   ├── system.md.j2    # Unified system prompt
│   ├── partials/       # Composable prompt sections
│   └── steering/       # Quality guidance fragments
├── models/             # Pydantic models
│   ├── base.py         # StrictModel, FrozenModel
│   └── output.py       # AnalysisResult, BusinessLogicItem, etc.
├── mcp/                # FastMCP v3 server
│   ├── __init__.py     # Package init
│   └── server.py       # MCP tools, resources, and server definition
├── consumer/           # Event display (Rich TUI)
│   ├── base.py         # EventConsumer ABC
│   ├── phases.py       # 10-phase detection, discovery events (v7)
│   ├── rich_consumer.py # Dashboard with phase indicator + discovery feed
│   └── state.py        # AgentDisplayState with phase/discovery tracking
├── tools/              # Analysis tools (45+)
│   ├── discovery.py    # ripgrep, repomix, write_file (11 tools)
│   ├── astgrep.py      # ast-grep (3 tools)
│   ├── git.py          # git history (7 tools)
│   ├── shell_tool.py   # Shell with security hardening
│   ├── clones.py       # Clone detection via jscpd
│   ├── validation.py   # Input validation (path traversal, injection prevention)
│   ├── lsp/            # LSP integration (8 tools)
│   └── graph/          # NetworkX analysis (14 tools)
└── rules/              # ast-grep rule packs

Key Design Decisions¶

Agent Framework: Strands¶

The agent uses Strands Agents SDK with Claude Opus 4.6 via Amazon Bedrock. Strands provides:

Tool registration and dispatch
Structured output via Pydantic models
Event streaming for phase-aware progress display (10 phases + discovery feed)
HookProviders for quality and efficiency guardrails

Prompt Architecture: Jinja2 Templates¶

The system prompt is composed from modular Jinja2 templates:

system.md.j2 -- Unified entry point that includes all partials
partials/ -- Composable sections (rules, business logic, output format, tool-specific guidance)
steering/ -- Quality fragments (size limits, conciseness, anti-patterns, tool efficiency)

This allows the prompt to adapt based on detected codebase characteristics without maintaining multiple monolithic prompts.

Five Signal Layers¶

The analysis combines five distinct signal sources, following Tenet 2: Layer signals, read less:

Static structure (AST/types) -- ast-grep patterns, LSP symbols
Dynamic relationships (call graphs) -- LSP references, definitions
Temporal evolution (git history) -- churn, coupling, blame
Compressed abstractions (signatures) -- Tree-sitter compression via repomix
Human intent (naming, commits) -- commit messages, file naming patterns

Graph-First Ranking¶

Files are ranked by graph metrics rather than heuristics, following Tenet 1: Measure, don't guess:

Betweenness centrality -- identifies bridge/bottleneck files
PageRank/TrustRank -- identifies foundational modules
Louvain/Leiden communities -- detects module boundaries
Triangle detection -- finds tightly coupled triads

Structured Output¶

The agent produces a Pydantic-typed AnalysisResult rather than freeform text, following Tenet 5: Machines read it first. This enables downstream agents to parse the output programmatically.

MCP Server (FastMCP v3)¶

The mcp/ package exposes the core differentiators via the Model Context Protocol, enabling coding agents (Claude Code, Cursor, etc.) to use the analysis capabilities directly:

Tools: start_analysis/check_analysis (kickoff/poll), query_code_graph (10 algorithms), explore_code_graph (progressive disclosure), get_graph_stats
Resources: Read-only access to analysis artifacts via analysis:// URI templates
Transport: stdio (default, for local MCP clients) or HTTP (for networked access)

Commodity tools (ripgrep, LSP, git, ast-grep) are intentionally not exposed -- they are already available in every coding agent's MCP marketplace.

context7 MCP Integration¶

The analysis agent loads context7 documentation tools via strands.tools.mcp.MCPClient, enabling library documentation lookup during analysis. This is controlled by CODE_CONTEXT_CONTEXT7_ENABLED (default: true) and requires npx.

Mode-Aware Pipeline (v7)¶

The --full flag triggers exhaustive analysis. The mode is threaded through the entire pipeline:

CLI (--full) → _derive_mode() → run_analysis(mode=) → create_agent(mode=)
                                      ↓                       ↓
                              RichEventConsumer(mode=)   get_prompt(mode=)
                              (phase indicator,          create_all_hooks(full_mode=)
                               discovery feed,           (+ FailFastHook)
                               mode badge)

FailFastHook raises FullModeToolError when non-exempt tools return errors in full mode
Phase detection maps tool calls to 10 analysis phases for TUI progress display
Discovery feed extracts notable findings from tool results (file counts, symbol counts, hotspots)
Mode-aware prompt switches between size-limited (_size_limits.md.j2) and exhaustive (_full_mode.md.j2) steering directives

See Full Mode and TUI Phases for details.

Security Model¶

The agent operates within a defense-in-depth security boundary:

Shell allowlist -- Only a curated set of read-only programs can execute via the shell tool. Write operations, network commands, and shell operators are blocked.
Input validation -- All tool inputs (paths, patterns, globs) pass through validation functions that prevent path traversal and command injection.
Path containment -- File operations are validated to stay within the repository root.
Git read-only -- Git subcommands are restricted to read-only operations (log, diff, blame, status, etc.).

See the Security documentation for full details on the security model and CI pipeline.