Skip to content

Index

code_context_agent.tools.graph

Code graph analysis package.

Provides tools for building and analyzing code graphs using NetworkX: - Graph construction from LSP, AST-grep, and ripgrep results - Analysis algorithms (clustering, centrality, traversal) - Progressive disclosure for AI context generation - Export to Mermaid and JSON formats

CodeAnalyzer

CodeAnalyzer(graph)

Analyzer for code graphs using NetworkX algorithms.

Provides methods for finding important code (centrality), detecting logical modules (clustering), and analyzing relationships between code elements.

Initialize the analyzer with a code graph.

Parameters:

Name Type Description Default
graph CodeGraph

The CodeGraph to analyze

required
Source code in src/code_context_agent/tools/graph/analysis.py
def __init__(self, graph: CodeGraph) -> None:
    """Initialize the analyzer with a code graph.

    Args:
        graph: The CodeGraph to analyze
    """
    self.graph = graph

find_hotspots

find_hotspots(top_k=10)

Find code hotspots using betweenness centrality.

Hotspots are code elements that lie on many shortest paths between other elements - they are often bottlenecks or central integration points.

Parameters:

Name Type Description Default
top_k int

Number of top hotspots to return

10

Returns:

Type Description
list[dict[str, Any]]

List of dictionaries with node info and betweenness score

Source code in src/code_context_agent/tools/graph/analysis.py
def find_hotspots(self, top_k: int = 10) -> list[dict[str, Any]]:
    """Find code hotspots using betweenness centrality.

    Hotspots are code elements that lie on many shortest paths
    between other elements - they are often bottlenecks or
    central integration points.

    Args:
        top_k: Number of top hotspots to return

    Returns:
        List of dictionaries with node info and betweenness score
    """
    view = self.graph.get_view([EdgeType.CALLS, EdgeType.REFERENCES])

    if view.number_of_nodes() == 0:
        return []

    try:
        betweenness = nx.betweenness_centrality(view, weight="weight")
    except nx.NetworkXError:
        return []

    return self._format_ranked_results(betweenness, top_k)

find_foundations

find_foundations(top_k=10)

Find foundational code using PageRank.

Foundations are code elements that are heavily depended upon by other important code - the core infrastructure.

Parameters:

Name Type Description Default
top_k int

Number of top foundations to return

10

Returns:

Type Description
list[dict[str, Any]]

List of dictionaries with node info and PageRank score

Source code in src/code_context_agent/tools/graph/analysis.py
def find_foundations(self, top_k: int = 10) -> list[dict[str, Any]]:
    """Find foundational code using PageRank.

    Foundations are code elements that are heavily depended upon
    by other important code - the core infrastructure.

    Args:
        top_k: Number of top foundations to return

    Returns:
        List of dictionaries with node info and PageRank score
    """
    view = self.graph.get_view([EdgeType.CALLS, EdgeType.IMPORTS])

    if view.number_of_nodes() == 0:
        return []

    try:
        pagerank = nx.pagerank(view, alpha=0.85, weight="weight")
    except nx.NetworkXError:
        return []

    return self._format_ranked_results(pagerank, top_k)

find_trusted_foundations

find_trusted_foundations(seed_nodes=None, top_k=10)

Find foundational code using TrustRank (noise-resistant PageRank).

TrustRank propagates trust from seed nodes, making it more resistant to noise than standard PageRank. If no seed nodes provided, uses entry points as seeds.

Parameters:

Name Type Description Default
seed_nodes list[str] | None

List of trusted node IDs (defaults to entry points)

None
top_k int

Number of top results to return

10

Returns:

Type Description
list[dict[str, Any]]

List of dictionaries with node info and trust score

Source code in src/code_context_agent/tools/graph/analysis.py
def find_trusted_foundations(
    self,
    seed_nodes: list[str] | None = None,
    top_k: int = 10,
) -> list[dict[str, Any]]:
    """Find foundational code using TrustRank (noise-resistant PageRank).

    TrustRank propagates trust from seed nodes, making it more resistant
    to noise than standard PageRank. If no seed nodes provided, uses
    entry points as seeds.

    Args:
        seed_nodes: List of trusted node IDs (defaults to entry points)
        top_k: Number of top results to return

    Returns:
        List of dictionaries with node info and trust score
    """
    view = self.graph.get_view([EdgeType.CALLS, EdgeType.IMPORTS])

    if view.number_of_nodes() == 0:
        return []

    # Use entry points as default seeds
    if not seed_nodes:
        entry_points = self.find_entry_points()
        seed_nodes = [ep["id"] for ep in entry_points[:5]]

    if not seed_nodes:
        return self.find_foundations(top_k)

    # Build personalization dict for TrustRank
    trust = dict.fromkeys(view.nodes(), 0.0)
    for seed in seed_nodes:
        if seed in trust:
            trust[seed] = 1.0 / len(seed_nodes)

    try:
        scores = nx.pagerank(view, alpha=0.85, personalization=trust, weight="weight")
    except nx.NetworkXError:
        return []

    return self._format_ranked_results(scores, top_k)

find_entry_points

find_entry_points()

Find likely entry points in the code.

Entry points are nodes with no incoming call edges but outgoing calls - they initiate execution flow.

Returns:

Type Description
list[dict[str, Any]]

List of dictionaries with entry point node info

Source code in src/code_context_agent/tools/graph/analysis.py
def find_entry_points(self) -> list[dict[str, Any]]:
    """Find likely entry points in the code.

    Entry points are nodes with no incoming call edges but
    outgoing calls - they initiate execution flow.

    Returns:
        List of dictionaries with entry point node info
    """
    view = self.graph.get_view([EdgeType.CALLS])

    entry_points = []
    for node in view.nodes():
        in_deg = view.in_degree(node)
        out_deg = view.out_degree(node)

        # Entry point: no callers but makes calls
        if in_deg == 0 and out_deg > 0:
            node_data = self.graph.get_node_data(node)
            entry_points.append(
                {
                    "id": node,
                    "out_degree": out_deg,
                    **(node_data or {}),
                },
            )

    # Also check for main/run/start patterns
    for node, data in self.graph.nodes(data=True):
        name = str(data.get("name", "")).lower()
        if any(p in name for p in ("main", "__main__", "run", "start", "app", "cli")):
            if not any(ep["id"] == node for ep in entry_points):
                entry_points.append(
                    {
                        "id": node,
                        "out_degree": view.out_degree(node) if view.has_node(node) else 0,
                        **data,
                    },
                )

    # Sort by out_degree (more calls = more significant entry point)
    entry_points.sort(key=lambda x: x.get("out_degree", 0), reverse=True)

    return entry_points

detect_modules

detect_modules(resolution=1.0)

Detect logical modules using Louvain community detection.

Uses the Louvain algorithm to find communities of densely connected code elements.

Parameters:

Name Type Description Default
resolution float

Clustering resolution (< 1 = larger clusters, > 1 = smaller)

1.0

Returns:

Type Description
list[dict[str, Any]]

List of module dictionaries with members and metrics

Source code in src/code_context_agent/tools/graph/analysis.py
def detect_modules(self, resolution: float = 1.0) -> list[dict[str, Any]]:
    """Detect logical modules using Louvain community detection.

    Uses the Louvain algorithm to find communities of densely
    connected code elements.

    Args:
        resolution: Clustering resolution (< 1 = larger clusters, > 1 = smaller)

    Returns:
        List of module dictionaries with members and metrics
    """
    view = self.graph.get_view([EdgeType.CALLS, EdgeType.IMPORTS])

    if view.number_of_nodes() < 2:
        return []

    # Louvain requires undirected graph
    undirected = view.to_undirected()

    try:
        # Try Leiden first (better community quality, requires backend)
        communities = nx.community.leiden_communities(undirected, resolution=resolution, seed=42)
    except (NotImplementedError, nx.NetworkXError, ValueError, RuntimeError):
        try:
            # Fallback to Louvain (pure NetworkX)
            communities = nx.community.louvain_communities(undirected, resolution=resolution, seed=42)
        except (nx.NetworkXError, ValueError, RuntimeError):
            return []

    modules = []
    for i, community in enumerate(communities):
        community_list = list(community)

        # Get key nodes (highest PageRank within community)
        subgraph = view.subgraph(community_list)
        if subgraph.number_of_nodes() > 0:
            try:
                local_pr = nx.pagerank(subgraph)
                key_nodes = sorted(local_pr.items(), key=lambda x: x[1], reverse=True)[:3]
            except (nx.NetworkXError, ValueError, RuntimeError):
                key_nodes = [(n, 0) for n in community_list[:3]]
        else:
            key_nodes = []

        # Calculate cohesion (internal/external edge ratio)
        cohesion = self._calculate_cohesion(view, community)

        modules.append(
            {
                "module_id": i,
                "size": len(community_list),
                "key_nodes": [{"id": n, "score": s} for n, s in key_nodes],
                "members": community_list,
                "cohesion": cohesion,
            },
        )

    # Sort by size (largest modules first)
    modules.sort(key=lambda x: x["size"], reverse=True)

    return modules

find_clusters_by_pattern

find_clusters_by_pattern(rule_id)

Find clusters of nodes matching a specific AST-grep rule.

Groups nodes by their rule_id metadata to find related business logic patterns.

Parameters:

Name Type Description Default
rule_id str

The rule identifier to filter by

required

Returns:

Type Description
list[dict[str, Any]]

List of matching nodes grouped by file

Source code in src/code_context_agent/tools/graph/analysis.py
def find_clusters_by_pattern(self, rule_id: str) -> list[dict[str, Any]]:
    """Find clusters of nodes matching a specific AST-grep rule.

    Groups nodes by their rule_id metadata to find related
    business logic patterns.

    Args:
        rule_id: The rule identifier to filter by

    Returns:
        List of matching nodes grouped by file
    """
    matching_nodes: dict[str, list[dict[str, Any]]] = {}

    for node_id, data in self.graph.nodes(data=True):
        if data.get("rule_id") == rule_id:
            file_path = data.get("file_path", "unknown")
            if file_path not in matching_nodes:
                matching_nodes[file_path] = []
            matching_nodes[file_path].append({"id": node_id, **data})

    return [{"file": f, "matches": m, "count": len(m)} for f, m in matching_nodes.items()]

find_clusters_by_category

find_clusters_by_category(category)

Find all nodes matching a business logic category.

Parameters:

Name Type Description Default
category str

Category to filter by (e.g., "db", "auth", "http")

required

Returns:

Type Description
list[dict[str, Any]]

List of matching nodes with their locations

Source code in src/code_context_agent/tools/graph/analysis.py
def find_clusters_by_category(self, category: str) -> list[dict[str, Any]]:
    """Find all nodes matching a business logic category.

    Args:
        category: Category to filter by (e.g., "db", "auth", "http")

    Returns:
        List of matching nodes with their locations
    """
    matches = []

    for node_id, data in self.graph.nodes(data=True):
        if data.get("category") == category:
            matches.append({"id": node_id, **data})

    return matches

find_triangles

find_triangles(top_k=10)

Find tightly-coupled code triads using triangle detection.

Triangles in the call/import graph indicate three pieces of code that all depend on each other — potential cohesion or coupling issues.

Parameters:

Name Type Description Default
top_k int

Maximum number of triangles to return

10

Returns:

Type Description
list[dict[str, Any]]

List of triangle dictionaries with the three node IDs

Source code in src/code_context_agent/tools/graph/analysis.py
def find_triangles(self, top_k: int = 10) -> list[dict[str, Any]]:
    """Find tightly-coupled code triads using triangle detection.

    Triangles in the call/import graph indicate three pieces of code
    that all depend on each other — potential cohesion or coupling issues.

    Args:
        top_k: Maximum number of triangles to return

    Returns:
        List of triangle dictionaries with the three node IDs
    """
    view = self.graph.get_view([EdgeType.CALLS, EdgeType.IMPORTS])
    undirected = view.to_undirected()

    triangles = []
    try:
        for triangle in nx.enumerate_all_cliques(undirected):
            if len(triangle) == 3:
                triangles.append(
                    {
                        "nodes": list(triangle),
                        "node_details": [{"id": n, **(self.graph.get_node_data(n) or {})} for n in triangle],
                    },
                )
                if len(triangles) >= top_k:
                    break
    except nx.NetworkXError:
        pass  # graph structure doesn't support triangle detection (e.g. directed)

    return triangles

get_similar_nodes

get_similar_nodes(node_id, top_k=5)

Find nodes similar to a given node based on graph structure.

Uses personalized PageRank to find nodes closely related to the target node.

Parameters:

Name Type Description Default
node_id str

The node to find similar nodes for

required
top_k int

Number of similar nodes to return

5

Returns:

Type Description
list[dict[str, Any]]

List of similar nodes with similarity scores

Source code in src/code_context_agent/tools/graph/analysis.py
def get_similar_nodes(self, node_id: str, top_k: int = 5) -> list[dict[str, Any]]:
    """Find nodes similar to a given node based on graph structure.

    Uses personalized PageRank to find nodes closely related
    to the target node.

    Args:
        node_id: The node to find similar nodes for
        top_k: Number of similar nodes to return

    Returns:
        List of similar nodes with similarity scores
    """
    view = self.graph.get_view()

    if not view.has_node(node_id):
        return []

    try:
        # Personalized PageRank with target node as seed
        ppr = nx.pagerank(view, personalization={node_id: 1}, alpha=0.85)
    except nx.NetworkXError:
        return []

    # Remove self, sort by score
    del ppr[node_id]
    ranked = sorted(ppr.items(), key=lambda x: x[1], reverse=True)[:top_k]

    return [{"id": n, "similarity": s, **(self.graph.get_node_data(n) or {})} for n, s in ranked if s > 0]

calculate_coupling

calculate_coupling(node_a, node_b)

Calculate coupling strength between two nodes.

Considers shared neighbors, direct edges, and path length.

Parameters:

Name Type Description Default
node_a str

First node ID

required
node_b str

Second node ID

required

Returns:

Type Description
dict[str, Any]

Dictionary with coupling metrics

Source code in src/code_context_agent/tools/graph/analysis.py
def calculate_coupling(self, node_a: str, node_b: str) -> dict[str, Any]:
    """Calculate coupling strength between two nodes.

    Considers shared neighbors, direct edges, and path length.

    Args:
        node_a: First node ID
        node_b: Second node ID

    Returns:
        Dictionary with coupling metrics
    """
    view = self.graph.get_view()

    if not view.has_node(node_a) or not view.has_node(node_b):
        return {"error": "Node not found", "coupling": 0.0}

    # Direct edge count
    direct_edges = 0
    if view.has_edge(node_a, node_b):
        direct_edges += 1
    if view.has_edge(node_b, node_a):
        direct_edges += 1

    # Shared neighbors
    neighbors_a = set(view.successors(node_a)) | set(view.predecessors(node_a))
    neighbors_b = set(view.successors(node_b)) | set(view.predecessors(node_b))
    shared = neighbors_a & neighbors_b

    # Shortest path length
    try:
        path_length = nx.shortest_path_length(view.to_undirected(), node_a, node_b)
    except nx.NetworkXNoPath:
        path_length = float("inf")

    # Calculate coupling score (higher = more coupled)
    coupling = direct_edges * 2.0 + len(shared) * 0.5 + (1.0 / (path_length + 1))

    return {
        "node_a": node_a,
        "node_b": node_b,
        "direct_edges": direct_edges,
        "shared_neighbors": len(shared),
        "path_length": path_length if path_length != float("inf") else None,
        "coupling": coupling,
    }

get_dependency_chain

get_dependency_chain(
    node_id, direction="outgoing", max_depth=5
)

Get the dependency chain from/to a node.

Parameters:

Name Type Description Default
node_id str

Starting node

required
direction str

"outgoing" (what this depends on) or "incoming" (what depends on this)

'outgoing'
max_depth int

Maximum depth to traverse

5

Returns:

Type Description
dict[str, Any]

Dictionary with nodes and edges in the chain

Source code in src/code_context_agent/tools/graph/analysis.py
def get_dependency_chain(self, node_id: str, direction: str = "outgoing", max_depth: int = 5) -> dict[str, Any]:
    """Get the dependency chain from/to a node.

    Args:
        node_id: Starting node
        direction: "outgoing" (what this depends on) or "incoming" (what depends on this)
        max_depth: Maximum depth to traverse

    Returns:
        Dictionary with nodes and edges in the chain
    """
    view = self.graph.get_view([EdgeType.CALLS, EdgeType.IMPORTS])

    if not view.has_node(node_id):
        return {"error": "Node not found"}

    if direction == "outgoing":
        nodes = dict(nx.single_source_shortest_path_length(view, node_id, cutoff=max_depth))
    else:
        # Incoming: traverse reverse graph
        reverse = view.reverse()
        nodes = dict(nx.single_source_shortest_path_length(reverse, node_id, cutoff=max_depth))

    # Get edges within the discovered nodes
    subgraph = view.subgraph(nodes.keys())
    edges = list(subgraph.edges(data=True))

    return {
        "root": node_id,
        "direction": direction,
        "depth": max_depth,
        "nodes": [{"id": n, "distance": d, **(self.graph.get_node_data(n) or {})} for n, d in nodes.items()],
        "edges": [{"source": u, "target": v, **d} for u, v, d in edges],
    }

find_unused_symbols

find_unused_symbols(node_types=None, exclude_patterns=None)

Find symbols with zero incoming cross-file references.

Identifies functions, classes, and methods that are defined but never referenced from other files — dead code candidates.

Parameters:

Name Type Description Default
node_types list[str] | None

Filter to specific types (default: function, class, method)

None
exclude_patterns list[str] | None

Regex patterns to exclude from results

None

Returns:

Type Description
list[dict[str, Any]]

List of unused symbol dicts with id, name, file_path, node_type

Source code in src/code_context_agent/tools/graph/analysis.py
def find_unused_symbols(
    self,
    node_types: list[str] | None = None,
    exclude_patterns: list[str] | None = None,
) -> list[dict[str, Any]]:
    """Find symbols with zero incoming cross-file references.

    Identifies functions, classes, and methods that are defined but
    never referenced from other files — dead code candidates.

    Args:
        node_types: Filter to specific types (default: function, class, method)
        exclude_patterns: Regex patterns to exclude from results

    Returns:
        List of unused symbol dicts with id, name, file_path, node_type
    """
    target_types = (
        set(node_types)
        if node_types
        else {
            NodeType.FUNCTION.value,
            NodeType.CLASS.value,
            NodeType.METHOD.value,
        }
    )
    default_excludes = [r"^test_", r"^_", r"__init__", r"__main__"]
    excludes = [re.compile(p) for p in (exclude_patterns or default_excludes)]

    view = self.graph.get_view([EdgeType.REFERENCES, EdgeType.CALLS, EdgeType.IMPORTS])

    unused = []
    for node_id, data in self.graph.nodes(data=True):
        if data.get("node_type") not in target_types:
            continue

        name = str(data.get("name", ""))
        if any(pat.search(name) for pat in excludes):
            continue

        node_file = data.get("file_path", "")
        if not node_file:
            continue

        # Count incoming edges from OTHER files
        cross_file_refs = 0
        if view.has_node(node_id):
            for pred in view.predecessors(node_id):
                pred_data = self.graph.get_node_data(pred)
                pred_file = (pred_data or {}).get("file_path", "")
                if pred_file and pred_file != node_file:
                    cross_file_refs += 1
                    break  # One is enough to disqualify

        if cross_file_refs == 0:
            unused.append(
                {
                    "id": node_id,
                    "name": name,
                    "file_path": node_file,
                    "node_type": data.get("node_type"),
                    "line_start": data.get("line_start", 0),
                },
            )

    unused.sort(key=lambda x: (x["file_path"], x.get("line_start", 0)))
    return unused

find_refactoring_candidates

find_refactoring_candidates(top_k=10)

Identify refactoring opportunities by combining multiple signals.

Combines: - Clone pairs (SIMILAR_TO edges) -> "extract shared helper" - Code smell pattern matches (rule_id contains "code_smell") -> structural issues - Unused symbols -> "dead code removal"

Parameters:

Name Type Description Default
top_k int

Maximum number of candidates to return

10

Returns:

Type Description
list[dict[str, Any]]

Ranked list of refactoring candidates with type, files, and rationale.

Source code in src/code_context_agent/tools/graph/analysis.py
def find_refactoring_candidates(self, top_k: int = 10) -> list[dict[str, Any]]:  # noqa: C901
    """Identify refactoring opportunities by combining multiple signals.

    Combines:
    - Clone pairs (SIMILAR_TO edges) -> "extract shared helper"
    - Code smell pattern matches (rule_id contains "code_smell") -> structural issues
    - Unused symbols -> "dead code removal"

    Args:
        top_k: Maximum number of candidates to return

    Returns:
        Ranked list of refactoring candidates with type, files, and rationale.
    """
    candidates: list[dict[str, Any]] = []

    # 1. Clone groups from SIMILAR_TO edges
    similar_edges = self.graph.get_edges_by_type(EdgeType.SIMILAR_TO)
    clone_groups: dict[str, list[str]] = {}
    for source, target, data in similar_edges:
        key = f"{source}|{target}" if source < target else f"{target}|{source}"
        if key not in clone_groups:
            clone_groups[key] = [source, target]
            candidates.append(
                {
                    "type": "extract_helper",
                    "pattern": f"Duplicate code between {source} and {target}",
                    "files": [source, target],
                    "occurrence_count": 2,
                    "duplicated_lines": int(data.get("duplicated_lines", 0)),
                    "score": int(data.get("duplicated_lines", 5)) * 2.0,
                },
            )

    # 2. Code smell patterns
    smell_counts: dict[str, list[str]] = {}
    for node_id, data in self.graph.nodes(data=True):
        rule_id = data.get("rule_id", "")
        note = data.get("note", "")
        if "code_smell" in note or "code_smell" in rule_id:
            if rule_id not in smell_counts:
                smell_counts[rule_id] = []
            smell_counts[rule_id].append(data.get("file_path", node_id))

    for rule_id, files in smell_counts.items():
        candidates.append(
            {
                "type": "code_smell",
                "pattern": rule_id,
                "files": list(set(files)),
                "occurrence_count": len(files),
                "duplicated_lines": 0,
                "score": len(files) * 1.5,
            },
        )

    # 3. Unused symbols
    unused = self.find_unused_symbols()
    if unused:
        # Group by file
        by_file: dict[str, list[str]] = {}
        for sym in unused:
            fp = sym["file_path"]
            if fp not in by_file:
                by_file[fp] = []
            by_file[fp].append(sym["name"])

        for fp, names in by_file.items():
            candidates.append(
                {
                    "type": "dead_code",
                    "pattern": f"{len(names)} unused symbol(s) in {fp}",
                    "files": [fp],
                    "occurrence_count": len(names),
                    "duplicated_lines": 0,
                    "score": len(names) * 1.0,
                },
            )

    # Sort by score descending, return top_k
    candidates.sort(key=lambda x: x["score"], reverse=True)
    return candidates[:top_k]

ProgressiveExplorer

ProgressiveExplorer(graph, analyzer=None)

Staged exploration of code graph for AI context generation.

Tracks what has been explored and suggests next steps for progressive disclosure of codebase structure.

Initialize the explorer.

Parameters:

Name Type Description Default
graph CodeGraph

The CodeGraph to explore

required
analyzer CodeAnalyzer | None

Optional CodeAnalyzer (created if not provided)

None
Source code in src/code_context_agent/tools/graph/disclosure.py
def __init__(self, graph: CodeGraph, analyzer: CodeAnalyzer | None = None) -> None:
    """Initialize the explorer.

    Args:
        graph: The CodeGraph to explore
        analyzer: Optional CodeAnalyzer (created if not provided)
    """
    self.graph = graph
    self.analyzer = analyzer or CodeAnalyzer(graph)
    self.explored: set[str] = set()

get_overview

get_overview()

Get high-level codebase structure (Level 0).

Provides entry points, hotspots, modules, and foundations for initial orientation.

Returns:

Type Description
dict[str, Any]

Dictionary with overview information

Source code in src/code_context_agent/tools/graph/disclosure.py
def get_overview(self) -> dict[str, Any]:
    """Get high-level codebase structure (Level 0).

    Provides entry points, hotspots, modules, and foundations
    for initial orientation.

    Returns:
        Dictionary with overview information
    """
    entry_points = self.analyzer.find_entry_points()[:5]
    hotspots = self.analyzer.find_hotspots(5)
    modules = self.analyzer.detect_modules()
    foundations = self.analyzer.find_foundations(5)

    # Mark overview nodes as explored
    for ep in entry_points:
        self.explored.add(ep["id"])
    for hs in hotspots:
        self.explored.add(hs["id"])
    for found in foundations:
        self.explored.add(found["id"])

    return {
        "total_nodes": self.graph.node_count,
        "total_edges": self.graph.edge_count,
        "entry_points": entry_points,
        "hotspots": hotspots,
        "modules": [
            {
                "module_id": m["module_id"],
                "size": m["size"],
                "key_nodes": m["key_nodes"],
                "cohesion": m["cohesion"],
            }
            for m in modules
        ],
        "foundations": foundations,
        "explored_count": len(self.explored),
    }

expand_node

expand_node(node_id, depth=1)

Expand exploration from a specific node (Level 1+).

Uses BFS to discover nodes within the specified depth.

Parameters:

Name Type Description Default
node_id str

The node to expand from

required
depth int

Number of hops to expand

1

Returns:

Type Description
dict[str, Any]

Dictionary with discovered nodes, edges, and suggestions

Source code in src/code_context_agent/tools/graph/disclosure.py
def expand_node(self, node_id: str, depth: int = 1) -> dict[str, Any]:
    """Expand exploration from a specific node (Level 1+).

    Uses BFS to discover nodes within the specified depth.

    Args:
        node_id: The node to expand from
        depth: Number of hops to expand

    Returns:
        Dictionary with discovered nodes, edges, and suggestions
    """
    view = self.graph.get_view([EdgeType.CALLS, EdgeType.REFERENCES])

    if not view.has_node(node_id):
        return {"error": f"Node not found: {node_id}"}

    # BFS expansion
    try:
        distances = dict(nx.single_source_shortest_path_length(view, node_id, cutoff=depth))
    except nx.NetworkXError:
        distances = {node_id: 0}

    # Get the subgraph
    subgraph = view.subgraph(distances.keys())

    # Mark as explored
    self.explored.update(distances.keys())

    # Get node data
    discovered_nodes = []
    for n, dist in distances.items():
        node_data = self.graph.get_node_data(n) or {}
        discovered_nodes.append(
            {
                "id": n,
                "distance": dist,
                **node_data,
            },
        )

    # Get edges
    edges = [{"source": u, "target": v, **d} for u, v, d in subgraph.edges(data=True)]

    # Suggest next nodes to explore (high-degree nodes not yet explored)
    suggested_next = self._suggest_next_exploration(view, distances)

    return {
        "center": node_id,
        "depth": depth,
        "discovered_nodes": discovered_nodes,
        "edges": edges,
        "suggested_next": suggested_next,
        "explored_count": len(self.explored),
    }

expand_module

expand_module(module_id)

Explore an entire detected module.

Parameters:

Name Type Description Default
module_id int

The module ID from detect_modules()

required

Returns:

Type Description
dict[str, Any]

Dictionary with module details and internal structure

Source code in src/code_context_agent/tools/graph/disclosure.py
def expand_module(self, module_id: int) -> dict[str, Any]:
    """Explore an entire detected module.

    Args:
        module_id: The module ID from detect_modules()

    Returns:
        Dictionary with module details and internal structure
    """
    modules = self.analyzer.detect_modules()

    if module_id >= len(modules):
        return {"error": f"Module not found: {module_id}"}

    module = modules[module_id]
    members = module["members"]

    # Mark module members as explored
    self.explored.update(members)

    # Get internal structure
    view = self.graph.get_view([EdgeType.CALLS, EdgeType.REFERENCES])
    subgraph = view.subgraph(members)

    # Detailed node info
    nodes = []
    for n in members:
        node_data = self.graph.get_node_data(n) or {}
        in_deg = subgraph.in_degree(n)
        out_deg = subgraph.out_degree(n)
        nodes.append(
            {
                "id": n,
                "in_degree": in_deg,
                "out_degree": out_deg,
                **node_data,
            },
        )

    # Sort by degree (most connected first)
    nodes.sort(key=lambda x: x["in_degree"] + x["out_degree"], reverse=True)

    # Internal edges
    edges = [{"source": u, "target": v, **d} for u, v, d in subgraph.edges(data=True)]

    # External connections (edges to/from outside the module)
    external_in = []
    external_out = []
    for member in members:
        for pred in view.predecessors(member):
            if pred not in members:
                external_in.append({"from": pred, "to": member})
        for succ in view.successors(member):
            if succ not in members:
                external_out.append({"from": member, "to": succ})

    return {
        "module_id": module_id,
        "size": len(members),
        "key_nodes": module["key_nodes"],
        "cohesion": module["cohesion"],
        "nodes": nodes[:20],  # Limit to top 20 by degree
        "edges": edges,
        "external_incoming": external_in[:10],
        "external_outgoing": external_out[:10],
        "explored_count": len(self.explored),
    }

get_path_between

get_path_between(source, target)

Find shortest path between two nodes.

Parameters:

Name Type Description Default
source str

Source node ID

required
target str

Target node ID

required

Returns:

Type Description
dict[str, Any]

Dictionary with path information

Source code in src/code_context_agent/tools/graph/disclosure.py
def get_path_between(self, source: str, target: str) -> dict[str, Any]:
    """Find shortest path between two nodes.

    Args:
        source: Source node ID
        target: Target node ID

    Returns:
        Dictionary with path information
    """
    view = self.graph.get_view()

    if not view.has_node(source):
        return {"error": f"Source node not found: {source}"}
    if not view.has_node(target):
        return {"error": f"Target node not found: {target}"}

    try:
        path = nx.shortest_path(view, source, target)
    except nx.NetworkXNoPath:
        return {"path": None, "message": "No path found between nodes"}

    # Mark path as explored
    self.explored.update(path)

    # Get node data along path
    path_nodes = []
    for i, n in enumerate(path):
        node_data = self.graph.get_node_data(n) or {}
        path_nodes.append({"id": n, "position": i, **node_data})

    # Get edges along path
    path_edges = []
    for i in range(len(path) - 1):
        edge_data = {}
        if view.has_edge(path[i], path[i + 1]):
            edge_data = dict(view[path[i]][path[i + 1]])
        path_edges.append({"source": path[i], "target": path[i + 1], **edge_data})

    return {
        "path": path,
        "length": len(path) - 1,
        "nodes": path_nodes,
        "edges": path_edges,
        "explored_count": len(self.explored),
    }

explore_category

explore_category(category)

Explore all nodes in a business logic category.

Parameters:

Name Type Description Default
category str

Category to explore (e.g., "db", "auth", "http")

required

Returns:

Type Description
dict[str, Any]

Dictionary with categorized nodes

Source code in src/code_context_agent/tools/graph/disclosure.py
def explore_category(self, category: str) -> dict[str, Any]:
    """Explore all nodes in a business logic category.

    Args:
        category: Category to explore (e.g., "db", "auth", "http")

    Returns:
        Dictionary with categorized nodes
    """
    matches = self.analyzer.find_clusters_by_category(category)

    # Mark as explored
    for m in matches:
        self.explored.add(m["id"])

    # Group by file
    by_file: dict[str, list[dict[str, Any]]] = {}
    for m in matches:
        file_path = m.get("file_path", "unknown")
        if file_path not in by_file:
            by_file[file_path] = []
        by_file[file_path].append(m)

    return {
        "category": category,
        "total_count": len(matches),
        "files_count": len(by_file),
        "by_file": [{"file": f, "matches": m, "count": len(m)} for f, m in by_file.items()],
        "explored_count": len(self.explored),
    }

get_exploration_status

get_exploration_status()

Get the current exploration status.

Returns:

Type Description
dict[str, Any]

Dictionary with exploration statistics

Source code in src/code_context_agent/tools/graph/disclosure.py
def get_exploration_status(self) -> dict[str, Any]:
    """Get the current exploration status.

    Returns:
        Dictionary with exploration statistics
    """
    total = self.graph.node_count
    explored = len(self.explored)

    return {
        "total_nodes": total,
        "explored_nodes": explored,
        "unexplored_nodes": total - explored,
        "coverage_percent": (explored / total * 100) if total > 0 else 0,
    }

reset_exploration

reset_exploration()

Reset exploration state to start fresh.

Source code in src/code_context_agent/tools/graph/disclosure.py
def reset_exploration(self) -> None:
    """Reset exploration state to start fresh."""
    self.explored.clear()

CodeEdge

Bases: FrozenModel

An edge in the code graph representing a relationship.

Attributes:

Name Type Description
source str

Source node ID

target str

Target node ID

edge_type EdgeType

Classification of the relationship

weight float

Edge weight for algorithms (default 1.0)

metadata dict[str, Any]

Additional properties (line where relationship occurs, etc.)

to_dict

to_dict()

Convert to dictionary for serialization.

Source code in src/code_context_agent/tools/graph/model.py
def to_dict(self) -> dict[str, Any]:
    """Convert to dictionary for serialization."""
    result = self.model_dump()
    result["edge_type"] = self.edge_type.value
    return result

CodeGraph

CodeGraph()

Multi-layer code graph supporting multiple relationship types.

Wraps a NetworkX MultiDiGraph to support: - Multiple edge types between the same node pair - Node/edge attributes for metadata - Filtered views for specific relationship types

Initialize an empty code graph.

Source code in src/code_context_agent/tools/graph/model.py
def __init__(self) -> None:
    """Initialize an empty code graph."""
    self._graph: nx.MultiDiGraph = nx.MultiDiGraph()

node_count property

node_count

Return the number of nodes.

edge_count property

edge_count

Return the number of edges.

add_node

add_node(node)

Add a node to the graph.

Parameters:

Name Type Description Default
node CodeNode

The CodeNode to add

required
Source code in src/code_context_agent/tools/graph/model.py
def add_node(self, node: CodeNode) -> None:
    """Add a node to the graph.

    Args:
        node: The CodeNode to add
    """
    self._graph.add_node(
        node.id,
        name=node.name,
        node_type=node.node_type.value,
        file_path=node.file_path,
        line_start=node.line_start,
        line_end=node.line_end,
        **node.metadata,
    )

add_edge

add_edge(edge)

Add an edge to the graph.

Parameters:

Name Type Description Default
edge CodeEdge

The CodeEdge to add

required
Source code in src/code_context_agent/tools/graph/model.py
def add_edge(self, edge: CodeEdge) -> None:
    """Add an edge to the graph.

    Args:
        edge: The CodeEdge to add
    """
    self._graph.add_edge(
        edge.source,
        edge.target,
        key=edge.edge_type.value,
        edge_type=edge.edge_type.value,
        weight=edge.weight,
        **edge.metadata,
    )

has_node

has_node(node_id)

Check if a node exists in the graph.

Source code in src/code_context_agent/tools/graph/model.py
def has_node(self, node_id: str) -> bool:
    """Check if a node exists in the graph."""
    return self._graph.has_node(node_id)

has_edge

has_edge(source, target, edge_type=None)

Check if an edge exists in the graph.

Parameters:

Name Type Description Default
source str

Source node ID

required
target str

Target node ID

required
edge_type EdgeType | None

Optional edge type to check for specifically

None
Source code in src/code_context_agent/tools/graph/model.py
def has_edge(self, source: str, target: str, edge_type: EdgeType | None = None) -> bool:
    """Check if an edge exists in the graph.

    Args:
        source: Source node ID
        target: Target node ID
        edge_type: Optional edge type to check for specifically
    """
    if edge_type is None:
        return self._graph.has_edge(source, target)
    return self._graph.has_edge(source, target, key=edge_type.value)

get_node_data

get_node_data(node_id)

Get the data associated with a node.

Parameters:

Name Type Description Default
node_id str

The node ID to look up

required

Returns:

Type Description
dict[str, Any] | None

Dictionary of node attributes or None if not found

Source code in src/code_context_agent/tools/graph/model.py
def get_node_data(self, node_id: str) -> dict[str, Any] | None:
    """Get the data associated with a node.

    Args:
        node_id: The node ID to look up

    Returns:
        Dictionary of node attributes or None if not found
    """
    if not self._graph.has_node(node_id):
        return None
    return dict(self._graph.nodes[node_id])

get_nodes_by_type

get_nodes_by_type(node_type)

Get all node IDs of a specific type.

Parameters:

Name Type Description Default
node_type NodeType

The type to filter by

required

Returns:

Type Description
list[str]

List of node IDs matching the type

Source code in src/code_context_agent/tools/graph/model.py
def get_nodes_by_type(self, node_type: NodeType) -> list[str]:
    """Get all node IDs of a specific type.

    Args:
        node_type: The type to filter by

    Returns:
        List of node IDs matching the type
    """
    return [n for n, d in self._graph.nodes(data=True) if d.get("node_type") == node_type.value]

get_edges_by_type

get_edges_by_type(edge_type)

Get all edges of a specific type.

Parameters:

Name Type Description Default
edge_type EdgeType

The type to filter by

required

Returns:

Type Description
list[tuple[str, str, dict[str, Any]]]

List of (source, target, data) tuples

Source code in src/code_context_agent/tools/graph/model.py
def get_edges_by_type(self, edge_type: EdgeType) -> list[tuple[str, str, dict[str, Any]]]:
    """Get all edges of a specific type.

    Args:
        edge_type: The type to filter by

    Returns:
        List of (source, target, data) tuples
    """
    return [(u, v, d) for u, v, k, d in self._graph.edges(keys=True, data=True) if k == edge_type.value]

get_view

get_view(edge_types=None)

Get a filtered view of the graph for analysis algorithms.

Creates a simple DiGraph (not Multi) with only the specified edge types. Multiple edges between the same nodes are aggregated by summing weights.

Parameters:

Name Type Description Default
edge_types list[EdgeType] | None

List of edge types to include (None = all types)

None

Returns:

Type Description
DiGraph

A NetworkX DiGraph suitable for analysis algorithms

Source code in src/code_context_agent/tools/graph/model.py
def get_view(self, edge_types: list[EdgeType] | None = None) -> nx.DiGraph:
    """Get a filtered view of the graph for analysis algorithms.

    Creates a simple DiGraph (not Multi) with only the specified edge types.
    Multiple edges between the same nodes are aggregated by summing weights.

    Args:
        edge_types: List of edge types to include (None = all types)

    Returns:
        A NetworkX DiGraph suitable for analysis algorithms
    """
    view = nx.DiGraph()

    # Copy all nodes with their attributes
    view.add_nodes_from(self._graph.nodes(data=True))

    # Filter and aggregate edges
    for u, v, k, d in self._graph.edges(keys=True, data=True):
        if edge_types is None or EdgeType(k) in edge_types:
            if view.has_edge(u, v):
                # Aggregate weights
                view[u][v]["weight"] += d.get("weight", 1.0)
            else:
                view.add_edge(u, v, weight=d.get("weight", 1.0), types=[k])

    return view

nodes

nodes(data=False)

Return nodes, optionally with data.

Parameters:

Name Type Description Default
data bool

If True, return (node_id, data) tuples

False

Returns:

Type Description
Any

Node view from underlying NetworkX graph

Source code in src/code_context_agent/tools/graph/model.py
def nodes(self, data: bool = False) -> Any:
    """Return nodes, optionally with data.

    Args:
        data: If True, return (node_id, data) tuples

    Returns:
        Node view from underlying NetworkX graph
    """
    return self._graph.nodes(data=data)

edges

edges(data=False)

Return edges, optionally with data.

Parameters:

Name Type Description Default
data bool

If True, return (source, target, data) tuples

False

Returns:

Type Description
Any

Edge view from underlying NetworkX graph

Source code in src/code_context_agent/tools/graph/model.py
def edges(self, data: bool = False) -> Any:
    """Return edges, optionally with data.

    Args:
        data: If True, return (source, target, data) tuples

    Returns:
        Edge view from underlying NetworkX graph
    """
    return self._graph.edges(data=data)
to_node_link_data()

Export graph as node-link JSON format.

Returns:

Type Description
dict[str, Any]

Dictionary suitable for JSON serialization

Source code in src/code_context_agent/tools/graph/model.py
def to_node_link_data(self) -> dict[str, Any]:
    """Export graph as node-link JSON format.

    Returns:
        Dictionary suitable for JSON serialization
    """
    return nx.node_link_data(self._graph)
from_node_link_data(data)

Create a CodeGraph from node-link JSON format.

Handles both old NetworkX format ("links" key) and new 3.6+ format ("edges" key) for backward compatibility with saved graphs.

Parameters:

Name Type Description Default
data dict[str, Any]

Dictionary from node_link_data or JSON

required

Returns:

Type Description
CodeGraph

New CodeGraph instance

Source code in src/code_context_agent/tools/graph/model.py
@classmethod
def from_node_link_data(cls, data: dict[str, Any]) -> "CodeGraph":
    """Create a CodeGraph from node-link JSON format.

    Handles both old NetworkX format ("links" key) and new 3.6+
    format ("edges" key) for backward compatibility with saved graphs.

    Args:
        data: Dictionary from node_link_data or JSON

    Returns:
        New CodeGraph instance
    """
    graph = cls()
    # Handle both old ("links") and new ("edges") format
    if "links" in data and "edges" not in data:
        graph._graph = nx.node_link_graph(data, edges="links")
    else:
        graph._graph = nx.node_link_graph(data)
    return graph

describe

describe()

Get a quick summary of the graph.

Returns:

Type Description
dict[str, Any]

Dictionary with node count, edge count, type distributions, and density.

Source code in src/code_context_agent/tools/graph/model.py
def describe(self) -> dict[str, Any]:
    """Get a quick summary of the graph.

    Returns:
        Dictionary with node count, edge count, type distributions, and density.
    """
    node_types: dict[str, int] = {}
    for _, data in self._graph.nodes(data=True):
        nt = data.get("node_type", "unknown")
        node_types[nt] = node_types.get(nt, 0) + 1

    edge_types: dict[str, int] = {}
    for _, _, k, _ in self._graph.edges(keys=True, data=True):
        edge_types[k] = edge_types.get(k, 0) + 1

    return {
        "node_count": self.node_count,
        "edge_count": self.edge_count,
        "node_types": node_types,
        "edge_types": edge_types,
        "density": nx.density(self._graph),
    }

CodeNode

Bases: FrozenModel

A node in the code graph representing a code element.

Attributes:

Name Type Description
id str

Unique identifier (typically "file_path:symbol_name" or "file_path:line")

name str

Human-readable display name

node_type NodeType

Classification of the code element

file_path str

Absolute path to the source file

line_start int

Starting line number (0-indexed)

line_end int

Ending line number (0-indexed)

metadata dict[str, Any]

Additional properties (docstring, visibility, rule_id, etc.)

to_dict

to_dict()

Convert to dictionary for serialization.

Source code in src/code_context_agent/tools/graph/model.py
def to_dict(self) -> dict[str, Any]:
    """Convert to dictionary for serialization."""
    result = self.model_dump()
    result["node_type"] = self.node_type.value
    return result

EdgeType

Bases: Enum

Types of relationships between code elements.

NodeType

Bases: Enum

Types of nodes in the code graph.

code_graph_analyze

code_graph_analyze(
    graph_id,
    analysis_type,
    top_k=10,
    node_a="",
    node_b="",
    resolution=1.0,
    category="",
)

Run graph algorithms to surface structural insights about the codebase.

USE THIS TOOL: - After populating graph with code_graph_ingest_* tools - To find important code that isn't obvious from file names - To understand code relationships and architecture

DO NOT USE: - On an empty graph (ingest data first) - For simple lookups (use code_graph_explore instead)

Analysis types provide different perspectives:

Centrality (finds important code): - "hotspots": Betweenness centrality. Finds bottleneck code that many paths go through. High score = integration point, likely to cause cascading changes. Use for: risk assessment, refactoring targets. - "foundations": PageRank. Finds core infrastructure that other important code depends on. High score = foundational code. Use for: understanding dependencies, documentation priority. - "entry_points": Nodes with no incoming edges but outgoing calls. These start execution flows. Use for: understanding app structure.

Clustering (finds groupings): - "modules": Louvain community detection. Finds densely connected groups = logical modules/layers. Use for: architecture diagrams, understanding boundaries.

Relationships (between specific nodes): - "coupling": Measures how tightly two nodes are connected. Use for: understanding change impact, identifying tight coupling. - "similar": Personalized PageRank from a node. Finds related code. Use for: understanding a node's neighborhood. - "dependencies": BFS from a node. Shows what it depends on. Use for: understanding impact of changes.

Filtering: - "category": Finds all nodes in a business logic category. Use for: focused analysis on db/auth/validation/etc.

Code Health: - "unused_symbols": Finds functions/classes/methods with zero cross-file references. Dead code candidates. Use category param for node type filter. - "refactoring": Combines clone detection, code smells, and unused symbols into ranked refactoring opportunities.

Parameters:

Name Type Description Default
graph_id str

ID of the graph to analyze (must have data from ingestion)

required
analysis_type str

Algorithm to run. One of: - "hotspots": Returns ranked list by betweenness score - "foundations": Returns ranked list by PageRank score - "entry_points": Returns list of entry point nodes - "modules": Returns list of detected modules with members - "coupling": Returns coupling metrics (requires node_a, node_b) - "similar": Returns similar nodes (requires node_a) - "category": Returns nodes in category (requires category) - "dependencies": Returns dependency chain (requires node_a) - "trust": TrustRank-based foundations (noise-resistant PageRank from entry points) - "triangles": Find tightly-coupled code triads - "unused_symbols": Dead code detection (zero cross-file references) - "refactoring": Combined refactoring opportunity ranking

required
top_k int

Maximum results for ranked analyses (hotspots, foundations, similar). Default 10. Use 20-30 for comprehensive analysis.

10
node_a str

Required for "coupling", "similar", "dependencies". Node ID format: "file_path:symbol_name"

''
node_b str

Required for "coupling" analysis. Second node to compare.

''
resolution float

For "modules" only. Controls cluster granularity: - < 1.0: Fewer, larger clusters (e.g., 0.5 for high-level layers) - = 1.0: Default clustering - > 1.0: More, smaller clusters (e.g., 1.5 for fine-grained)

1.0
category str

Required for "category" analysis. Category name from AST-grep rule packs: "db", "auth", "http", "validation", etc.

''

Returns:

Name Type Description
str

JSON with analysis results. Format varies by type:

str

hotspots/foundations:

str

{"results": [{"id": "...", "score": 0.85, "name": "...", ...}]}

modules str
str

{"module_count": 5, "results": [

str

]}

coupling str
str

{"results": {"coupling": 2.5, "shared_neighbors": 3, "path_length": 2}}

Output Size: 1-10KB depending on top_k and analysis type

Workflow Examples:

Find bottleneck code

hotspots = code_graph_analyze("main", "hotspots", top_k=15)

Results ranked by betweenness - top items are integration points

Detect architecture layers

modules = code_graph_analyze("main", "modules", resolution=0.8)

Each module is a logical grouping - name based on key_nodes

Understand coupling

coupling = code_graph_analyze("main", "coupling", node_a="src/api.py:handler", node_b="src/db.py:repository")

High coupling score = tightly connected, changes propagate

Find all database operations

db_ops = code_graph_analyze("main", "category", category="db")

Returns all nodes tagged as "db" from AST-grep ingestion

Source code in src/code_context_agent/tools/graph/tools.py
@tool
def code_graph_analyze(  # noqa: C901
    graph_id: str,
    analysis_type: str,
    top_k: int = 10,
    node_a: str = "",
    node_b: str = "",
    resolution: float = 1.0,
    category: str = "",
) -> str:
    """Run graph algorithms to surface structural insights about the codebase.

    USE THIS TOOL:
    - After populating graph with code_graph_ingest_* tools
    - To find important code that isn't obvious from file names
    - To understand code relationships and architecture

    DO NOT USE:
    - On an empty graph (ingest data first)
    - For simple lookups (use code_graph_explore instead)

    Analysis types provide different perspectives:

    **Centrality (finds important code):**
    - "hotspots": Betweenness centrality. Finds bottleneck code that many
      paths go through. High score = integration point, likely to cause
      cascading changes. Use for: risk assessment, refactoring targets.
    - "foundations": PageRank. Finds core infrastructure that other
      important code depends on. High score = foundational code.
      Use for: understanding dependencies, documentation priority.
    - "entry_points": Nodes with no incoming edges but outgoing calls.
      These start execution flows. Use for: understanding app structure.

    **Clustering (finds groupings):**
    - "modules": Louvain community detection. Finds densely connected
      groups = logical modules/layers. Use for: architecture diagrams,
      understanding boundaries.

    **Relationships (between specific nodes):**
    - "coupling": Measures how tightly two nodes are connected.
      Use for: understanding change impact, identifying tight coupling.
    - "similar": Personalized PageRank from a node. Finds related code.
      Use for: understanding a node's neighborhood.
    - "dependencies": BFS from a node. Shows what it depends on.
      Use for: understanding impact of changes.

    **Filtering:**
    - "category": Finds all nodes in a business logic category.
      Use for: focused analysis on db/auth/validation/etc.

    **Code Health:**
    - "unused_symbols": Finds functions/classes/methods with zero cross-file
      references. Dead code candidates. Use category param for node type filter.
    - "refactoring": Combines clone detection, code smells, and unused symbols
      into ranked refactoring opportunities.

    Args:
        graph_id: ID of the graph to analyze (must have data from ingestion)
        analysis_type: Algorithm to run. One of:
            - "hotspots": Returns ranked list by betweenness score
            - "foundations": Returns ranked list by PageRank score
            - "entry_points": Returns list of entry point nodes
            - "modules": Returns list of detected modules with members
            - "coupling": Returns coupling metrics (requires node_a, node_b)
            - "similar": Returns similar nodes (requires node_a)
            - "category": Returns nodes in category (requires category)
            - "dependencies": Returns dependency chain (requires node_a)
            - "trust": TrustRank-based foundations (noise-resistant PageRank from entry points)
            - "triangles": Find tightly-coupled code triads
            - "unused_symbols": Dead code detection (zero cross-file references)
            - "refactoring": Combined refactoring opportunity ranking
        top_k: Maximum results for ranked analyses (hotspots, foundations,
            similar). Default 10. Use 20-30 for comprehensive analysis.
        node_a: Required for "coupling", "similar", "dependencies".
            Node ID format: "file_path:symbol_name"
        node_b: Required for "coupling" analysis. Second node to compare.
        resolution: For "modules" only. Controls cluster granularity:
            - < 1.0: Fewer, larger clusters (e.g., 0.5 for high-level layers)
            - = 1.0: Default clustering
            - > 1.0: More, smaller clusters (e.g., 1.5 for fine-grained)
        category: Required for "category" analysis. Category name from
            AST-grep rule packs: "db", "auth", "http", "validation", etc.

    Returns:
        JSON with analysis results. Format varies by type:

        hotspots/foundations:
        {"results": [{"id": "...", "score": 0.85, "name": "...", ...}]}

        modules:
        {"module_count": 5, "results": [
            {"module_id": 0, "size": 15, "key_nodes": [...], "cohesion": 0.8}
        ]}

        coupling:
        {"results": {"coupling": 2.5, "shared_neighbors": 3, "path_length": 2}}

    Output Size: 1-10KB depending on top_k and analysis type

    Workflow Examples:

    Find bottleneck code:
        hotspots = code_graph_analyze("main", "hotspots", top_k=15)
        # Results ranked by betweenness - top items are integration points

    Detect architecture layers:
        modules = code_graph_analyze("main", "modules", resolution=0.8)
        # Each module is a logical grouping - name based on key_nodes

    Understand coupling:
        coupling = code_graph_analyze("main", "coupling",
                                       node_a="src/api.py:handler",
                                       node_b="src/db.py:repository")
        # High coupling score = tightly connected, changes propagate

    Find all database operations:
        db_ops = code_graph_analyze("main", "category", category="db")
        # Returns all nodes tagged as "db" from AST-grep ingestion
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    analyzer = CodeAnalyzer(graph)

    if analysis_type == "hotspots":
        results = analyzer.find_hotspots(top_k)
        return _json_response({"status": "success", "analysis": "hotspots", "results": results})

    if analysis_type == "foundations":
        results = analyzer.find_foundations(top_k)
        return _json_response({"status": "success", "analysis": "foundations", "results": results})

    if analysis_type == "entry_points":
        results = analyzer.find_entry_points()
        return _json_response({"status": "success", "analysis": "entry_points", "results": results})

    if analysis_type == "modules":
        results = analyzer.detect_modules(resolution)
        return _json_response(
            {"status": "success", "analysis": "modules", "module_count": len(results), "results": results},
        )

    if analysis_type == "coupling":
        if not node_a or not node_b:
            return _json_response({"status": "error", "message": "node_a and node_b required for coupling"})
        results = analyzer.calculate_coupling(node_a, node_b)
        return _json_response({"status": "success", "analysis": "coupling", "results": results})

    if analysis_type == "similar":
        if not node_a:
            return _json_response({"status": "error", "message": "node_a required for similar analysis"})
        results = analyzer.get_similar_nodes(node_a, top_k)
        return _json_response({"status": "success", "analysis": "similar", "results": results})

    if analysis_type == "category":
        if not category:
            return _json_response({"status": "error", "message": "category required for category analysis"})
        results = analyzer.find_clusters_by_category(category)
        return _json_response({"status": "success", "analysis": "category", "category": category, "results": results})

    if analysis_type == "dependencies":
        if not node_a:
            return _json_response({"status": "error", "message": "node_a required for dependencies"})
        direction = "outgoing"  # What does this node depend on
        results = analyzer.get_dependency_chain(node_a, direction)
        return _json_response({"status": "success", "analysis": "dependencies", "results": results})

    if analysis_type == "trust":
        results = analyzer.find_trusted_foundations(top_k=top_k)
        return _json_response({"status": "success", "analysis": "trust", "results": results})

    if analysis_type == "triangles":
        results = analyzer.find_triangles(top_k=top_k)
        return _json_response({"status": "success", "analysis": "triangles", "results": results})

    if analysis_type == "unused_symbols":
        node_type_filter = category.split(",") if category else None
        results = analyzer.find_unused_symbols(node_types=node_type_filter)
        return _json_response(
            {
                "status": "success",
                "analysis": "unused_symbols",
                "results": results,
                "count": len(results),
            },
        )

    if analysis_type == "refactoring":
        results = analyzer.find_refactoring_candidates(top_k=top_k)
        return _json_response(
            {
                "status": "success",
                "analysis": "refactoring",
                "results": results,
                "count": len(results),
            },
        )

    return _json_response({"status": "error", "message": f"Unknown analysis_type: {analysis_type}"})

code_graph_create

code_graph_create(graph_id, description='')

Initialize an empty code graph for structural analysis of a codebase.

USE THIS TOOL: - At the start of analysis, BEFORE running LSP/AST-grep tools - When you need to unify results from multiple discovery tools - When you want to run graph algorithms (hotspots, modules, coupling)

DO NOT USE: - If a graph with this ID already exists (will overwrite it) - For simple single-file analysis (use LSP tools directly)

The graph is stored in memory for the session. Populate it using: - code_graph_ingest_lsp: Add symbols, references, definitions from LSP - code_graph_ingest_astgrep: Add business logic patterns - code_graph_ingest_tests: Add test coverage relationships

Parameters:

Name Type Description Default
graph_id str

Unique identifier for this graph. Use descriptive names: - "main": Primary analysis graph for the whole codebase - "feature_auth": Graph focused on authentication code - "module_api": Graph for API layer only

required
description str

Human-readable description of what this graph represents. Helps when managing multiple graphs.

''

Returns:

Name Type Description
JSON str

{"status": "success", "graph_id": "...", "message": "..."}

Output Size: ~100 bytes

Workflow
  1. code_graph_create("main") # Initialize
  2. lsp_start(...) + lsp_document_symbols(...) # Discover
  3. code_graph_ingest_lsp(...) # Populate
  4. code_graph_analyze("main", "hotspots") # Analyze
  5. code_graph_save("main", ".code-context/graph.json") # Persist
Example

code_graph_create("main", "Full codebase analysis") code_graph_create("backend", "Backend services only")

Source code in src/code_context_agent/tools/graph/tools.py
@tool
def code_graph_create(
    graph_id: str,
    description: str = "",
) -> str:
    """Initialize an empty code graph for structural analysis of a codebase.

    USE THIS TOOL:
    - At the start of analysis, BEFORE running LSP/AST-grep tools
    - When you need to unify results from multiple discovery tools
    - When you want to run graph algorithms (hotspots, modules, coupling)

    DO NOT USE:
    - If a graph with this ID already exists (will overwrite it)
    - For simple single-file analysis (use LSP tools directly)

    The graph is stored in memory for the session. Populate it using:
    - code_graph_ingest_lsp: Add symbols, references, definitions from LSP
    - code_graph_ingest_astgrep: Add business logic patterns
    - code_graph_ingest_tests: Add test coverage relationships

    Args:
        graph_id: Unique identifier for this graph. Use descriptive names:
            - "main": Primary analysis graph for the whole codebase
            - "feature_auth": Graph focused on authentication code
            - "module_api": Graph for API layer only
        description: Human-readable description of what this graph represents.
            Helps when managing multiple graphs.

    Returns:
        JSON: {"status": "success", "graph_id": "...", "message": "..."}

    Output Size: ~100 bytes

    Workflow:
        1. code_graph_create("main")           # Initialize
        2. lsp_start(...) + lsp_document_symbols(...)  # Discover
        3. code_graph_ingest_lsp(...)          # Populate
        4. code_graph_analyze("main", "hotspots")  # Analyze
        5. code_graph_save("main", ".code-context/graph.json")  # Persist

    Example:
        code_graph_create("main", "Full codebase analysis")
        code_graph_create("backend", "Backend services only")
    """
    _graphs[graph_id] = CodeGraph()
    # Reset explorer for this graph
    _explorers.pop(graph_id, None)

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "description": description,
            "message": f"Created new code graph: {graph_id}",
        },
    )

code_graph_explore

code_graph_explore(
    graph_id,
    action,
    node_id="",
    module_id=-1,
    target_node="",
    depth=1,
    category="",
)

Progressively explore the code graph to build context incrementally.

USE THIS TOOL: - ALWAYS start with "overview" action first - When you need to understand the codebase step by step - To get suggestions on where to explore next - To track what you've already explored

DO NOT USE: - For running analysis algorithms (use code_graph_analyze instead) - On an empty graph (ingest data first)

Progressive disclosure pattern: 1. "overview" → Get entry points, hotspots, modules, foundations 2. Pick interesting nodes from overview 3. "expand_node" → See neighbors and relationships 4. Repeat until sufficient context is gathered

The explorer tracks visited nodes and suggests what to explore next.

Actions:

Starting point: - "overview": Returns high-level structure. Includes: - entry_points: Where execution starts - hotspots: Bottleneck code (top 5) - modules: Detected clusters with key nodes - foundations: Core infrastructure (top 5) Always start here to orient yourself.

Drill-down: - "expand_node": BFS expansion from a node. See immediate neighbors and their relationships. Good for understanding a specific area. - "expand_module": Deep-dive into a detected module. Shows internal structure and external connections. - "category": Explore all nodes in a business logic category. Groups results by file.

Navigation: - "path": Find shortest path between two nodes. Useful for understanding how components connect. - "status": Check exploration coverage (% of nodes visited). - "reset": Clear exploration state to start fresh.

Parameters:

Name Type Description Default
graph_id str

ID of the graph to explore (must have data from ingestion)

required
action str

Exploration action. One of: - "overview": No additional params needed - "expand_node": Requires node_id, optional depth - "expand_module": Requires module_id (from overview/modules analysis) - "path": Requires node_id (source) and target_node - "category": Requires category (e.g., "db", "auth") - "status": No additional params - "reset": No additional params

required
node_id str

For "expand_node": Node ID to expand from. For "path": Source node. Format: "file_path:symbol_name"

''
module_id int

For "expand_module": Module ID from detect_modules results. Typically 0, 1, 2, etc. from the overview.

-1
target_node str

For "path": Destination node ID.

''
depth int

For "expand_node": How many hops to expand. - depth=1: Direct neighbors only (fast, focused) - depth=2: Neighbors of neighbors (broader context) - depth=3+: Rarely needed, can be large

1
category str

For "category": Business logic category name. Values from AST-grep: "db", "auth", "http", "validation", etc.

''

Returns:

Name Type Description
str

JSON with exploration results. Always includes "explored_count".

overview str
str

{ "entry_points": [...], "hotspots": [...], "modules": [{"module_id": 0, "size": 15, "key_nodes": [...]}], "foundations": [...], "explored_count": 25

str

}

expand_node str
str

{ "center": "src/api.py:handler", "discovered_nodes": [...], "edges": [...], "suggested_next": [...], # What to explore next "explored_count": 40

str

}

Output Size: 2-20KB depending on action and graph size

Workflow Example:

1. Start with overview

overview = code_graph_explore("main", "overview")

Look at entry_points and hotspots

2. Expand from interesting hotspot

details = code_graph_explore("main", "expand_node", node_id=overview["hotspots"][0]["id"], depth=2)

See neighbors and suggested_next

3. Explore a module

module_details = code_graph_explore("main", "expand_module", module_id=0)

See internal structure and external connections

4. Check coverage

status = code_graph_explore("main", "status")

coverage_percent shows how much of graph was explored

Source code in src/code_context_agent/tools/graph/tools.py
@tool
def code_graph_explore(  # noqa: C901
    graph_id: str,
    action: str,
    node_id: str = "",
    module_id: int = -1,
    target_node: str = "",
    depth: int = 1,
    category: str = "",
) -> str:
    """Progressively explore the code graph to build context incrementally.

    USE THIS TOOL:
    - ALWAYS start with "overview" action first
    - When you need to understand the codebase step by step
    - To get suggestions on where to explore next
    - To track what you've already explored

    DO NOT USE:
    - For running analysis algorithms (use code_graph_analyze instead)
    - On an empty graph (ingest data first)

    Progressive disclosure pattern:
    1. "overview" → Get entry points, hotspots, modules, foundations
    2. Pick interesting nodes from overview
    3. "expand_node" → See neighbors and relationships
    4. Repeat until sufficient context is gathered

    The explorer tracks visited nodes and suggests what to explore next.

    Actions:

    **Starting point:**
    - "overview": Returns high-level structure. Includes:
      - entry_points: Where execution starts
      - hotspots: Bottleneck code (top 5)
      - modules: Detected clusters with key nodes
      - foundations: Core infrastructure (top 5)
      Always start here to orient yourself.

    **Drill-down:**
    - "expand_node": BFS expansion from a node. See immediate neighbors
      and their relationships. Good for understanding a specific area.
    - "expand_module": Deep-dive into a detected module. Shows internal
      structure and external connections.
    - "category": Explore all nodes in a business logic category.
      Groups results by file.

    **Navigation:**
    - "path": Find shortest path between two nodes. Useful for
      understanding how components connect.
    - "status": Check exploration coverage (% of nodes visited).
    - "reset": Clear exploration state to start fresh.

    Args:
        graph_id: ID of the graph to explore (must have data from ingestion)
        action: Exploration action. One of:
            - "overview": No additional params needed
            - "expand_node": Requires node_id, optional depth
            - "expand_module": Requires module_id (from overview/modules analysis)
            - "path": Requires node_id (source) and target_node
            - "category": Requires category (e.g., "db", "auth")
            - "status": No additional params
            - "reset": No additional params
        node_id: For "expand_node": Node ID to expand from.
            For "path": Source node. Format: "file_path:symbol_name"
        module_id: For "expand_module": Module ID from detect_modules results.
            Typically 0, 1, 2, etc. from the overview.
        target_node: For "path": Destination node ID.
        depth: For "expand_node": How many hops to expand.
            - depth=1: Direct neighbors only (fast, focused)
            - depth=2: Neighbors of neighbors (broader context)
            - depth=3+: Rarely needed, can be large
        category: For "category": Business logic category name.
            Values from AST-grep: "db", "auth", "http", "validation", etc.

    Returns:
        JSON with exploration results. Always includes "explored_count".

        overview:
        {
            "entry_points": [...],
            "hotspots": [...],
            "modules": [{"module_id": 0, "size": 15, "key_nodes": [...]}],
            "foundations": [...],
            "explored_count": 25
        }

        expand_node:
        {
            "center": "src/api.py:handler",
            "discovered_nodes": [...],
            "edges": [...],
            "suggested_next": [...],  # What to explore next
            "explored_count": 40
        }

    Output Size: 2-20KB depending on action and graph size

    Workflow Example:

    # 1. Start with overview
    overview = code_graph_explore("main", "overview")
    # Look at entry_points and hotspots

    # 2. Expand from interesting hotspot
    details = code_graph_explore("main", "expand_node",
                                  node_id=overview["hotspots"][0]["id"],
                                  depth=2)
    # See neighbors and suggested_next

    # 3. Explore a module
    module_details = code_graph_explore("main", "expand_module", module_id=0)
    # See internal structure and external connections

    # 4. Check coverage
    status = code_graph_explore("main", "status")
    # coverage_percent shows how much of graph was explored
    """
    explorer = _get_explorer(graph_id)
    if explorer is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    if action == "overview":
        results = explorer.get_overview()
        return _json_response({"status": "success", "action": "overview", **results})

    if action == "expand_node":
        if not node_id:
            return _json_response({"status": "error", "message": "node_id required for expand_node"})
        results = explorer.expand_node(node_id, depth)
        return _json_response({"status": "success", "action": "expand_node", **results})

    if action == "expand_module":
        if module_id < 0:
            return _json_response({"status": "error", "message": "module_id required for expand_module"})
        results = explorer.expand_module(module_id)
        return _json_response({"status": "success", "action": "expand_module", **results})

    if action == "path":
        if not node_id or not target_node:
            return _json_response({"status": "error", "message": "node_id and target_node required for path"})
        results = explorer.get_path_between(node_id, target_node)
        return _json_response({"status": "success", "action": "path", **results})

    if action == "category":
        if not category:
            return _json_response({"status": "error", "message": "category required for category exploration"})
        results = explorer.explore_category(category)
        return _json_response({"status": "success", "action": "category", **results})

    if action == "status":
        results = explorer.get_exploration_status()
        return _json_response({"status": "success", "action": "status", **results})

    if action == "reset":
        explorer.reset_exploration()
        return _json_response({"status": "success", "action": "reset", "message": "Exploration state reset"})

    return _json_response({"status": "error", "message": f"Unknown action: {action}"})

code_graph_export

code_graph_export(
    graph_id,
    format="json",
    include_metadata=True,
    max_nodes=100,
)

Export the code graph for visualization or external analysis.

USE THIS TOOL: - To generate Mermaid diagrams for CONTEXT.md architecture section - To save graph data for external visualization tools - After analysis, to capture the graph structure

DO NOT USE: - For persistence (use code_graph_save instead) - On empty graphs (ingest data first)

Export formats:

"mermaid" (recommended for documentation): Generates Mermaid diagram syntax that can be embedded in markdown. - Selects top nodes by degree (most connected = most important) - Uses shapes based on node type: - [name]: Classes (rectangles) - (name): Functions/methods (rounded) - [[name]]: Files (stadium shape) - Edge styles by relationship: - → : calls - -.-> : imports - ==> : inherits

"json" (for external tools): NetworkX node-link format. Can be loaded into other graph tools.

Parameters:

Name Type Description Default
graph_id str

ID of the graph to export (must exist)

required
format str

Export format: - "mermaid": Mermaid diagram syntax (for markdown embedding) - "json": NetworkX node-link JSON (for external tools)

'json'
include_metadata bool

For "json" format only. Whether to include node/edge metadata (file_path, line numbers, etc.). Set False for smaller output.

True
max_nodes int

For "mermaid" format only. Maximum nodes to include. Mermaid diagrams become unreadable with too many nodes. Recommended: 15 for CONTEXT.md, up to 50 for detailed diagrams. Nodes are selected by degree (most connected first).

100

Returns:

Type Description
str

For "mermaid":

str

{ "status": "success", "format": "mermaid", "diagram": "graph TD\n node1[Name] → node2..."

str

}

str

For "json":

str

{ "status": "success", "format": "json", "graph": {"nodes": [...], "links": [...]}

str

}

Output Size
  • mermaid: 1-5KB (limited by max_nodes)
  • json: 10-500KB (depends on graph size)

Workflow Example:

Export for CONTEXT.md architecture diagram

result = code_graph_export("main", format="mermaid", max_nodes=15) mermaid_code = result["diagram"]

Embed in markdown:

```mermaid

{mermaid_code}

```

Export for external visualization

result = code_graph_export("main", format="json", include_metadata=True)

Use with Gephi, D3.js, etc.

Source code in src/code_context_agent/tools/graph/tools.py
@tool
def code_graph_export(
    graph_id: str,
    format: str = "json",
    include_metadata: bool = True,
    max_nodes: int = 100,
) -> str:
    """Export the code graph for visualization or external analysis.

    USE THIS TOOL:
    - To generate Mermaid diagrams for CONTEXT.md architecture section
    - To save graph data for external visualization tools
    - After analysis, to capture the graph structure

    DO NOT USE:
    - For persistence (use code_graph_save instead)
    - On empty graphs (ingest data first)

    Export formats:

    **"mermaid"** (recommended for documentation):
    Generates Mermaid diagram syntax that can be embedded in markdown.
    - Selects top nodes by degree (most connected = most important)
    - Uses shapes based on node type:
      - [name]: Classes (rectangles)
      - (name): Functions/methods (rounded)
      - [[name]]: Files (stadium shape)
    - Edge styles by relationship:
      - --> : calls
      - -.-> : imports
      - ==> : inherits

    **"json"** (for external tools):
    NetworkX node-link format. Can be loaded into other graph tools.

    Args:
        graph_id: ID of the graph to export (must exist)
        format: Export format:
            - "mermaid": Mermaid diagram syntax (for markdown embedding)
            - "json": NetworkX node-link JSON (for external tools)
        include_metadata: For "json" format only. Whether to include
            node/edge metadata (file_path, line numbers, etc.).
            Set False for smaller output.
        max_nodes: For "mermaid" format only. Maximum nodes to include.
            Mermaid diagrams become unreadable with too many nodes.
            Recommended: 15 for CONTEXT.md, up to 50 for detailed diagrams.
            Nodes are selected by degree (most connected first).

    Returns:
        For "mermaid":
        {
            "status": "success",
            "format": "mermaid",
            "diagram": "graph TD\\n    node1[Name] --> node2..."
        }

        For "json":
        {
            "status": "success",
            "format": "json",
            "graph": {"nodes": [...], "links": [...]}
        }

    Output Size:
        - mermaid: 1-5KB (limited by max_nodes)
        - json: 10-500KB (depends on graph size)

    Workflow Example:

    # Export for CONTEXT.md architecture diagram
    result = code_graph_export("main", format="mermaid", max_nodes=15)
    mermaid_code = result["diagram"]
    # Embed in markdown:
    # ```mermaid
    # {mermaid_code}
    # ```

    # Export for external visualization
    result = code_graph_export("main", format="json", include_metadata=True)
    # Use with Gephi, D3.js, etc.
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    if format == "json":
        data = graph.to_node_link_data()
        if not include_metadata:
            # Strip metadata
            for node in data.get("nodes", []):
                node.pop("metadata", None)
            for link in data.get("links", []):
                link.pop("metadata", None)
        return _json_response({"status": "success", "format": "json", "graph": data})

    if format == "mermaid":
        mermaid = _export_mermaid(graph, max_nodes)
        return _json_response({"status": "success", "format": "mermaid", "diagram": mermaid})

    return _json_response({"status": "error", "message": f"Unknown format: {format}"})

code_graph_ingest_astgrep

code_graph_ingest_astgrep(
    graph_id, astgrep_result, result_type="rule_pack"
)

Add AST-grep pattern matches to the graph as categorized business logic nodes.

USE THIS TOOL: - After running astgrep_scan_rule_pack to add business logic patterns - After running astgrep_scan for custom pattern matches - When you want graph analysis to consider business logic categories

DO NOT USE: - Before code_graph_create (graph must exist first) - With empty AST-grep results (check match count first)

AST-grep matches become nodes with rich metadata: - category: "db", "auth", "http", "validation", etc. - severity: "error" (writes), "warning" (reads), "hint" (definitions) - rule_id: The specific pattern that matched

This metadata enables category-based analysis: - code_graph_analyze("main", "category", category="db") - code_graph_explore("main", "category", category="auth")

Parameters:

Name Type Description Default
graph_id str

ID of the target graph (must exist from code_graph_create)

required
astgrep_result str

The raw JSON string output from astgrep_scan or astgrep_scan_rule_pack. Pass the exact return value.

required
result_type str

Source of the AST-grep result: - "rule_pack" (default): From astgrep_scan_rule_pack. Results include category, severity, rule_id metadata. Use this for business logic detection. - "scan": From astgrep_scan ad-hoc patterns. Results have pattern info but no category metadata.

'rule_pack'

Returns:

Type Description
str

JSON with ingestion results:

str

{ "status": "success", "nodes_added": 25, "categories": ["db", "auth", "validation"], "total_nodes": 175

str

}

Output Size: ~300 bytes

Common Errors
  • "Graph not found": Call code_graph_create first
  • "Invalid JSON": AST-grep result is malformed
Workflow Example

Run rule pack for Python business logic

matches = astgrep_scan_rule_pack("py_business_logic", repo_path)

Ingest into graph

code_graph_ingest_astgrep("main", matches, "rule_pack")

Now analyze by category

db_ops = code_graph_analyze("main", "category", category="db") auth_ops = code_graph_analyze("main", "category", category="auth")

Source code in src/code_context_agent/tools/graph/tools.py
@tool
def code_graph_ingest_astgrep(
    graph_id: str,
    astgrep_result: str,
    result_type: str = "rule_pack",
) -> str:
    """Add AST-grep pattern matches to the graph as categorized business logic nodes.

    USE THIS TOOL:
    - After running astgrep_scan_rule_pack to add business logic patterns
    - After running astgrep_scan for custom pattern matches
    - When you want graph analysis to consider business logic categories

    DO NOT USE:
    - Before code_graph_create (graph must exist first)
    - With empty AST-grep results (check match count first)

    AST-grep matches become nodes with rich metadata:
    - category: "db", "auth", "http", "validation", etc.
    - severity: "error" (writes), "warning" (reads), "hint" (definitions)
    - rule_id: The specific pattern that matched

    This metadata enables category-based analysis:
    - code_graph_analyze("main", "category", category="db")
    - code_graph_explore("main", "category", category="auth")

    Args:
        graph_id: ID of the target graph (must exist from code_graph_create)
        astgrep_result: The raw JSON string output from astgrep_scan or
            astgrep_scan_rule_pack. Pass the exact return value.
        result_type: Source of the AST-grep result:
            - "rule_pack" (default): From astgrep_scan_rule_pack.
              Results include category, severity, rule_id metadata.
              Use this for business logic detection.
            - "scan": From astgrep_scan ad-hoc patterns.
              Results have pattern info but no category metadata.

    Returns:
        JSON with ingestion results:
        {
            "status": "success",
            "nodes_added": 25,
            "categories": ["db", "auth", "validation"],
            "total_nodes": 175
        }

    Output Size: ~300 bytes

    Common Errors:
        - "Graph not found": Call code_graph_create first
        - "Invalid JSON": AST-grep result is malformed

    Workflow Example:
        # Run rule pack for Python business logic
        matches = astgrep_scan_rule_pack("py_business_logic", repo_path)

        # Ingest into graph
        code_graph_ingest_astgrep("main", matches, "rule_pack")

        # Now analyze by category
        db_ops = code_graph_analyze("main", "category", category="db")
        auth_ops = code_graph_analyze("main", "category", category="auth")
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    try:
        result = json.loads(astgrep_result)
    except json.JSONDecodeError as e:
        return _json_response({"status": "error", "message": f"Invalid JSON: {e}"})

    nodes_added = 0
    categories: set[str] = set()

    if result_type == "rule_pack":
        nodes = ingest_astgrep_rule_pack(result)
    else:
        nodes = ingest_astgrep_matches(result)

    for node in nodes:
        if not graph.has_node(node.id):
            graph.add_node(node)
            nodes_added += 1
            if "category" in node.metadata:
                categories.add(node.metadata["category"])

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "result_type": result_type,
            "nodes_added": nodes_added,
            "categories": list(categories),
            "total_nodes": graph.node_count,
        },
    )

code_graph_ingest_clones

code_graph_ingest_clones(graph_id, clone_result)

Add clone detection results to the graph as SIMILAR_TO edges.

USE THIS TOOL: - After calling detect_clones to find duplicate code blocks - To enable refactoring candidate analysis in code_graph_analyze

DO NOT USE: - Before code_graph_create (graph must exist first) - With empty clone results

Creates SIMILAR_TO edges between files sharing duplicate code. These edges are used by: - code_graph_analyze("main", "refactoring") for refactoring candidates

Parameters:

Name Type Description Default
graph_id str

ID of the target graph (must exist from code_graph_create)

required
clone_result str

The raw JSON string output from detect_clones tool.

required

Returns:

Name Type Description
JSON str

{"status": "success", "edges_added": N, "total_edges": M}

Output Size: ~150 bytes

Source code in src/code_context_agent/tools/graph/tools.py
@tool
def code_graph_ingest_clones(
    graph_id: str,
    clone_result: str,
) -> str:
    """Add clone detection results to the graph as SIMILAR_TO edges.

    USE THIS TOOL:
    - After calling detect_clones to find duplicate code blocks
    - To enable refactoring candidate analysis in code_graph_analyze

    DO NOT USE:
    - Before code_graph_create (graph must exist first)
    - With empty clone results

    Creates SIMILAR_TO edges between files sharing duplicate code.
    These edges are used by:
    - code_graph_analyze("main", "refactoring") for refactoring candidates

    Args:
        graph_id: ID of the target graph (must exist from code_graph_create)
        clone_result: The raw JSON string output from detect_clones tool.

    Returns:
        JSON: {"status": "success", "edges_added": N, "total_edges": M}

    Output Size: ~150 bytes
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    try:
        result = json.loads(clone_result)
    except json.JSONDecodeError as e:
        return _json_response({"status": "error", "message": f"Invalid JSON: {e}"})

    edges = ingest_clone_results(result)
    edges_added = 0

    for edge in edges:
        graph.add_edge(edge)
        edges_added += 1

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "edges_added": edges_added,
            "total_edges": graph.edge_count,
        },
    )

code_graph_ingest_git

code_graph_ingest_git(
    graph_id,
    git_result,
    result_type,
    source_file="",
    min_percentage=20.0,
)

Add git history data to the code graph as nodes, edges, or metadata.

USE THIS TOOL: - After calling git_hotspots to add churn metadata to FILE nodes - After calling git_files_changed_together to add COCHANGES edges - After calling git_contributors or git_blame_summary to add ownership metadata

DO NOT USE: - Before code_graph_create (graph must exist first) - With error-status git results

Parameters:

Name Type Description Default
graph_id str

ID of the target graph (must exist from code_graph_create)

required
git_result str

The raw JSON string output from a git tool. Pass the exact return value from git_hotspots, git_files_changed_together, git_contributors, or git_blame_summary.

required
result_type str

Type of git result being ingested: - "hotspots": From git_hotspots. Creates/updates FILE nodes with churn metadata. - "cochanges": From git_files_changed_together. Creates COCHANGES edges. Uses min_percentage to filter low-coupling pairs. - "contributors": From git_contributors or git_blame_summary. Returns ownership metadata dict.

required
source_file str

For "contributors" type. If provided and the node exists, attaches contributor metadata to the FILE node at this path.

''
min_percentage float

For "cochanges" type. Minimum co-change percentage to create an edge (default 20.0). Lower = more edges.

20.0

Returns:

Type Description
str

JSON with ingestion results varying by type.

Output Size: ~200 bytes

Workflow Examples:

Ingesting hotspots (creates/updates FILE nodes): hotspots = git_hotspots(repo_path, limit=30) code_graph_ingest_git("main", hotspots, "hotspots")

Ingesting co-changes (creates COCHANGES edges): coupling = git_files_changed_together(repo_path, "src/auth.py") code_graph_ingest_git("main", coupling, "cochanges", min_percentage=15.0)

Ingesting contributors (returns metadata): blame = git_blame_summary(repo_path, "src/auth.py") code_graph_ingest_git("main", blame, "contributors", source_file="src/auth.py")

Source code in src/code_context_agent/tools/graph/tools.py
@tool
def code_graph_ingest_git(
    graph_id: str,
    git_result: str,
    result_type: str,
    source_file: str = "",
    min_percentage: float = 20.0,
) -> str:
    """Add git history data to the code graph as nodes, edges, or metadata.

    USE THIS TOOL:
    - After calling git_hotspots to add churn metadata to FILE nodes
    - After calling git_files_changed_together to add COCHANGES edges
    - After calling git_contributors or git_blame_summary to add ownership metadata

    DO NOT USE:
    - Before code_graph_create (graph must exist first)
    - With error-status git results

    Args:
        graph_id: ID of the target graph (must exist from code_graph_create)
        git_result: The raw JSON string output from a git tool.
            Pass the exact return value from git_hotspots,
            git_files_changed_together, git_contributors, or git_blame_summary.
        result_type: Type of git result being ingested:
            - "hotspots": From git_hotspots. Creates/updates FILE nodes with churn metadata.
            - "cochanges": From git_files_changed_together. Creates COCHANGES edges.
              Uses min_percentage to filter low-coupling pairs.
            - "contributors": From git_contributors or git_blame_summary.
              Returns ownership metadata dict.
        source_file: For "contributors" type. If provided and the node exists,
            attaches contributor metadata to the FILE node at this path.
        min_percentage: For "cochanges" type. Minimum co-change percentage
            to create an edge (default 20.0). Lower = more edges.

    Returns:
        JSON with ingestion results varying by type.

    Output Size: ~200 bytes

    Workflow Examples:

    Ingesting hotspots (creates/updates FILE nodes):
        hotspots = git_hotspots(repo_path, limit=30)
        code_graph_ingest_git("main", hotspots, "hotspots")

    Ingesting co-changes (creates COCHANGES edges):
        coupling = git_files_changed_together(repo_path, "src/auth.py")
        code_graph_ingest_git("main", coupling, "cochanges", min_percentage=15.0)

    Ingesting contributors (returns metadata):
        blame = git_blame_summary(repo_path, "src/auth.py")
        code_graph_ingest_git("main", blame, "contributors", source_file="src/auth.py")
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    try:
        result = json.loads(git_result)
    except json.JSONDecodeError as e:
        return _json_response({"status": "error", "message": f"Invalid JSON: {e}"})

    if result_type == "hotspots":
        nodes = ingest_git_hotspots(result)
        nodes_added = 0
        nodes_updated = 0
        for node in nodes:
            if graph.has_node(node.id):
                # Merge churn metadata into existing node
                existing = graph._graph.nodes[node.id]
                existing.setdefault("metadata", {}).update(node.metadata)
                nodes_updated += 1
            else:
                graph.add_node(node)
                nodes_added += 1
        return _json_response(
            {
                "status": "success",
                "graph_id": graph_id,
                "result_type": "hotspots",
                "nodes_added": nodes_added,
                "nodes_updated": nodes_updated,
                "total_nodes": graph.node_count,
            },
        )

    if result_type == "cochanges":
        edges = ingest_git_cochanges(result, min_percentage=min_percentage)
        edges_added = 0
        for edge in edges:
            graph.add_edge(edge)
            edges_added += 1
        return _json_response(
            {
                "status": "success",
                "graph_id": graph_id,
                "result_type": "cochanges",
                "edges_added": edges_added,
                "total_edges": graph.edge_count,
            },
        )

    if result_type == "contributors":
        metadata = ingest_git_contributors(result)
        if source_file and graph.has_node(source_file):
            graph._graph.nodes[source_file].setdefault("metadata", {}).update(metadata)
        return _json_response(
            {
                "status": "success",
                "graph_id": graph_id,
                "result_type": "contributors",
                "contributor_count": metadata.get("contributor_count", 0),
            },
        )

    return _json_response({"status": "error", "message": f"Unknown result_type: {result_type}"})

code_graph_ingest_inheritance

code_graph_ingest_inheritance(
    graph_id, hover_content, class_node_id, file_path
)

Add class inheritance/implementation edges from LSP hover information.

USE THIS TOOL: - After lsp_hover on a class to capture extends/implements relationships - When building class hierarchy for OOP codebases - In DEEP mode for comprehensive type analysis

DO NOT USE: - On non-class symbols (functions, variables) - Without first creating the class node via code_graph_ingest_lsp

Parses class signatures to create edges: - "inherits" edges: class Foo extends Bar → Foo --inherits→ Bar - "implements" edges: class Foo implements IBar → Foo --implements→ IBar

Works with common patterns: - TypeScript/JavaScript: extends, implements - Python: class Foo(Bar, Baz) - Java: extends, implements

Parameters:

Name Type Description Default
graph_id str

ID of the target graph (must exist from code_graph_create)

required
hover_content str

The markdown/text content from lsp_hover result. Extract the "value" field from the hover response. Example: "class UserService extends BaseService implements IUserService"

required
class_node_id str

The node ID of the class in the graph. Format: "file_path:ClassName" (e.g., "src/services/user.ts:UserService") Must match the ID created by code_graph_ingest_lsp.

required
file_path str

Path to the file containing this class. Used to resolve base class locations.

required

Returns:

Name Type Description
JSON str

{"status": "success", "edges_added": N, "edge_types": ["inherits", "implements"]}

Output Size: ~200 bytes

Workflow Example

Get class symbols

symbols = lsp_document_symbols(session_id, "src/user.ts") code_graph_ingest_lsp("main", symbols, "symbols", source_file="src/user.ts")

For each class, get hover info and ingest inheritance

hover = lsp_hover(session_id, "src/user.ts", class_line, 0) hover_content = hover["hover"]["contents"]["value"] code_graph_ingest_inheritance("main", hover_content, "src/user.ts:UserService", "src/user.ts")

Source code in src/code_context_agent/tools/graph/tools.py
@tool
def code_graph_ingest_inheritance(
    graph_id: str,
    hover_content: str,
    class_node_id: str,
    file_path: str,
) -> str:
    """Add class inheritance/implementation edges from LSP hover information.

    USE THIS TOOL:
    - After lsp_hover on a class to capture extends/implements relationships
    - When building class hierarchy for OOP codebases
    - In DEEP mode for comprehensive type analysis

    DO NOT USE:
    - On non-class symbols (functions, variables)
    - Without first creating the class node via code_graph_ingest_lsp

    Parses class signatures to create edges:
    - "inherits" edges: class Foo extends Bar → Foo --inherits--> Bar
    - "implements" edges: class Foo implements IBar → Foo --implements--> IBar

    Works with common patterns:
    - TypeScript/JavaScript: extends, implements
    - Python: class Foo(Bar, Baz)
    - Java: extends, implements

    Args:
        graph_id: ID of the target graph (must exist from code_graph_create)
        hover_content: The markdown/text content from lsp_hover result.
            Extract the "value" field from the hover response.
            Example: "class UserService extends BaseService implements IUserService"
        class_node_id: The node ID of the class in the graph.
            Format: "file_path:ClassName" (e.g., "src/services/user.ts:UserService")
            Must match the ID created by code_graph_ingest_lsp.
        file_path: Path to the file containing this class.
            Used to resolve base class locations.

    Returns:
        JSON: {"status": "success", "edges_added": N, "edge_types": ["inherits", "implements"]}

    Output Size: ~200 bytes

    Workflow Example:
        # Get class symbols
        symbols = lsp_document_symbols(session_id, "src/user.ts")
        code_graph_ingest_lsp("main", symbols, "symbols", source_file="src/user.ts")

        # For each class, get hover info and ingest inheritance
        hover = lsp_hover(session_id, "src/user.ts", class_line, 0)
        hover_content = hover["hover"]["contents"]["value"]
        code_graph_ingest_inheritance("main", hover_content,
                                      "src/user.ts:UserService", "src/user.ts")
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    edges = ingest_inheritance(hover_content, class_node_id, file_path)
    edges_added = 0

    for edge in edges:
        graph.add_edge(edge)
        edges_added += 1

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "edges_added": edges_added,
            "edge_types": ["inherits", "implements"],
            "total_edges": graph.edge_count,
        },
    )

code_graph_ingest_lsp

code_graph_ingest_lsp(
    graph_id,
    lsp_result,
    result_type,
    source_file="",
    source_symbol="",
)

Add LSP tool results to the code graph as nodes and edges.

USE THIS TOOL: - After calling lsp_document_symbols to add function/class nodes - After calling lsp_references to add "references" edges (fan-in data) - After calling lsp_definition to add "calls" edges (call relationships)

DO NOT USE: - Before calling code_graph_create (graph must exist first) - With invalid/empty LSP results (check LSP tool status first)

Converts raw LSP data into graph structure: - "symbols" → Creates nodes for functions, classes, methods, variables - "references" → Creates edges showing where a symbol is used - "definition" → Creates edges showing what a symbol calls/uses

Parameters:

Name Type Description Default
graph_id str

ID of the target graph (must exist from code_graph_create)

required
lsp_result str

The raw JSON string output from an LSP tool. Pass the exact return value from lsp_document_symbols, lsp_references, or lsp_definition.

required
result_type str

Type of LSP result being ingested: - "symbols": From lsp_document_symbols. Creates nodes. REQUIRES source_file parameter. - "references": From lsp_references. Creates reference edges. REQUIRES source_symbol parameter (format: "file:name"). - "definition": From lsp_definition. Creates call/import edges.

required
source_file str

Required for "symbols" type. The file path that was analyzed (e.g., "src/main.py"). Used to create node IDs.

''
source_symbol str

Required for "references" type. The symbol ID that references point TO (format: "src/main.py:my_function").

''

Returns:

Type Description
str

JSON with ingestion results:

str

{ "status": "success", "nodes_added": 15, # New nodes created "edges_added": 8, # New edges created "total_nodes": 150, # Graph totals "total_edges": 200

str

}

Output Size: ~200 bytes

Common Errors
  • "Graph not found": Call code_graph_create first
  • "source_file required": Must provide source_file for "symbols"
  • "source_symbol required": Must provide source_symbol for "references"
  • "Invalid JSON": LSP result is malformed

Workflow Examples:

Ingesting symbols (creates nodes): symbols = lsp_document_symbols(session_id, "src/api.py") code_graph_ingest_lsp("main", symbols, "symbols", source_file="src/api.py")

Ingesting references (creates edges showing fan-in): refs = lsp_references(session_id, "src/api.py", 10, 5) code_graph_ingest_lsp("main", refs, "references", source_symbol="src/api.py:handle_request")

Ingesting definitions (creates call edges): defn = lsp_definition(session_id, "src/api.py", 15, 20) code_graph_ingest_lsp("main", defn, "definition")

Source code in src/code_context_agent/tools/graph/tools.py
@tool
def code_graph_ingest_lsp(  # noqa: C901
    graph_id: str,
    lsp_result: str,
    result_type: str,
    source_file: str = "",
    source_symbol: str = "",
) -> str:
    """Add LSP tool results to the code graph as nodes and edges.

    USE THIS TOOL:
    - After calling lsp_document_symbols to add function/class nodes
    - After calling lsp_references to add "references" edges (fan-in data)
    - After calling lsp_definition to add "calls" edges (call relationships)

    DO NOT USE:
    - Before calling code_graph_create (graph must exist first)
    - With invalid/empty LSP results (check LSP tool status first)

    Converts raw LSP data into graph structure:
    - "symbols" → Creates nodes for functions, classes, methods, variables
    - "references" → Creates edges showing where a symbol is used
    - "definition" → Creates edges showing what a symbol calls/uses

    Args:
        graph_id: ID of the target graph (must exist from code_graph_create)
        lsp_result: The raw JSON string output from an LSP tool.
            Pass the exact return value from lsp_document_symbols,
            lsp_references, or lsp_definition.
        result_type: Type of LSP result being ingested:
            - "symbols": From lsp_document_symbols. Creates nodes.
              REQUIRES source_file parameter.
            - "references": From lsp_references. Creates reference edges.
              REQUIRES source_symbol parameter (format: "file:name").
            - "definition": From lsp_definition. Creates call/import edges.
        source_file: Required for "symbols" type. The file path that was
            analyzed (e.g., "src/main.py"). Used to create node IDs.
        source_symbol: Required for "references" type. The symbol ID that
            references point TO (format: "src/main.py:my_function").

    Returns:
        JSON with ingestion results:
        {
            "status": "success",
            "nodes_added": 15,      # New nodes created
            "edges_added": 8,       # New edges created
            "total_nodes": 150,     # Graph totals
            "total_edges": 200
        }

    Output Size: ~200 bytes

    Common Errors:
        - "Graph not found": Call code_graph_create first
        - "source_file required": Must provide source_file for "symbols"
        - "source_symbol required": Must provide source_symbol for "references"
        - "Invalid JSON": LSP result is malformed

    Workflow Examples:

    Ingesting symbols (creates nodes):
        symbols = lsp_document_symbols(session_id, "src/api.py")
        code_graph_ingest_lsp("main", symbols, "symbols", source_file="src/api.py")

    Ingesting references (creates edges showing fan-in):
        refs = lsp_references(session_id, "src/api.py", 10, 5)
        code_graph_ingest_lsp("main", refs, "references",
                              source_symbol="src/api.py:handle_request")

    Ingesting definitions (creates call edges):
        defn = lsp_definition(session_id, "src/api.py", 15, 20)
        code_graph_ingest_lsp("main", defn, "definition")
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    try:
        result = json.loads(lsp_result)
    except json.JSONDecodeError as e:
        return _json_response({"status": "error", "message": f"Invalid JSON: {e}"})

    nodes_added = 0
    edges_added = 0

    if result_type == "symbols":
        if not source_file:
            return _json_response({"status": "error", "message": "source_file required for symbols"})
        nodes, edges = ingest_lsp_symbols(result, source_file)
        for node in nodes:
            if not graph.has_node(node.id):
                graph.add_node(node)
                nodes_added += 1
        for edge in edges:
            graph.add_edge(edge)
            edges_added += 1

    elif result_type == "references":
        if not source_symbol:
            return _json_response({"status": "error", "message": "source_symbol required for references"})
        edges = ingest_lsp_references(result, source_symbol)
        for edge in edges:
            graph.add_edge(edge)
            edges_added += 1

    elif result_type == "definition":
        # Extract source location from result
        from_file = result.get("file", source_file)
        from_line = result.get("position", {}).get("line", 0)
        edges = ingest_lsp_definition(result, from_file, from_line)
        for edge in edges:
            graph.add_edge(edge)
            edges_added += 1

    else:
        return _json_response({"status": "error", "message": f"Unknown result_type: {result_type}"})

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "result_type": result_type,
            "nodes_added": nodes_added,
            "edges_added": edges_added,
            "total_nodes": graph.node_count,
            "total_edges": graph.edge_count,
        },
    )

code_graph_ingest_rg

code_graph_ingest_rg(graph_id, rg_result)

Add ripgrep search matches to the graph as preliminary nodes.

USE THIS TOOL: - When LSP doesn't cover a language/pattern - For text-based patterns (SQL keywords, config values, comments) - As a fallback when semantic analysis isn't available

DO NOT USE: - When LSP symbols are available (prefer code_graph_ingest_lsp) - For structural patterns (prefer code_graph_ingest_astgrep)

Creates lightweight nodes from text matches. These nodes have: - file_path and line number - matched text content - No semantic type information (unlike LSP nodes)

Ripgrep nodes are useful for: - Finding TODO/FIXME comments - Locating hardcoded values - Identifying SQL queries in strings

Parameters:

Name Type Description Default
graph_id str

ID of the target graph (must exist from code_graph_create)

required
rg_result str

The raw JSON string output from rg_search tool. Pass the exact return value.

required

Returns:

Name Type Description
JSON str

{"status": "success", "nodes_added": N, "total_nodes": M}

Output Size: ~150 bytes

Workflow Example

Find all SQL queries

sql_matches = rg_search("SELECT|INSERT|UPDATE|DELETE", repo_path) code_graph_ingest_rg("main", sql_matches)

Source code in src/code_context_agent/tools/graph/tools.py
@tool
def code_graph_ingest_rg(
    graph_id: str,
    rg_result: str,
) -> str:
    """Add ripgrep search matches to the graph as preliminary nodes.

    USE THIS TOOL:
    - When LSP doesn't cover a language/pattern
    - For text-based patterns (SQL keywords, config values, comments)
    - As a fallback when semantic analysis isn't available

    DO NOT USE:
    - When LSP symbols are available (prefer code_graph_ingest_lsp)
    - For structural patterns (prefer code_graph_ingest_astgrep)

    Creates lightweight nodes from text matches. These nodes have:
    - file_path and line number
    - matched text content
    - No semantic type information (unlike LSP nodes)

    Ripgrep nodes are useful for:
    - Finding TODO/FIXME comments
    - Locating hardcoded values
    - Identifying SQL queries in strings

    Args:
        graph_id: ID of the target graph (must exist from code_graph_create)
        rg_result: The raw JSON string output from rg_search tool.
            Pass the exact return value.

    Returns:
        JSON: {"status": "success", "nodes_added": N, "total_nodes": M}

    Output Size: ~150 bytes

    Workflow Example:
        # Find all SQL queries
        sql_matches = rg_search("SELECT|INSERT|UPDATE|DELETE", repo_path)
        code_graph_ingest_rg("main", sql_matches)
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    try:
        result = json.loads(rg_result)
    except json.JSONDecodeError as e:
        return _json_response({"status": "error", "message": f"Invalid JSON: {e}"})

    nodes = ingest_rg_matches(result)
    nodes_added = 0

    for node in nodes:
        if not graph.has_node(node.id):
            graph.add_node(node)
            nodes_added += 1

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "nodes_added": nodes_added,
            "total_nodes": graph.node_count,
        },
    )

code_graph_ingest_tests

code_graph_ingest_tests(
    graph_id, test_files, production_files
)

Add test-to-production file mappings as "tests" edges in the graph.

USE THIS TOOL: - After identifying test files (via rg_search for test patterns) - To enable test coverage analysis on business logic - To find untested hotspots in the codebase

DO NOT USE: - With unfiltered file lists (only include actual test files) - Before adding production file nodes to the graph

Creates "tests" edges based on naming convention matching: - test_foo.py → foo.py - foo.test.ts → foo.ts - FooTest.java → Foo.java - tests/foo.test.js → src/foo.js

These edges enable: - Finding untested business logic (nodes without incoming test edges) - Understanding test coverage per module - Prioritizing testing efforts on hotspots

Parameters:

Name Type Description Default
graph_id str

ID of the target graph (must exist from code_graph_create)

required
test_files str

JSON array of test file paths as a string. Example: '["tests/test_user.py", "tests/test_auth.py"]' Obtain from rg_search or file manifest filtering.

required
production_files str

JSON array of production file paths as a string. Example: '["src/user.py", "src/auth.py"]' Should include all files you want to map tests to.

required

Returns:

Name Type Description
JSON str

{"status": "success", "edges_added": N, "total_edges": M}

Output Size: ~150 bytes

Workflow Example

Find test files

test_matches = rg_search("def test_|it(|describe(", repo_path) test_files = extract_unique_files(test_matches)

Get production files from manifest

prod_files = filter_non_test_files(manifest)

Create test mapping edges

code_graph_ingest_tests("main", json.dumps(test_files), json.dumps(prod_files))

Find untested hotspots

hotspots = code_graph_analyze("main", "hotspots", top_k=10)

Check which have no incoming "tests" edges

Source code in src/code_context_agent/tools/graph/tools.py
@tool
def code_graph_ingest_tests(
    graph_id: str,
    test_files: str,
    production_files: str,
) -> str:
    """Add test-to-production file mappings as "tests" edges in the graph.

    USE THIS TOOL:
    - After identifying test files (via rg_search for test patterns)
    - To enable test coverage analysis on business logic
    - To find untested hotspots in the codebase

    DO NOT USE:
    - With unfiltered file lists (only include actual test files)
    - Before adding production file nodes to the graph

    Creates "tests" edges based on naming convention matching:
    - test_foo.py → foo.py
    - foo.test.ts → foo.ts
    - FooTest.java → Foo.java
    - __tests__/foo.test.js → src/foo.js

    These edges enable:
    - Finding untested business logic (nodes without incoming test edges)
    - Understanding test coverage per module
    - Prioritizing testing efforts on hotspots

    Args:
        graph_id: ID of the target graph (must exist from code_graph_create)
        test_files: JSON array of test file paths as a string.
            Example: '["tests/test_user.py", "tests/test_auth.py"]'
            Obtain from rg_search or file manifest filtering.
        production_files: JSON array of production file paths as a string.
            Example: '["src/user.py", "src/auth.py"]'
            Should include all files you want to map tests to.

    Returns:
        JSON: {"status": "success", "edges_added": N, "total_edges": M}

    Output Size: ~150 bytes

    Workflow Example:
        # Find test files
        test_matches = rg_search("def test_|it\\(|describe\\(", repo_path)
        test_files = extract_unique_files(test_matches)

        # Get production files from manifest
        prod_files = filter_non_test_files(manifest)

        # Create test mapping edges
        code_graph_ingest_tests("main",
                                json.dumps(test_files),
                                json.dumps(prod_files))

        # Find untested hotspots
        hotspots = code_graph_analyze("main", "hotspots", top_k=10)
        # Check which have no incoming "tests" edges
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    try:
        tests = json.loads(test_files)
        prods = json.loads(production_files)
    except json.JSONDecodeError as e:
        return _json_response({"status": "error", "message": f"Invalid JSON: {e}"})

    edges = ingest_test_mapping(tests, prods)
    edges_added = 0

    for edge in edges:
        graph.add_edge(edge)
        edges_added += 1

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "edges_added": edges_added,
            "total_edges": graph.edge_count,
        },
    )

code_graph_load

code_graph_load(graph_id, file_path)

Load a previously saved code graph from disk.

USE THIS TOOL: - At the start of a session if .code-context/code_graph.json exists - To resume analysis from a previous session - To skip re-running LSP/AST-grep data collection

DO NOT USE: - If graph file doesn't exist (check with file system first) - When you need fresh analysis (create new graph instead)

Loading a saved graph restores: - All nodes with their metadata - All edges with their types - Ready for immediate analysis (code_graph_analyze, code_graph_explore)

Note: Loading replaces any existing graph with the same ID. The explorer state is reset (tracked exploration cleared).

Parameters:

Name Type Description Default
graph_id str

ID to assign to the loaded graph. Use: - "main": For the primary codebase graph - Descriptive names for scoped graphs

required
file_path str

Path to the saved graph file. Standard location: ".code-context/code_graph.json"

required

Returns:

Name Type Description
JSON str

{ "status": "success", "graph_id": "main", "path": ".code-context/code_graph.json", "nodes": 150, "edges": 200

str

}

Output Size: ~100 bytes

Common Errors
  • "Load failed": File not found or invalid JSON

Workflow Example:

Check if saved graph exists

If .code-context/code_graph.json exists:

code_graph_load("main", ".code-context/code_graph.json")

Graph is ready for analysis

hotspots = code_graph_analyze("main", "hotspots") overview = code_graph_explore("main", "overview")

No need to re-run lsp_* or astgrep_* tools!

Source code in src/code_context_agent/tools/graph/tools.py
@tool
def code_graph_load(
    graph_id: str,
    file_path: str,
) -> str:
    """Load a previously saved code graph from disk.

    USE THIS TOOL:
    - At the start of a session if .code-context/code_graph.json exists
    - To resume analysis from a previous session
    - To skip re-running LSP/AST-grep data collection

    DO NOT USE:
    - If graph file doesn't exist (check with file system first)
    - When you need fresh analysis (create new graph instead)

    Loading a saved graph restores:
    - All nodes with their metadata
    - All edges with their types
    - Ready for immediate analysis (code_graph_analyze, code_graph_explore)

    Note: Loading replaces any existing graph with the same ID.
    The explorer state is reset (tracked exploration cleared).

    Args:
        graph_id: ID to assign to the loaded graph. Use:
            - "main": For the primary codebase graph
            - Descriptive names for scoped graphs
        file_path: Path to the saved graph file.
            Standard location: ".code-context/code_graph.json"

    Returns:
        JSON: {
            "status": "success",
            "graph_id": "main",
            "path": ".code-context/code_graph.json",
            "nodes": 150,
            "edges": 200
        }

    Output Size: ~100 bytes

    Common Errors:
        - "Load failed": File not found or invalid JSON

    Workflow Example:

    # Check if saved graph exists
    # If .code-context/code_graph.json exists:
    code_graph_load("main", ".code-context/code_graph.json")

    # Graph is ready for analysis
    hotspots = code_graph_analyze("main", "hotspots")
    overview = code_graph_explore("main", "overview")

    # No need to re-run lsp_* or astgrep_* tools!
    """
    try:
        from ..validation import ValidationError, validate_file_path

        try:
            path = validate_file_path(file_path)
        except ValidationError as e:
            return _json_response({"status": "error", "message": str(e)})
        data = json.loads(path.read_text())
        graph = CodeGraph.from_node_link_data(data)
        _graphs[graph_id] = graph
        # Reset explorer
        _explorers.pop(graph_id, None)
    except (OSError, ValueError, TypeError, KeyError) as e:
        return _json_response({"status": "error", "message": f"Load failed: {e}"})

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "path": str(path),
            "nodes": graph.node_count,
            "edges": graph.edge_count,
        },
    )

code_graph_save

code_graph_save(graph_id, file_path)

Persist the code graph to disk for reuse in future sessions.

USE THIS TOOL: - After completing graph analysis (DEEP mode) - When you want to preserve analysis results - Before ending a session with valuable graph data

DO NOT USE: - For exporting to visualization formats (use code_graph_export) - On empty graphs (waste of disk space)

Saves the complete graph structure including: - All nodes with metadata (file_path, line numbers, categories) - All edges with types (calls, references, imports, inherits) - All analysis-relevant data

Saved graphs can be reloaded with code_graph_load, avoiding the need to re-run LSP/AST-grep tools.

Parameters:

Name Type Description Default
graph_id str

ID of the graph to save (must exist)

required
file_path str

Destination file path. Recommended locations: - ".code-context/code_graph.json": Standard location for main graph - ".code-context/{name}_graph.json": For named/scoped graphs Parent directories are created automatically.

required

Returns:

Name Type Description
JSON str

{ "status": "success", "graph_id": "main", "path": ".code-context/code_graph.json", "nodes": 150, "edges": 200

str

}

Output Size: ~100 bytes (file size varies: 10KB-1MB)

Common Errors
  • "Graph not found": Graph ID doesn't exist
  • "Save failed": File system error (permissions, disk full)

Workflow Example:

After comprehensive analysis in DEEP mode

code_graph_create("main")

... ingest LSP, AST-grep data ...

... run analysis ...

Save for future sessions

code_graph_save("main", ".code-context/code_graph.json")

In future session:

code_graph_load("main", ".code-context/code_graph.json")

Graph restored with all nodes/edges

Source code in src/code_context_agent/tools/graph/tools.py
@tool
def code_graph_save(
    graph_id: str,
    file_path: str,
) -> str:
    """Persist the code graph to disk for reuse in future sessions.

    USE THIS TOOL:
    - After completing graph analysis (DEEP mode)
    - When you want to preserve analysis results
    - Before ending a session with valuable graph data

    DO NOT USE:
    - For exporting to visualization formats (use code_graph_export)
    - On empty graphs (waste of disk space)

    Saves the complete graph structure including:
    - All nodes with metadata (file_path, line numbers, categories)
    - All edges with types (calls, references, imports, inherits)
    - All analysis-relevant data

    Saved graphs can be reloaded with code_graph_load, avoiding
    the need to re-run LSP/AST-grep tools.

    Args:
        graph_id: ID of the graph to save (must exist)
        file_path: Destination file path. Recommended locations:
            - ".code-context/code_graph.json": Standard location for main graph
            - ".code-context/{name}_graph.json": For named/scoped graphs
            Parent directories are created automatically.

    Returns:
        JSON: {
            "status": "success",
            "graph_id": "main",
            "path": ".code-context/code_graph.json",
            "nodes": 150,
            "edges": 200
        }

    Output Size: ~100 bytes (file size varies: 10KB-1MB)

    Common Errors:
        - "Graph not found": Graph ID doesn't exist
        - "Save failed": File system error (permissions, disk full)

    Workflow Example:

    # After comprehensive analysis in DEEP mode
    code_graph_create("main")
    # ... ingest LSP, AST-grep data ...
    # ... run analysis ...

    # Save for future sessions
    code_graph_save("main", ".code-context/code_graph.json")

    # In future session:
    code_graph_load("main", ".code-context/code_graph.json")
    # Graph restored with all nodes/edges
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    try:
        from ..validation import ValidationError, validate_file_path

        try:
            path = validate_file_path(file_path, must_exist=False)
        except ValidationError as e:
            return _json_response({"status": "error", "message": str(e)})
        data = graph.to_node_link_data()
        path.parent.mkdir(parents=True, exist_ok=True)
        path.write_text(json.dumps(data, indent=2, default=str))
    except (OSError, ValueError, TypeError) as e:
        return _json_response({"status": "error", "message": f"Save failed: {e}"})

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "path": str(path),
            "nodes": graph.node_count,
            "edges": graph.edge_count,
        },
    )

code_graph_stats

code_graph_stats(graph_id)

Get summary statistics about a code graph.

USE THIS TOOL: - To verify graph was populated correctly after ingestion - To understand graph composition before analysis - For the completion signal (graph node/edge counts)

DO NOT USE: - For detailed analysis (use code_graph_analyze) - For exploration (use code_graph_explore)

Returns counts broken down by type: - Nodes by type: function, class, method, variable, pattern_match - Edges by type: calls, references, imports, inherits, tests

This helps verify: - LSP ingestion worked (function/class nodes exist) - AST-grep ingestion worked (pattern_match nodes exist) - Reference tracking worked (references edges exist) - Test mapping worked (tests edges exist)

Parameters:

Name Type Description Default
graph_id str

ID of the graph to get stats for (must exist)

required

Returns:

Name Type Description
JSON str

{ "status": "success", "graph_id": "main", "total_nodes": 150, "total_edges": 200, "nodes_by_type": { "function": 80, "class": 20, "method": 40, "pattern_match": 10 }, "edges_by_type": { "calls": 100, "references": 60, "imports": 30, "tests": 10 }

str

}

Output Size: ~300 bytes

Workflow Example:

After ingestion, verify graph state

stats = code_graph_stats("main")

Check ingestion worked

if stats["nodes_by_type"]["function"] == 0: # LSP symbols not ingested properly

if stats["edges_by_type"]["references"] == 0: # LSP references not ingested

Use in completion signal

Graph: {stats["total_nodes"]} nodes, {stats["total_edges"]} edges

Source code in src/code_context_agent/tools/graph/tools.py
@tool
def code_graph_stats(
    graph_id: str,
) -> str:
    """Get summary statistics about a code graph.

    USE THIS TOOL:
    - To verify graph was populated correctly after ingestion
    - To understand graph composition before analysis
    - For the completion signal (graph node/edge counts)

    DO NOT USE:
    - For detailed analysis (use code_graph_analyze)
    - For exploration (use code_graph_explore)

    Returns counts broken down by type:
    - Nodes by type: function, class, method, variable, pattern_match
    - Edges by type: calls, references, imports, inherits, tests

    This helps verify:
    - LSP ingestion worked (function/class nodes exist)
    - AST-grep ingestion worked (pattern_match nodes exist)
    - Reference tracking worked (references edges exist)
    - Test mapping worked (tests edges exist)

    Args:
        graph_id: ID of the graph to get stats for (must exist)

    Returns:
        JSON: {
            "status": "success",
            "graph_id": "main",
            "total_nodes": 150,
            "total_edges": 200,
            "nodes_by_type": {
                "function": 80,
                "class": 20,
                "method": 40,
                "pattern_match": 10
            },
            "edges_by_type": {
                "calls": 100,
                "references": 60,
                "imports": 30,
                "tests": 10
            }
        }

    Output Size: ~300 bytes

    Workflow Example:

    # After ingestion, verify graph state
    stats = code_graph_stats("main")

    # Check ingestion worked
    if stats["nodes_by_type"]["function"] == 0:
        # LSP symbols not ingested properly

    if stats["edges_by_type"]["references"] == 0:
        # LSP references not ingested

    # Use in completion signal
    # Graph: {stats["total_nodes"]} nodes, {stats["total_edges"]} edges
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    # Count nodes by type
    node_types: dict[str, int] = {}
    for _, data in graph.nodes(data=True):
        ntype = data.get("node_type", "unknown")
        node_types[ntype] = node_types.get(ntype, 0) + 1

    # Count edges by type
    edge_types: dict[str, int] = {}
    for _, _, data in graph.edges(data=True):
        etype = data.get("edge_type", "unknown")
        edge_types[etype] = edge_types.get(etype, 0) + 1

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "total_nodes": graph.node_count,
            "total_edges": graph.edge_count,
            "nodes_by_type": node_types,
            "edges_by_type": edge_types,
        },
    )