Index

code_context_agent.tools.graph ¶

Code graph analysis package.

Provides tools for building and analyzing code graphs using NetworkX: - Graph construction from LSP, AST-grep, and ripgrep results - Analysis algorithms (clustering, centrality, traversal) - Progressive disclosure for AI context generation - Export to Mermaid and JSON formats

CodeAnalyzer ¶

CodeAnalyzer(graph)

Analyzer for code graphs using NetworkX algorithms.

Provides methods for finding important code (centrality), detecting logical modules (clustering), and analyzing relationships between code elements.

Initialize the analyzer with a code graph.

Parameters:

Name	Type	Description	Default
`graph`	`CodeGraph`	The CodeGraph to analyze	required

Source code in src/code_context_agent/tools/graph/analysis.py

def __init__(self, graph: CodeGraph) -> None:
    """Initialize the analyzer with a code graph.

    Args:
        graph: The CodeGraph to analyze
    """
    self.graph = graph

find_hotspots ¶

find_hotspots(top_k=10)

Find code hotspots using betweenness centrality.

Hotspots are code elements that lie on many shortest paths between other elements - they are often bottlenecks or central integration points.

Parameters:

Name	Type	Description	Default
`top_k`	`int`	Number of top hotspots to return	`10`

Returns:

Type	Description
`list[dict[str, Any]]`	List of dictionaries with node info and betweenness score

Source code in src/code_context_agent/tools/graph/analysis.py

def find_hotspots(self, top_k: int = 10) -> list[dict[str, Any]]:
    """Find code hotspots using betweenness centrality.

    Hotspots are code elements that lie on many shortest paths
    between other elements - they are often bottlenecks or
    central integration points.

    Args:
        top_k: Number of top hotspots to return

    Returns:
        List of dictionaries with node info and betweenness score
    """
    view = self.graph.get_view([EdgeType.CALLS, EdgeType.REFERENCES])

    if view.number_of_nodes() == 0:
        return []

    try:
        betweenness = nx.betweenness_centrality(view, weight="weight")
    except nx.NetworkXError:
        return []

    return self._format_ranked_results(betweenness, top_k)

find_foundations ¶

find_foundations(top_k=10)

Find foundational code using PageRank.

Foundations are code elements that are heavily depended upon by other important code - the core infrastructure.

Parameters:

Name	Type	Description	Default
`top_k`	`int`	Number of top foundations to return	`10`

Returns:

Type	Description
`list[dict[str, Any]]`	List of dictionaries with node info and PageRank score

Source code in src/code_context_agent/tools/graph/analysis.py

def find_foundations(self, top_k: int = 10) -> list[dict[str, Any]]:
    """Find foundational code using PageRank.

    Foundations are code elements that are heavily depended upon
    by other important code - the core infrastructure.

    Args:
        top_k: Number of top foundations to return

    Returns:
        List of dictionaries with node info and PageRank score
    """
    view = self.graph.get_view([EdgeType.CALLS, EdgeType.IMPORTS])

    if view.number_of_nodes() == 0:
        return []

    try:
        pagerank = nx.pagerank(view, alpha=0.85, weight="weight")
    except nx.NetworkXError:
        return []

    return self._format_ranked_results(pagerank, top_k)

find_trusted_foundations ¶

find_trusted_foundations(seed_nodes=None, top_k=10)

Find foundational code using TrustRank (noise-resistant PageRank).

TrustRank propagates trust from seed nodes, making it more resistant to noise than standard PageRank. If no seed nodes provided, uses entry points as seeds.

Parameters:

Name	Type	Description	Default
`seed_nodes`	`list[str] \| None`	List of trusted node IDs (defaults to entry points)	`None`
`top_k`	`int`	Number of top results to return	`10`

Returns:

Type	Description
`list[dict[str, Any]]`	List of dictionaries with node info and trust score

Source code in src/code_context_agent/tools/graph/analysis.py

def find_trusted_foundations(
    self,
    seed_nodes: list[str] | None = None,
    top_k: int = 10,
) -> list[dict[str, Any]]:
    """Find foundational code using TrustRank (noise-resistant PageRank).

    TrustRank propagates trust from seed nodes, making it more resistant
    to noise than standard PageRank. If no seed nodes provided, uses
    entry points as seeds.

    Args:
        seed_nodes: List of trusted node IDs (defaults to entry points)
        top_k: Number of top results to return

    Returns:
        List of dictionaries with node info and trust score
    """
    view = self.graph.get_view([EdgeType.CALLS, EdgeType.IMPORTS])

    if view.number_of_nodes() == 0:
        return []

    # Use entry points as default seeds
    if not seed_nodes:
        entry_points = self.find_entry_points()
        seed_nodes = [ep["id"] for ep in entry_points[:5]]

    if not seed_nodes:
        return self.find_foundations(top_k)

    # Build personalization dict for TrustRank
    trust = dict.fromkeys(view.nodes(), 0.0)
    for seed in seed_nodes:
        if seed in trust:
            trust[seed] = 1.0 / len(seed_nodes)

    try:
        scores = nx.pagerank(view, alpha=0.85, personalization=trust, weight="weight")
    except nx.NetworkXError:
        return []

    return self._format_ranked_results(scores, top_k)

find_entry_points ¶

find_entry_points()

Find likely entry points in the code.

Entry points are nodes with no incoming call edges but outgoing calls - they initiate execution flow.

Returns:

Type	Description
`list[dict[str, Any]]`	List of dictionaries with entry point node info

Source code in src/code_context_agent/tools/graph/analysis.py

def find_entry_points(self) -> list[dict[str, Any]]:
    """Find likely entry points in the code.

    Entry points are nodes with no incoming call edges but
    outgoing calls - they initiate execution flow.

    Returns:
        List of dictionaries with entry point node info
    """
    view = self.graph.get_view([EdgeType.CALLS])

    entry_points = []
    for node in view.nodes():
        in_deg = view.in_degree(node)
        out_deg = view.out_degree(node)

        # Entry point: no callers but makes calls
        if in_deg == 0 and out_deg > 0:
            node_data = self.graph.get_node_data(node)
            entry_points.append(
                {
                    "id": node,
                    "out_degree": out_deg,
                    **(node_data or {}),
                },
            )

    # Also check for main/run/start patterns
    for node, data in self.graph.nodes(data=True):
        name = str(data.get("name", "")).lower()
        if any(p in name for p in ("main", "__main__", "run", "start", "app", "cli")):
            if not any(ep["id"] == node for ep in entry_points):
                entry_points.append(
                    {
                        "id": node,
                        "out_degree": view.out_degree(node) if view.has_node(node) else 0,
                        **data,
                    },
                )

    # Sort by out_degree (more calls = more significant entry point)
    entry_points.sort(key=lambda x: x.get("out_degree", 0), reverse=True)

    return entry_points

detect_modules ¶

detect_modules(resolution=1.0)

Detect logical modules using Louvain community detection.

Uses the Louvain algorithm to find communities of densely connected code elements.

Parameters:

Name	Type	Description	Default
`resolution`	`float`	Clustering resolution (< 1 = larger clusters, > 1 = smaller)	`1.0`

Returns:

Type	Description
`list[dict[str, Any]]`	List of module dictionaries with members and metrics

Source code in src/code_context_agent/tools/graph/analysis.py

def detect_modules(self, resolution: float = 1.0) -> list[dict[str, Any]]:
    """Detect logical modules using Louvain community detection.

    Uses the Louvain algorithm to find communities of densely
    connected code elements.

    Args:
        resolution: Clustering resolution (< 1 = larger clusters, > 1 = smaller)

    Returns:
        List of module dictionaries with members and metrics
    """
    view = self.graph.get_view([EdgeType.CALLS, EdgeType.IMPORTS])

    if view.number_of_nodes() < 2:
        return []

    # Louvain requires undirected graph
    undirected = view.to_undirected()

    try:
        # Try Leiden first (better community quality, requires backend)
        communities = nx.community.leiden_communities(undirected, resolution=resolution, seed=42)
    except (NotImplementedError, nx.NetworkXError, ValueError, RuntimeError):
        try:
            # Fallback to Louvain (pure NetworkX)
            communities = nx.community.louvain_communities(undirected, resolution=resolution, seed=42)
        except (nx.NetworkXError, ValueError, RuntimeError):
            return []

    modules = []
    for i, community in enumerate(communities):
        community_list = list(community)

        # Get key nodes (highest PageRank within community)
        subgraph = view.subgraph(community_list)
        if subgraph.number_of_nodes() > 0:
            try:
                local_pr = nx.pagerank(subgraph)
                key_nodes = sorted(local_pr.items(), key=lambda x: x[1], reverse=True)[:3]
            except (nx.NetworkXError, ValueError, RuntimeError):
                key_nodes = [(n, 0) for n in community_list[:3]]
        else:
            key_nodes = []

        # Calculate cohesion (internal/external edge ratio)
        cohesion = self._calculate_cohesion(view, community)

        modules.append(
            {
                "module_id": i,
                "size": len(community_list),
                "key_nodes": [{"id": n, "score": s} for n, s in key_nodes],
                "members": community_list,
                "cohesion": cohesion,
            },
        )

    # Sort by size (largest modules first)
    modules.sort(key=lambda x: x["size"], reverse=True)

    return modules

find_clusters_by_pattern ¶

find_clusters_by_pattern(rule_id)

Find clusters of nodes matching a specific AST-grep rule.

Groups nodes by their rule_id metadata to find related business logic patterns.

Parameters:

Name	Type	Description	Default
`rule_id`	`str`	The rule identifier to filter by	required

Returns:

Type	Description
`list[dict[str, Any]]`	List of matching nodes grouped by file

Source code in src/code_context_agent/tools/graph/analysis.py

def find_clusters_by_pattern(self, rule_id: str) -> list[dict[str, Any]]:
    """Find clusters of nodes matching a specific AST-grep rule.

    Groups nodes by their rule_id metadata to find related
    business logic patterns.

    Args:
        rule_id: The rule identifier to filter by

    Returns:
        List of matching nodes grouped by file
    """
    matching_nodes: dict[str, list[dict[str, Any]]] = {}

    for node_id, data in self.graph.nodes(data=True):
        if data.get("rule_id") == rule_id:
            file_path = data.get("file_path", "unknown")
            if file_path not in matching_nodes:
                matching_nodes[file_path] = []
            matching_nodes[file_path].append({"id": node_id, **data})

    return [{"file": f, "matches": m, "count": len(m)} for f, m in matching_nodes.items()]

find_clusters_by_category ¶

find_clusters_by_category(category)

Find all nodes matching a business logic category.

Parameters:

Name	Type	Description	Default
`category`	`str`	Category to filter by (e.g., "db", "auth", "http")	required

Returns:

Type	Description
`list[dict[str, Any]]`	List of matching nodes with their locations

Source code in src/code_context_agent/tools/graph/analysis.py

def find_clusters_by_category(self, category: str) -> list[dict[str, Any]]:
    """Find all nodes matching a business logic category.

    Args:
        category: Category to filter by (e.g., "db", "auth", "http")

    Returns:
        List of matching nodes with their locations
    """
    matches = []

    for node_id, data in self.graph.nodes(data=True):
        if data.get("category") == category:
            matches.append({"id": node_id, **data})

    return matches

find_triangles ¶

find_triangles(top_k=10)

Find tightly-coupled code triads using triangle detection.

Triangles in the call/import graph indicate three pieces of code that all depend on each other — potential cohesion or coupling issues.

Parameters:

Name	Type	Description	Default
`top_k`	`int`	Maximum number of triangles to return	`10`

Returns:

Type	Description
`list[dict[str, Any]]`	List of triangle dictionaries with the three node IDs

Source code in src/code_context_agent/tools/graph/analysis.py

def find_triangles(self, top_k: int = 10) -> list[dict[str, Any]]:
    """Find tightly-coupled code triads using triangle detection.

    Triangles in the call/import graph indicate three pieces of code
    that all depend on each other — potential cohesion or coupling issues.

    Args:
        top_k: Maximum number of triangles to return

    Returns:
        List of triangle dictionaries with the three node IDs
    """
    view = self.graph.get_view([EdgeType.CALLS, EdgeType.IMPORTS])
    undirected = view.to_undirected()

    triangles = []
    try:
        for triangle in nx.enumerate_all_cliques(undirected):
            if len(triangle) == 3:
                triangles.append(
                    {
                        "nodes": list(triangle),
                        "node_details": [{"id": n, **(self.graph.get_node_data(n) or {})} for n in triangle],
                    },
                )
                if len(triangles) >= top_k:
                    break
    except nx.NetworkXError:
        pass  # graph structure doesn't support triangle detection (e.g. directed)

    return triangles

get_similar_nodes ¶

get_similar_nodes(node_id, top_k=5)

Find nodes similar to a given node based on graph structure.

Uses personalized PageRank to find nodes closely related to the target node.

Parameters:

Name	Type	Description	Default
`node_id`	`str`	The node to find similar nodes for	required
`top_k`	`int`	Number of similar nodes to return	`5`

Returns:

Type	Description
`list[dict[str, Any]]`	List of similar nodes with similarity scores

Source code in src/code_context_agent/tools/graph/analysis.py

def get_similar_nodes(self, node_id: str, top_k: int = 5) -> list[dict[str, Any]]:
    """Find nodes similar to a given node based on graph structure.

    Uses personalized PageRank to find nodes closely related
    to the target node.

    Args:
        node_id: The node to find similar nodes for
        top_k: Number of similar nodes to return

    Returns:
        List of similar nodes with similarity scores
    """
    view = self.graph.get_view()

    if not view.has_node(node_id):
        return []

    try:
        # Personalized PageRank with target node as seed
        ppr = nx.pagerank(view, personalization={node_id: 1}, alpha=0.85)
    except nx.NetworkXError:
        return []

    # Remove self, sort by score
    del ppr[node_id]
    ranked = sorted(ppr.items(), key=lambda x: x[1], reverse=True)[:top_k]

    return [{"id": n, "similarity": s, **(self.graph.get_node_data(n) or {})} for n, s in ranked if s > 0]

calculate_coupling ¶

calculate_coupling(node_a, node_b)

Calculate coupling strength between two nodes.

Considers shared neighbors, direct edges, and path length.

Parameters:

Name	Type	Description	Default
`node_a`	`str`	First node ID	required
`node_b`	`str`	Second node ID	required

Returns:

Type	Description
`dict[str, Any]`	Dictionary with coupling metrics

Source code in src/code_context_agent/tools/graph/analysis.py

def calculate_coupling(self, node_a: str, node_b: str) -> dict[str, Any]:
    """Calculate coupling strength between two nodes.

    Considers shared neighbors, direct edges, and path length.

    Args:
        node_a: First node ID
        node_b: Second node ID

    Returns:
        Dictionary with coupling metrics
    """
    view = self.graph.get_view()

    if not view.has_node(node_a) or not view.has_node(node_b):
        return {"error": "Node not found", "coupling": 0.0}

    # Direct edge count
    direct_edges = 0
    if view.has_edge(node_a, node_b):
        direct_edges += 1
    if view.has_edge(node_b, node_a):
        direct_edges += 1

    # Shared neighbors
    neighbors_a = set(view.successors(node_a)) | set(view.predecessors(node_a))
    neighbors_b = set(view.successors(node_b)) | set(view.predecessors(node_b))
    shared = neighbors_a & neighbors_b

    # Shortest path length
    try:
        path_length = nx.shortest_path_length(view.to_undirected(), node_a, node_b)
    except nx.NetworkXNoPath:
        path_length = float("inf")

    # Calculate coupling score (higher = more coupled)
    coupling = direct_edges * 2.0 + len(shared) * 0.5 + (1.0 / (path_length + 1))

    return {
        "node_a": node_a,
        "node_b": node_b,
        "direct_edges": direct_edges,
        "shared_neighbors": len(shared),
        "path_length": path_length if path_length != float("inf") else None,
        "coupling": coupling,
    }

get_dependency_chain ¶

get_dependency_chain(
    node_id, direction="outgoing", max_depth=5
)

Get the dependency chain from/to a node.

Parameters:

Name	Type	Description	Default
`node_id`	`str`	Starting node	required
`direction`	`str`	"outgoing" (what this depends on) or "incoming" (what depends on this)	`'outgoing'`
`max_depth`	`int`	Maximum depth to traverse	`5`

Returns:

Type	Description
`dict[str, Any]`	Dictionary with nodes and edges in the chain

Source code in src/code_context_agent/tools/graph/analysis.py

def get_dependency_chain(self, node_id: str, direction: str = "outgoing", max_depth: int = 5) -> dict[str, Any]:
    """Get the dependency chain from/to a node.

    Args:
        node_id: Starting node
        direction: "outgoing" (what this depends on) or "incoming" (what depends on this)
        max_depth: Maximum depth to traverse

    Returns:
        Dictionary with nodes and edges in the chain
    """
    view = self.graph.get_view([EdgeType.CALLS, EdgeType.IMPORTS])

    if not view.has_node(node_id):
        return {"error": "Node not found"}

    if direction == "outgoing":
        nodes = dict(nx.single_source_shortest_path_length(view, node_id, cutoff=max_depth))
    else:
        # Incoming: traverse reverse graph
        reverse = view.reverse()
        nodes = dict(nx.single_source_shortest_path_length(reverse, node_id, cutoff=max_depth))

    # Get edges within the discovered nodes
    subgraph = view.subgraph(nodes.keys())
    edges = list(subgraph.edges(data=True))

    return {
        "root": node_id,
        "direction": direction,
        "depth": max_depth,
        "nodes": [{"id": n, "distance": d, **(self.graph.get_node_data(n) or {})} for n, d in nodes.items()],
        "edges": [{"source": u, "target": v, **d} for u, v, d in edges],
    }

find_unused_symbols ¶

find_unused_symbols(node_types=None, exclude_patterns=None)

Find symbols with zero incoming cross-file references.

Identifies functions, classes, and methods that are defined but never referenced from other files — dead code candidates.

Parameters:

Name	Type	Description	Default
`node_types`	`list[str] \| None`	Filter to specific types (default: function, class, method)	`None`
`exclude_patterns`	`list[str] \| None`	Regex patterns to exclude from results	`None`

Returns:

Type	Description
`list[dict[str, Any]]`	List of unused symbol dicts with id, name, file_path, node_type

Source code in src/code_context_agent/tools/graph/analysis.py

def find_unused_symbols(
    self,
    node_types: list[str] | None = None,
    exclude_patterns: list[str] | None = None,
) -> list[dict[str, Any]]:
    """Find symbols with zero incoming cross-file references.

    Identifies functions, classes, and methods that are defined but
    never referenced from other files — dead code candidates.

    Args:
        node_types: Filter to specific types (default: function, class, method)
        exclude_patterns: Regex patterns to exclude from results

    Returns:
        List of unused symbol dicts with id, name, file_path, node_type
    """
    target_types = (
        set(node_types)
        if node_types
        else {
            NodeType.FUNCTION.value,
            NodeType.CLASS.value,
            NodeType.METHOD.value,
        }
    )
    default_excludes = [r"^test_", r"^_", r"__init__", r"__main__"]
    excludes = [re.compile(p) for p in (exclude_patterns or default_excludes)]

    view = self.graph.get_view([EdgeType.REFERENCES, EdgeType.CALLS, EdgeType.IMPORTS])

    unused = []
    for node_id, data in self.graph.nodes(data=True):
        if data.get("node_type") not in target_types:
            continue

        name = str(data.get("name", ""))
        if any(pat.search(name) for pat in excludes):
            continue

        node_file = data.get("file_path", "")
        if not node_file:
            continue

        # Count incoming edges from OTHER files
        cross_file_refs = 0
        if view.has_node(node_id):
            for pred in view.predecessors(node_id):
                pred_data = self.graph.get_node_data(pred)
                pred_file = (pred_data or {}).get("file_path", "")
                if pred_file and pred_file != node_file:
                    cross_file_refs += 1
                    break  # One is enough to disqualify

        if cross_file_refs == 0:
            unused.append(
                {
                    "id": node_id,
                    "name": name,
                    "file_path": node_file,
                    "node_type": data.get("node_type"),
                    "line_start": data.get("line_start", 0),
                },
            )

    unused.sort(key=lambda x: (x["file_path"], x.get("line_start", 0)))
    return unused

find_refactoring_candidates ¶

find_refactoring_candidates(top_k=10)

Identify refactoring opportunities by combining multiple signals.

Combines: - Clone pairs (SIMILAR_TO edges) -> "extract shared helper" - Code smell pattern matches (rule_id contains "code_smell") -> structural issues - Unused symbols -> "dead code removal"

Parameters:

Name	Type	Description	Default
`top_k`	`int`	Maximum number of candidates to return	`10`

Returns:

Type	Description
`list[dict[str, Any]]`	Ranked list of refactoring candidates with type, files, and rationale.

Source code in src/code_context_agent/tools/graph/analysis.py

def find_refactoring_candidates(self, top_k: int = 10) -> list[dict[str, Any]]:  # noqa: C901
    """Identify refactoring opportunities by combining multiple signals.

    Combines:
    - Clone pairs (SIMILAR_TO edges) -> "extract shared helper"
    - Code smell pattern matches (rule_id contains "code_smell") -> structural issues
    - Unused symbols -> "dead code removal"

    Args:
        top_k: Maximum number of candidates to return

    Returns:
        Ranked list of refactoring candidates with type, files, and rationale.
    """
    candidates: list[dict[str, Any]] = []

    # 1. Clone groups from SIMILAR_TO edges
    similar_edges = self.graph.get_edges_by_type(EdgeType.SIMILAR_TO)
    clone_groups: dict[str, list[str]] = {}
    for source, target, data in similar_edges:
        key = f"{source}|{target}" if source < target else f"{target}|{source}"
        if key not in clone_groups:
            clone_groups[key] = [source, target]
            candidates.append(
                {
                    "type": "extract_helper",
                    "pattern": f"Duplicate code between {source} and {target}",
                    "files": [source, target],
                    "occurrence_count": 2,
                    "duplicated_lines": int(data.get("duplicated_lines", 0)),
                    "score": int(data.get("duplicated_lines", 5)) * 2.0,
                },
            )

    # 2. Code smell patterns
    smell_counts: dict[str, list[str]] = {}
    for node_id, data in self.graph.nodes(data=True):
        rule_id = data.get("rule_id", "")
        note = data.get("note", "")
        if "code_smell" in note or "code_smell" in rule_id:
            if rule_id not in smell_counts:
                smell_counts[rule_id] = []
            smell_counts[rule_id].append(data.get("file_path", node_id))

    for rule_id, files in smell_counts.items():
        candidates.append(
            {
                "type": "code_smell",
                "pattern": rule_id,
                "files": list(set(files)),
                "occurrence_count": len(files),
                "duplicated_lines": 0,
                "score": len(files) * 1.5,
            },
        )

    # 3. Unused symbols
    unused = self.find_unused_symbols()
    if unused:
        # Group by file
        by_file: dict[str, list[str]] = {}
        for sym in unused:
            fp = sym["file_path"]
            if fp not in by_file:
                by_file[fp] = []
            by_file[fp].append(sym["name"])

        for fp, names in by_file.items():
            candidates.append(
                {
                    "type": "dead_code",
                    "pattern": f"{len(names)} unused symbol(s) in {fp}",
                    "files": [fp],
                    "occurrence_count": len(names),
                    "duplicated_lines": 0,
                    "score": len(names) * 1.0,
                },
            )

    # Sort by score descending, return top_k
    candidates.sort(key=lambda x: x["score"], reverse=True)
    return candidates[:top_k]

ProgressiveExplorer ¶

ProgressiveExplorer(graph, analyzer=None)

Staged exploration of code graph for AI context generation.

Tracks what has been explored and suggests next steps for progressive disclosure of codebase structure.

Initialize the explorer.

Parameters:

Name	Type	Description	Default
`graph`	`CodeGraph`	The CodeGraph to explore	required
`analyzer`	`CodeAnalyzer \| None`	Optional CodeAnalyzer (created if not provided)	`None`

Source code in src/code_context_agent/tools/graph/disclosure.py

def __init__(self, graph: CodeGraph, analyzer: CodeAnalyzer | None = None) -> None:
    """Initialize the explorer.

    Args:
        graph: The CodeGraph to explore
        analyzer: Optional CodeAnalyzer (created if not provided)
    """
    self.graph = graph
    self.analyzer = analyzer or CodeAnalyzer(graph)
    self.explored: set[str] = set()

get_overview ¶

get_overview()

Get high-level codebase structure (Level 0).

Provides entry points, hotspots, modules, and foundations for initial orientation.

Returns:

Type	Description
`dict[str, Any]`	Dictionary with overview information

Source code in src/code_context_agent/tools/graph/disclosure.py

def get_overview(self) -> dict[str, Any]:
    """Get high-level codebase structure (Level 0).

    Provides entry points, hotspots, modules, and foundations
    for initial orientation.

    Returns:
        Dictionary with overview information
    """
    entry_points = self.analyzer.find_entry_points()[:5]
    hotspots = self.analyzer.find_hotspots(5)
    modules = self.analyzer.detect_modules()
    foundations = self.analyzer.find_foundations(5)

    # Mark overview nodes as explored
    for ep in entry_points:
        self.explored.add(ep["id"])
    for hs in hotspots:
        self.explored.add(hs["id"])
    for found in foundations:
        self.explored.add(found["id"])

    return {
        "total_nodes": self.graph.node_count,
        "total_edges": self.graph.edge_count,
        "entry_points": entry_points,
        "hotspots": hotspots,
        "modules": [
            {
                "module_id": m["module_id"],
                "size": m["size"],
                "key_nodes": m["key_nodes"],
                "cohesion": m["cohesion"],
            }
            for m in modules
        ],
        "foundations": foundations,
        "explored_count": len(self.explored),
    }

expand_node ¶

expand_node(node_id, depth=1)

Expand exploration from a specific node (Level 1+).

Uses BFS to discover nodes within the specified depth.

Parameters:

Name	Type	Description	Default
`node_id`	`str`	The node to expand from	required
`depth`	`int`	Number of hops to expand	`1`

Returns:

Type	Description
`dict[str, Any]`	Dictionary with discovered nodes, edges, and suggestions

Source code in src/code_context_agent/tools/graph/disclosure.py

def expand_node(self, node_id: str, depth: int = 1) -> dict[str, Any]:
    """Expand exploration from a specific node (Level 1+).

    Uses BFS to discover nodes within the specified depth.

    Args:
        node_id: The node to expand from
        depth: Number of hops to expand

    Returns:
        Dictionary with discovered nodes, edges, and suggestions
    """
    view = self.graph.get_view([EdgeType.CALLS, EdgeType.REFERENCES])

    if not view.has_node(node_id):
        return {"error": f"Node not found: {node_id}"}

    # BFS expansion
    try:
        distances = dict(nx.single_source_shortest_path_length(view, node_id, cutoff=depth))
    except nx.NetworkXError:
        distances = {node_id: 0}

    # Get the subgraph
    subgraph = view.subgraph(distances.keys())

    # Mark as explored
    self.explored.update(distances.keys())

    # Get node data
    discovered_nodes = []
    for n, dist in distances.items():
        node_data = self.graph.get_node_data(n) or {}
        discovered_nodes.append(
            {
                "id": n,
                "distance": dist,
                **node_data,
            },
        )

    # Get edges
    edges = [{"source": u, "target": v, **d} for u, v, d in subgraph.edges(data=True)]

    # Suggest next nodes to explore (high-degree nodes not yet explored)
    suggested_next = self._suggest_next_exploration(view, distances)

    return {
        "center": node_id,
        "depth": depth,
        "discovered_nodes": discovered_nodes,
        "edges": edges,
        "suggested_next": suggested_next,
        "explored_count": len(self.explored),
    }

expand_module ¶

expand_module(module_id)

Explore an entire detected module.

Parameters:

Name	Type	Description	Default
`module_id`	`int`	The module ID from detect_modules()	required

Returns:

Type	Description
`dict[str, Any]`	Dictionary with module details and internal structure

Source code in src/code_context_agent/tools/graph/disclosure.py

def expand_module(self, module_id: int) -> dict[str, Any]:
    """Explore an entire detected module.

    Args:
        module_id: The module ID from detect_modules()

    Returns:
        Dictionary with module details and internal structure
    """
    modules = self.analyzer.detect_modules()

    if module_id >= len(modules):
        return {"error": f"Module not found: {module_id}"}

    module = modules[module_id]
    members = module["members"]

    # Mark module members as explored
    self.explored.update(members)

    # Get internal structure
    view = self.graph.get_view([EdgeType.CALLS, EdgeType.REFERENCES])
    subgraph = view.subgraph(members)

    # Detailed node info
    nodes = []
    for n in members:
        node_data = self.graph.get_node_data(n) or {}
        in_deg = subgraph.in_degree(n)
        out_deg = subgraph.out_degree(n)
        nodes.append(
            {
                "id": n,
                "in_degree": in_deg,
                "out_degree": out_deg,
                **node_data,
            },
        )

    # Sort by degree (most connected first)
    nodes.sort(key=lambda x: x["in_degree"] + x["out_degree"], reverse=True)

    # Internal edges
    edges = [{"source": u, "target": v, **d} for u, v, d in subgraph.edges(data=True)]

    # External connections (edges to/from outside the module)
    external_in = []
    external_out = []
    for member in members:
        for pred in view.predecessors(member):
            if pred not in members:
                external_in.append({"from": pred, "to": member})
        for succ in view.successors(member):
            if succ not in members:
                external_out.append({"from": member, "to": succ})

    return {
        "module_id": module_id,
        "size": len(members),
        "key_nodes": module["key_nodes"],
        "cohesion": module["cohesion"],
        "nodes": nodes[:20],  # Limit to top 20 by degree
        "edges": edges,
        "external_incoming": external_in[:10],
        "external_outgoing": external_out[:10],
        "explored_count": len(self.explored),
    }

get_path_between ¶

get_path_between(source, target)

Find shortest path between two nodes.

Parameters:

Name	Type	Description	Default
`source`	`str`	Source node ID	required
`target`	`str`	Target node ID	required

Returns:

Type	Description
`dict[str, Any]`	Dictionary with path information

Source code in src/code_context_agent/tools/graph/disclosure.py

def get_path_between(self, source: str, target: str) -> dict[str, Any]:
    """Find shortest path between two nodes.

    Args:
        source: Source node ID
        target: Target node ID

    Returns:
        Dictionary with path information
    """
    view = self.graph.get_view()

    if not view.has_node(source):
        return {"error": f"Source node not found: {source}"}
    if not view.has_node(target):
        return {"error": f"Target node not found: {target}"}

    try:
        path = nx.shortest_path(view, source, target)
    except nx.NetworkXNoPath:
        return {"path": None, "message": "No path found between nodes"}

    # Mark path as explored
    self.explored.update(path)

    # Get node data along path
    path_nodes = []
    for i, n in enumerate(path):
        node_data = self.graph.get_node_data(n) or {}
        path_nodes.append({"id": n, "position": i, **node_data})

    # Get edges along path
    path_edges = []
    for i in range(len(path) - 1):
        edge_data = {}
        if view.has_edge(path[i], path[i + 1]):
            edge_data = dict(view[path[i]][path[i + 1]])
        path_edges.append({"source": path[i], "target": path[i + 1], **edge_data})

    return {
        "path": path,
        "length": len(path) - 1,
        "nodes": path_nodes,
        "edges": path_edges,
        "explored_count": len(self.explored),
    }

explore_category ¶

explore_category(category)

Explore all nodes in a business logic category.

Parameters:

Name	Type	Description	Default
`category`	`str`	Category to explore (e.g., "db", "auth", "http")	required

Returns:

Type	Description
`dict[str, Any]`	Dictionary with categorized nodes

Source code in src/code_context_agent/tools/graph/disclosure.py

def explore_category(self, category: str) -> dict[str, Any]:
    """Explore all nodes in a business logic category.

    Args:
        category: Category to explore (e.g., "db", "auth", "http")

    Returns:
        Dictionary with categorized nodes
    """
    matches = self.analyzer.find_clusters_by_category(category)

    # Mark as explored
    for m in matches:
        self.explored.add(m["id"])

    # Group by file
    by_file: dict[str, list[dict[str, Any]]] = {}
    for m in matches:
        file_path = m.get("file_path", "unknown")
        if file_path not in by_file:
            by_file[file_path] = []
        by_file[file_path].append(m)

    return {
        "category": category,
        "total_count": len(matches),
        "files_count": len(by_file),
        "by_file": [{"file": f, "matches": m, "count": len(m)} for f, m in by_file.items()],
        "explored_count": len(self.explored),
    }

get_exploration_status ¶

get_exploration_status()

Get the current exploration status.

Returns:

Type	Description
`dict[str, Any]`	Dictionary with exploration statistics

Source code in src/code_context_agent/tools/graph/disclosure.py

def get_exploration_status(self) -> dict[str, Any]:
    """Get the current exploration status.

    Returns:
        Dictionary with exploration statistics
    """
    total = self.graph.node_count
    explored = len(self.explored)

    return {
        "total_nodes": total,
        "explored_nodes": explored,
        "unexplored_nodes": total - explored,
        "coverage_percent": (explored / total * 100) if total > 0 else 0,
    }

reset_exploration ¶

reset_exploration()

Reset exploration state to start fresh.

Source code in src/code_context_agent/tools/graph/disclosure.py

def reset_exploration(self) -> None:
    """Reset exploration state to start fresh."""
    self.explored.clear()

CodeEdge ¶

Bases: FrozenModel

An edge in the code graph representing a relationship.

Attributes:

Name	Type	Description
`source`	`str`	Source node ID
`target`	`str`	Target node ID
`edge_type`	`EdgeType`	Classification of the relationship
`weight`	`float`	Edge weight for algorithms (default 1.0)
`metadata`	`dict[str, Any]`	Additional properties (line where relationship occurs, etc.)

to_dict ¶

to_dict()

Convert to dictionary for serialization.

Source code in src/code_context_agent/tools/graph/model.py

def to_dict(self) -> dict[str, Any]:
    """Convert to dictionary for serialization."""
    result = self.model_dump()
    result["edge_type"] = self.edge_type.value
    return result

CodeGraph ¶

CodeGraph()

Multi-layer code graph supporting multiple relationship types.

Wraps a NetworkX MultiDiGraph to support: - Multiple edge types between the same node pair - Node/edge attributes for metadata - Filtered views for specific relationship types

Initialize an empty code graph.

Source code in src/code_context_agent/tools/graph/model.py

def __init__(self) -> None:
    """Initialize an empty code graph."""
    self._graph: nx.MultiDiGraph = nx.MultiDiGraph()

node_count `property` ¶

node_count

Return the number of nodes.

edge_count `property` ¶

edge_count

Return the number of edges.

add_node ¶

add_node(node)

Add a node to the graph.

Parameters:

Name	Type	Description	Default
`node`	`CodeNode`	The CodeNode to add	required

Source code in src/code_context_agent/tools/graph/model.py

def add_node(self, node: CodeNode) -> None:
    """Add a node to the graph.

    Args:
        node: The CodeNode to add
    """
    self._graph.add_node(
        node.id,
        name=node.name,
        node_type=node.node_type.value,
        file_path=node.file_path,
        line_start=node.line_start,
        line_end=node.line_end,
        **node.metadata,
    )

add_edge ¶

add_edge(edge)

Add an edge to the graph.

Parameters:

Name	Type	Description	Default
`edge`	`CodeEdge`	The CodeEdge to add	required

Source code in src/code_context_agent/tools/graph/model.py

def add_edge(self, edge: CodeEdge) -> None:
    """Add an edge to the graph.

    Args:
        edge: The CodeEdge to add
    """
    self._graph.add_edge(
        edge.source,
        edge.target,
        key=edge.edge_type.value,
        edge_type=edge.edge_type.value,
        weight=edge.weight,
        **edge.metadata,
    )

has_node ¶

has_node(node_id)

Check if a node exists in the graph.

Source code in src/code_context_agent/tools/graph/model.py

def has_node(self, node_id: str) -> bool:
    """Check if a node exists in the graph."""
    return self._graph.has_node(node_id)

has_edge ¶

has_edge(source, target, edge_type=None)

Check if an edge exists in the graph.

Parameters:

Name	Type	Description	Default
`source`	`str`	Source node ID	required
`target`	`str`	Target node ID	required
`edge_type`	`EdgeType \| None`	Optional edge type to check for specifically	`None`

Source code in src/code_context_agent/tools/graph/model.py

def has_edge(self, source: str, target: str, edge_type: EdgeType | None = None) -> bool:
    """Check if an edge exists in the graph.

    Args:
        source: Source node ID
        target: Target node ID
        edge_type: Optional edge type to check for specifically
    """
    if edge_type is None:
        return self._graph.has_edge(source, target)
    return self._graph.has_edge(source, target, key=edge_type.value)

get_node_data ¶

get_node_data(node_id)

Get the data associated with a node.

Parameters:

Name	Type	Description	Default
`node_id`	`str`	The node ID to look up	required

Returns:

Type	Description
`dict[str, Any] \| None`	Dictionary of node attributes or None if not found

Source code in src/code_context_agent/tools/graph/model.py

def get_node_data(self, node_id: str) -> dict[str, Any] | None:
    """Get the data associated with a node.

    Args:
        node_id: The node ID to look up

    Returns:
        Dictionary of node attributes or None if not found
    """
    if not self._graph.has_node(node_id):
        return None
    return dict(self._graph.nodes[node_id])

get_nodes_by_type ¶

get_nodes_by_type(node_type)

Get all node IDs of a specific type.

Parameters:

Name	Type	Description	Default
`node_type`	`NodeType`	The type to filter by	required

Returns:

Type	Description
`list[str]`	List of node IDs matching the type

Source code in src/code_context_agent/tools/graph/model.py

def get_nodes_by_type(self, node_type: NodeType) -> list[str]:
    """Get all node IDs of a specific type.

    Args:
        node_type: The type to filter by

    Returns:
        List of node IDs matching the type
    """
    return [n for n, d in self._graph.nodes(data=True) if d.get("node_type") == node_type.value]

get_edges_by_type ¶

get_edges_by_type(edge_type)

Get all edges of a specific type.

Parameters:

Name	Type	Description	Default
`edge_type`	`EdgeType`	The type to filter by	required

Returns:

Type	Description
`list[tuple[str, str, dict[str, Any]]]`	List of (source, target, data) tuples

Source code in src/code_context_agent/tools/graph/model.py

def get_edges_by_type(self, edge_type: EdgeType) -> list[tuple[str, str, dict[str, Any]]]:
    """Get all edges of a specific type.

    Args:
        edge_type: The type to filter by

    Returns:
        List of (source, target, data) tuples
    """
    return [(u, v, d) for u, v, k, d in self._graph.edges(keys=True, data=True) if k == edge_type.value]

get_view ¶

get_view(edge_types=None)

Get a filtered view of the graph for analysis algorithms.

Creates a simple DiGraph (not Multi) with only the specified edge types. Multiple edges between the same nodes are aggregated by summing weights.

Parameters:

Name	Type	Description	Default
`edge_types`	`list[EdgeType] \| None`	List of edge types to include (None = all types)	`None`

Returns:

Type	Description
`DiGraph`	A NetworkX DiGraph suitable for analysis algorithms

Source code in src/code_context_agent/tools/graph/model.py

def get_view(self, edge_types: list[EdgeType] | None = None) -> nx.DiGraph:
    """Get a filtered view of the graph for analysis algorithms.

    Creates a simple DiGraph (not Multi) with only the specified edge types.
    Multiple edges between the same nodes are aggregated by summing weights.

    Args:
        edge_types: List of edge types to include (None = all types)

    Returns:
        A NetworkX DiGraph suitable for analysis algorithms
    """
    view = nx.DiGraph()

    # Copy all nodes with their attributes
    view.add_nodes_from(self._graph.nodes(data=True))

    # Filter and aggregate edges
    for u, v, k, d in self._graph.edges(keys=True, data=True):
        if edge_types is None or EdgeType(k) in edge_types:
            if view.has_edge(u, v):
                # Aggregate weights
                view[u][v]["weight"] += d.get("weight", 1.0)
            else:
                view.add_edge(u, v, weight=d.get("weight", 1.0), types=[k])

    return view

nodes ¶

nodes(data=False)

Return nodes, optionally with data.

Parameters:

Name	Type	Description	Default
`data`	`bool`	If True, return (node_id, data) tuples	`False`

Returns:

Type	Description
`Any`	Node view from underlying NetworkX graph

Source code in src/code_context_agent/tools/graph/model.py

def nodes(self, data: bool = False) -> Any:
    """Return nodes, optionally with data.

    Args:
        data: If True, return (node_id, data) tuples

    Returns:
        Node view from underlying NetworkX graph
    """
    return self._graph.nodes(data=data)

edges ¶

edges(data=False)

Return edges, optionally with data.

Parameters:

Name	Type	Description	Default
`data`	`bool`	If True, return (source, target, data) tuples	`False`

Returns:

Type	Description
`Any`	Edge view from underlying NetworkX graph

Source code in src/code_context_agent/tools/graph/model.py

def edges(self, data: bool = False) -> Any:
    """Return edges, optionally with data.

    Args:
        data: If True, return (source, target, data) tuples

    Returns:
        Edge view from underlying NetworkX graph
    """
    return self._graph.edges(data=data)

to_node_link_data ¶

to_node_link_data()

Export graph as node-link JSON format.

Returns:

Type	Description
`dict[str, Any]`	Dictionary suitable for JSON serialization

Source code in src/code_context_agent/tools/graph/model.py

def to_node_link_data(self) -> dict[str, Any]:
    """Export graph as node-link JSON format.

    Returns:
        Dictionary suitable for JSON serialization
    """
    return nx.node_link_data(self._graph)

from_node_link_data `classmethod` ¶

from_node_link_data(data)

Create a CodeGraph from node-link JSON format.

Handles both old NetworkX format ("links" key) and new 3.6+ format ("edges" key) for backward compatibility with saved graphs.

Parameters:

Name	Type	Description	Default
`data`	`dict[str, Any]`	Dictionary from node_link_data or JSON	required

Returns:

Type	Description
`CodeGraph`	New CodeGraph instance

Source code in src/code_context_agent/tools/graph/model.py

@classmethod
def from_node_link_data(cls, data: dict[str, Any]) -> "CodeGraph":
    """Create a CodeGraph from node-link JSON format.

    Handles both old NetworkX format ("links" key) and new 3.6+
    format ("edges" key) for backward compatibility with saved graphs.

    Args:
        data: Dictionary from node_link_data or JSON

    Returns:
        New CodeGraph instance
    """
    graph = cls()
    # Handle both old ("links") and new ("edges") format
    if "links" in data and "edges" not in data:
        graph._graph = nx.node_link_graph(data, edges="links")
    else:
        graph._graph = nx.node_link_graph(data)
    return graph

describe ¶

describe()

Get a quick summary of the graph.

Returns:

Type	Description
`dict[str, Any]`	Dictionary with node count, edge count, type distributions, and density.

Source code in src/code_context_agent/tools/graph/model.py

def describe(self) -> dict[str, Any]:
    """Get a quick summary of the graph.

    Returns:
        Dictionary with node count, edge count, type distributions, and density.
    """
    node_types: dict[str, int] = {}
    for _, data in self._graph.nodes(data=True):
        nt = data.get("node_type", "unknown")
        node_types[nt] = node_types.get(nt, 0) + 1

    edge_types: dict[str, int] = {}
    for _, _, k, _ in self._graph.edges(keys=True, data=True):
        edge_types[k] = edge_types.get(k, 0) + 1

    return {
        "node_count": self.node_count,
        "edge_count": self.edge_count,
        "node_types": node_types,
        "edge_types": edge_types,
        "density": nx.density(self._graph),
    }

CodeNode ¶

Bases: FrozenModel

A node in the code graph representing a code element.

Attributes:

Name	Type	Description
`id`	`str`	Unique identifier (typically "file_path:symbol_name" or "file_path:line")
`name`	`str`	Human-readable display name
`node_type`	`NodeType`	Classification of the code element
`file_path`	`str`	Absolute path to the source file
`line_start`	`int`	Starting line number (0-indexed)
`line_end`	`int`	Ending line number (0-indexed)
`metadata`	`dict[str, Any]`	Additional properties (docstring, visibility, rule_id, etc.)

to_dict ¶

to_dict()

Convert to dictionary for serialization.

Source code in src/code_context_agent/tools/graph/model.py

def to_dict(self) -> dict[str, Any]:
    """Convert to dictionary for serialization."""
    result = self.model_dump()
    result["node_type"] = self.node_type.value
    return result

EdgeType ¶

Bases: Enum

Types of relationships between code elements.

NodeType ¶

Bases: Enum

Types of nodes in the code graph.

code_graph_analyze ¶

code_graph_analyze(
    graph_id,
    analysis_type,
    top_k=10,
    node_a="",
    node_b="",
    resolution=1.0,
    category="",
)

Run graph algorithms to surface structural insights about the codebase.

USE THIS TOOL: - After populating graph with code_graph_ingest_* tools - To find important code that isn't obvious from file names - To understand code relationships and architecture

DO NOT USE: - On an empty graph (ingest data first) - For simple lookups (use code_graph_explore instead)

Analysis types provide different perspectives:

Centrality (finds important code): - "hotspots": Betweenness centrality. Finds bottleneck code that many paths go through. High score = integration point, likely to cause cascading changes. Use for: risk assessment, refactoring targets. - "foundations": PageRank. Finds core infrastructure that other important code depends on. High score = foundational code. Use for: understanding dependencies, documentation priority. - "entry_points": Nodes with no incoming edges but outgoing calls. These start execution flows. Use for: understanding app structure.

Clustering (finds groupings): - "modules": Louvain community detection. Finds densely connected groups = logical modules/layers. Use for: architecture diagrams, understanding boundaries.

Relationships (between specific nodes): - "coupling": Measures how tightly two nodes are connected. Use for: understanding change impact, identifying tight coupling. - "similar": Personalized PageRank from a node. Finds related code. Use for: understanding a node's neighborhood. - "dependencies": BFS from a node. Shows what it depends on. Use for: understanding impact of changes.

Filtering: - "category": Finds all nodes in a business logic category. Use for: focused analysis on db/auth/validation/etc.

Code Health: - "unused_symbols": Finds functions/classes/methods with zero cross-file references. Dead code candidates. Use category param for node type filter. - "refactoring": Combines clone detection, code smells, and unused symbols into ranked refactoring opportunities.

Parameters:

Name	Type	Description	Default
`graph_id`	`str`	ID of the graph to analyze (must have data from ingestion)	required
`analysis_type`	`str`	Algorithm to run. One of: - "hotspots": Returns ranked list by betweenness score - "foundations": Returns ranked list by PageRank score - "entry_points": Returns list of entry point nodes - "modules": Returns list of detected modules with members - "coupling": Returns coupling metrics (requires node_a, node_b) - "similar": Returns similar nodes (requires node_a) - "category": Returns nodes in category (requires category) - "dependencies": Returns dependency chain (requires node_a) - "trust": TrustRank-based foundations (noise-resistant PageRank from entry points) - "triangles": Find tightly-coupled code triads - "unused_symbols": Dead code detection (zero cross-file references) - "refactoring": Combined refactoring opportunity ranking	required
`top_k`	`int`	Maximum results for ranked analyses (hotspots, foundations, similar). Default 10. Use 20-30 for comprehensive analysis.	`10`
`node_a`	`str`	Required for "coupling", "similar", "dependencies". Node ID format: "file_path:symbol_name"	`''`
`node_b`	`str`	Required for "coupling" analysis. Second node to compare.	`''`
`resolution`	`float`	For "modules" only. Controls cluster granularity: - < 1.0: Fewer, larger clusters (e.g., 0.5 for high-level layers) - = 1.0: Default clustering - > 1.0: More, smaller clusters (e.g., 1.5 for fine-grained)	`1.0`
`category`	`str`	Required for "category" analysis. Category name from AST-grep rule packs: "db", "auth", "http", "validation", etc.	`''`

Returns:

Name	Type	Description
	`str`	JSON with analysis results. Format varies by type:
	`str`	hotspots/foundations:
	`str`	{"results": [{"id": "...", "score": 0.85, "name": "...", ...}]}
`modules`	`str`
	`str`	{"module_count": 5, "results": [
	`str`	]}
`coupling`	`str`
	`str`	{"results": {"coupling": 2.5, "shared_neighbors": 3, "path_length": 2}}

Output Size: 1-10KB depending on top_k and analysis type

Workflow Examples:

Find bottleneck code

hotspots = code_graph_analyze("main", "hotspots", top_k=15)

Results ranked by betweenness - top items are integration points¶

Detect architecture layers

modules = code_graph_analyze("main", "modules", resolution=0.8)

Each module is a logical grouping - name based on key_nodes¶

Understand coupling

coupling = code_graph_analyze("main", "coupling", node_a="src/api.py:handler", node_b="src/db.py:repository")

High coupling score = tightly connected, changes propagate¶

Find all database operations

db_ops = code_graph_analyze("main", "category", category="db")

Returns all nodes tagged as "db" from AST-grep ingestion¶

Source code in src/code_context_agent/tools/graph/tools.py

@tool
def code_graph_analyze(  # noqa: C901
    graph_id: str,
    analysis_type: str,
    top_k: int = 10,
    node_a: str = "",
    node_b: str = "",
    resolution: float = 1.0,
    category: str = "",
) -> str:
    """Run graph algorithms to surface structural insights about the codebase.

    USE THIS TOOL:
    - After populating graph with code_graph_ingest_* tools
    - To find important code that isn't obvious from file names
    - To understand code relationships and architecture

    DO NOT USE:
    - On an empty graph (ingest data first)
    - For simple lookups (use code_graph_explore instead)

    Analysis types provide different perspectives:

    **Centrality (finds important code):**
    - "hotspots": Betweenness centrality. Finds bottleneck code that many
      paths go through. High score = integration point, likely to cause
      cascading changes. Use for: risk assessment, refactoring targets.
    - "foundations": PageRank. Finds core infrastructure that other
      important code depends on. High score = foundational code.
      Use for: understanding dependencies, documentation priority.
    - "entry_points": Nodes with no incoming edges but outgoing calls.
      These start execution flows. Use for: understanding app structure.

    **Clustering (finds groupings):**
    - "modules": Louvain community detection. Finds densely connected
      groups = logical modules/layers. Use for: architecture diagrams,
      understanding boundaries.

    **Relationships (between specific nodes):**
    - "coupling": Measures how tightly two nodes are connected.
      Use for: understanding change impact, identifying tight coupling.
    - "similar": Personalized PageRank from a node. Finds related code.
      Use for: understanding a node's neighborhood.
    - "dependencies": BFS from a node. Shows what it depends on.
      Use for: understanding impact of changes.

    **Filtering:**
    - "category": Finds all nodes in a business logic category.
      Use for: focused analysis on db/auth/validation/etc.

    **Code Health:**
    - "unused_symbols": Finds functions/classes/methods with zero cross-file
      references. Dead code candidates. Use category param for node type filter.
    - "refactoring": Combines clone detection, code smells, and unused symbols
      into ranked refactoring opportunities.

    Args:
        graph_id: ID of the graph to analyze (must have data from ingestion)
        analysis_type: Algorithm to run. One of:
            - "hotspots": Returns ranked list by betweenness score
            - "foundations": Returns ranked list by PageRank score
            - "entry_points": Returns list of entry point nodes
            - "modules": Returns list of detected modules with members
            - "coupling": Returns coupling metrics (requires node_a, node_b)
            - "similar": Returns similar nodes (requires node_a)
            - "category": Returns nodes in category (requires category)
            - "dependencies": Returns dependency chain (requires node_a)
            - "trust": TrustRank-based foundations (noise-resistant PageRank from entry points)
            - "triangles": Find tightly-coupled code triads
            - "unused_symbols": Dead code detection (zero cross-file references)
            - "refactoring": Combined refactoring opportunity ranking
        top_k: Maximum results for ranked analyses (hotspots, foundations,
            similar). Default 10. Use 20-30 for comprehensive analysis.
        node_a: Required for "coupling", "similar", "dependencies".
            Node ID format: "file_path:symbol_name"
        node_b: Required for "coupling" analysis. Second node to compare.
        resolution: For "modules" only. Controls cluster granularity:
            - < 1.0: Fewer, larger clusters (e.g., 0.5 for high-level layers)
            - = 1.0: Default clustering
            - > 1.0: More, smaller clusters (e.g., 1.5 for fine-grained)
        category: Required for "category" analysis. Category name from
            AST-grep rule packs: "db", "auth", "http", "validation", etc.

    Returns:
        JSON with analysis results. Format varies by type:

        hotspots/foundations:
        {"results": [{"id": "...", "score": 0.85, "name": "...", ...}]}

        modules:
        {"module_count": 5, "results": [
            {"module_id": 0, "size": 15, "key_nodes": [...], "cohesion": 0.8}
        ]}

        coupling:
        {"results": {"coupling": 2.5, "shared_neighbors": 3, "path_length": 2}}

    Output Size: 1-10KB depending on top_k and analysis type

    Workflow Examples:

    Find bottleneck code:
        hotspots = code_graph_analyze("main", "hotspots", top_k=15)
        # Results ranked by betweenness - top items are integration points

    Detect architecture layers:
        modules = code_graph_analyze("main", "modules", resolution=0.8)
        # Each module is a logical grouping - name based on key_nodes

    Understand coupling:
        coupling = code_graph_analyze("main", "coupling",
                                       node_a="src/api.py:handler",
                                       node_b="src/db.py:repository")
        # High coupling score = tightly connected, changes propagate

    Find all database operations:
        db_ops = code_graph_analyze("main", "category", category="db")
        # Returns all nodes tagged as "db" from AST-grep ingestion
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    analyzer = CodeAnalyzer(graph)

    if analysis_type == "hotspots":
        results = analyzer.find_hotspots(top_k)
        return _json_response({"status": "success", "analysis": "hotspots", "results": results})

    if analysis_type == "foundations":
        results = analyzer.find_foundations(top_k)
        return _json_response({"status": "success", "analysis": "foundations", "results": results})

    if analysis_type == "entry_points":
        results = analyzer.find_entry_points()
        return _json_response({"status": "success", "analysis": "entry_points", "results": results})

    if analysis_type == "modules":
        results = analyzer.detect_modules(resolution)
        return _json_response(
            {"status": "success", "analysis": "modules", "module_count": len(results), "results": results},
        )

    if analysis_type == "coupling":
        if not node_a or not node_b:
            return _json_response({"status": "error", "message": "node_a and node_b required for coupling"})
        results = analyzer.calculate_coupling(node_a, node_b)
        return _json_response({"status": "success", "analysis": "coupling", "results": results})

    if analysis_type == "similar":
        if not node_a:
            return _json_response({"status": "error", "message": "node_a required for similar analysis"})
        results = analyzer.get_similar_nodes(node_a, top_k)
        return _json_response({"status": "success", "analysis": "similar", "results": results})

    if analysis_type == "category":
        if not category:
            return _json_response({"status": "error", "message": "category required for category analysis"})
        results = analyzer.find_clusters_by_category(category)
        return _json_response({"status": "success", "analysis": "category", "category": category, "results": results})

    if analysis_type == "dependencies":
        if not node_a:
            return _json_response({"status": "error", "message": "node_a required for dependencies"})
        direction = "outgoing"  # What does this node depend on
        results = analyzer.get_dependency_chain(node_a, direction)
        return _json_response({"status": "success", "analysis": "dependencies", "results": results})

    if analysis_type == "trust":
        results = analyzer.find_trusted_foundations(top_k=top_k)
        return _json_response({"status": "success", "analysis": "trust", "results": results})

    if analysis_type == "triangles":
        results = analyzer.find_triangles(top_k=top_k)
        return _json_response({"status": "success", "analysis": "triangles", "results": results})

    if analysis_type == "unused_symbols":
        node_type_filter = category.split(",") if category else None
        results = analyzer.find_unused_symbols(node_types=node_type_filter)
        return _json_response(
            {
                "status": "success",
                "analysis": "unused_symbols",
                "results": results,
                "count": len(results),
            },
        )

    if analysis_type == "refactoring":
        results = analyzer.find_refactoring_candidates(top_k=top_k)
        return _json_response(
            {
                "status": "success",
                "analysis": "refactoring",
                "results": results,
                "count": len(results),
            },
        )

    return _json_response({"status": "error", "message": f"Unknown analysis_type: {analysis_type}"})

code_graph_create ¶

code_graph_create(graph_id, description='')

Initialize an empty code graph for structural analysis of a codebase.

USE THIS TOOL: - At the start of analysis, BEFORE running LSP/AST-grep tools - When you need to unify results from multiple discovery tools - When you want to run graph algorithms (hotspots, modules, coupling)

DO NOT USE: - If a graph with this ID already exists (will overwrite it) - For simple single-file analysis (use LSP tools directly)

The graph is stored in memory for the session. Populate it using: - code_graph_ingest_lsp: Add symbols, references, definitions from LSP - code_graph_ingest_astgrep: Add business logic patterns - code_graph_ingest_tests: Add test coverage relationships

Parameters:

Name	Type	Description	Default
`graph_id`	`str`	Unique identifier for this graph. Use descriptive names: - "main": Primary analysis graph for the whole codebase - "feature_auth": Graph focused on authentication code - "module_api": Graph for API layer only	required
`description`	`str`	Human-readable description of what this graph represents. Helps when managing multiple graphs.	`''`

Returns:

Name	Type	Description
`JSON`	`str`	{"status": "success", "graph_id": "...", "message": "..."}

Output Size: ~100 bytes

Workflow

code_graph_create("main") # Initialize
lsp_start(...) + lsp_document_symbols(...) # Discover
code_graph_ingest_lsp(...) # Populate
code_graph_analyze("main", "hotspots") # Analyze
code_graph_save("main", ".code-context/graph.json") # Persist

Example

code_graph_create("main", "Full codebase analysis") code_graph_create("backend", "Backend services only")

Source code in src/code_context_agent/tools/graph/tools.py

@tool
def code_graph_create(
    graph_id: str,
    description: str = "",
) -> str:
    """Initialize an empty code graph for structural analysis of a codebase.

    USE THIS TOOL:
    - At the start of analysis, BEFORE running LSP/AST-grep tools
    - When you need to unify results from multiple discovery tools
    - When you want to run graph algorithms (hotspots, modules, coupling)

    DO NOT USE:
    - If a graph with this ID already exists (will overwrite it)
    - For simple single-file analysis (use LSP tools directly)

    The graph is stored in memory for the session. Populate it using:
    - code_graph_ingest_lsp: Add symbols, references, definitions from LSP
    - code_graph_ingest_astgrep: Add business logic patterns
    - code_graph_ingest_tests: Add test coverage relationships

    Args:
        graph_id: Unique identifier for this graph. Use descriptive names:
            - "main": Primary analysis graph for the whole codebase
            - "feature_auth": Graph focused on authentication code
            - "module_api": Graph for API layer only
        description: Human-readable description of what this graph represents.
            Helps when managing multiple graphs.

    Returns:
        JSON: {"status": "success", "graph_id": "...", "message": "..."}

    Output Size: ~100 bytes

    Workflow:
        1. code_graph_create("main")           # Initialize
        2. lsp_start(...) + lsp_document_symbols(...)  # Discover
        3. code_graph_ingest_lsp(...)          # Populate
        4. code_graph_analyze("main", "hotspots")  # Analyze
        5. code_graph_save("main", ".code-context/graph.json")  # Persist

    Example:
        code_graph_create("main", "Full codebase analysis")
        code_graph_create("backend", "Backend services only")
    """
    _graphs[graph_id] = CodeGraph()
    # Reset explorer for this graph
    _explorers.pop(graph_id, None)

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "description": description,
            "message": f"Created new code graph: {graph_id}",
        },
    )

code_graph_explore ¶

code_graph_explore(
    graph_id,
    action,
    node_id="",
    module_id=-1,
    target_node="",
    depth=1,
    category="",
)

Progressively explore the code graph to build context incrementally.

USE THIS TOOL: - ALWAYS start with "overview" action first - When you need to understand the codebase step by step - To get suggestions on where to explore next - To track what you've already explored

DO NOT USE: - For running analysis algorithms (use code_graph_analyze instead) - On an empty graph (ingest data first)

Progressive disclosure pattern: 1. "overview" → Get entry points, hotspots, modules, foundations 2. Pick interesting nodes from overview 3. "expand_node" → See neighbors and relationships 4. Repeat until sufficient context is gathered

The explorer tracks visited nodes and suggests what to explore next.

Actions:

Starting point: - "overview": Returns high-level structure. Includes: - entry_points: Where execution starts - hotspots: Bottleneck code (top 5) - modules: Detected clusters with key nodes - foundations: Core infrastructure (top 5) Always start here to orient yourself.

Drill-down: - "expand_node": BFS expansion from a node. See immediate neighbors and their relationships. Good for understanding a specific area. - "expand_module": Deep-dive into a detected module. Shows internal structure and external connections. - "category": Explore all nodes in a business logic category. Groups results by file.

Navigation: - "path": Find shortest path between two nodes. Useful for understanding how components connect. - "status": Check exploration coverage (% of nodes visited). - "reset": Clear exploration state to start fresh.

Parameters:

Name	Type	Description	Default
`graph_id`	`str`	ID of the graph to explore (must have data from ingestion)	required
`action`	`str`	Exploration action. One of: - "overview": No additional params needed - "expand_node": Requires node_id, optional depth - "expand_module": Requires module_id (from overview/modules analysis) - "path": Requires node_id (source) and target_node - "category": Requires category (e.g., "db", "auth") - "status": No additional params - "reset": No additional params	required
`node_id`	`str`	For "expand_node": Node ID to expand from. For "path": Source node. Format: "file_path:symbol_name"	`''`
`module_id`	`int`	For "expand_module": Module ID from detect_modules results. Typically 0, 1, 2, etc. from the overview.	`-1`
`target_node`	`str`	For "path": Destination node ID.	`''`
`depth`	`int`	For "expand_node": How many hops to expand. - depth=1: Direct neighbors only (fast, focused) - depth=2: Neighbors of neighbors (broader context) - depth=3+: Rarely needed, can be large	`1`
`category`	`str`	For "category": Business logic category name. Values from AST-grep: "db", "auth", "http", "validation", etc.	`''`

Returns:

Name	Type	Description
	`str`	JSON with exploration results. Always includes "explored_count".
`overview`	`str`
	`str`	{ "entry_points": [...], "hotspots": [...], "modules": [{"module_id": 0, "size": 15, "key_nodes": [...]}], "foundations": [...], "explored_count": 25
	`str`	}
`expand_node`	`str`
	`str`	{ "center": "src/api.py:handler", "discovered_nodes": [...], "edges": [...], "suggested_next": [...], # What to explore next "explored_count": 40
	`str`	}

Output Size: 2-20KB depending on action and graph size

Workflow Example:

1. Start with overview¶

overview = code_graph_explore("main", "overview")

Look at entry_points and hotspots¶

2. Expand from interesting hotspot¶

details = code_graph_explore("main", "expand_node", node_id=overview["hotspots"][0]["id"], depth=2)

See neighbors and suggested_next¶

3. Explore a module¶

module_details = code_graph_explore("main", "expand_module", module_id=0)

See internal structure and external connections¶

4. Check coverage¶

status = code_graph_explore("main", "status")

coverage_percent shows how much of graph was explored¶

Source code in src/code_context_agent/tools/graph/tools.py

@tool
def code_graph_explore(  # noqa: C901
    graph_id: str,
    action: str,
    node_id: str = "",
    module_id: int = -1,
    target_node: str = "",
    depth: int = 1,
    category: str = "",
) -> str:
    """Progressively explore the code graph to build context incrementally.

    USE THIS TOOL:
    - ALWAYS start with "overview" action first
    - When you need to understand the codebase step by step
    - To get suggestions on where to explore next
    - To track what you've already explored

    DO NOT USE:
    - For running analysis algorithms (use code_graph_analyze instead)
    - On an empty graph (ingest data first)

    Progressive disclosure pattern:
    1. "overview" → Get entry points, hotspots, modules, foundations
    2. Pick interesting nodes from overview
    3. "expand_node" → See neighbors and relationships
    4. Repeat until sufficient context is gathered

    The explorer tracks visited nodes and suggests what to explore next.

    Actions:

    **Starting point:**
    - "overview": Returns high-level structure. Includes:
      - entry_points: Where execution starts
      - hotspots: Bottleneck code (top 5)
      - modules: Detected clusters with key nodes
      - foundations: Core infrastructure (top 5)
      Always start here to orient yourself.

    **Drill-down:**
    - "expand_node": BFS expansion from a node. See immediate neighbors
      and their relationships. Good for understanding a specific area.
    - "expand_module": Deep-dive into a detected module. Shows internal
      structure and external connections.
    - "category": Explore all nodes in a business logic category.
      Groups results by file.

    **Navigation:**
    - "path": Find shortest path between two nodes. Useful for
      understanding how components connect.
    - "status": Check exploration coverage (% of nodes visited).
    - "reset": Clear exploration state to start fresh.

    Args:
        graph_id: ID of the graph to explore (must have data from ingestion)
        action: Exploration action. One of:
            - "overview": No additional params needed
            - "expand_node": Requires node_id, optional depth
            - "expand_module": Requires module_id (from overview/modules analysis)
            - "path": Requires node_id (source) and target_node
            - "category": Requires category (e.g., "db", "auth")
            - "status": No additional params
            - "reset": No additional params
        node_id: For "expand_node": Node ID to expand from.
            For "path": Source node. Format: "file_path:symbol_name"
        module_id: For "expand_module": Module ID from detect_modules results.
            Typically 0, 1, 2, etc. from the overview.
        target_node: For "path": Destination node ID.
        depth: For "expand_node": How many hops to expand.
            - depth=1: Direct neighbors only (fast, focused)
            - depth=2: Neighbors of neighbors (broader context)
            - depth=3+: Rarely needed, can be large
        category: For "category": Business logic category name.
            Values from AST-grep: "db", "auth", "http", "validation", etc.

    Returns:
        JSON with exploration results. Always includes "explored_count".

        overview:
        {
            "entry_points": [...],
            "hotspots": [...],
            "modules": [{"module_id": 0, "size": 15, "key_nodes": [...]}],
            "foundations": [...],
            "explored_count": 25
        }

        expand_node:
        {
            "center": "src/api.py:handler",
            "discovered_nodes": [...],
            "edges": [...],
            "suggested_next": [...],  # What to explore next
            "explored_count": 40
        }

    Output Size: 2-20KB depending on action and graph size

    Workflow Example:

    # 1. Start with overview
    overview = code_graph_explore("main", "overview")
    # Look at entry_points and hotspots

    # 2. Expand from interesting hotspot
    details = code_graph_explore("main", "expand_node",
                                  node_id=overview["hotspots"][0]["id"],
                                  depth=2)
    # See neighbors and suggested_next

    # 3. Explore a module
    module_details = code_graph_explore("main", "expand_module", module_id=0)
    # See internal structure and external connections

    # 4. Check coverage
    status = code_graph_explore("main", "status")
    # coverage_percent shows how much of graph was explored
    """
    explorer = _get_explorer(graph_id)
    if explorer is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    if action == "overview":
        results = explorer.get_overview()
        return _json_response({"status": "success", "action": "overview", **results})

    if action == "expand_node":
        if not node_id:
            return _json_response({"status": "error", "message": "node_id required for expand_node"})
        results = explorer.expand_node(node_id, depth)
        return _json_response({"status": "success", "action": "expand_node", **results})

    if action == "expand_module":
        if module_id < 0:
            return _json_response({"status": "error", "message": "module_id required for expand_module"})
        results = explorer.expand_module(module_id)
        return _json_response({"status": "success", "action": "expand_module", **results})

    if action == "path":
        if not node_id or not target_node:
            return _json_response({"status": "error", "message": "node_id and target_node required for path"})
        results = explorer.get_path_between(node_id, target_node)
        return _json_response({"status": "success", "action": "path", **results})

    if action == "category":
        if not category:
            return _json_response({"status": "error", "message": "category required for category exploration"})
        results = explorer.explore_category(category)
        return _json_response({"status": "success", "action": "category", **results})

    if action == "status":
        results = explorer.get_exploration_status()
        return _json_response({"status": "success", "action": "status", **results})

    if action == "reset":
        explorer.reset_exploration()
        return _json_response({"status": "success", "action": "reset", "message": "Exploration state reset"})

    return _json_response({"status": "error", "message": f"Unknown action: {action}"})

code_graph_export ¶

code_graph_export(
    graph_id,
    format="json",
    include_metadata=True,
    max_nodes=100,
)

Export the code graph for visualization or external analysis.

USE THIS TOOL: - To generate Mermaid diagrams for CONTEXT.md architecture section - To save graph data for external visualization tools - After analysis, to capture the graph structure

DO NOT USE: - For persistence (use code_graph_save instead) - On empty graphs (ingest data first)

Export formats:

"mermaid" (recommended for documentation): Generates Mermaid diagram syntax that can be embedded in markdown. - Selects top nodes by degree (most connected = most important) - Uses shapes based on node type: - [name]: Classes (rectangles) - (name): Functions/methods (rounded) - [[name]]: Files (stadium shape) - Edge styles by relationship: - → : calls - -.-> : imports - ==> : inherits

"json" (for external tools): NetworkX node-link format. Can be loaded into other graph tools.

Parameters:

Name	Type	Description	Default
`graph_id`	`str`	ID of the graph to export (must exist)	required
`format`	`str`	Export format: - "mermaid": Mermaid diagram syntax (for markdown embedding) - "json": NetworkX node-link JSON (for external tools)	`'json'`
`include_metadata`	`bool`	For "json" format only. Whether to include node/edge metadata (file_path, line numbers, etc.). Set False for smaller output.	`True`
`max_nodes`	`int`	For "mermaid" format only. Maximum nodes to include. Mermaid diagrams become unreadable with too many nodes. Recommended: 15 for CONTEXT.md, up to 50 for detailed diagrams. Nodes are selected by degree (most connected first).	`100`

Returns:

Type	Description
`str`	For "mermaid":
`str`	{ "status": "success", "format": "mermaid", "diagram": "graph TD\n node1[Name] → node2..."
`str`	}
`str`	For "json":
`str`	{ "status": "success", "format": "json", "graph": {"nodes": [...], "links": [...]}
`str`	}

Output Size

mermaid: 1-5KB (limited by max_nodes)
json: 10-500KB (depends on graph size)

Workflow Example:

Export for CONTEXT.md architecture diagram¶

result = code_graph_export("main", format="mermaid", max_nodes=15) mermaid_code = result["diagram"]

Embed in markdown:¶

```mermaid¶

{mermaid_code}¶

```¶

Export for external visualization¶

result = code_graph_export("main", format="json", include_metadata=True)

Use with Gephi, D3.js, etc.¶

Source code in src/code_context_agent/tools/graph/tools.py

@tool
def code_graph_export(
    graph_id: str,
    format: str = "json",
    include_metadata: bool = True,
    max_nodes: int = 100,
) -> str:
    """Export the code graph for visualization or external analysis.

    USE THIS TOOL:
    - To generate Mermaid diagrams for CONTEXT.md architecture section
    - To save graph data for external visualization tools
    - After analysis, to capture the graph structure

    DO NOT USE:
    - For persistence (use code_graph_save instead)
    - On empty graphs (ingest data first)

    Export formats:

    **"mermaid"** (recommended for documentation):
    Generates Mermaid diagram syntax that can be embedded in markdown.
    - Selects top nodes by degree (most connected = most important)
    - Uses shapes based on node type:
      - [name]: Classes (rectangles)
      - (name): Functions/methods (rounded)
      - [[name]]: Files (stadium shape)
    - Edge styles by relationship:
      - --> : calls
      - -.-> : imports
      - ==> : inherits

    **"json"** (for external tools):
    NetworkX node-link format. Can be loaded into other graph tools.

    Args:
        graph_id: ID of the graph to export (must exist)
        format: Export format:
            - "mermaid": Mermaid diagram syntax (for markdown embedding)
            - "json": NetworkX node-link JSON (for external tools)
        include_metadata: For "json" format only. Whether to include
            node/edge metadata (file_path, line numbers, etc.).
            Set False for smaller output.
        max_nodes: For "mermaid" format only. Maximum nodes to include.
            Mermaid diagrams become unreadable with too many nodes.
            Recommended: 15 for CONTEXT.md, up to 50 for detailed diagrams.
            Nodes are selected by degree (most connected first).

    Returns:
        For "mermaid":
        {
            "status": "success",
            "format": "mermaid",
            "diagram": "graph TD\\n    node1[Name] --> node2..."
        }

        For "json":
        {
            "status": "success",
            "format": "json",
            "graph": {"nodes": [...], "links": [...]}
        }

    Output Size:
        - mermaid: 1-5KB (limited by max_nodes)
        - json: 10-500KB (depends on graph size)

    Workflow Example:

    # Export for CONTEXT.md architecture diagram
    result = code_graph_export("main", format="mermaid", max_nodes=15)
    mermaid_code = result["diagram"]
    # Embed in markdown:
    # ```mermaid
    # {mermaid_code}
    # ```

    # Export for external visualization
    result = code_graph_export("main", format="json", include_metadata=True)
    # Use with Gephi, D3.js, etc.
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    if format == "json":
        data = graph.to_node_link_data()
        if not include_metadata:
            # Strip metadata
            for node in data.get("nodes", []):
                node.pop("metadata", None)
            for link in data.get("links", []):
                link.pop("metadata", None)
        return _json_response({"status": "success", "format": "json", "graph": data})

    if format == "mermaid":
        mermaid = _export_mermaid(graph, max_nodes)
        return _json_response({"status": "success", "format": "mermaid", "diagram": mermaid})

    return _json_response({"status": "error", "message": f"Unknown format: {format}"})

code_graph_ingest_astgrep ¶

code_graph_ingest_astgrep(
    graph_id, astgrep_result, result_type="rule_pack"
)

Add AST-grep pattern matches to the graph as categorized business logic nodes.

USE THIS TOOL: - After running astgrep_scan_rule_pack to add business logic patterns - After running astgrep_scan for custom pattern matches - When you want graph analysis to consider business logic categories

DO NOT USE: - Before code_graph_create (graph must exist first) - With empty AST-grep results (check match count first)

AST-grep matches become nodes with rich metadata: - category: "db", "auth", "http", "validation", etc. - severity: "error" (writes), "warning" (reads), "hint" (definitions) - rule_id: The specific pattern that matched

This metadata enables category-based analysis: - code_graph_analyze("main", "category", category="db") - code_graph_explore("main", "category", category="auth")

Parameters:

Name	Type	Description	Default
`graph_id`	`str`	ID of the target graph (must exist from code_graph_create)	required
`astgrep_result`	`str`	The raw JSON string output from astgrep_scan or astgrep_scan_rule_pack. Pass the exact return value.	required
`result_type`	`str`	Source of the AST-grep result: - "rule_pack" (default): From astgrep_scan_rule_pack. Results include category, severity, rule_id metadata. Use this for business logic detection. - "scan": From astgrep_scan ad-hoc patterns. Results have pattern info but no category metadata.	`'rule_pack'`

Returns:

Type	Description
`str`	JSON with ingestion results:
`str`	{ "status": "success", "nodes_added": 25, "categories": ["db", "auth", "validation"], "total_nodes": 175
`str`	}

Output Size: ~300 bytes

Common Errors

"Graph not found": Call code_graph_create first
"Invalid JSON": AST-grep result is malformed

Workflow Example

Run rule pack for Python business logic¶

matches = astgrep_scan_rule_pack("py_business_logic", repo_path)

Ingest into graph¶

code_graph_ingest_astgrep("main", matches, "rule_pack")

Now analyze by category¶

db_ops = code_graph_analyze("main", "category", category="db") auth_ops = code_graph_analyze("main", "category", category="auth")

Source code in src/code_context_agent/tools/graph/tools.py

@tool
def code_graph_ingest_astgrep(
    graph_id: str,
    astgrep_result: str,
    result_type: str = "rule_pack",
) -> str:
    """Add AST-grep pattern matches to the graph as categorized business logic nodes.

    USE THIS TOOL:
    - After running astgrep_scan_rule_pack to add business logic patterns
    - After running astgrep_scan for custom pattern matches
    - When you want graph analysis to consider business logic categories

    DO NOT USE:
    - Before code_graph_create (graph must exist first)
    - With empty AST-grep results (check match count first)

    AST-grep matches become nodes with rich metadata:
    - category: "db", "auth", "http", "validation", etc.
    - severity: "error" (writes), "warning" (reads), "hint" (definitions)
    - rule_id: The specific pattern that matched

    This metadata enables category-based analysis:
    - code_graph_analyze("main", "category", category="db")
    - code_graph_explore("main", "category", category="auth")

    Args:
        graph_id: ID of the target graph (must exist from code_graph_create)
        astgrep_result: The raw JSON string output from astgrep_scan or
            astgrep_scan_rule_pack. Pass the exact return value.
        result_type: Source of the AST-grep result:
            - "rule_pack" (default): From astgrep_scan_rule_pack.
              Results include category, severity, rule_id metadata.
              Use this for business logic detection.
            - "scan": From astgrep_scan ad-hoc patterns.
              Results have pattern info but no category metadata.

    Returns:
        JSON with ingestion results:
        {
            "status": "success",
            "nodes_added": 25,
            "categories": ["db", "auth", "validation"],
            "total_nodes": 175
        }

    Output Size: ~300 bytes

    Common Errors:
        - "Graph not found": Call code_graph_create first
        - "Invalid JSON": AST-grep result is malformed

    Workflow Example:
        # Run rule pack for Python business logic
        matches = astgrep_scan_rule_pack("py_business_logic", repo_path)

        # Ingest into graph
        code_graph_ingest_astgrep("main", matches, "rule_pack")

        # Now analyze by category
        db_ops = code_graph_analyze("main", "category", category="db")
        auth_ops = code_graph_analyze("main", "category", category="auth")
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    try:
        result = json.loads(astgrep_result)
    except json.JSONDecodeError as e:
        return _json_response({"status": "error", "message": f"Invalid JSON: {e}"})

    nodes_added = 0
    categories: set[str] = set()

    if result_type == "rule_pack":
        nodes = ingest_astgrep_rule_pack(result)
    else:
        nodes = ingest_astgrep_matches(result)

    for node in nodes:
        if not graph.has_node(node.id):
            graph.add_node(node)
            nodes_added += 1
            if "category" in node.metadata:
                categories.add(node.metadata["category"])

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "result_type": result_type,
            "nodes_added": nodes_added,
            "categories": list(categories),
            "total_nodes": graph.node_count,
        },
    )

code_graph_ingest_clones ¶

code_graph_ingest_clones(graph_id, clone_result)

Add clone detection results to the graph as SIMILAR_TO edges.

USE THIS TOOL: - After calling detect_clones to find duplicate code blocks - To enable refactoring candidate analysis in code_graph_analyze

DO NOT USE: - Before code_graph_create (graph must exist first) - With empty clone results

Creates SIMILAR_TO edges between files sharing duplicate code. These edges are used by: - code_graph_analyze("main", "refactoring") for refactoring candidates

Parameters:

Name	Type	Description	Default
`graph_id`	`str`	ID of the target graph (must exist from code_graph_create)	required
`clone_result`	`str`	The raw JSON string output from detect_clones tool.	required

Returns:

Name	Type	Description
`JSON`	`str`	{"status": "success", "edges_added": N, "total_edges": M}

Output Size: ~150 bytes

Source code in src/code_context_agent/tools/graph/tools.py

@tool
def code_graph_ingest_clones(
    graph_id: str,
    clone_result: str,
) -> str:
    """Add clone detection results to the graph as SIMILAR_TO edges.

    USE THIS TOOL:
    - After calling detect_clones to find duplicate code blocks
    - To enable refactoring candidate analysis in code_graph_analyze

    DO NOT USE:
    - Before code_graph_create (graph must exist first)
    - With empty clone results

    Creates SIMILAR_TO edges between files sharing duplicate code.
    These edges are used by:
    - code_graph_analyze("main", "refactoring") for refactoring candidates

    Args:
        graph_id: ID of the target graph (must exist from code_graph_create)
        clone_result: The raw JSON string output from detect_clones tool.

    Returns:
        JSON: {"status": "success", "edges_added": N, "total_edges": M}

    Output Size: ~150 bytes
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    try:
        result = json.loads(clone_result)
    except json.JSONDecodeError as e:
        return _json_response({"status": "error", "message": f"Invalid JSON: {e}"})

    edges = ingest_clone_results(result)
    edges_added = 0

    for edge in edges:
        graph.add_edge(edge)
        edges_added += 1

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "edges_added": edges_added,
            "total_edges": graph.edge_count,
        },
    )

code_graph_ingest_git ¶

code_graph_ingest_git(
    graph_id,
    git_result,
    result_type,
    source_file="",
    min_percentage=20.0,
)

Add git history data to the code graph as nodes, edges, or metadata.

USE THIS TOOL: - After calling git_hotspots to add churn metadata to FILE nodes - After calling git_files_changed_together to add COCHANGES edges - After calling git_contributors or git_blame_summary to add ownership metadata

DO NOT USE: - Before code_graph_create (graph must exist first) - With error-status git results

Parameters:

Name	Type	Description	Default
`graph_id`	`str`	ID of the target graph (must exist from code_graph_create)	required
`git_result`	`str`	The raw JSON string output from a git tool. Pass the exact return value from git_hotspots, git_files_changed_together, git_contributors, or git_blame_summary.	required
`result_type`	`str`	Type of git result being ingested: - "hotspots": From git_hotspots. Creates/updates FILE nodes with churn metadata. - "cochanges": From git_files_changed_together. Creates COCHANGES edges. Uses min_percentage to filter low-coupling pairs. - "contributors": From git_contributors or git_blame_summary. Returns ownership metadata dict.	required
`source_file`	`str`	For "contributors" type. If provided and the node exists, attaches contributor metadata to the FILE node at this path.	`''`
`min_percentage`	`float`	For "cochanges" type. Minimum co-change percentage to create an edge (default 20.0). Lower = more edges.	`20.0`

Returns:

Type	Description
`str`	JSON with ingestion results varying by type.

Output Size: ~200 bytes

Workflow Examples:

Ingesting hotspots (creates/updates FILE nodes): hotspots = git_hotspots(repo_path, limit=30) code_graph_ingest_git("main", hotspots, "hotspots")

Ingesting co-changes (creates COCHANGES edges): coupling = git_files_changed_together(repo_path, "src/auth.py") code_graph_ingest_git("main", coupling, "cochanges", min_percentage=15.0)

Ingesting contributors (returns metadata): blame = git_blame_summary(repo_path, "src/auth.py") code_graph_ingest_git("main", blame, "contributors", source_file="src/auth.py")

Source code in src/code_context_agent/tools/graph/tools.py

@tool
def code_graph_ingest_git(
    graph_id: str,
    git_result: str,
    result_type: str,
    source_file: str = "",
    min_percentage: float = 20.0,
) -> str:
    """Add git history data to the code graph as nodes, edges, or metadata.

    USE THIS TOOL:
    - After calling git_hotspots to add churn metadata to FILE nodes
    - After calling git_files_changed_together to add COCHANGES edges
    - After calling git_contributors or git_blame_summary to add ownership metadata

    DO NOT USE:
    - Before code_graph_create (graph must exist first)
    - With error-status git results

    Args:
        graph_id: ID of the target graph (must exist from code_graph_create)
        git_result: The raw JSON string output from a git tool.
            Pass the exact return value from git_hotspots,
            git_files_changed_together, git_contributors, or git_blame_summary.
        result_type: Type of git result being ingested:
            - "hotspots": From git_hotspots. Creates/updates FILE nodes with churn metadata.
            - "cochanges": From git_files_changed_together. Creates COCHANGES edges.
              Uses min_percentage to filter low-coupling pairs.
            - "contributors": From git_contributors or git_blame_summary.
              Returns ownership metadata dict.
        source_file: For "contributors" type. If provided and the node exists,
            attaches contributor metadata to the FILE node at this path.
        min_percentage: For "cochanges" type. Minimum co-change percentage
            to create an edge (default 20.0). Lower = more edges.

    Returns:
        JSON with ingestion results varying by type.

    Output Size: ~200 bytes

    Workflow Examples:

    Ingesting hotspots (creates/updates FILE nodes):
        hotspots = git_hotspots(repo_path, limit=30)
        code_graph_ingest_git("main", hotspots, "hotspots")

    Ingesting co-changes (creates COCHANGES edges):
        coupling = git_files_changed_together(repo_path, "src/auth.py")
        code_graph_ingest_git("main", coupling, "cochanges", min_percentage=15.0)

    Ingesting contributors (returns metadata):
        blame = git_blame_summary(repo_path, "src/auth.py")
        code_graph_ingest_git("main", blame, "contributors", source_file="src/auth.py")
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    try:
        result = json.loads(git_result)
    except json.JSONDecodeError as e:
        return _json_response({"status": "error", "message": f"Invalid JSON: {e}"})

    if result_type == "hotspots":
        nodes = ingest_git_hotspots(result)
        nodes_added = 0
        nodes_updated = 0
        for node in nodes:
            if graph.has_node(node.id):
                # Merge churn metadata into existing node
                existing = graph._graph.nodes[node.id]
                existing.setdefault("metadata", {}).update(node.metadata)
                nodes_updated += 1
            else:
                graph.add_node(node)
                nodes_added += 1
        return _json_response(
            {
                "status": "success",
                "graph_id": graph_id,
                "result_type": "hotspots",
                "nodes_added": nodes_added,
                "nodes_updated": nodes_updated,
                "total_nodes": graph.node_count,
            },
        )

    if result_type == "cochanges":
        edges = ingest_git_cochanges(result, min_percentage=min_percentage)
        edges_added = 0
        for edge in edges:
            graph.add_edge(edge)
            edges_added += 1
        return _json_response(
            {
                "status": "success",
                "graph_id": graph_id,
                "result_type": "cochanges",
                "edges_added": edges_added,
                "total_edges": graph.edge_count,
            },
        )

    if result_type == "contributors":
        metadata = ingest_git_contributors(result)
        if source_file and graph.has_node(source_file):
            graph._graph.nodes[source_file].setdefault("metadata", {}).update(metadata)
        return _json_response(
            {
                "status": "success",
                "graph_id": graph_id,
                "result_type": "contributors",
                "contributor_count": metadata.get("contributor_count", 0),
            },
        )

    return _json_response({"status": "error", "message": f"Unknown result_type: {result_type}"})

code_graph_ingest_inheritance ¶

code_graph_ingest_inheritance(
    graph_id, hover_content, class_node_id, file_path
)

Add class inheritance/implementation edges from LSP hover information.

USE THIS TOOL: - After lsp_hover on a class to capture extends/implements relationships - When building class hierarchy for OOP codebases - In DEEP mode for comprehensive type analysis

DO NOT USE: - On non-class symbols (functions, variables) - Without first creating the class node via code_graph_ingest_lsp

Parses class signatures to create edges: - "inherits" edges: class Foo extends Bar → Foo --inherits→ Bar - "implements" edges: class Foo implements IBar → Foo --implements→ IBar

Works with common patterns: - TypeScript/JavaScript: extends, implements - Python: class Foo(Bar, Baz) - Java: extends, implements

Parameters:

Name	Type	Description	Default
`graph_id`	`str`	ID of the target graph (must exist from code_graph_create)	required
`hover_content`	`str`	The markdown/text content from lsp_hover result. Extract the "value" field from the hover response. Example: "class UserService extends BaseService implements IUserService"	required
`class_node_id`	`str`	The node ID of the class in the graph. Format: "file_path:ClassName" (e.g., "src/services/user.ts:UserService") Must match the ID created by code_graph_ingest_lsp.	required
`file_path`	`str`	Path to the file containing this class. Used to resolve base class locations.	required

Returns:

Name	Type	Description
`JSON`	`str`	{"status": "success", "edges_added": N, "edge_types": ["inherits", "implements"]}

Output Size: ~200 bytes

Workflow Example

Get class symbols¶

symbols = lsp_document_symbols(session_id, "src/user.ts") code_graph_ingest_lsp("main", symbols, "symbols", source_file="src/user.ts")

For each class, get hover info and ingest inheritance¶

hover = lsp_hover(session_id, "src/user.ts", class_line, 0) hover_content = hover["hover"]["contents"]["value"] code_graph_ingest_inheritance("main", hover_content, "src/user.ts:UserService", "src/user.ts")

Source code in src/code_context_agent/tools/graph/tools.py

@tool
def code_graph_ingest_inheritance(
    graph_id: str,
    hover_content: str,
    class_node_id: str,
    file_path: str,
) -> str:
    """Add class inheritance/implementation edges from LSP hover information.

    USE THIS TOOL:
    - After lsp_hover on a class to capture extends/implements relationships
    - When building class hierarchy for OOP codebases
    - In DEEP mode for comprehensive type analysis

    DO NOT USE:
    - On non-class symbols (functions, variables)
    - Without first creating the class node via code_graph_ingest_lsp

    Parses class signatures to create edges:
    - "inherits" edges: class Foo extends Bar → Foo --inherits--> Bar
    - "implements" edges: class Foo implements IBar → Foo --implements--> IBar

    Works with common patterns:
    - TypeScript/JavaScript: extends, implements
    - Python: class Foo(Bar, Baz)
    - Java: extends, implements

    Args:
        graph_id: ID of the target graph (must exist from code_graph_create)
        hover_content: The markdown/text content from lsp_hover result.
            Extract the "value" field from the hover response.
            Example: "class UserService extends BaseService implements IUserService"
        class_node_id: The node ID of the class in the graph.
            Format: "file_path:ClassName" (e.g., "src/services/user.ts:UserService")
            Must match the ID created by code_graph_ingest_lsp.
        file_path: Path to the file containing this class.
            Used to resolve base class locations.

    Returns:
        JSON: {"status": "success", "edges_added": N, "edge_types": ["inherits", "implements"]}

    Output Size: ~200 bytes

    Workflow Example:
        # Get class symbols
        symbols = lsp_document_symbols(session_id, "src/user.ts")
        code_graph_ingest_lsp("main", symbols, "symbols", source_file="src/user.ts")

        # For each class, get hover info and ingest inheritance
        hover = lsp_hover(session_id, "src/user.ts", class_line, 0)
        hover_content = hover["hover"]["contents"]["value"]
        code_graph_ingest_inheritance("main", hover_content,
                                      "src/user.ts:UserService", "src/user.ts")
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    edges = ingest_inheritance(hover_content, class_node_id, file_path)
    edges_added = 0

    for edge in edges:
        graph.add_edge(edge)
        edges_added += 1

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "edges_added": edges_added,
            "edge_types": ["inherits", "implements"],
            "total_edges": graph.edge_count,
        },
    )

code_graph_ingest_lsp ¶

code_graph_ingest_lsp(
    graph_id,
    lsp_result,
    result_type,
    source_file="",
    source_symbol="",
)

Add LSP tool results to the code graph as nodes and edges.

USE THIS TOOL: - After calling lsp_document_symbols to add function/class nodes - After calling lsp_references to add "references" edges (fan-in data) - After calling lsp_definition to add "calls" edges (call relationships)

DO NOT USE: - Before calling code_graph_create (graph must exist first) - With invalid/empty LSP results (check LSP tool status first)

Converts raw LSP data into graph structure: - "symbols" → Creates nodes for functions, classes, methods, variables - "references" → Creates edges showing where a symbol is used - "definition" → Creates edges showing what a symbol calls/uses

Parameters:

Name	Type	Description	Default
`graph_id`	`str`	ID of the target graph (must exist from code_graph_create)	required
`lsp_result`	`str`	The raw JSON string output from an LSP tool. Pass the exact return value from lsp_document_symbols, lsp_references, or lsp_definition.	required
`result_type`	`str`	Type of LSP result being ingested: - "symbols": From lsp_document_symbols. Creates nodes. REQUIRES source_file parameter. - "references": From lsp_references. Creates reference edges. REQUIRES source_symbol parameter (format: "file:name"). - "definition": From lsp_definition. Creates call/import edges.	required
`source_file`	`str`	Required for "symbols" type. The file path that was analyzed (e.g., "src/main.py"). Used to create node IDs.	`''`
`source_symbol`	`str`	Required for "references" type. The symbol ID that references point TO (format: "src/main.py:my_function").	`''`

Returns:

Type	Description
`str`	JSON with ingestion results:
`str`	{ "status": "success", "nodes_added": 15, # New nodes created "edges_added": 8, # New edges created "total_nodes": 150, # Graph totals "total_edges": 200
`str`	}

Output Size: ~200 bytes

Common Errors

"Graph not found": Call code_graph_create first
"source_file required": Must provide source_file for "symbols"
"source_symbol required": Must provide source_symbol for "references"
"Invalid JSON": LSP result is malformed

Workflow Examples:

Ingesting symbols (creates nodes): symbols = lsp_document_symbols(session_id, "src/api.py") code_graph_ingest_lsp("main", symbols, "symbols", source_file="src/api.py")

Ingesting references (creates edges showing fan-in): refs = lsp_references(session_id, "src/api.py", 10, 5) code_graph_ingest_lsp("main", refs, "references", source_symbol="src/api.py:handle_request")

Ingesting definitions (creates call edges): defn = lsp_definition(session_id, "src/api.py", 15, 20) code_graph_ingest_lsp("main", defn, "definition")

Source code in src/code_context_agent/tools/graph/tools.py

@tool
def code_graph_ingest_lsp(  # noqa: C901
    graph_id: str,
    lsp_result: str,
    result_type: str,
    source_file: str = "",
    source_symbol: str = "",
) -> str:
    """Add LSP tool results to the code graph as nodes and edges.

    USE THIS TOOL:
    - After calling lsp_document_symbols to add function/class nodes
    - After calling lsp_references to add "references" edges (fan-in data)
    - After calling lsp_definition to add "calls" edges (call relationships)

    DO NOT USE:
    - Before calling code_graph_create (graph must exist first)
    - With invalid/empty LSP results (check LSP tool status first)

    Converts raw LSP data into graph structure:
    - "symbols" → Creates nodes for functions, classes, methods, variables
    - "references" → Creates edges showing where a symbol is used
    - "definition" → Creates edges showing what a symbol calls/uses

    Args:
        graph_id: ID of the target graph (must exist from code_graph_create)
        lsp_result: The raw JSON string output from an LSP tool.
            Pass the exact return value from lsp_document_symbols,
            lsp_references, or lsp_definition.
        result_type: Type of LSP result being ingested:
            - "symbols": From lsp_document_symbols. Creates nodes.
              REQUIRES source_file parameter.
            - "references": From lsp_references. Creates reference edges.
              REQUIRES source_symbol parameter (format: "file:name").
            - "definition": From lsp_definition. Creates call/import edges.
        source_file: Required for "symbols" type. The file path that was
            analyzed (e.g., "src/main.py"). Used to create node IDs.
        source_symbol: Required for "references" type. The symbol ID that
            references point TO (format: "src/main.py:my_function").

    Returns:
        JSON with ingestion results:
        {
            "status": "success",
            "nodes_added": 15,      # New nodes created
            "edges_added": 8,       # New edges created
            "total_nodes": 150,     # Graph totals
            "total_edges": 200
        }

    Output Size: ~200 bytes

    Common Errors:
        - "Graph not found": Call code_graph_create first
        - "source_file required": Must provide source_file for "symbols"
        - "source_symbol required": Must provide source_symbol for "references"
        - "Invalid JSON": LSP result is malformed

    Workflow Examples:

    Ingesting symbols (creates nodes):
        symbols = lsp_document_symbols(session_id, "src/api.py")
        code_graph_ingest_lsp("main", symbols, "symbols", source_file="src/api.py")

    Ingesting references (creates edges showing fan-in):
        refs = lsp_references(session_id, "src/api.py", 10, 5)
        code_graph_ingest_lsp("main", refs, "references",
                              source_symbol="src/api.py:handle_request")

    Ingesting definitions (creates call edges):
        defn = lsp_definition(session_id, "src/api.py", 15, 20)
        code_graph_ingest_lsp("main", defn, "definition")
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    try:
        result = json.loads(lsp_result)
    except json.JSONDecodeError as e:
        return _json_response({"status": "error", "message": f"Invalid JSON: {e}"})

    nodes_added = 0
    edges_added = 0

    if result_type == "symbols":
        if not source_file:
            return _json_response({"status": "error", "message": "source_file required for symbols"})
        nodes, edges = ingest_lsp_symbols(result, source_file)
        for node in nodes:
            if not graph.has_node(node.id):
                graph.add_node(node)
                nodes_added += 1
        for edge in edges:
            graph.add_edge(edge)
            edges_added += 1

    elif result_type == "references":
        if not source_symbol:
            return _json_response({"status": "error", "message": "source_symbol required for references"})
        edges = ingest_lsp_references(result, source_symbol)
        for edge in edges:
            graph.add_edge(edge)
            edges_added += 1

    elif result_type == "definition":
        # Extract source location from result
        from_file = result.get("file", source_file)
        from_line = result.get("position", {}).get("line", 0)
        edges = ingest_lsp_definition(result, from_file, from_line)
        for edge in edges:
            graph.add_edge(edge)
            edges_added += 1

    else:
        return _json_response({"status": "error", "message": f"Unknown result_type: {result_type}"})

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "result_type": result_type,
            "nodes_added": nodes_added,
            "edges_added": edges_added,
            "total_nodes": graph.node_count,
            "total_edges": graph.edge_count,
        },
    )

code_graph_ingest_rg ¶

code_graph_ingest_rg(graph_id, rg_result)

Add ripgrep search matches to the graph as preliminary nodes.

USE THIS TOOL: - When LSP doesn't cover a language/pattern - For text-based patterns (SQL keywords, config values, comments) - As a fallback when semantic analysis isn't available

DO NOT USE: - When LSP symbols are available (prefer code_graph_ingest_lsp) - For structural patterns (prefer code_graph_ingest_astgrep)

Creates lightweight nodes from text matches. These nodes have: - file_path and line number - matched text content - No semantic type information (unlike LSP nodes)

Ripgrep nodes are useful for: - Finding TODO/FIXME comments - Locating hardcoded values - Identifying SQL queries in strings

Parameters:

Name	Type	Description	Default
`graph_id`	`str`	ID of the target graph (must exist from code_graph_create)	required
`rg_result`	`str`	The raw JSON string output from rg_search tool. Pass the exact return value.	required

Returns:

Name	Type	Description
`JSON`	`str`	{"status": "success", "nodes_added": N, "total_nodes": M}

Output Size: ~150 bytes

Workflow Example

Find all SQL queries¶

sql_matches = rg_search("SELECT|INSERT|UPDATE|DELETE", repo_path) code_graph_ingest_rg("main", sql_matches)

Source code in src/code_context_agent/tools/graph/tools.py

@tool
def code_graph_ingest_rg(
    graph_id: str,
    rg_result: str,
) -> str:
    """Add ripgrep search matches to the graph as preliminary nodes.

    USE THIS TOOL:
    - When LSP doesn't cover a language/pattern
    - For text-based patterns (SQL keywords, config values, comments)
    - As a fallback when semantic analysis isn't available

    DO NOT USE:
    - When LSP symbols are available (prefer code_graph_ingest_lsp)
    - For structural patterns (prefer code_graph_ingest_astgrep)

    Creates lightweight nodes from text matches. These nodes have:
    - file_path and line number
    - matched text content
    - No semantic type information (unlike LSP nodes)

    Ripgrep nodes are useful for:
    - Finding TODO/FIXME comments
    - Locating hardcoded values
    - Identifying SQL queries in strings

    Args:
        graph_id: ID of the target graph (must exist from code_graph_create)
        rg_result: The raw JSON string output from rg_search tool.
            Pass the exact return value.

    Returns:
        JSON: {"status": "success", "nodes_added": N, "total_nodes": M}

    Output Size: ~150 bytes

    Workflow Example:
        # Find all SQL queries
        sql_matches = rg_search("SELECT|INSERT|UPDATE|DELETE", repo_path)
        code_graph_ingest_rg("main", sql_matches)
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    try:
        result = json.loads(rg_result)
    except json.JSONDecodeError as e:
        return _json_response({"status": "error", "message": f"Invalid JSON: {e}"})

    nodes = ingest_rg_matches(result)
    nodes_added = 0

    for node in nodes:
        if not graph.has_node(node.id):
            graph.add_node(node)
            nodes_added += 1

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "nodes_added": nodes_added,
            "total_nodes": graph.node_count,
        },
    )

code_graph_ingest_tests ¶

code_graph_ingest_tests(
    graph_id, test_files, production_files
)

Add test-to-production file mappings as "tests" edges in the graph.

USE THIS TOOL: - After identifying test files (via rg_search for test patterns) - To enable test coverage analysis on business logic - To find untested hotspots in the codebase

DO NOT USE: - With unfiltered file lists (only include actual test files) - Before adding production file nodes to the graph

Creates "tests" edges based on naming convention matching: - test_foo.py → foo.py - foo.test.ts → foo.ts - FooTest.java → Foo.java - tests/foo.test.js → src/foo.js

These edges enable: - Finding untested business logic (nodes without incoming test edges) - Understanding test coverage per module - Prioritizing testing efforts on hotspots

Parameters:

Name	Type	Description	Default
`graph_id`	`str`	ID of the target graph (must exist from code_graph_create)	required
`test_files`	`str`	JSON array of test file paths as a string. Example: '["tests/test_user.py", "tests/test_auth.py"]' Obtain from rg_search or file manifest filtering.	required
`production_files`	`str`	JSON array of production file paths as a string. Example: '["src/user.py", "src/auth.py"]' Should include all files you want to map tests to.	required

Returns:

Name	Type	Description
`JSON`	`str`	{"status": "success", "edges_added": N, "total_edges": M}

Output Size: ~150 bytes

Workflow Example

Find test files¶

test_matches = rg_search("def test_|it(|describe(", repo_path) test_files = extract_unique_files(test_matches)

Get production files from manifest¶

prod_files = filter_non_test_files(manifest)

Create test mapping edges¶

code_graph_ingest_tests("main", json.dumps(test_files), json.dumps(prod_files))

Find untested hotspots¶

hotspots = code_graph_analyze("main", "hotspots", top_k=10)

Check which have no incoming "tests" edges¶

Source code in src/code_context_agent/tools/graph/tools.py

@tool
def code_graph_ingest_tests(
    graph_id: str,
    test_files: str,
    production_files: str,
) -> str:
    """Add test-to-production file mappings as "tests" edges in the graph.

    USE THIS TOOL:
    - After identifying test files (via rg_search for test patterns)
    - To enable test coverage analysis on business logic
    - To find untested hotspots in the codebase

    DO NOT USE:
    - With unfiltered file lists (only include actual test files)
    - Before adding production file nodes to the graph

    Creates "tests" edges based on naming convention matching:
    - test_foo.py → foo.py
    - foo.test.ts → foo.ts
    - FooTest.java → Foo.java
    - __tests__/foo.test.js → src/foo.js

    These edges enable:
    - Finding untested business logic (nodes without incoming test edges)
    - Understanding test coverage per module
    - Prioritizing testing efforts on hotspots

    Args:
        graph_id: ID of the target graph (must exist from code_graph_create)
        test_files: JSON array of test file paths as a string.
            Example: '["tests/test_user.py", "tests/test_auth.py"]'
            Obtain from rg_search or file manifest filtering.
        production_files: JSON array of production file paths as a string.
            Example: '["src/user.py", "src/auth.py"]'
            Should include all files you want to map tests to.

    Returns:
        JSON: {"status": "success", "edges_added": N, "total_edges": M}

    Output Size: ~150 bytes

    Workflow Example:
        # Find test files
        test_matches = rg_search("def test_|it\\(|describe\\(", repo_path)
        test_files = extract_unique_files(test_matches)

        # Get production files from manifest
        prod_files = filter_non_test_files(manifest)

        # Create test mapping edges
        code_graph_ingest_tests("main",
                                json.dumps(test_files),
                                json.dumps(prod_files))

        # Find untested hotspots
        hotspots = code_graph_analyze("main", "hotspots", top_k=10)
        # Check which have no incoming "tests" edges
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    try:
        tests = json.loads(test_files)
        prods = json.loads(production_files)
    except json.JSONDecodeError as e:
        return _json_response({"status": "error", "message": f"Invalid JSON: {e}"})

    edges = ingest_test_mapping(tests, prods)
    edges_added = 0

    for edge in edges:
        graph.add_edge(edge)
        edges_added += 1

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "edges_added": edges_added,
            "total_edges": graph.edge_count,
        },
    )

code_graph_load ¶

code_graph_load(graph_id, file_path)

Load a previously saved code graph from disk.

USE THIS TOOL: - At the start of a session if .code-context/code_graph.json exists - To resume analysis from a previous session - To skip re-running LSP/AST-grep data collection

DO NOT USE: - If graph file doesn't exist (check with file system first) - When you need fresh analysis (create new graph instead)

Loading a saved graph restores: - All nodes with their metadata - All edges with their types - Ready for immediate analysis (code_graph_analyze, code_graph_explore)

Note: Loading replaces any existing graph with the same ID. The explorer state is reset (tracked exploration cleared).

Parameters:

Name	Type	Description	Default
`graph_id`	`str`	ID to assign to the loaded graph. Use: - "main": For the primary codebase graph - Descriptive names for scoped graphs	required
`file_path`	`str`	Path to the saved graph file. Standard location: ".code-context/code_graph.json"	required

Returns:

Name	Type	Description
`JSON`	`str`	{ "status": "success", "graph_id": "main", "path": ".code-context/code_graph.json", "nodes": 150, "edges": 200
	`str`	}

Output Size: ~100 bytes

Common Errors

"Load failed": File not found or invalid JSON

Workflow Example:

Check if saved graph exists¶

If .code-context/code_graph.json exists:¶

code_graph_load("main", ".code-context/code_graph.json")

Graph is ready for analysis¶

hotspots = code_graph_analyze("main", "hotspots") overview = code_graph_explore("main", "overview")

No need to re-run lsp_* or astgrep_* tools!¶

Source code in src/code_context_agent/tools/graph/tools.py

@tool
def code_graph_load(
    graph_id: str,
    file_path: str,
) -> str:
    """Load a previously saved code graph from disk.

    USE THIS TOOL:
    - At the start of a session if .code-context/code_graph.json exists
    - To resume analysis from a previous session
    - To skip re-running LSP/AST-grep data collection

    DO NOT USE:
    - If graph file doesn't exist (check with file system first)
    - When you need fresh analysis (create new graph instead)

    Loading a saved graph restores:
    - All nodes with their metadata
    - All edges with their types
    - Ready for immediate analysis (code_graph_analyze, code_graph_explore)

    Note: Loading replaces any existing graph with the same ID.
    The explorer state is reset (tracked exploration cleared).

    Args:
        graph_id: ID to assign to the loaded graph. Use:
            - "main": For the primary codebase graph
            - Descriptive names for scoped graphs
        file_path: Path to the saved graph file.
            Standard location: ".code-context/code_graph.json"

    Returns:
        JSON: {
            "status": "success",
            "graph_id": "main",
            "path": ".code-context/code_graph.json",
            "nodes": 150,
            "edges": 200
        }

    Output Size: ~100 bytes

    Common Errors:
        - "Load failed": File not found or invalid JSON

    Workflow Example:

    # Check if saved graph exists
    # If .code-context/code_graph.json exists:
    code_graph_load("main", ".code-context/code_graph.json")

    # Graph is ready for analysis
    hotspots = code_graph_analyze("main", "hotspots")
    overview = code_graph_explore("main", "overview")

    # No need to re-run lsp_* or astgrep_* tools!
    """
    try:
        from ..validation import ValidationError, validate_file_path

        try:
            path = validate_file_path(file_path)
        except ValidationError as e:
            return _json_response({"status": "error", "message": str(e)})
        data = json.loads(path.read_text())
        graph = CodeGraph.from_node_link_data(data)
        _graphs[graph_id] = graph
        # Reset explorer
        _explorers.pop(graph_id, None)
    except (OSError, ValueError, TypeError, KeyError) as e:
        return _json_response({"status": "error", "message": f"Load failed: {e}"})

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "path": str(path),
            "nodes": graph.node_count,
            "edges": graph.edge_count,
        },
    )

code_graph_save ¶

code_graph_save(graph_id, file_path)

Persist the code graph to disk for reuse in future sessions.

USE THIS TOOL: - After completing graph analysis (DEEP mode) - When you want to preserve analysis results - Before ending a session with valuable graph data

DO NOT USE: - For exporting to visualization formats (use code_graph_export) - On empty graphs (waste of disk space)

Saves the complete graph structure including: - All nodes with metadata (file_path, line numbers, categories) - All edges with types (calls, references, imports, inherits) - All analysis-relevant data

Saved graphs can be reloaded with code_graph_load, avoiding the need to re-run LSP/AST-grep tools.

Parameters:

Name	Type	Description	Default
`graph_id`	`str`	ID of the graph to save (must exist)	required
`file_path`	`str`	Destination file path. Recommended locations: - ".code-context/code_graph.json": Standard location for main graph - ".code-context/{name}_graph.json": For named/scoped graphs Parent directories are created automatically.	required

Returns:

Name	Type	Description
`JSON`	`str`	{ "status": "success", "graph_id": "main", "path": ".code-context/code_graph.json", "nodes": 150, "edges": 200
	`str`	}

Output Size: ~100 bytes (file size varies: 10KB-1MB)

Common Errors

"Graph not found": Graph ID doesn't exist
"Save failed": File system error (permissions, disk full)

Workflow Example:

After comprehensive analysis in DEEP mode¶

code_graph_create("main")

... ingest LSP, AST-grep data ...¶

... run analysis ...¶

Save for future sessions¶

code_graph_save("main", ".code-context/code_graph.json")

In future session:¶

code_graph_load("main", ".code-context/code_graph.json")

Graph restored with all nodes/edges¶

Source code in src/code_context_agent/tools/graph/tools.py

@tool
def code_graph_save(
    graph_id: str,
    file_path: str,
) -> str:
    """Persist the code graph to disk for reuse in future sessions.

    USE THIS TOOL:
    - After completing graph analysis (DEEP mode)
    - When you want to preserve analysis results
    - Before ending a session with valuable graph data

    DO NOT USE:
    - For exporting to visualization formats (use code_graph_export)
    - On empty graphs (waste of disk space)

    Saves the complete graph structure including:
    - All nodes with metadata (file_path, line numbers, categories)
    - All edges with types (calls, references, imports, inherits)
    - All analysis-relevant data

    Saved graphs can be reloaded with code_graph_load, avoiding
    the need to re-run LSP/AST-grep tools.

    Args:
        graph_id: ID of the graph to save (must exist)
        file_path: Destination file path. Recommended locations:
            - ".code-context/code_graph.json": Standard location for main graph
            - ".code-context/{name}_graph.json": For named/scoped graphs
            Parent directories are created automatically.

    Returns:
        JSON: {
            "status": "success",
            "graph_id": "main",
            "path": ".code-context/code_graph.json",
            "nodes": 150,
            "edges": 200
        }

    Output Size: ~100 bytes (file size varies: 10KB-1MB)

    Common Errors:
        - "Graph not found": Graph ID doesn't exist
        - "Save failed": File system error (permissions, disk full)

    Workflow Example:

    # After comprehensive analysis in DEEP mode
    code_graph_create("main")
    # ... ingest LSP, AST-grep data ...
    # ... run analysis ...

    # Save for future sessions
    code_graph_save("main", ".code-context/code_graph.json")

    # In future session:
    code_graph_load("main", ".code-context/code_graph.json")
    # Graph restored with all nodes/edges
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    try:
        from ..validation import ValidationError, validate_file_path

        try:
            path = validate_file_path(file_path, must_exist=False)
        except ValidationError as e:
            return _json_response({"status": "error", "message": str(e)})
        data = graph.to_node_link_data()
        path.parent.mkdir(parents=True, exist_ok=True)
        path.write_text(json.dumps(data, indent=2, default=str))
    except (OSError, ValueError, TypeError) as e:
        return _json_response({"status": "error", "message": f"Save failed: {e}"})

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "path": str(path),
            "nodes": graph.node_count,
            "edges": graph.edge_count,
        },
    )

code_graph_stats ¶

code_graph_stats(graph_id)

Get summary statistics about a code graph.

USE THIS TOOL: - To verify graph was populated correctly after ingestion - To understand graph composition before analysis - For the completion signal (graph node/edge counts)

DO NOT USE: - For detailed analysis (use code_graph_analyze) - For exploration (use code_graph_explore)

Returns counts broken down by type: - Nodes by type: function, class, method, variable, pattern_match - Edges by type: calls, references, imports, inherits, tests

This helps verify: - LSP ingestion worked (function/class nodes exist) - AST-grep ingestion worked (pattern_match nodes exist) - Reference tracking worked (references edges exist) - Test mapping worked (tests edges exist)

Parameters:

Name	Type	Description	Default
`graph_id`	`str`	ID of the graph to get stats for (must exist)	required

Returns:

Name	Type	Description
`JSON`	`str`	{ "status": "success", "graph_id": "main", "total_nodes": 150, "total_edges": 200, "nodes_by_type": { "function": 80, "class": 20, "method": 40, "pattern_match": 10 }, "edges_by_type": { "calls": 100, "references": 60, "imports": 30, "tests": 10 }
	`str`	}

Output Size: ~300 bytes

Workflow Example:

After ingestion, verify graph state¶

stats = code_graph_stats("main")

Check ingestion worked¶

if stats["nodes_by_type"]["function"] == 0: # LSP symbols not ingested properly

if stats["edges_by_type"]["references"] == 0: # LSP references not ingested

Use in completion signal¶

Graph: {stats["total_nodes"]} nodes, {stats["total_edges"]} edges¶

Source code in src/code_context_agent/tools/graph/tools.py

@tool
def code_graph_stats(
    graph_id: str,
) -> str:
    """Get summary statistics about a code graph.

    USE THIS TOOL:
    - To verify graph was populated correctly after ingestion
    - To understand graph composition before analysis
    - For the completion signal (graph node/edge counts)

    DO NOT USE:
    - For detailed analysis (use code_graph_analyze)
    - For exploration (use code_graph_explore)

    Returns counts broken down by type:
    - Nodes by type: function, class, method, variable, pattern_match
    - Edges by type: calls, references, imports, inherits, tests

    This helps verify:
    - LSP ingestion worked (function/class nodes exist)
    - AST-grep ingestion worked (pattern_match nodes exist)
    - Reference tracking worked (references edges exist)
    - Test mapping worked (tests edges exist)

    Args:
        graph_id: ID of the graph to get stats for (must exist)

    Returns:
        JSON: {
            "status": "success",
            "graph_id": "main",
            "total_nodes": 150,
            "total_edges": 200,
            "nodes_by_type": {
                "function": 80,
                "class": 20,
                "method": 40,
                "pattern_match": 10
            },
            "edges_by_type": {
                "calls": 100,
                "references": 60,
                "imports": 30,
                "tests": 10
            }
        }

    Output Size: ~300 bytes

    Workflow Example:

    # After ingestion, verify graph state
    stats = code_graph_stats("main")

    # Check ingestion worked
    if stats["nodes_by_type"]["function"] == 0:
        # LSP symbols not ingested properly

    if stats["edges_by_type"]["references"] == 0:
        # LSP references not ingested

    # Use in completion signal
    # Graph: {stats["total_nodes"]} nodes, {stats["total_edges"]} edges
    """
    graph = _get_graph(graph_id)
    if graph is None:
        return _json_response({"status": "error", "message": f"Graph not found: {graph_id}"})

    # Count nodes by type
    node_types: dict[str, int] = {}
    for _, data in graph.nodes(data=True):
        ntype = data.get("node_type", "unknown")
        node_types[ntype] = node_types.get(ntype, 0) + 1

    # Count edges by type
    edge_types: dict[str, int] = {}
    for _, _, data in graph.edges(data=True):
        etype = data.get("edge_type", "unknown")
        edge_types[etype] = edge_types.get(etype, 0) + 1

    return _json_response(
        {
            "status": "success",
            "graph_id": graph_id,
            "total_nodes": graph.node_count,
            "total_edges": graph.edge_count,
            "nodes_by_type": node_types,
            "edges_by_type": edge_types,
        },
    )

Index

code_context_agent.tools.graph ¶

CodeAnalyzer ¶

find_hotspots ¶

find_foundations ¶

find_trusted_foundations ¶

find_entry_points ¶

detect_modules ¶

find_clusters_by_pattern ¶

find_clusters_by_category ¶

find_triangles ¶

get_similar_nodes ¶

calculate_coupling ¶

get_dependency_chain ¶

find_unused_symbols ¶

find_refactoring_candidates ¶

ProgressiveExplorer ¶

get_overview ¶

expand_node ¶

expand_module ¶

get_path_between ¶

explore_category ¶

get_exploration_status ¶

reset_exploration ¶

CodeEdge ¶

to_dict ¶

CodeGraph ¶

node_count property ¶

edge_count property ¶

add_node ¶

add_edge ¶

has_node ¶

has_edge ¶

get_node_data ¶

get_nodes_by_type ¶

get_edges_by_type ¶

get_view ¶

nodes ¶

edges ¶

to_node_link_data ¶

from_node_link_data classmethod ¶

describe ¶

CodeNode ¶

to_dict ¶

EdgeType ¶

NodeType ¶

code_graph_analyze ¶

Results ranked by betweenness - top items are integration points¶

Each module is a logical grouping - name based on key_nodes¶

High coupling score = tightly connected, changes propagate¶

Returns all nodes tagged as "db" from AST-grep ingestion¶

code_graph_create ¶

code_graph_explore ¶

1. Start with overview¶

Look at entry_points and hotspots¶

2. Expand from interesting hotspot¶

See neighbors and suggested_next¶

3. Explore a module¶

See internal structure and external connections¶

4. Check coverage¶

coverage_percent shows how much of graph was explored¶

code_graph_export ¶

Export for CONTEXT.md architecture diagram¶

Embed in markdown:¶

```mermaid¶

{mermaid_code}¶

```¶

Export for external visualization¶

Use with Gephi, D3.js, etc.¶

code_graph_ingest_astgrep ¶

Run rule pack for Python business logic¶

Ingest into graph¶

Now analyze by category¶

code_graph_ingest_clones ¶

code_graph_ingest_git ¶

code_graph_ingest_inheritance ¶

Get class symbols¶

For each class, get hover info and ingest inheritance¶

code_graph_ingest_lsp ¶

code_graph_ingest_rg ¶

node_count `property` ¶

edge_count `property` ¶

from_node_link_data `classmethod` ¶