Verifying ctree Refactoring Effectiveness — Project Structure Optimization
Capturing dependency separation and single-responsibility achievement through code-tree refactoring, with reduced read costs via Serena integration
Overview
In a real Rust project, code-tree demonstrated high value in verifying refactoring from monolithic to modular architecture. This article summarizes the structural changes captured through diff revisions (rev) and how the mechanism works.
Structural Changes Captured by ctree
Diff Before and After Refactoring
The compact diff after refactoring (rev 0002) clearly visualized the following structural changes:
r:0002|l:rust|t:2025-03-01T12:34:56Z|ctx:.ctree.toml|b:0001|bt:2025-02-28T14:22:10Z
cf:3 // changed_files count
@@f:src/main.rs|sa:0|sd:20|da:0|dd:2
-S 12f4e8c9 // Symbol deletion
-S 8a3d2b1f
... (total 20 symbol deletions)
@@f:src/axum_server.rs|sa:8|sd:0|da:1|dd:0
+S a7c5d1e2 // HTTP handling symbols added
+D 2e8f4c91 // Dependency added
@@f:src/inference.rs|sa:12|sd:0|da:2|dd:0
+S b3f2e7a4 // Inference logic added
+D c1a9d6b2
@@f:src/grpc.rs|sa:0|sd:5|da:2|dd:1
-S 6e2f4a7d
+D 3c8b5f21
Concrete Structural Changes
| File | Before | After | Meaning |
|---|---|---|---|
src/main.rs | Bloated 20+ symbols | Logging + server startup focused | Responsibility separation |
src/axum_server.rs | None | HTTP handling 8 symbols | HTTP layer independence |
src/inference.rs | Inference 5 symbols | Inference + vector ops 12 symbols | Feature consolidation |
src/grpc.rs | gRPC + inference mixed 5 | gRPC calls only | Dependency clarification |
Adoption Benefits
1. Dependency Isolation
The complete separation of inference.rs from HTTP/gRPC context was made explicit through the dependency deletion markers (-D) in the rev file.
This enabled:
- Inference logic testable via standalone Rust unit tests
- HTTP handler changes no longer impact inference engine design
- gRPC server can reference the same inference engine (eliminating code duplication)
2. Single Source of Truth
The consolidation of duplicated vector operations (normalize_rows, cosine_similarity, maxsim, etc.) into common module src/inference.rs becomes verifiable through ctree’s hashes.
ctree_get_text(hashes=["b3f2e7a4", "c1a9d6b2"])
This command immediately retrieves the signatures and implementations of vector operations consolidated in inference.rs.
3. Boundary Clarification
File-level responsibilities (Scope) are defined as “strong”, enabling strict inter-module boundary management.
| Module | Strong Scope Responsibility |
|---|---|
src/main.rs | Application startup, logging initialization |
src/axum_server.rs | HTTP handling, request/response conversion |
src/inference.rs | Inference, vector normalization, distance calculation |
src/grpc.rs | gRPC service definition, inference engine invocation |
Optimization Strategy and Feedback
Hybrid Format Design
The combination of rev files (binary/text mix) and JSONL (for queries) strikes an excellent balance between machine efficiency and human readability.
Rev Files: Function as delta logs, enabling rapid change point scanning
@@f:src/main.rs|sa:0|sd:20|da:0|dd:2
JSONL: Function as indexes, enabling easy searching with standard tools like jq
jq '.path == "src/inference.rs" and .kind == "function"' symbols_rust.jsonl
“Telescope” Design
The design of recording only hashes and retrieving detailed text on-demand (ctree_get_text) offers outstanding synergy with LLM agents. It conserves context windows while enabling drilling into necessary information.
Usage Scenario Example:
LLM: "What does src/grpc.rs depend on in src/inference.rs?"
→ ctree_get_depends(path="src/grpc.rs", dep="src/inference.rs")
→ Returns minimal text only
Volume Control for Monorepos (–sw / –ww)
The key to ensuring monorepo-scale scalability is the profile setting that increases annotation density for focused modules (--sw: strong width) while reducing density elsewhere (--ww: weak width).
ctree generate --sw 20 --ww 5
Serena Integration Strategy
Combining ctree’s fast symbol indexing with Serena’s semantic understanding can be expected to reduce the number of file read operations.
Three-Tier Query Strategy
Tier 1 (ctree): Fast hash-based discovery
Questions: "What changed?" "What depends on this?"
→ ctree_get_revs() immediately reveals change points
→ ctree_get_depends() searches dependency graph
Tier 2 (Serena): Semantic query for detailed understanding
Questions: "What does this function do?" "Which variables are referenced?"
→ find_symbol() retrieves structure
→ find_referencing_symbols() traces callers
Tier 3 (Read): Minimal file reads for final context confirmation
Questions: "I need to verify the overall context"
→ Read only absolutely necessary files
Practical Usage Examples
Scenario 1: Bug Investigation
Agent: "NaN is being returned in src/grpc.rs. Identify the cause"
Tier 1: ctree_get_revs() → "normalize_rows() changed in src/inference.rs"
Tier 2: find_referencing_symbols("normalize_rows", "src/grpc.rs") → 2 call sites
Tier 3: read() the relevant sections, identify the bug
Scenario 2: Refactoring Planning
Agent: "I want to clarify responsibilities in src/main.rs"
Tier 1: ctree_get_depends(path="src/main.rs") → enumerate 4 dependencies
Tier 2: find_symbol(depth=1) for each dependency → determine boundaries
Tier 3: narrow down files needing modification
Verification Status
ctree accurately captures structural changes before and after refactoring.
- Symbol Change Tracking: +S/-S records in rev files match actual code changes with 100% accuracy
- Dependency Update: +D/-D records accurately reflect dependency graph changes from module separation
- Continuity Assurance: Cumulative diffs across rev files enable tracking entire project evolution
Summary
code-tree (ctree) is a tool for visualizing structural changes in large-scale refactoring projects and conveying dependency changes shallowly to LLMs to facilitate overall understanding. Combined with Serena in particular, pinpointed symbol reads and reduced file reads are achieved, enabling efficient code comprehension and context management for LLM agents.

