shelpa-mcp: Design Record of a Scrapped Virtual Pipeline

Overview

shelpa-mcp was developed as a Model Context Protocol (MCP) compliant virtual pipeline server to provide LLM agents with safe file operations and text processing. It was ultimately abandoned due to the difficulty of model behavior correction (see Security Design and Lessons for details), but the architecture design was interesting enough to preserve as a technical record.

This article documents the MCP server architecture, command routing implementation, pipeline stage management, and session CWD implementation.

Background: What Is a Virtual Pipeline

Design Philosophy

Traditional shell access allows LLM agents to execute arbitrary commands, creating significant security risks. Shelpa implements a “virtual pipeline” concept that achieves:

Whitelist Control: Only permitted commands can execute
Pipeline Chaining: UNIX pipe (|) command concatenation
Workspace Restriction: Access denial outside designated directories
Audit Trail: All write operations mirrored in .shelpa/

MCP Server Role

Shelpa-mcp operates as a stdio MCP server exposing a single shelpa_pipe tool:

  LLM Agent (Claude, etc.)
    ↓ (JSON-RPC over stdio)
shelpa-mcp Server
    ↓
shelpa Library (parse → validate → execute)
    ↓
Workspace Files + .shelpa/ Audit Trail

Command Routing Design

Command Classification

Shelpa commands fall into two categories:

Pipeline Commands

Participate in pipe chains, producing and consuming byte streams:

  let pipeline_cmds = [
    "tail", "rg", "awk", "sed", "tr", "jq", "wc", 
    "tee", "fd", "ls", "head", "sort",
    "ctree_check", "ctree_generate", "serena_find_symbol"
];

Usage examples:

  tail -n 100 app.log | rg "ERROR" | awk '{print $3}' | sort -u
ls src | rg "\.rs$"
fd "\.toml$" | head -5

Execute only as sole stages; cannot participate in pipe chains:

  let nav_cmds = ["pwd", "cd"];

Reclassifying ls

Initially, ls was classified as a navigation command. However, pipeline usage patterns like ls src | rg fn were common, prompting reclassification:

Before:

  let nav_cmds = ["pwd", "cd", "ls"];  // ls restricted to standalone

After:

  let pipeline_cmds = ["tail", "rg", ..., "ls"];  // ls joins pipelines
let nav_cmds = ["pwd", "cd"];  // Pure navigation only

When used standalone, ls still dispatches to the workspace-restricted builtin implementation (builtin_ls).

Pipeline Stage Management

Parsing

  pub fn parse_pipeline(command_str: &str) -> Result<Vec<PipelineStage>, GuardViolation> {
    // 1. Shell quote analysis
    let tokens = shell_words::split(command_str)?;
    
    // 2. Split by pipe (|)
    let stages: Vec<PipelineStage> = split_by_pipe(&tokens);
    
    // 3. Validate each stage's command
    for stage in &stages {
        validate_command(&stage.command, &stage.args)?;
    }
    
    // 4. Detect redirections (prohibited)
    check_no_redirects(&stages)?;
    
    Ok(stages)
}

Inter-Stage Data Flow

  Stage 1 (tail)     Stage 2 (rg)      Stage 3 (tee)
  stdout ──pipe──→  stdin              stdin
                    stdout ──pipe──→    stdin
                                       stdout → returned
                                       file   → workspace
                                       mirror → .shelpa/

Each stage executes as an independent subprocess, connected via OS-level pipes for stdin/stdout.

Execution Metadata

Each stage’s execution results are recorded as metadata:

  pub struct StepMeta {
    pub command: String,
    pub output_size: usize,
    pub truncated: bool,
    pub execution_time_ms: u64,
}

Session CWD Management

Problem

MCP servers communicate via stateless JSON-RPC protocol. However, file operations require a “current directory” concept.

Solution: Session-Scoped CWD

  static CWD_MUTEX: LazyLock<Mutex<Option<PathBuf>>> = LazyLock::new(|| Mutex::new(None));

fn current_cwd() -> PathBuf {
    CWD_MUTEX.lock().unwrap()
        .clone()
        .unwrap_or_else(|| workspace_root())
}

fn tool_pipe(args: &Map<String, Value>) -> Result<String> {
    let cwd = if let Some(cwd_str) = get_str(args, "cwd")? {
        let canonical = fs::canonicalize(root.join(cwd_str))?;
        assert!(canonical.starts_with(&root));
        canonical
    } else {
        current_cwd()
    };
    
    match shelpa::pipe(&root, &cwd, &command) {
        Ok(result) => {
            // Update session CWD on successful cd
            if let Some(new_cwd) = result.new_cwd.clone() {
                *CWD_MUTEX.lock().unwrap() = Some(new_cwd);
            }
            // ...
        }
    }
}

Successful cd execution updates the internal Mutex-guarded CWD. Subsequent commands execute relative to the new CWD.

Dual-Write Tee Implementation

Design

The tee command writes simultaneously to two locations:

Real File: Writes to specified workspace path (overwrite or append)
Audit Mirror: Always appends to .shelpa/{cwd_rel}/{target}

  tee output.txt
  ↓
  ├── workspace/output.txt    (overwrite mode)
  └── .shelpa/output.txt      (append mode, with separator)

tee -a output.txt
  ↓
  ├── workspace/output.txt    (append mode)
  └── .shelpa/output.txt      (append mode)

Overwrite Separators

For non-append writes, the .shelpa/ mirror inserts boundary separators:

  --- shelpa:overwrite ts=2026-02-25T13:17:30Z record_id=1772025450395572000 ---
(new content appended here)

This enables complete tracking of “when was this overwritten” in audit logs.

MCP Interface Design

CLI Help Output

The shelpa-mcp help output. Tool names were disguised as shell commands, intended to trick LLMs into reusing pre-trained shell knowledge (this ultimately didn’t work — see Security Design and Lessons for details):

  shelpa-mcp (MCP stdio server)
Usage:
  shelpa-mcp [--root <ROOT>] [--help]
Notes:
  - This binary speaks MCP over stdio. It does not serve HTTP.
  - Use your MCP client to call tools below.
  - --root sets the workspace root directory for all tool calls.
  - cwd defaults to the workspace root if not provided.
Tools:
  - shelpa_pipe    Execute a virtual safety pipeline
  (tail  rg  awk  sed  tr  jq  wc  tee  fd  ls  sort  head)
  - shelpa_write   Execute a virtual safety tee (auto guard, save override history)
Allowed pipeline commands: tail rg awk sed tr jq wc tee fd
Navigation commands (single-stage only): pwd  cd <path>  ls [path]
Pipes only. No redirects (> >>). No sed -i. No awk file output. Save via tee only.
CRITICAL: Never use standard file editing tools (such as write_file, replace, etc.)
  - always use the specified tool exclusively.

Tool Definition

  {
  "name": "shelpa_pipe",
  "description": "Execute a virtual pipeline string or navigation command",
  "inputSchema": {
    "type": "object",
    "properties": {
      "command": {
        "type": "string",
        "description": "Pipeline command string"
      },
      "cwd": {
        "type": "string",
        "description": "Optional working directory within workspace"
      },
      "confirm_oversize": {
        "type": "boolean",
        "description": "Confirm large writes exceeding approval threshold"
      }
    },
    "required": ["command"]
  }
}

Response Format

Success:

  {
  "stdout": "matched line 1\nmatched line 2\n",
  "meta": {
    "steps": [
      {"command": "tail -n 100 file", "output_size": 5000, "truncated": false, "execution_time_ms": 12},
      {"command": "rg pattern", "output_size": 48, "truncated": false, "execution_time_ms": 8}
    ],
    "tee": null
  }
}

Error:

  {
  "error": {
    "code": "GUARD_VIOLATION",
    "reason": "DISALLOWED_CMD",
    "detail": "'rm' is not allowed.",
    "suggestion": "Use tee to write files instead."
  }
}

Key Learnings

1. MCP Server Design Patterns

Single-tool design for complex functionality has high affinity with LLM agents. Minimizing tool count reduces LLM tool selection cognitive load.

2. Session State on Stateless Protocols

Session state (CWD) on stateless protocols is managed via server-side Mutex. Simple but requires attention in multi-client environments.

3. Command Classification Flexibility

Commands like ls that serve multiple purposes benefit from dual-nature design: pipeline-capable yet falling back to builtin for standalone execution.

4. Audit Design Practicality

Dual-write design balances real file operability with audit completeness. Separator-based append enables straightforward post-hoc auditing.

Conclusion

The shelpa-mcp implementation achieved:

Secure MCP Server: Whitelist control and multi-layer guards to prevent unauthorized operations
Flexible Pipelines: UNIX pipe semantics maintained with security guarantees
Session CWD: Directory management on stateless protocol
Complete Audit Trail: Dual-write tee records all operations

While technically robust, it proved impossible to establish virtual pipeline usage among LLM agents, and the project was ultimately abandoned. See Security Design and Lessons for details.

shelpa-mcp: Design Record of a Scrapped Virtual Pipeline

Overview link

Background: What Is a Virtual Pipeline link

Design Philosophy link

MCP Server Role link

Command Routing Design link

Command Classification link

Pipeline Commands link

Navigation Commands link

Reclassifying ls link

Pipeline Stage Management link

Parsing link

Inter-Stage Data Flow link

Execution Metadata link

Session CWD Management link

Problem link

Solution: Session-Scoped CWD link

Dual-Write Tee Implementation link

Design link

Overwrite Separators link

MCP Interface Design link

CLI Help Output link

Tool Definition link

Response Format link

Key Learnings link

1. MCP Server Design Patterns link

2. Session State on Stateless Protocols link

3. Command Classification Flexibility link

4. Audit Design Practicality link

Conclusion link