Large Language Models

Critique of Leaked Claude Code CLI

Published 2026-03-31. Last modified 2026-04-03.
Time to read: 9 minutes.

This page is part of the llm collection.

The Claude Code CLI source code was leaked. Again. Anthropic’s build process is clearly not robust. 512,000 lines of code!

The leaked Claude Code CLI was acquired and published by Chaofan Shou (@Fried_rice on X). He spent 3 years on a PhD in Computer Science from the University of California, Berkeley before dropping out.

While Anthropic’s trade secrets have some legal protection, there are architectural insights that are valuable to competitors—useful for improving their own architecture, speeding up development of competing tools, seeing what Anthropic is working on next, and identifying gaps in what Anthropic has worked out.

Further, bad actors looking for security vulnerabilities now have a map for bypassing the guardrails Anthropic has put in place. The category Claude Code occupies (and currently leads) is moving very quickly, though, and it’s hard to predict right now how much of a problem this will be a few months down the road.

Rust Rewrite

The code is being rewritten in Rust with the name claw-code under the belief that this would be a legal clean room implementation. claw-code now holds the record for being the GitHub repository with the fastest-growing popularity.

Legal Repercussions

I am a technologist, without formal legal training, but I have 30 years of experience as an expert witness dealing with intellectual property such as trade secrets and patents, and I have learned how the laws are practiced.

I would love to use proper LLM harnesses written in Go and Rust, instead of fragile Python or Node programs, but my experience as a software expert witness suggests that claw-code may have legal baggage.

For Chaofan Shou

The accounts that I have read lead me to believe that the individual who discovered the source code, Chaofan Shou, did not break the law in the course of his discovery. The problem he might encounter is what he did with the code.

Locking the door after the horse escapes does not retrieve the horse. If Anthropic sues they will probably bankrupt the defendant(s), even if Anthropic does not win.

My understanding of how claw-code was written leads me to expect that this would likely not qualify as as clean room implementation under US federal law.

If claw-code is found to infringe on Anthropic’s intellectual property rights, users would not be found liable, just those involved in creating it.

The rule of law in the USA has been significantly compromised. Outcomes are for high-profile cases are now much less certain than they were before January 20, 2025.

For Anthropic

The leaked source code apparently has anti-competitive measures within it. Those measures might be found to violate various US Federal and/or state laws.

Code Critique

critique the quality of github.com/instructkr/claude-code. consider maintainablity, tests, user experience, efficiency

Based on the repository structure and documented architecture of the leaked claude-code source, here is a critique of its quality across your four requested dimensions.

1. Maintainability: High but Complex

The codebase demonstrates a highly mature, enterprise-grade architecture, though it suffers from “God Object” issues in its core logic.

  • Modular Tool/Command Pattern: The project uses a strictly decoupled architecture for its capabilities. Each tool (e.g., BashTool, FileEditTool) and slash command (e.g., /commit, /review) is an isolated module with its own schema and logic. This makes it trivial to add new features without regression risks in other areas.
  • Strong Typing & Validation: The use of TypeScript (Strict Mode) combined with Zod v4 for runtime schema validation ensures that data flowing between the LLM and the local system is predictable and safe.
  • Concerns with “Mega-Files”: Reports indicate that QueryEngine.ts (~46k lines) and Tool.ts (~29k lines) are massive. While this centralizes core logic, it significantly hinders readability and makes these files “hotspots” for merge conflicts and technical debt.
  • Feature Flagging: The integration of GrowthBook allows for dead-code elimination and granular feature rollouts, which is essential for maintaining a rapidly evolving agentic tool.

2. Tests: Robust and Multi-Layered

While the raw leak focuses on the src/ directory, the architectural hooks point toward a sophisticated testing culture.

  • Environmental Simulation: The Doctor command and environment diagnostics suggest the code is built to handle many different terminal environments (macOS, Linux, various shells).
  • Permission Hooks: The tool-permission system (src/hooks/toolPermission/) acts as a built-in safety test, ensuring that no tool executes without passing through a rigorous check of “bypass,” “auto,” or “user-approval” modes.
  • Tool Isolation: Because tools are self-contained, they are highly unit-testable. The repository includes specific “Synthetic Output” tools which are often used to mock LLM responses for deterministic integration testing.

3. User Experience (UX): State-of-the-Art Terminal UI

The tool pushes the boundaries of what a CLI can do, treating the terminal as a dynamic application rather than a static scroll.

  • React + Ink: By using Ink, the developers have brought a component-based UI model to the terminal. This allows for rich, interactive elements like progress bars, live diffs, and multi-select menus that feel more like a GUI.
  • Performance Optimization: The “Parallel Prefetch” strategy (starting MDM, keychain, and API connections simultaneously before the main module finishes loading) shows a high level of empathy for the user’s time.
  • Resiliency: Features like /resume and persistent memory management (memdir/) ensure that if a session is interrupted, the user doesn’t lose their context—a common pain point in CLI tools.
  • Accessibility: The inclusion of Vim mode and Voice input indicates a focus on diverse developer workflows.

4. Efficiency: Optimized for Speed and Scale

The choice of stack is focused on minimizing the “latency tax” of AI interactions.

  • Runtime (Bun): Using Bun over Node.js provides faster startup times and a built-in test runner/bundler, which is critical for a CLI tool that needs to feel “instant.”
  • Lazy Loading: Heavy dependencies (like gRPC and OpenTelemetry) are dynamically imported only when needed, keeping the initial memory footprint low.
  • Context Compression: The /compact command and compact/ service show that the tool actively manages token costs and context window limits, which is the primary driver of “efficiency” in agentic coding.
  • Parallelism: The “Team” and “Coordinator” systems allow for multi-agent parallel execution, effectively trading API costs for wall-clock time speedups on complex tasks.

Summary Table

Metric Rating Key Strength Key Weakness
Maintainability 7/10 Excellent modularity for tools/commands.  
Massive core files (40k+ lines) are risky.      
Tests 9/10 Designed for environmental robustness. Complexity makes
full coverage difficult.      
UX 10/10 High-fidelity terminal UI via React/Ink. Can be
overwhelming for minimalists.      
Efficiency 9/10 Bun runtime + parallel prefetching. Heavy use of
sub-agents can spike costs.      

critique the design and implementation of github.com/instructkr/claude-code. how would porting this project to golang affect the user experience

Claude Code (as mirrored in https://github.com/instructkr/claude-code) is a leaked snapshot of Anthropic’s proprietary agentic CLI coding assistant. It’s a massive ~512k LOC, ~1,900-file TypeScript codebase built for Bun, using React + Ink for a rich terminal UI. The tool lets you chat with Claude in the terminal, where it autonomously reads your codebase, edits files, runs shell commands, searches code, spawns sub-agents, manages tasks/worktrees, integrates with LSP/MCP/IDE bridges, and more. The repo itself is just an educational/security-research mirror (not runnable as-is), but it perfectly captures the original design and implementation.

Design Critique

Strengths

  • Highly agentic and feature-complete: 40+ tools (Bash, FileEdit, Glob/Grep, WebSearch, AgentTool for sub-agents, SkillTool for reusable workflows, LSP, Notebook, Cron, etc.) plus ~50 slash commands (/commit, /review, /plan, etc.). This turns the CLI into a true autonomous coding partner rather than just a chat wrapper. The permission system (with modes like plan, bypass, auto-approve) and policy enforcement are thoughtful safety layers for a tool that can edit files and run arbitrary commands.
  • Clear architectural separation: tools/, commands/, services/, components/, hooks/, bridge/, coordinator/ etc. make it modular at the directory level. Skills, plugins, context compression, memory extraction, and feature-flag-driven code (e.g., VOICE_MODE, BRIDGE_MODE) show forward-thinking extensibility and product management.
  • Modern developer-tool integration: IDE bridge (VS Code/JetBrains), OAuth/keychain auth, GrowthBook flags, OpenTelemetry, and parallel prefetching (MDM, keychain, API) demonstrate production-grade polish.
  • Innovative TUI: React + Ink lets the UI feel like a real app (dynamic components, stateful screens, modals) inside a terminal—far beyond typical CLI spinners and text.

Weaknesses

  • Over-engineering and scope creep: At 512k LOC this is larger than many full web apps. It includes voice commands, proactive agents, daemon mode, team memory sync, etc.—features that may bloat the core experience. The design risks becoming a “kitchen-sink” tool instead of a focused, reliable agent.
  • Monolithic tendencies: Despite directory modularity, core files are gigantic (QueryEngine.ts ≈46k LOC, Tool.ts ≈29k LOC, commands.ts ≈25k LOC). This violates single-responsibility principles and suggests a god-class pattern that will hurt long-term maintainability, testing, and onboarding.
  • Tight coupling to specific runtime/UI choices: Everything is built around Bun + React/Ink. While powerful, it locks the project into one ecosystem and adds unnecessary complexity for what is ultimately a terminal tool.
  • Security surface is enormous: 40+ tools that can execute shell commands or edit files demand the permission system to be bulletproof. The design acknowledges this but any flaw in the hook-based permission checks or mode switching could be catastrophic in real use.

Implementation Critique

Strengths

  • Type safety and validation: Strict TypeScript + Zod v4 schemas everywhere—excellent for tool inputs and API contracts.
  • Performance-conscious patterns: Lazy import() for heavy modules, parallel startup prefetching, Bun’s feature flags for dead-code elimination at build time, streaming LLM responses, and token/cost tracking are all production-ready.
  • Reusable UI layer: React hooks + Ink components allow clean, declarative terminal rendering (rare and impressive for CLIs).
  • Observability: Built-in analytics, OpenTelemetry, and detailed logging/telemetry hooks show mature engineering.

Weaknesses

  • Maintainability red flags: Those multi-10k-line files are a nightmare for debugging or refactoring. Error handling, retry logic, and tool-call loops are all crammed into QueryEngine.ts—a classic sign of organic growth without enough refactoring.
  • Packaging hygiene failure: The entire leak happened because a source map in the npm package pointed to unobfuscated TypeScript in an R2 bucket. This is a basic DevOps mistake that exposed the full source.
  • Runtime dependency bloat: Bun is fast, but the project now requires Bun (or Node compatibility hacks) + Ink/React overhead. No visible tests or CI snippets in the snapshot, which is concerning for a safety-critical tool.
  • Platform assumptions: macOS keychain, specific OAuth flows, and React/Ink rendering behaviors may not translate cleanly to Linux/Windows without extra work.

Overall, the design is ambitious and visionary for an agentic coding agent, but the implementation shows the classic signs of rapid internal iteration at a well-funded company: powerful but somewhat unwieldy and not yet hardened for open-source-style scrutiny.

How Porting to Go Would Affect User Experience

A full port to Go (using Cobra for CLI commands, Bubble Tea + Lipgloss for TUI, the official Anthropic Go SDK, etc.) is a massive undertaking given the size, but the UX changes would be noticeable and mostly positive for a CLI audience.

Positive UX Impacts

  • Dramatically simpler installation and distribution: Current version requires npm install + Bun runtime. A Go version ships as a single static binary (claude-code). Users just brew install, curl -L https://... | sh, or download from GitHub Releases. Zero Node/Bun dependency hell, works identically on Linux/macOS/Windows/arm64. Huge win for adoption speed and “it just works” feel.
  • Faster startup and lower resource usage: Go binaries start in milliseconds with tiny memory footprint (no JS runtime or React reconciliation). Long-running agent sessions and streaming responses would feel snappier, especially on lower-end machines or inside containers. Goroutines would handle concurrent tools/sub-agents more efficiently than JS async.
  • Better reliability and predictability: Compile-time checks catch many classes of bugs that TypeScript lets slip. No more “random Bun version incompatibility” issues. Cross-compilation is trivial, so official binaries for every platform are easy.
  • Native OS integration: Go has excellent built-in support for keychain (via github.com/zalando/go-keyring), file watching, LSP, etc. Some features (e.g., permission prompts, worktree handling) could feel more native.
  • Security and auditability bonus: Static binaries are easier to scan; Go’s memory safety reduces certain attack surfaces.

Negative / Neutral UX Impacts

  • TUI feel would change (and possibly regress initially): React + Ink gives a very dynamic, component-based, “app-like” terminal experience (smooth re-renders, complex layouts, modals). Go’s Bubble Tea is excellent and visually beautiful (many modern CLIs like gh, charm tools use it), but it’s more imperative/state-machine driven. Recreating every screen, hook, and reactive component would require significant redesign effort. Early ports might feel slightly less polished or more “terminal-native” rather than “mini web app.”
  • Development velocity and iteration speed drop: Adding new UI screens or tweaking layouts is faster in React than in Bubble Tea. If Anthropic’s team iterated quickly on the TUI, the Go version might ship with a simpler or more static interface unless the port team invests heavily in UI parity.
  • Ecosystem re-implementation cost: LSP client, MCP server, GrowthBook flags, plugin system, memory sync, etc. would need Go equivalents. Some features (voice mode, certain React-only components) could be deprioritized or simplified, subtly changing the “delight” factor.
  • No functional change to core loop: The chat → plan → tool-use → execute experience stays identical. Users wouldn’t lose any agentic power; they’d just notice faster binary and easier install.

Net UX Effect Porting to Go would improve the everyday user experience for most developers: faster install, lighter runtime, better cross-platform support, and a more “professional CLI tool” vibe (think gh or terraform quality). The rich React TUI is the only area where UX could temporarily feel less fluid, but a competent port using modern Go TUI libraries (Bubble Tea + Glamour + Lipgloss) can match or exceed the original polish. Long-term, the Go version would likely be more stable, easier to distribute, and cheaper to run—exactly what a production CLI should be. The massive port effort would pay off in maintainability and user delight for the distribution and performance dimensions that matter most in a terminal tool.

* indicates a required field.

Please select the following to receive Mike Slinn’s newsletter:

You can unsubscribe at any time by clicking the link in the footer of emails.

Mike Slinn uses Mailchimp as his marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp’s privacy practices.