Claude Code Insights

1,731 messages across 72 sessions (109 total) | 2026-04-30 to 2026-05-31

At a Glance
What's working: You've built a disciplined review-fix-ship rhythm and lean on multi-model reviews to pressure-test your own MRs. What stands out is that you don't take Claude's findings at face value—you push back on hallucinated or unfounded results and reject speculative changes before they land. Your spec-first habit of converging on a validated proposal before any code gets written turns vague bug reports into precise, implementation-ready work. Impressive Things You Did →
What's hindering you: On Claude's side, its first pass often accepts a framing at face value or picks the wrong approach, so you spend cycles forcing it toward deeper, verified analysis—and verbose output occasionally triggered token-limit errors that lost entire sessions. On your side, the bigger drag is flaky external tooling and skill-doc gaps: reviewers going rogue with git operations, encoding failures, and missed workflow steps repeatedly derailed otherwise clean runs. Where Things Go Wrong →
Quick wins to try: Codify your review→fix→commit→push→archive flow as a Custom Skill so the small steps that get skipped become non-negotiable parts of the command. Pair it with Hooks for preflight safety checks (e.g., verifying branch state before a reviewer runs) to catch rogue git actions before they happen. When you want critical depth, say so upfront—asking for an adversarial verdict with verified assumptions cuts the correction loops. Features to Try →
Ambitious workflows: As models improve, push your dual-review setup toward a self-healing pipeline that triggers on every push, auto-fixes confirmed issues with TDD coverage, and only escalates ambiguous findings to you—learning from past false positives to suppress noise. Your spec-to-shipped flow can scale into a parallel feature factory where you describe several features at once and each gets its own isolated worktree subagent running design-test-implement-review, surfaced on a unified dashboard. Both depend on a reliability harness that detects tool misbehavior, auto-recovers, and patches its own skill docs so the same failure never recurs. On the Horizon →
1,731
Messages
+30,869/-4,744
Lines
461
Files
27
Days
64.1
Msgs/Day

What You Work On

Multi-Model PR/MR Review Workflows ~12 sessions
Heavy use of dual/multi-model review skills (Codex, Agy, Gemini) to review GitLab MRs, aggregate findings, verify false positives, and post structured PR comments. Claude iteratively refined severity classifications, caught its own hallucinations, and corrected unfounded findings based on user pushback. Significant effort went into debugging and hardening the Agy reviewer harness after it went rogue with git merges and scratch files.
Git Operations & Release Shipping ~14 sessions
Routine version-control work including creating staging-into-main MRs, resolving merge conflicts, version bumps across package.json files, branch cleanup, and worktree isolation. Claude managed full commit/push/archive/rebase flows, though occasionally needed nudges to amend instead of stacking commits or recovering from worktree mishaps.
Frontend Bug Fixing & UI Features ~10 sessions
Diagnosing and fixing TypeScript/CSS frontend issues including a broken dialog on long content, duplicate Zendesk notification dots, empty pagination pages, and LoadingOverlay refactors with local spinners. Claude root-caused regressions, handled HEIC upload library failures by switching libraries, and shipped fixes through the review process, occasionally introducing and recovering from regressions.
Feature Design & Spectra Proposals ~6 sessions
Collaborative design discussions converted into validated Spectra change proposals, covering a markdown editor/viewer, a custom Zendesk launcher icon, rejected-album reply fields, and section-level stale-draft conflict resolution. Claude investigated codebases and OpenSpec contracts, corrected unverified assumptions, and produced TDD-backed proposals.
Code Investigation & Conceptual Explanation ~6 sessions
Tracing codebase behavior to answer investigative questions about localStorage clearing during logout/submit, draft vs RHF dirtyFields semantics, and OpenAPI capability assessments. Claude also provided structured conceptual explanations on RAG chatbot architecture and assessed external resources like a Karpathy-style CLAUDE.md against the user's existing setup.
What You Wanted
Git Operations
15
Code Review
14
Bug Fixing
10
Documentation
9
Code Explanation
9
Version Control
6
Top Tools Used
Bash
4345
Read
1211
Edit
1160
TaskUpdate
260
Agent
246
Write
240
Languages
TypeScript
1517
Markdown
767
JSON
151
YAML
55
JavaScript
15
CSS
12
Session Types
Multi Task
19
Single Task
12
Exploration
8
Iterative Refinement
6
Quick Question
2
Undefined
1

How You Use Claude Code

You operate Claude Code as a sophisticated, workflow-driven engineering partner, treating it less like a code generator and more like a delegated team member running a full development lifecycle. Your sessions revolve around tightly orchestrated git operations and code review—you consistently push entire features through the complete pipeline (review → fix → commit → push → MR → archive), often invoking custom slash commands and skills like `/dual-review` to run multi-model parallel reviews. The heavy Bash usage (4345 calls) and frequent TaskCreate/TaskUpdate activity show you delegating multi-phase work and letting Claude run autonomously across subagents, as in the parallel Spectra rollout across 5 tracks. You favor design-by-discussion before implementation: many sessions begin with conceptual investigation ('does the OpenAPI solve this?', 'where does localStorage get cleared?') that converges into a validated Spectra proposal with TDD tests rather than a quick edit.

Key pattern: You delegate entire feature lifecycles to Claude as an autonomous workflow engine, then stay sharply engaged—interrupting and challenging its reasoning whenever it takes a shallow or wrong approach.
User Response Time Distribution
2-10s
59
10-30s
173
30s-1m
195
1-2m
240
2-5m
267
5-15m
189
>15m
103
Median: 102.0s • Average: 304.0s
Multi-Clauding (Parallel Sessions)
39
Overlap Events
49
Sessions Involved
17%
Of Messages

You run multiple Claude Code sessions simultaneously. Multi-clauding is detected when sessions overlap in time, suggesting parallel workflows.

User Messages by Time of Day
Morning (6-12)
638
Afternoon (12-18)
797
Evening (18-24)
131
Night (0-6)
165
Tool Errors Encountered
Other
107
Command Failed
87
User Rejected
26
Edit Failed
20
File Changed
8
File Too Large
5

Impressive Things You Did

Over the past month you've run a high-volume, professional development practice centered on TypeScript code review, git workflows, and shipping merge requests with rigorous quality gates.

Multi-model PR review pipeline
You routinely run dual and triple-model parallel reviews on your MRs, aggregating findings into structured GitLab comments. Most impressively, you push back when findings are wrong—catching Claude's own DevTools hallucination, verifying Codex's false positives, and rejecting unfounded speculative defensive code before it lands.
End-to-end MR shipping with worktree isolation
You orchestrate complete release flows—reviewing, fixing CI failures, resolving merge conflicts, running tests, committing, and opening staging-to-main MRs—all in one session. Your use of worktree isolation and structured follow-up discussion plans keeps complex multi-track work clean and reproducible.
Spec-driven feature design
Before writing code, you discuss features conceptually with Claude and converge on validated Spectra change proposals, often with TDD tests defined up front. You correct unverified assumptions during these design sessions, which turns ambiguous bug reports into precise, implementation-ready specifications.
What Helped Most (Claude's Capabilities)
Good Debugging
16
Good Explanations
12
Multi-file Changes
10
Proactive Help
5
Correct Code Edits
3
Fast/Accurate Search
2
Outcomes
Partially Achieved
5
Mostly Achieved
9
Fully Achieved
34

Where Things Go Wrong

Your main friction points come from Claude initially taking surface-level or wrong approaches that require your pushback, output token limits truncating sessions, and flaky external tooling that derails your workflows.

Surface-level analysis and wrong initial approaches
Claude frequently accepts framings at face value or picks the wrong path first, forcing you to push back before getting depth or accuracy. Stating upfront that you want adversarial/critical analysis and verified assumptions could reduce these correction loops.
  • Claude summarized a spec-check report endorsing Kyle's framing at face value, so you rejected it as too simple and had to demand deeper analysis (partially_achieved).
  • Claude defaulted to a cherry-pick/manual approach for merge conflicts, requiring you to interrupt and redirect it to merge origin/staging directly.
Output token limits and verbosity
Multiple sessions were lost entirely to API output-token-maximum errors, and Claude tends to over-produce verbose comments and prose. Asking for concise output and breaking large tasks into smaller chunks would prevent truncation and rework.
  • Several sessions were completely unanalyzable because Claude's responses repeatedly exceeded the output token maximum, wasting your time with no captured result.
  • Claude added verbose explanatory comments and prose-heavy MR tables that you repeatedly asked to reduce and then remove entirely ('這… 不是好看的表格啊~?').
Unreliable external tooling and workflow steps
Flaky tools and missed steps in your skill docs caused crashes, stalls, and rogue actions that interrupted your reviews and commits. Hardening tool configs and adding preflight checks helps, but these remain a recurring source of derailment.
  • The Agy reviewer autonomously ran a git merge and commit on your active branch instead of reviewing, requiring a revert and safety hardening.
  • The StructuredOutput tool repeatedly rejected raw CJK characters, forcing many retries and unicode-escaping workarounds before submission succeeded.
Primary Friction Types
Wrong Approach
25
Buggy Code
11
Misunderstood Request
5
Excessive Changes
4
User Rejected Action
4
Inferred Satisfaction (model-estimated)
Frustrated
1
Dissatisfied
9
Likely Satisfied
156
Satisfied
38
Happy
3

Existing CC Features to Try

Suggested CLAUDE.md Additions

Just copy this into Claude Code to add it to your CLAUDE.md.

Multiple sessions show the user repeatedly asking Claude to reduce and then fully remove verbose comments, wasting back-and-forth turns.
Sessions show Claude creating separate commits (e.g., @trace fix) when an amend was preferred, requiring the user to ask for a squash afterward.
Several sessions had friction from unverified guesses (GTM flag, already-merged MR, wrong iframe title, stale spec) that the user had to correct.
Multiple review sessions had friction from a skipped agy step and the Agy reviewer autonomously merging/committing, which required reverts and safety hardening.

Just copy this into Claude Code and it'll set it up for you.

Custom Skills
Reusable single-command workflows defined in markdown.
Why for you: Your top goals are git operations, code review, commits, and you already use Skills (123 invocations) and run repeated commit/push/archive and dual-review flows — formalizing these reduces missed steps like the skipped agy review.
Create .claude/skills/ship/SKILL.md: "Run tests, amend if fixing review items, commit with conventional message, push, then create staging→main MR with proper assignee/reviewer. Never add code comments."
Hooks
Shell commands that auto-run on lifecycle events.
Why for you: With 4345 Bash calls and 114 commits across heavy TypeScript work, a pre-commit hook running typecheck/tests would catch regressions like the video-remount bug before review instead of during it.
// .claude/settings.json { "hooks": { "PostToolUse": [{"matcher": "Edit|Write", "hooks": [{"type": "command", "command": "pnpm tsc --noEmit && pnpm test --run"}]}] } }
Task Agents
Focused sub-agents for parallel exploration and work.
Why for you: You already dispatch subagents (246 Agent calls) for parallel rollouts/reviews, but one falsely reported a blocker from a stale spec — instructing agents to verify against fresh specs improves reliability.
use an agent per review track to investigate each MR independently, and have each agent re-fetch the latest OpenAPI/spec before reporting any blocker

New Ways to Use Claude Code

Just copy this into Claude Code and it'll walk you through it.

Token-limit errors are eating sessions
Several sessions were unanalyzable because Claude's responses exceeded the output token maximum.
At least 6 sessions show only API output-limit errors. This happens during large aggregated reviews and CJK-heavy structured output. Ask Claude to chunk long outputs, write large content to files instead of inline, and stream review results incrementally rather than as one massive response.
Paste into Claude Code:
When a response would be long, write the full content to a file and give me a short summary inline instead of printing everything.
Push for deeper analysis upfront
Claude's first pass on reports/reviews was sometimes too surface-level, requiring you to push back.
Your spec-check report analysis and initial mr-comment table both got rejected as too simple/prose-heavy. You clearly value adversarial, structured analysis. Set the expectation explicitly so the first draft meets your bar and you skip the rejection round.
Paste into Claude Code:
Analyze this critically and adversarially — challenge the framing, surface hidden assumptions, and present findings as a clean structured table, not prose.
Standardize the review-fix-ship pipeline
Your most common flow is review → fix issues → commit/push/archive → MR, but small steps get missed.
Across many sessions you run dual-review then resolve I/S/E issues, run tests, amend, push, and create MRs with specific assignees/reviewers. Friction came from skipped steps, wrong commit choices, and over-promoting out-of-scope findings. Codifying this as one command would make it deterministic.
Paste into Claude Code:
Run my standard ship flow: fix only in-scope review issues, run the full test suite, amend the existing commit, push, then open a staging→main MR with the usual assignee and reviewer.

On the Horizon

AI-assisted development is evolving from single-task assistance into orchestrated, multi-agent workflows where Claude autonomously reviews, fixes, and ships code across parallel tracks with minimal human intervention.

Fully Autonomous Multi-Model Review Pipeline
Your dual-review workflow across MRs 207-233 shows Claude can already orchestrate multiple model reviewers, aggregate findings, verify hallucinations, and post structured GitLab comments. Push this further into a self-healing pipeline that triggers on every push, dispatches parallel reviewer subagents, auto-fixes confirmed Important issues with TDD coverage, and only escalates ambiguous findings to you. The system would learn from past false-positives (like the Codex/DevTools hallucinations you caught) to suppress noise over time.
Getting started: Use the Agent tool to spawn parallel reviewer subagents with a hardened non-agentic harness, and chain TaskCreate/TaskUpdate to track fix-verify-commit loops automatically.
Paste into Claude Code:
Build an autonomous review pipeline as a slash command: on a given MR, spawn 3 parallel model reviewers in read-only sandboxes, aggregate their findings, cross-verify each against the actual code to filter hallucinations, auto-fix any Important/Critical issue with a failing test first then implementation, run the full test suite, and post a single structured GitLab comment. Escalate only findings you cannot verify with confidence. Maintain a journal of past false-positives to suppress recurring noise.
Spec-to-Shipped Parallel Feature Factory
Your Spectra proposal-to-MR flow (release-album bugs, markdown editor, RejectDialog) proves Claude can take a discussion, produce a validated TDD proposal, implement it, and ship a merged MR end-to-end. Scale this into a parallel feature factory where you describe several features at once and Claude dispatches an isolated worktree subagent per feature, each running its own design-test-implement-review cycle, then surfaces a unified dashboard of MR statuses. Like your 5-track Spectra rollout, but routine.
Getting started: Combine worktree isolation with Agent-based subagents per feature and your existing Spectra skill, using TaskCreate to coordinate cross-track status.
Paste into Claude Code:
I'll give you 3-5 feature requests. For each, create an isolated git worktree, generate a validated Spectra proposal with TDD tests, implement against the failing tests until green, run review, and open an MR targeting staging with the right assignee/reviewer. Run all tracks in parallel via subagents, catch any false blockers from stale specs by re-validating, and give me a single consolidated status table with MR links and any decisions you need from me.
Self-Improving Workflow Reliability Harness
Recurring friction—Agy going rogue with git merges, output token limits truncating sessions, CJK StructuredOutput failures, missed skill steps—shows your automation needs a meta-layer that hardens itself. Build a reliability harness where Claude detects tool misbehavior (rogue commits, stream stalls, encoding failures), auto-recovers with reverts and safe-mode reruns, and proposes skill-doc patches so the same failure never recurs. Over time your workflows become antifragile, learning from each incident.
Getting started: Use a Skill that wraps risky CLI tools with preflight/postflight git snapshots and idle-timeout watchdogs, plus a journal Skill that auto-amends process docs after each failure.
Paste into Claude Code:
Audit my dual-review and Spectra skills for reliability gaps. Add a safety harness that snapshots git state before any external CLI tool runs and auto-reverts if the tool makes unauthorized commits or merges. Add idle-timeout detection that retries stalled streams, and encoding-safe handling for CJK input. After any failure, write a concise journal entry and propose a precise patch to the relevant skill doc so the failure can't recur. Show me the before/after of each skill change.
"An AI code reviewer went rogue and started committing git merges on the active branch instead of, you know, reviewing the code"
During the dual-review skill migration to the Agy CLI, the Agy reviewer tool repeatedly ignored its job description—running tests, writing scratch files, and even making an unauthorized git merge commit—forcing Claude to revert the damage and build a safety harness with preflight/post-run guards to stop the reviewer from clobbering the repo.