Claude Code Insights

3,166 messages across 149 sessions (188 total) | 2026-03-16 to 2026-05-02

At a Glance
What's working: You've built a sophisticated multi-stage workflow around MR quality — running parallel multi-model reviews, verifying critical findings before posting, and feeding learnings back into reusable skills. The phased approach on MR !111 (resolving 7/7 Critical and 11/12 Important items in discrete, verifiable commits) and your deep research and extensive hands-on use of third-party tools show real production discipline. You also treat skills as living infrastructure, hardening them against real MRs rather than one-off use. Impressive Things You Did →
What's hindering you: On Claude's side: it tends to act before observing — starting code edits during review-only sessions, making claims about IDE buffer state without checking disk, and producing prose-heavy first drafts when you clearly wanted structured tables or critical analysis. On your side: environment assumptions (English locales, @me identity, default fetches) keep biting until they're documented, and ambiguous depth/format expectations lead to rework cycles where you push back and Claude redoes the work. Where Things Go Wrong →
Quick wins to try: Codify your dual-review → verify → post pipeline as a single skill with explicit anti-hallucination guardrails (verify-files-exist gates, observe-before-claim steps). Add a pre-commit or pre-edit hook that blocks code modifications during review-mode sessions so Claude can't pre-act. And capture the JetBrains reformat_file + replace_text_in_file buffer-flush trick as a skill — that empirical discovery shouldn't have to be rediscovered. Features to Try →
Ambitious workflows: As models improve, expect your skill stack to compose into an autonomous spec-to-merge pipeline: hand it an OpenAPI diff or Figma link, and it spins up a worktree, implements, runs three-model review, applies Critical/Important fixes, verifies via headless browser, and posts the aggregated comment — only paging you for true judgment calls. Pair this with a self-healing skills system where every correction you make (placeholder wording, locale, format, premature edits) becomes a structured friction event that proposes AGENTS.md or skill amendments automatically, so the misunderstandings that caused this period's rework cycles don't recur. On the Horizon →
3,166
Messages
+33,751/-4,072
Lines
622
Files
30
Days
105.5
Msgs/Day

What You Work On

Merge Request Review & Quality Workflows ~18 sessions
Extensive use of Claude Code for multi-model code review workflows (dual-review, parallel review agents) on MRs across staging branches. Claude orchestrated Codex/Gemini/Claude reviews, aggregated findings into structured comments, posted to GitLab, verified critical findings, and tracked follow-up issues. Friction included occasional hallucinated blockers and tool invocation failures requiring pivots to manual agent reviews.
Frontend Feature Development & Bug Fixes ~12 sessions
TypeScript-heavy work on organization management tables, contracts pages, album review tabs, and Taiwan ID validation, often shipped end-to-end via MRs. Claude diagnosed root causes (API type changes, nullable arrays, CDN race conditions), implemented multi-file changes, verified via Chrome DevTools, and opened MRs with proper assignees/reviewers.
OpenAPI Spec Sync & Tooling ~7 sessions
Recurring workflow to sync openapi.yaml from staging via authenticated Chrome tabs, reformat through JetBrains MCP, and commit changes. Claude iteratively debugged IDE buffer flushing tricks, CORS issues, and regex bugs in changelog tooling, eventually codifying the process into a reusable daily-run skill.
Reverse Engineering & Architecture Analysis ~6 sessions
Deep research and extensive hands-on use of third-party apps, paired with comparative evaluation of AI agent repos (multica, hermes-agent, openclaw, open-webui). Claude systematically inspected binaries, network traffic, models, and prompts to produce comprehensive architecture documentation and positioning analyses to inform tooling decisions.
Learning, Explanations & Dev Environment ~7 sessions
Q&A sessions on RAG/chatbot architecture, Ghostty terminal quirks, Neovim adoption tailored to a WebStorm+vim workflow, and conceptual code walk-throughs. Claude produced structured explanations, week-one practice plans saved to notes, and targeted configuration guidance based on version-specific bug research.
What You Wanted
Code Review
14
Git Commit
12
Commit And Push
7
Learning Explanation
7
Sync Openapi Spec
6
Create Issues
6
Top Tools Used
Bash
4447
Read
1531
Edit
1362
TaskUpdate
315
Write
308
Agent
288
Languages
TypeScript
1458
Markdown
938
JSON
265
Shell
107
Go
95
YAML
68
Session Types
Multi Task
28
Single Task
11
Exploration
6
Iterative Refinement
5
Quick Question
2

How You Use Claude Code

You operate Claude Code as a high-throughput engineering manager running parallel workstreams rather than a hands-on coder. Across 149 sessions and 191 commits, your dominant pattern is invoking structured workflows—`/dual-review`, `/sync-openapi-spec`, `/gitlab-push-mr`—and orchestrating multi-phase work (review → fix → simplify → push → journal) in a single session. You frequently spawn parallel review agents across 3 models, batch-review MRs 124-131 simultaneously, and chain tasks like 'resolve conflicts, archive specs, open MR, update memory.' The 4447 Bash calls vs 1362 Edits confirms you're directing automation, not micro-editing. You also invest heavily in meta-tooling: when a skill misbehaves (designer-dev assignee bug, gitlab-mr defaults, openapi reformat buffer issue), you don't just patch the symptom—you iterate across multiple real MRs until root-caused, then distill the learning into journal/guide/skill files.

You interrupt and redirect aggressively when Claude drifts. The friction log shows you cutting in when Claude edits code during a review-only task ('requiring revert'), when sub-agents hallucinate blockers, when output is too prose-heavy ('這... 不是好看的表格啊~?'), or when surface-level analysis endorses someone else's framing instead of critiquing it. You corrected 'three places' to 'five places' on an alias rename, aborted a hung Codex review, and pushed back when Claude wrote specs in English against your memory rules. This suggests you read outputs critically and verify against ground truth, not skim-and-approve.

Your specs tend to be goal-oriented and trust-the-process: 'ship MR 111 through full quality workflow,' 'research this third-party tool deeply and document it,' 'triage these 3 tasks.' You let Claude run long multi-stage workflows but expect verification at each phase (browser automation checks, tsc/biome green, posted GitLab comments). The 18 dissatisfied vs 227 satisfied/likely-satisfied messages and 37/52 fully-achieved sessions show the model works—but the recurring 'wrong_approach' (28) and 'misunderstood_request' (15) friction points come from your high standards on workflow granularity, language (Chinese vs English), and scope boundaries that aren't always stated upfront.

Key pattern: You run Claude as a parallel workflow orchestrator with high-trust delegation but sharp, critical course-correction whenever outputs drift from your standards or ground truth.
User Response Time Distribution
2-10s
131
10-30s
497
30s-1m
519
1-2m
497
2-5m
398
5-15m
211
>15m
150
Median: 64.4s • Average: 218.7s
Multi-Clauding (Parallel Sessions)
46
Overlap Events
58
Sessions Involved
8%
Of Messages

You run multiple Claude Code sessions simultaneously. Multi-clauding is detected when sessions overlap in time, suggesting parallel workflows.

User Messages by Time of Day
Morning (6-12)
1081
Afternoon (12-18)
1301
Evening (18-24)
370
Night (0-6)
414
Tool Errors Encountered
Other
162
Command Failed
157
User Rejected
31
File Changed
18
File Not Found
13
File Too Large
11

Impressive Things You Did

Across 149 sessions of intensive frontend development, code review, and tooling work, you've built a remarkably disciplined workflow around shipping MRs with confidence.

Multi-Model Parallel Code Review
You've operationalized a /dual-review workflow that runs three models in parallel against your MRs, then aggregates findings into a single posted comment with severity rankings. You consistently verify Critical findings before posting, audit adoption across multiple MRs, and distill recurring patterns back into journals, guides, and skills — turning each review into compounding institutional knowledge.
Worktree-Isolated MR Lifecycle
You routinely set up isolated worktrees for MR review, run staged phases (conflict resolution, simplify pass, multi-model review, openspec backfill, reviewer feedback), and ship with verification at each step. Your phased approach on MR !111 — resolving 7/7 Critical and 11/12 Important items across discrete commits — shows a mature production discipline most teams never achieve.
Skills as Reusable Infrastructure
Rather than treating each task as one-off, you iteratively harden workflows into skills — sync-openapi-spec, gitlab-push-mr, designer-dev — testing them against real MRs and fixing root causes (like the bot-account assignee bug or the IDE buffer flush trick). You then squash commits cleanly and document the empirical findings, building a personal toolkit that gets sharper with every session.
What Helped Most (Claude's Capabilities)
Multi-file Changes
20
Good Debugging
13
Good Explanations
11
Correct Code Edits
5
Fast/Accurate Search
2
Proactive Help
1
Outcomes
Partially Achieved
2
Mostly Achieved
13
Fully Achieved
37

Where Things Go Wrong

Your sessions show high success rates but recurring friction around Claude taking premature action, misjudging scope, and stumbling on environmental quirks before observing actual state.

Premature action before verification
You frequently have to interrupt Claude when it starts editing code, creating tasks, or making claims before reading the relevant state. Asking Claude to explicitly observe-then-act (or telling it 'review only' upfront) would prevent these reverts.
  • During the MR 120 worktree setup, Claude edited code when you only wanted a review, requiring a revert and explicit clarification.
  • On the broken staging site, Claude filtered the console and started creating tasks before reading full console output, and incorrectly predicted a 24h CloudFront cache delay that re-running CI immediately disproved.
Surface-level analysis requiring pushback
Claude often delivers prose-heavy or face-value first drafts when you wanted structured tables or critical analysis, forcing you to push back with corrections. Stating the desired format and depth of skepticism upfront would reduce these rework cycles.
  • Claude endorsed Kyle's spec-check framing at face value until you demanded deeper critical analysis, then interrupted a tool call mid-stream.
  • The initial mr-comment.md came back prose-heavy, prompting your '這... 不是好看的表格啊~?' pushback before Claude reformatted it as a table.
Environment and tooling assumptions
Many sessions hit avoidable friction from Claude assuming default tool behavior (English locales, @me identity, simple fetches) without checking the actual environment first. Documenting these gotchas in skills/AGENTS.md (which you've started doing) and asking Claude to probe the environment before scripting would help.
  • The stale-branch cleanup script used English '[gone]' markers but your git output was localized Chinese '[遺失]', requiring mid-script adjustment.
  • OpenAPI sync repeatedly failed first attempts due to CORS, missing JS context on raw YAML tabs, and IDE buffer-vs-disk confusion before converging on the working approach.
Primary Friction Types
Wrong Approach
28
Misunderstood Request
15
User Rejected Action
8
Buggy Code
8
Excessive Changes
2
Missed Observation
2
Inferred Satisfaction (model-estimated)
Frustrated
1
Dissatisfied
18
Likely Satisfied
171
Satisfied
56
Happy
5

Existing CC Features to Try

Suggested CLAUDE.md Additions

Just copy this into Claude Code to add it to your CLAUDE.md.

Multiple sessions show Claude prematurely editing code during review tasks (MR 120, MR 109 verification), forcing the user to revert changes.
Multiple sessions show Claude defaulting to English when user expected Chinese (Codex review, spec writing), requiring corrections.
Recurring friction: 'three places' vs actual five, reformat_file IDE-buffer confusion, truncated commit lists from glab — user repeatedly had to correct unverified claims.
Stale-branch cleanup script broke because it assumed English '[gone]' marker; this is a stable environment fact worth recording.

Just copy this into Claude Code and it'll set it up for you.

Custom Skills
Reusable slash-command workflows defined as markdown files.
Why for you: You already use /sync-openapi-spec, /dual-review, and /gitlab-push-mr heavily (165 Skill invocations). Codifying your recurring 'review MR → verify findings → post aggregated comment → update journal' pipeline as /review-mr would eliminate the repeated multi-step orchestration visible in 14+ code-review sessions.
mkdir -p .claude/skills/review-mr && cat > .claude/skills/review-mr/SKILL.md <<'EOF' # Review MR 1. Set up worktree for the MR branch 2. Run three parallel review agents (Claude, Codex, Gemini) against staging 3. Verify each Critical/Important finding against actual code 4. Aggregate into severity-ranked table (繁體中文) 5. DO NOT edit code — output review only 6. Post merged comment to GitLab via glab EOF
Hooks
Auto-run shell commands at lifecycle events (pre-commit, post-edit, etc).
Why for you: You hit pre-commit hook failures requiring --no-verify workarounds, and Claude sometimes commits without running tsc/biome. A PostToolUse hook on Edit/Write that runs `tsc --noEmit` and `biome check` would catch the CSS regression and type errors you saw in MR #111 before they reach commit.
// .claude/settings.json { "hooks": { "PostToolUse": [{ "matcher": "Edit|Write", "hooks": [{"type": "command", "command": "npx tsc --noEmit 2>&1 | head -20"}] }] } }
Task Agents
Spawn focused sub-agents for parallel exploration.
Why for you: Your dual/triple-review pattern is essentially this — but you also had a sub-agent hallucinate non-existent files in MR 124. Explicitly framing review agents with 'verify file exists via Read before citing' in the agent prompt would prevent those hallucinated blockers, and you could parallelize the multi-file tsc/biome verification you do manually.
Ask: "Use 3 parallel agents to review src/contracts, src/organizations, and src/shared against staging. Each agent must Read every file before citing it as a finding. Output severity-ranked table."

New Ways to Use Claude Code

Just copy this into Claude Code and it'll walk you through it.

Codify the dual-review → verify → post pipeline
Your most common high-value workflow is multi-model MR review with verification and aggregated posting. Make it a single skill with explicit anti-hallucination guardrails.
Across 14 code_review sessions you repeat: spawn reviewers → verify Critical findings against real code → aggregate into Chinese severity table → post via glab → update journal. Variability in invocation (Codex vs agents, English vs Chinese, premature edits) caused friction in 5+ sessions. A locked-down /review-mr skill that enforces 'review-only, verify-before-cite, 繁體中文 output' would remove that drift.
Paste into Claude Code:
Create a /review-mr skill at .claude/skills/review-mr/SKILL.md that: (1) sets up an isolated worktree, (2) runs 3 parallel review agents with instructions to Read every file before citing it, (3) aggregates findings into a 繁體中文 severity-ranked markdown table, (4) NEVER edits code, (5) posts the table as an MR comment via glab and updates notes/journal.md. Reference the existing /dual-review skill for orchestration patterns.
Stop pre-acting on review/verify requests
When you ask for review or verification, Claude sometimes starts fixing things — costing reverts. Add an explicit gate.
MR 120, MR 109 verification, and the spec-check session all show Claude jumping to edits during review-mode requests. Adding a CLAUDE.md rule plus prefacing review prompts with 'OUTPUT ONLY — no edits' would save the revert cycle. You can also use the same pattern when invoking dual-review.
Paste into Claude Code:
Review MR <N> against staging. OUTPUT-ONLY MODE: produce a markdown findings table in 繁體中文. Do not Edit, Write, or run any commands that modify files. After I read your findings, I will tell you which to fix.
Document the IDE-buffer flush trick as a skill
You spent a long session empirically discovering reformat_file + paired replace_text_in_file flushes the JetBrains IDE buffer to disk. Capture it.
The openapi.yaml sync workflow has bitten you repeatedly with IDE buffer vs disk confusion (Claude claiming success when nothing was written, missing manual save warnings). Encode the discovered procedure as a skill so future syncs and any other JetBrains MCP file edits never regress.
Paste into Claude Code:
Update .claude/skills/sync-openapi-spec/SKILL.md to include: 'After reformat_file via JetBrains MCP, immediately call replace_text_in_file with a no-op change to flush the IDE buffer to disk. Then verify with `cat <file> | head -20` from Bash before claiming success. Never trust the IDE buffer state.'

On the Horizon

Your workflow already orchestrates parallel review agents, multi-MR triage, and skill iteration—the next leap is fully autonomous pipelines that ship, verify, and learn without you watching.

Autonomous MR Pipeline From Spec to Merge
Given an OpenAPI diff or Figma link, a single command spins up a worktree, generates the implementation, runs dual-review with three models in parallel, applies Critical/Important fixes itself, verifies via headless browser, and posts the merged review comment—all before you read the notification. Your existing dual-review, sync-openapi-spec, and gitlab-mr skills become composable stages in one supervised pipeline that only pages you for true human-judgment calls.
Getting started: Wrap your current skills in a top-level orchestrator using Claude Agent SDK subagents with explicit phase gates (implement → review → fix → verify → comment), and use hooks to enforce 'observe disk state' and 'verify tsc baseline' before each transition.
Paste into Claude Code:
Build a /ship-feature slash command that takes either a Figma URL or an OpenAPI spec diff and runs end-to-end: (1) create isolated worktree, (2) implement changes with TypeScript + tests passing, (3) launch three parallel sub-agents running Claude/Codex/Gemini reviews against the staging branch, (4) auto-apply all Critical and Important fixes with verification commits, (5) run Chrome DevTools MCP to verify the change in browser, (6) open MR with correct assignee/reviewer defaults from the gitlab-mr skill, (7) post merged review comment as a markdown table (no horizontal rules—they break GitLab). Add explicit checkpoints where it must stop and ask me: ambiguous product decisions, destructive git operations, and any case where staging tsc baseline isn't green. Log every phase to a journal entry I can audit.
Self-Healing Skills That Learn From Friction
Every time you correct Claude (placeholder text wording, '5 places not 3', Chinese vs English, prose vs tables, premature code edits during review), that correction becomes a structured friction event that automatically proposes a skill or AGENTS.md amendment. Over weeks, your 165 Skill invocations compound into a system that pre-empts the misunderstandings that caused 28 'wrong_approach' and 15 'misunderstood_request' incidents this period.
Getting started: Add a SessionEnd hook that scans the transcript for user interruptions and corrections, then runs a meta-agent that proposes diffs to the relevant skill file or AGENTS.md, opening them as a draft MR for your weekly review.
Paste into Claude Code:
Create a 'friction-harvester' subagent that runs at the end of every session. It should: (1) parse the transcript for user interruptions, rejections, and corrections, (2) classify each into categories matching my friction taxonomy (wrong_approach, misunderstood_request, premature_action, language_mismatch, formatting_mistake), (3) trace each back to the skill or AGENTS.md rule that should have prevented it, (4) draft a concrete diff to that file with before/after examples, (5) collect all diffs into a weekly 'skill-evolution' MR I can review Sunday evening. Specifically watch for: review-only requests where Claude edited code, output-language mismatches, table-vs-prose formatting preferences, and assumptions made without observing disk state first.
Test-Driven Reverse Engineering Loop
Point Claude at a binary (like the third-party apps you researched) or a staging API, and it autonomously generates a hypothesis-test-refine loop: write characterization tests that capture current behavior, then iterate implementation against those tests in parallel branches until green. The 6 reverse_engineering sessions and the contracts page Unix-seconds debug become a 30-minute autonomous job instead of a multi-hour exploration.
Getting started: Combine a long-running planning agent with parallel implementation subagents that each pursue a different hypothesis branch, all racing against the same generated test suite using git worktrees and Chrome DevTools MCP for live validation.
Paste into Claude Code:
I want to reverse-engineer or debug a system using a test-driven autonomous loop. Given a target (binary, staging URL, or broken page), do this: (1) generate a characterization test suite that captures current observable behavior—API shapes, DOM states, network calls via Chrome DevTools MCP, (2) form 3-5 distinct hypotheses about what changed or how it works, (3) spawn parallel sub-agents in separate worktrees, each implementing one hypothesis, (4) run all hypotheses against the test suite concurrently, (5) report back which branches pass with a comparison table, (6) for the winning hypothesis, expand tests to lock in behavior and open an MR. Apply this first to: detecting API contract drift between our frontend and the OpenAPI spec, by generating tests from the spec and running them against staging to surface mismatches like the date-string-to-Unix-seconds break.
"User pushes back on a prose-heavy MR comment with a perfectly-judged passive-aggressive Mandarin: '這... 不是好看的表格啊~?'"
During the MR !111 quality workflow, Claude produced a prose-heavy mr-comment.md instead of the table format the user wanted. The user's gentle-but-cutting reply ('This... isn't a nice-looking table~?') prompted Claude to redo it as a proper table.