Claude Codeのトークン節約とビッグプラン戦略その2

概要

Claude Codeのトークンコストをどう管理するかという話。LLMのリクエストには会話履歴が全部含まれるので、実装の後にcommit/pushするだけでも無駄にトークンを消費してしまう。/clearを使うパターンやagent teamsによるコンテキスト分離がどう効くかについて、Claude Codeに聞いてみた会話の記録。

前回の記事（Claude Codeのトークン節約とビッグプラン戦略）の続きのような内容で、もう少し具体的なパターン比較をしている。

`/clear`でコンテキストを切る

前回の記事で、セッションが長くなるとトークンコストが爆発的に増えるという話をした。じゃあ具体的に、実装後のcommit/pushみたいな「本体の実装とは関係ない作業」をどう扱うべきなのかが気になった。3パターンを比較してみた。

Takazudo

I’ve heard LLM requests contains all the history log of that conversation:

Claude Codeのトークン節約とビッグプラン戦略

If so. is this better to avoid?

=== Pattern A ===

me: implement something big
cc: ok working… done.
me: /commits push
cc: ok… done.

in this case, is this better?

=== Pattern B ===

me: implement something big, /commits push
cc: ok working… done.

=== Pattern C ===

me: implement something big
cc: ok working… done.
/clear
me: /commits push
cc: ok… done.

Or, Can Claude Code arrange the request to LLM. so this kind of /commits push are not needed to be taken care of?

LLMのリクエストにはその会話の履歴ログが全部含まれるという話を聞きました。

もしそうなら、以下のパターンはどれが良いのでしょうか。

=== パターンA ===

自分: 大きな実装をして
cc: 了解……完了。
自分: /commits push
cc: 了解……完了。

こっちの方が良い？

=== パターンB ===

自分: 大きな実装をして、/commits push
cc: 了解……完了。

あるいは

=== パターンC ===

自分: 大きな実装をして
cc: 了解……完了。
/clear
自分: /commits push
cc: 了解……完了。

それとも、Claude Codeがリクエストをうまく調整してくれるので、こういう/commits pushのことは気にしなくて良いんでしょうか？

Claude Code

Yes, each LLM request includes the conversation history. Here’s how the three patterns compare:

Pattern A (separate turns) - Most expensive for the commit step

Turn 1: [system prompt] + your request → implementation (large output)
Turn 2: [system prompt] + your request + ALL of turn 1 + "/commits push"
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                          This entire implementation history is sent again
                          just to run git commands. Wasteful.

Pattern B (combined instruction) - Slightly better

Turn 1: [system prompt] + "implement X, /commits push"
        → implementation happens, then commit/push in same turn

Fewer round-trips, but the commit/push phase still carries the full turn context internally. Slightly cheaper than A because there’s one fewer request.

Pattern C (after /clear) - Most token-efficient

Turn 1: [system prompt] + your request → implementation
/clear  (wipes history)
Turn 2: [system prompt] + "/commits push" → only sees git diff, clean context

The commit step starts fresh. It reads git diff and git log to understand what to commit — it doesn’t need the implementation history at all.

Recommendation: Pattern C is the best practice for token savings. /commits only needs git status / git diff / git log — it never needs the implementation conversation.

Does Claude Code optimize this automatically?

No. Claude Code does auto-compress old messages when approaching context limits, but it doesn’t know that a follow-up like /commits is context-independent. The compression is a safety net, not an optimization — it still sends more than necessary.

So /clear before context-independent operations is a good habit:

me: implement something big
cc: done.
/clear
me: /commits push
cc: done.

This is especially impactful when the implementation conversation was long (many tool calls, file reads, edits, etc.), since all of that would otherwise be re-sent just for a few git commands.

はい、各LLMリクエストには会話履歴が含まれます。3つのパターンの比較は以下の通りです。

パターンA（別々のターン）- コミットステップが最も高コスト

ターン1: [システムプロンプト] + あなたのリクエスト → 実装（大きな出力）
ターン2: [システムプロンプト] + あなたのリクエスト + ターン1の全て + "/commits push"
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                この実装履歴全体が、gitコマンドを実行する
                                ためだけに再送される。無駄。

パターンB（結合した指示）- 少し良い

ターン1: [システムプロンプト] + "Xを実装して、/commits push"
         → 実装が行われ、同じターン内でcommit/push

ラウンドトリップは少ないが、commit/pushフェーズは内部的にフルのターンコンテキストを持つ。Aよりリクエストが1回少ない分、わずかに安い。

パターンC（/clear後）- 最もトークン効率的

ターン1: [システムプロンプト] + あなたのリクエスト → 実装
/clear  （履歴を消去）
ターン2: [システムプロンプト] + "/commits push" → git diffだけ見る、クリーンなコンテキスト

コミットステップはフレッシュにスタートする。何をコミットすべきかはgit diffとgit logを読めばわかる——実装履歴は全く必要ない。

推奨: パターンCがトークン節約のベストプラクティス。/commitsはgit status / git diff / git logだけ必要で、実装の会話履歴は不要。

Claude Codeはこれを自動的に最適化するか？

いいえ。Claude Codeはコンテキスト制限に近づくと古いメッセージを自動圧縮するが、/commitsのようなフォローアップがコンテキスト非依存であることは認識していない。圧縮はセーフティネットであり、最適化ではない——必要以上に送信してしまう。

なので、コンテキスト非依存の操作の前に/clearするのは良い習慣：

me: 大きな実装をして
cc: 完了。
/clear
me: /commits push
cc: 完了。

これは特に実装の会話が長かった場合（多くのツール呼び出し、ファイル読み込み、編集など）に効果的。そうでなければ、わずかなgitコマンドのために全てが再送されてしまう。

`/clear`後のコスト

パターンCが最も効率的だというのはわかった。でも、/clearした後のセッションはそれまでの実装のことを何も知らない。ファイルを読み直したり、コミットメッセージのためにコンテキストを推測したりする必要がある。それでもなお安いのかが気になった。

Takazudo

wow that’s a big thing. if so… with pattern C, on the refreshed session, Claude Code has no idea about the changes. so it needs to read the files and also need to guess the context for commit message. even if such works are there, it’s still low cost as the token?

それは大きな話ですね。もしそうなら……パターンCでは、リフレッシュしたセッションは変更について何も知らない。だからファイルを読み直す必要があるし、コミットメッセージのためにコンテキストを推測する必要もある。そういう作業があっても、トークンコストとしては安いままなんですか？

Claude Code

Yes. Even with the “fresh discovery” work, it’s still much cheaper. Here are the rough numbers:

Pattern A (no /clear) — commit step:

Content	Tokens (rough)
System prompt	~10K
Implementation history (reads, edits, tool outputs, errors, retries…)	50K - 500K+
“/commits push”	tiny
git status/diff/log	~5-20K
Total input for commit turn	~65K - 520K+

Pattern C (after /clear) — commit step:

Content	Tokens (rough)
System prompt	~10K
”/commits push”	tiny
git status/diff/log	~5-20K
Total input for commit turn	~15-30K

The “discovery work” (git diff, git log) is far smaller than the implementation history it replaces. Even a large diff is typically 5-20K tokens. A moderate implementation conversation is easily 100K+.

What about commit message quality?

git diff tells Claude what changed. git log tells it the project’s commit style. That’s usually enough for a good commit message.

What’s lost is the “why” — your original intent. But Claude Code’s /commits skill already relies heavily on git diff anyway. The implementation history is mostly noise for the commit step (file reads, failed attempts, retries, test outputs…).

If you want the best of both worlds:

(long implementation session)
/clear
me: /commits push. Context: added retry logic to the API client for resilience
cc: done.

One line of context is ~20 tokens. The implementation history it replaces is 100K+. That’s a ~5000x efficiency gain for the “why” information.

はい。「フレッシュな発見」作業があっても、はるかに安い。おおよその数字は以下の通り。

パターンA（/clearなし）— コミットステップ:

内容	トークン（概算）
システムプロンプト	~10K
実装履歴（読み込み、編集、ツール出力、エラー、リトライ……）	50K - 500K+
“/commits push”	微小
git status/diff/log	~5-20K
コミットターンの合計入力	~65K - 520K+

パターンC（/clear後）— コミットステップ:

内容	トークン（概算）
システムプロンプト	~10K
”/commits push”	微小
git status/diff/log	~5-20K
コミットターンの合計入力	~15-30K

「発見作業」（git diff、git log）は、それが置き換える実装履歴よりはるかに小さい。大きなdiffでも通常5-20Kトークン。中程度の実装の会話は簡単に100K+になる。

コミットメッセージの品質はどうか？

git diffがClaude Codeに何が変わったかを教え、git logがプロジェクトのコミットスタイルを教える。通常これで良いコミットメッセージを書くには十分。

失われるのは「なぜ」——あなたの元の意図。しかしClaude Codeの/commitsスキルはすでにgit diffに大きく依存している。実装履歴はコミットステップにとってはほぼノイズ（ファイル読み込み、失敗した試み、リトライ、テスト出力……）。

両方の良いところを取りたいなら：

（長い実装セッション）
/clear
me: /commits push. Context: added retry logic to the API client for resilience
cc: done.

1行のコンテキストは約20トークン。それが置き換える実装履歴は100K以上。「なぜ」の情報について約5000倍の効率化。

約15〜30K対約65〜520K以上。桁が違う。しかもコミットメッセージの品質も、git diffとgit logがあればそこまで落ちないとのこと。1行のコンテキストを添えるだけで5000倍の効率化というのはかなりインパクトがある。

プロンプトの自動分割はされない

パターンBで「実装して、commit/pushして」と1つのプロンプトにまとめた場合、Claude Codeが内部的に「まず実装、次にcommit」と分割してくれるのかが気になった。

Takazudo

Good. next question. on Pattern B, you told me

me: implement something big, /commits push

this is better but /commits needs next operation, so token costs with this. so some long order prompt costs like chaining request? Is that automatically recognized by Claude Code software itself, not by LLM? So… in this case. Maybe Claude Code recognize.

OK, let’s separate this prompt to…
- 1. do implementation
- 1. /commits push

なるほど。次の質問です。パターンBについて、あなたはこう言いました

me: implement something big, /commits push

これは良いが、/commitsは次の操作が必要で、トークンコストがかかると。こういう長い命令プロンプトはリクエストのチェーンのようにコストがかかる？これはClaude CodeのソフトウェアがLLMではなく自動的に認識するものですか？つまり……Claude Codeがこう認識するのか。

OK、このプロンプトを分割しよう……
- 1. 実装する
- 1. /commits push

Claude Code

No, Claude Code does not automatically split or chain your prompt. The entire thing is sent to the LLM as one request, and the LLM decides what to do.

So with Pattern B:

me: implement something big, /commits push

What actually happens:

The LLM receives your full prompt
It starts implementing (reading files, editing, running tests…)
Each tool call is a round-trip — the growing history is sent back to the LLM each time
After implementation, the LLM then does the commit/push
By the time it reaches the commit step, the context is already full of implementation history

There’s no “smart separation” happening at the Claude Code software layer. Claude Code is essentially a loop:

while not done:
    send(system_prompt + full_history) → LLM
    LLM returns tool calls or text
    execute tools, append results to history

Every round-trip carries the full (or compressed) history. Claude Code doesn’t analyze your prompt upfront and split it into independent stages.

So Pattern B saves one user-turn round-trip compared to Pattern A, but the commit step still carries all the implementation context. The savings are minimal.

Pattern C (/clear) is the only way to actually drop the history. It’s a user-initiated action — Claude Code won’t do it for you.

Summary:

Pattern	Who optimizes?	Savings
A (separate turns)	nobody	worst
B (combined prompt)	nobody (LLM just does both sequentially)	marginal
C (/clear between)	you	best

The takeaway: token optimization is your responsibility as the user. Claude Code doesn’t optimize conversation structure for you.

いいえ、Claude Codeはプロンプトを自動的に分割したりチェーンしたりしない。全体が1つのリクエストとしてLLMに送られ、LLMが何をするか決める。

パターンBの場合：

me: 大きな実装をして、/commits push

実際に起こること：

LLMがあなたのフルプロンプトを受け取る
実装を開始する（ファイル読み込み、編集、テスト実行……）
各ツール呼び出しがラウンドトリップ——増え続ける履歴が毎回LLMに送り返される
実装後、LLMがcommit/pushを行う
コミットステップに到達するころには、コンテキストは実装履歴で一杯

Claude Codeのソフトウェア層で「スマートな分割」は行われていない。Claude Codeは本質的にはループ：

while not done:
    send(system_prompt + full_history) → LLM
    LLM returns tool calls or text
    execute tools, append results to history

毎回のラウンドトリップがフル（または圧縮された）履歴を運ぶ。Claude Codeがプロンプトを事前に分析して独立したステージに分割することはない。

パターンBはパターンAと比べてユーザーターンのラウンドトリップが1回少ないが、コミットステップは依然として全ての実装コンテキストを持つ。節約は最小限。

パターンC（/clear）が履歴を実際にドロップする唯一の方法。ユーザーが起こすアクションで、Claude Codeが自動的にやってくれることはない。

まとめ：

パターン	誰が最適化する？	節約
A（別々のターン）	誰もしない	最悪
B（結合プロンプト）	誰もしない（LLMが順に処理するだけ）	わずか
C（/clearを挟む）	あなた	最良

要点：トークン最適化はユーザーの責任。Claude Codeが会話構造を最適化してくれることはない。

Claude Codeがプロンプトをいい感じに分割してくれると良いなーと思ったが、そういうことはしないとのこと。Claude Codeは本質的にはループで、毎回フル履歴をLLMに送り、ツールを呼び出して結果を追加する、その繰り返し。最適化はユーザーの責任ということになる。

コンテキスト管理は人間の仕事

/clearが効くのはわかった。ただ/clearしすぎると全てのメモリが消えてしまう。結局、人間が判断するしかないのかを確認した。

Takazudo

hum, interesting, but with too many slash-clear, it clears the memory completely. so finally, only the human can handle when to clear, and is it the only way to manage for this token cost reducing in short?

ふーん、興味深い。でも/clearをしすぎると、メモリが完全にクリアされてしまう。結局、人間だけがいつクリアすべきか判断できるし、トークンコスト削減のためにはそれが唯一の方法ということですか？

Claude Code

Yes, exactly. You’re the only one who knows which context is still needed and which can be dropped.

/clear is a blunt tool — it’s all or nothing. There’s no “clear the implementation details but keep the architectural decisions” option.

What you can do in practice:

/clear at natural breakpoints

implement feature A → done → /clear
implement feature B → done → /clear
/commits push

Each implementation starts fresh. Good.

Re-state what matters after /clear

(long implementation session)
/clear
me: We just added retry logic to src/api-client.ts. /commits push

Cheap “memory” in ~20 tokens.

Auto-compression exists, but it’s passive

Claude Code does compress old messages when context gets large, but:

You can’t control when or what gets compressed
It’s a safety net, not a strategy
It still carries more than necessary

The fundamental limitation:

Human brain:  knows what context matters → /clear at the right time
Claude Code:  has no concept of "this context is no longer useful"
LLM:          sees everything it's given, can't ask for less

So yes — in short, you are the context manager. /clear at natural task boundaries is the only real lever you have today for token cost reduction.

はい、その通り。あなただけが、どのコンテキストがまだ必要でどれを落とせるかを知っている。

/clearは雑なツール——全か無か。「実装の詳細は消すけどアーキテクチャの判断は残す」というオプションはない。

実際にできること：

自然なブレークポイントで/clear

機能Aを実装 → 完了 → /clear
機能Bを実装 → 完了 → /clear
/commits push

各実装がフレッシュに始まる。良い。

/clear後に重要なことを再記述

（長い実装セッション）
/clear
me: We just added retry logic to src/api-client.ts. /commits push

~20トークンの安い「メモリ」。

自動圧縮は存在するが、受動的

Claude Codeはコンテキストが大きくなると古いメッセージを圧縮するが：

いつ何が圧縮されるかは制御できない
セーフティネットであって戦略ではない
必要以上のものを依然として持っている

根本的な限界：

人間の脳:    どのコンテキストが重要か知っている → 適切なタイミングで/clear
Claude Code: 「このコンテキストはもう不要」という概念がない
LLM:         与えられた全てを見る、少なくしてとは頼めない

つまり——要するに、あなたがコンテキストマネージャー。自然なタスク境界での/clearが、今日のトークンコスト削減のための唯一の本当のレバー。

「あなたがコンテキストマネージャー」。これはなかなか端的な表現だなと。Claude Codeには「このコンテキストはもう不要」という概念がない。自動圧縮はあるが戦略ではなくセーフティネット。結局、タスクの区切りを判断してコンテキストを切れるのは人間だけ。

エージェントチームによるコンテキスト分離

ここまでの話は/clearによる手動のコンテキスト管理だった。でもagent teamsを使えば、子エージェントがそれぞれ独立したコンテキストを持つ。これは自動的な/clear境界のようなものではないかと聞いてみた。

Takazudo

well, but we have agent teams. with agent teams, the manager will hold the whole memory (although we can arrange the reports to text files or something like that). but it’s effective for long running operation with less cost, right? for example, if one session holds everything, it will easily consume the tokens as hell

でも、agent teamsがありますよね。agent teamsだと、マネージャーは全体のメモリを保持する（レポートをテキストファイルか何かに整理することはできるにしても）。でも長時間の操作をより少ないコストで行うには効果的ですよね？例えば、1つのセッションが全てを保持したら、トークンを莫大に消費してしまう

Claude Code

Exactly right. Agent teams are essentially architectural context isolation.

Single session — everything accumulates:

[system prompt] + [task1 reads/edits/tests] + [task2 reads/edits/tests] + [task3...]
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                   This grows and grows. Every round-trip re-sends it all.

Agent team — each agent has its own isolated context:

Manager context:
  [system prompt] + [plan] + [summary from agent1] + [summary from agent2] + ...
                              ~100 tokens            ~100 tokens

Agent 1 context (temporary):
  [system prompt] + [task1 instructions] + [task1 reads/edits/tests] → dies

Agent 2 context (temporary):
  [system prompt] + [task2 instructions] + [task2 reads/edits/tests] → dies

Each child agent is like an automatic /clear boundary. The heavy implementation context (file reads, diffs, test outputs, retries) lives and dies within that agent. The manager only receives a short summary.

Concrete token comparison for a 3-feature implementation:

Approach	Manager context at end	Total tokens consumed
Single session	~300-500K+ (everything)	Very high (re-sent every turn)
Agent team	~30-50K (plan + summaries)	Lower (heavy work is isolated)

The key insight:

Single session:  YOU must /clear to manage context
Agent team:      Agent boundaries ARE the /clear

Each agent spawn is a natural context boundary that you don’t have to manage manually. The manager stays lean.

Tradeoffs — it’s not free:

Each agent pays the system prompt cost (~10K) on spawn
If the manager micromanages (many back-and-forth messages), its context still grows
Agent spawning has latency overhead

So it’s most effective when tasks are independent and substantial — worth the spawn cost, and the isolation saves far more than the overhead.

This is exactly why the “big-plan” strategy (break into issues, assign to agents) works well both organizationally and economically.

その通り。agent teamsは本質的にはアーキテクチャ的なコンテキスト分離。

シングルセッション——全てが蓄積される：

[システムプロンプト] + [タスク1の読み込み/編集/テスト] + [タスク2の読み込み/編集/テスト] + [タスク3...]
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                       これがどんどん増える。毎回のラウンドトリップで全てが再送される。

エージェントチーム——各エージェントが独立したコンテキストを持つ：

マネージャーのコンテキスト:
  [システムプロンプト] + [計画] + [エージェント1からの要約] + [エージェント2からの要約] + ...
                                  ~100トークン              ~100トークン

エージェント1のコンテキスト（一時的）:
  [システムプロンプト] + [タスク1の指示] + [タスク1の読み込み/編集/テスト] → 終了

エージェント2のコンテキスト（一時的）:
  [システムプロンプト] + [タスク2の指示] + [タスク2の読み込み/編集/テスト] → 終了

各子エージェントは自動的な/clear境界のようなもの。重い実装コンテキスト（ファイル読み込み、diff、テスト出力、リトライ）はそのエージェント内で生まれて消える。マネージャーは短い要約だけを受け取る。

3機能実装の具体的なトークン比較：

アプローチ	終了時のマネージャーコンテキスト	消費される総トークン
シングルセッション	~300-500K+（全て）	非常に高い（毎ターン再送）
エージェントチーム	~30-50K（計画 + 要約）	低い（重い作業は分離）

核心的な洞察：

シングルセッション: あなたがコンテキスト管理のために/clearしなければならない
エージェントチーム: エージェントの境界が/clearそのもの

各エージェントのスポーンは、手動で管理する必要のない自然なコンテキスト境界。マネージャーはスリムなまま。

トレードオフ——タダではない：

各エージェントはスポーン時にシステムプロンプトのコスト（~10K）を払う
マネージャーがマイクロマネジメントすると（多くのやり取り）、そのコンテキストも増える
エージェントスポーンにはレイテンシのオーバーヘッドがある

なので、タスクが独立していて十分な規模がある場合に最も効果的——スポーンコストに見合い、分離がオーバーヘッドよりはるかに大きな節約になる。

これがまさに「ビッグプラン」戦略（issueに分割し、エージェントに割り当てる）が組織的にも経済的にもうまく機能する理由。

「エージェントの境界が/clearそのもの」。これが前回の記事で書いた/big-plan戦略と繋がる。計画セッションでissueに分割し、それをエージェントチームで実行する。各エージェントは独立したコンテキストで動き、マネージャーは要約だけを受け取る。結果として、手動で/clearする必要すらなくなる。

まぁそういうわけで、前回の記事では「セッションは作業記憶であってログではない」「計画が終わったら閉じる」というところまでの話だったが、今回はもう少し具体的に、/clearのパターン比較や、agent teamsが自動的なコンテキスト境界として機能するという話になった。要するに、トークンコストの管理は人間の仕事だが、エージェントチームを使えばその負担をアーキテクチャ的に解決できるということだろう。

Claude Codeのトークン節約とビッグプラン戦略 その2

概要

/clearでコンテキストを切る

/clear後のコスト

プロンプトの自動分割はされない

コンテキスト管理は人間の仕事

エージェントチームによるコンテキスト分離

Claude Codeのトークン節約とビッグプラン戦略その2

`/clear`でコンテキストを切る

`/clear`後のコスト