Skip to content

feat(codex): surface pre-turn projection accounting (#80765)#80778

Open
aiZKP wants to merge 1 commit into
openclaw:mainfrom
aiZKP:fix/codex-projection-accounting-80765
Open

feat(codex): surface pre-turn projection accounting (#80765)#80778
aiZKP wants to merge 1 commit into
openclaw:mainfrom
aiZKP:fix/codex-projection-accounting-80765

Conversation

@aiZKP
Copy link
Copy Markdown

@aiZKP aiZKP commented May 11, 2026

Summary

Closes #80765.

Codex's context-engine projection previously sized the rendered prompt with the
generic 4 chars/token heuristic and exposed nothing about that estimate
downstream. Status/LCM diagnostics could not separate frontier tokens
selected by the context engine, rendered Codex projection chars/tokens
before send, and provider-observed usage after the turn.

This PR adds a small pre-turn accounting snapshot to the projection and routes
it into agent telemetry:

  • projectContextEngineAssemblyForCodex now returns a stats block:
    • projectedPromptChars — length of the rendered Codex prompt
    • promptTokens — tokenizer-backed when supplied, heuristic otherwise
    • accounting: "estimated" | "exact" — explicit marker
    • capChars — active rendered-context cap (currently 24_000)
    • reserveTokens — surfaced when the caller routes the configured
      agents.defaults.compaction.reserveTokens /
      reserveTokensFloor through
  • An optional tokenize?: (text: string) => number | undefined parameter
    lets a future Codex app-server / provider tokenizer flip the marker to
    exact without changing call sites. Throwing or non-finite returns fall
    back to the heuristic.
  • run-attempt.ts resolves agents.defaults.compaction.reserveTokens
    (falling back to reserveTokensFloor) and emits a new
    codex_app_server.context_projection agent event before turn/start
    on both the context-engine and mirrored-history projection paths.

Existing behavior (24k char cap, prompt rendering, duplicate trailing-prompt
trim, developer-instruction addition, prePromptMessageCount) is unchanged.

Acceptance criteria

  • Native Codex projection reports pre-turn exact tokens when a tokenizer
    is supplied; otherwise marks accounting as estimated.
  • Diagnostics can distinguish:
    • LCM/frontier tokens selected by the context engine
      (frontierTokens on the emitted event, equal to contextTokenBudget)
    • rendered Codex projection chars/tokens before send
      (projectedPromptChars / promptTokens / accounting)
    • provider-observed prompt/input tokens after the turn
      (existing afterTurn runtimeContext.lastCallUsage / promptCache)
  • Tests cover the estimate-vs-exact marker and ensure configured reserve
    fields surface through projection stats.

Files touched

File Lines Purpose
extensions/codex/src/app-server/context-engine-projection.ts +95 Stats type, tokenizer seam, accounting marker
extensions/codex/src/app-server/run-attempt.ts +43 Reserve resolver, projection event emit
extensions/codex/src/app-server/context-engine-projection.test.ts +79 / -1 5 new tests for stats / marker / reserve

No SDK contract, no public manifest, no docs/changelog surface changed.

Test plan

  • pnpm test extensions/codex/src/app-server/context-engine-projection.test.ts10 passed (5 new + 5 existing)
  • pnpm test extensions/codex/src/app-server/run-attempt.context-engine.test.ts6 passed
  • pnpm check:changed (extension prod + extension test lanes) — typecheck, oxlint, format, runtime sidecar guard, import-cycle check all green
  • One unrelated test (run-attempt.test.ts > does not expose OpenClaw Tool Search controls through Codex dynamic tools) times out — verified to fail identically on main without these changes, so it is pre-existing and unrelated to this PR.

Notes for reviewers

  • The projection cap (MAX_RENDERED_CONTEXT_CHARS = 24_000) is intentionally
    unchanged here. Making it budget-aware via contextTokenBudget /
    reserveTokens is tracked by fix(codex): scale context engine projection #80761; this PR is the accounting
    follow-up only.
  • The new tokenize parameter is a no-op until a Codex/provider tokenizer
    is wired in. The acceptance criterion ("exact when the runtime/tokenizer
    surface supports it") is satisfied by the seam plus the explicit
    estimated marker; no behavior change for current callers.
  • The emitted event uses Record<string, unknown> — consumers that already
    subscribe to onAgentEvent see a new stream value but the existing
    envelope shape is preserved.

Refs: #80765

Adds a `stats` block to the Codex context-engine projection so callers can
distinguish LCM/frontier sizing from the rendered Codex prompt and from
post-turn provider-observed usage. The block carries `projectedPromptChars`,
`promptTokens`, an `accounting: "estimated" | "exact"` marker, the active
`capChars`, and (when routed through) the configured compaction
`reserveTokens` knob.

The projection accepts an optional `tokenize` callback so a provider/runtime
tokenizer can flip stats to `exact` when available; without one the existing
4-chars/token heuristic is used and accounting is explicitly marked
`estimated`. The Codex app-server run-attempt now resolves
`agents.defaults.compaction.reserveTokens` (falling back to
`reserveTokensFloor`) and emits a `codex_app_server.context_projection`
telemetry event alongside the existing post-turn usage signals.

Closes openclaw#80765
@openclaw-barnacle openclaw-barnacle Bot added extensions: codex size: M triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 11, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented May 11, 2026

Codex review: needs real behavior proof before merge.

Summary
Adds Codex context projection stats, an exact-vs-estimated token accounting marker, reserve-token surfacing, and a codex_app_server.context_projection agent event with projection tests.

Reproducibility: not applicable. as a bug reproduction: this is a feature/observability PR. Source inspection of current main confirms the projection path lacks stats and emits no pre-turn projection accounting event.

Real behavior proof
Needs real behavior proof before merge: Only test/check claims are supplied; this needs redacted terminal output, logs, or a recording from a real Codex app-server run showing the new projection event before merge, then a PR body update or @clawsweeper re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, ask a maintainer to comment @clawsweeper re-review.

Next step before merge
External feature PR needs maintainer judgment on the telemetry/tokenizer contract and real behavior proof from the contributor; there is no narrow repair task for automation.

Security
Cleared: The diff adds local telemetry fields and tests without new dependencies, workflows, secret handling, package resolution, or code-execution surfaces.

Review details

Best possible solution:

Land a focused Codex-owned projection accounting seam only after maintainer review accepts the event shape and the contributor adds redacted real-run proof showing the new telemetry path.

Do we have a high-confidence way to reproduce the issue?

Not applicable as a bug reproduction: this is a feature/observability PR. Source inspection of current main confirms the projection path lacks stats and emits no pre-turn projection accounting event.

Is this the best way to solve the issue?

Unclear until the Codex telemetry/tokenizer contract is accepted by maintainers. The patch is narrow and no blocking code defect was found, but it should be backed by real behavior proof and should keep exact-token claims scoped to a future authoritative tokenizer.

What I checked:

  • Current main lacks projection stats: On current main, CodexContextProjection only returns developer instruction additions, prompt text, assembled messages, and prePromptMessageCount; there is no stats block, token count, accounting marker, or reserve field. (extensions/codex/src/app-server/context-engine-projection.ts:3, c6eefd9f4d59)
  • Current main has no projection event: The current Codex app-server path calls projectContextEngineAssemblyForCodex, uses the prompt text, and proceeds to prompt build without emitting a projection accounting event. (extensions/codex/src/app-server/run-attempt.ts:632, c6eefd9f4d59)
  • Patch adds the requested telemetry seam: The PR patch adds CodexContextProjectionStats, a tokenize callback, estimated fallback logic, reserve-token surfacing, and emits codex_app_server.context_projection from both context-engine and mirrored-history projection paths. (extensions/codex/src/app-server/context-engine-projection.ts:1, 304379c78965)
  • Linked issue asks for this accounting feature: The linked issue describes the remaining observability gap after the budget-aware projection work: pre-turn projection currently uses a 4 chars/token estimate and cannot distinguish frontier tokens, projected prompt sizing, and provider-observed usage before/after the turn.
  • Current app-server protocol lacks a tokenizer endpoint: The current Codex app-server request result map lists thread, turn, model, plugin, skill, and related methods, but no token-count or tokenizer preflight method, so the exact-token path remains a seam until an authoritative runtime API exists. (extensions/codex/src/app-server/protocol.ts:482, c6eefd9f4d59)
  • Real behavior proof is not supplied: The PR body reports unit tests and check:changed, and the PR has the triage: needs-real-behavior-proof label, but the body/comments do not include terminal output, logs, recording, or another real Codex app-server run showing the new event after the change.

Likely related people:

  • @jalehman: Commit history for extensions/codex/src/app-server/context-engine-projection.ts shows commit 51186d2725439646f7fea0e59bb466f957404c46 added the Codex app-server context-engine lifecycle, projection, and related coverage. (role: introduced current projection behavior; confidence: high; commits: 51186d272543; files: extensions/codex/src/app-server/context-engine-projection.ts, extensions/codex/src/app-server/run-attempt.ts, docs/plan/codex-context-engine-harness.md)
  • @steipete: Recent file history for run-attempt.ts and the projection helper includes Codex app-server changes and refactors such as 694f40fceeea164674f6a95a65ff607d88a0490c, d2f578cbb4ee47b8db1c1e79741b91c019d74eef, and 1c76065ccd29104901a861574bfc56a10e3d9d44. (role: recent area contributor; confidence: high; commits: 694f40fceeea, d2f578cbb4ee, 1c76065ccd29; files: extensions/codex/src/app-server/run-attempt.ts, extensions/codex/src/app-server/context-engine-projection.ts)
  • @obviyus: Recent history for run-attempt.ts includes c529ab29c2228fa37ddf1a8e76b0559805e16d59, which preserved current-turn context in the same prompt-build area that this PR instruments. (role: adjacent current-turn context contributor; confidence: medium; commits: c529ab29c222; files: extensions/codex/src/app-server/run-attempt.ts)

Remaining risk / open question:

  • No real after-fix behavior proof is present from a real Codex app-server run showing the new projection event or stats in operation.
  • The exact-token path is only an optional seam until an authoritative Codex tokenizer or app-server token-count contract is wired in.

Codex review notes: model gpt-5.5, reasoning high; reviewed against c6eefd9f4d59.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

extensions: codex size: M triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Codex context-engine projection lacks exact pre-turn token accounting

1 participant