Additional

Session Runtime

Synced from github.com/CoWork-OS/CoWork-OS/docs

SessionRuntime is the canonical owner of mutable task-session state for execution-oriented work. It sits between TaskExecutor and the lower-level turn executor, so the executor keeps task lifecycle responsibilities while the runtime owns the session mirror, permission state, persistence, and recovery behavior.

What It Owns

SessionRuntime groups all mutable session state into explicit buckets:

transcript: conversation history, latest user and assistant outputs, chat summary blocks, and step outcome summaries
tooling: tool failure tracking, tool usage counters, tool-result memory, web evidence memory, and visible-tool render caching
files: file-read tracking and file-operation tracking
loop: turn counts, continuation counters, compaction counters, loop fingerprints, and soft-deadline flags
recovery: retry and failure-signature state, recovery classification, and tool-disable scopes
queues: pending follow-ups and step-feedback signals
worker: mentioned-agent dispatch state and verification-agent state
verification: verification evidence and failed-step tracking
checklist: session-local execution checklist items, nudge state, and checklist timestamps
promptCache: stable system blocks, stable-prefix hash, tool-schema hash, provider-family mode, and invalidation reason
usage: cumulative token and cost tracking plus usage offsets
permissions: default mode, session-local rules, temporary grants, denial counters, and latest prompt context

This is the state that used to be mirrored across the executor and related helpers. It now lives in one place so task resume, retry, and completion logic read the same source of truth.

Managed Session Relationship

Managed Agents adds a durable API-facing run object, ManagedSession, above the existing runtime.

The ownership split is:

ManagedSession is the stable control-plane resource for reusable runs, event history, and resume/cancel semantics
Task remains the execution worker that owns SessionRuntime
SessionRuntime still owns mutable task-session state and session_runtime_v2 snapshots

In practice, a managed session points at one backing task and optionally one backing team run. Managed-session events mirror sanitized task and daemon events so UI/backend consumers can observe one durable session stream without bypassing the task runtime.

Tool Availability and Rendering

SessionRuntime also owns the runtime-facing available-tools path used by execution and follow-up turns.

The sequence is:

start from the tool catalog
apply restriction, allowlist, policy, mode, and intent filters
render prompt-aware descriptions only for the remaining visible tools
cache the rendered result for repeated turns under the same visible-tool and render-context state

This means tool-local guidance is attached only to tools the current task can actually see.

The render-context inputs include:

execution mode
task domain
web-search mode
shell availability
agent type
worker role
user-input allowance

The provider-facing tool shape remains unchanged after rendering; providers still receive only name, rendered description, and input_schema.

Prompt Cache State

SessionRuntime persists the stable prompt-cache context needed to reuse provider-side prompt prefixes across turns and restarts.

It tracks:

stable session-scoped systemBlocks
stablePrefixHash
toolSchemaHash
prompt-cache mode and provider family
the latest invalidation reason when the stable prefix changes

This lets CoWork reuse the same cacheable prefix after follow-ups, resume from session_runtime_v2 snapshots without rebuilding it blindly, and explain why a cache epoch changed when model, provider family, tool schema, or stable prompt contracts changed.

Public Surface

The runtime exposes a narrow API for the executor and task orchestration layers:

runStepLoop(...)
runFollowUpLoop(...)
runTextLoop(...)
queueFollowUp(...)
setStepFeedback(...)
drainAllPendingFollowUps(...)
maybeAutoContinueAfterTurnLimit(...)
continueAfterBudgetExhausted(...)
resetForRetry(...)
saveSnapshot(...)
restoreFromEvents(...)
projectTaskState(...)
createTaskList(...)
updateTaskList(...)
listTaskList(...)
getTaskListState(...)
clearTaskListVerificationNudge()
getOutputState(...)
getVerificationState(...)
getRecoveryState(...)
getPermissionState(...)
applyWorkspaceUpdate(...)

The executor still owns task bootstrap, plan construction, completion policy, UI/daemon updates, and final result projection. It now delegates session state changes instead of duplicating them.

Session Checklist Primitive

SessionRuntime now owns a lightweight session checklist that is separate from:

plan steps
shared team checklists
any database-backed durable checklist model

The checklist is session-scoped, agent-managed, and persisted only inside session_runtime_v2 snapshots and checklist events. It exists to keep execution disciplined during multi-step work without creating a second planning system.

Tool surface

Execution-capable paths expose three runtime tools:

task_list_create
task_list_update
task_list_list

They are intentionally unavailable in chat, plan, analyze, and other non-executing paths.

Runtime semantics

task_list_create creates the first ordered checklist and fails if one already exists
task_list_update replaces the full ordered state
existing items keep their ids when the caller supplies them
new items may omit id and the runtime generates one
kind defaults to implementation
create and update reject empty lists, duplicate ids, and more than one in_progress item

Verification nudge algorithm

The checklist can raise a non-blocking verification reminder when all of the following are true:

a session checklist exists
at least one checklist item has kind: "implementation"
every implementation item is completed
no checklist item has kind: "verification"
the current plan does not already contain a verification step
the task is not already on a verified path with explicit verification coverage

When that happens:

task_list_update returns verificationNudgeNeeded: true
task_list_create and task_list_update can append an immediate human-readable reminder in the tool result payload seen by the model
SessionRuntime persists that flag in checklist state
the runtime emits task_list_verification_nudged
next-turn preparation injects a short reminder to add or run a verification item before final completion

The reminder is advisory. It does not fail completion by itself, and the post-hoc verification runtime remains the final enforcement layer.

UI/event model

Checklist state is replayable from events alone. The runtime emits:

task_list_created
task_list_updated
task_list_verification_nudged

Each payload carries the full current checklist snapshot so the renderer can reconstruct the latest read-only checklist state without a separate query path.

Turn Execution Flow

TaskExecutor
  -> SessionRuntime
      -> TurnKernel for the active step / follow-up / text turn
      -> provider-aware output-budget policy
      -> adaptive truncation retry helpers
      -> session state updates
      -> task projection updates
      -> snapshot save / restore

The turn kernel is still responsible for a single turn of execution. SessionRuntime is responsible for choosing when to run it, which state to feed into it, and how to persist the results.

Adaptive Output Budgeting

When COWORK_LLM_OUTPUT_POLICY=adaptive is enabled, execution and follow-up turns resolve output limits through one shared provider-aware policy path instead of relying on provider defaults or a single tool-call floor.

Request kinds

The runtime resolves a budget for three request kinds:

agentic_main: the first execution turn for a step or follow-up
tool_followup: the next turn after tool results are present in the transcript
continuation: explicit continuation-only retries after a visible truncation

Policy shape

For the current internal rollout, the runtime uses provider-family defaults for the mainstream routes first:

Anthropic-family routes
Bedrock Claude routes
OpenAI routes
Azure OpenAI routes
Gemini routes
OpenRouter routes
generic fallback for everything else

Default behavior:

main execution turns start with 8000
tool-follow-up turns start with 16000
if a main/tool turn is truncated, the runtime retries the same request once with a higher budget before injecting any continuation prompt
the escalated target is typically 48000
Anthropic-family routes can escalate to 64000
generic fallback escalation is capped at 16000

The runtime always sends an explicit output limit for agentic turns in adaptive mode. It does not depend on provider-side defaults because gateways and compatible endpoints vary too much.

Budget selection

The runtime resolves the outbound limit in this order:

task-level agentConfig.maxTokens, when present
COWORK_LLM_MAX_OUTPUT_TOKENS
provider-family defaults for the current request kind
final clamping by known hard caps and context headroom

Known provider caps remain a safety rail even when task-level or env-level overrides request more.

Truncation handling

If a turn stops because it hit the output limit, the runtime classifies the result as one of:

visible_partial_output: the model produced usable answer text but was cut off
reasoning_exhausted: the model spent the budget on internal reasoning or otherwise produced no usable answer text

Recovery order:

same-request escalation once
if still truncated and visible partial output exists, fall back to the normal continuation prompt path
if still truncated and the result is reasoning-only, stop burning continuation retries and surface a targeted guidance message instead

This keeps continuation retries focused on recoverable truncation and avoids loops where the model repeatedly spends the entire budget without producing visible output.

Logging and rollout controls

The runtime logs output-budget decisions on each agentic call, including:

provider family
request kind
chosen budget
transport token field
cap source
escalation attempted
truncation classification
whether continuation fallback was allowed or skipped

Internal rollout is controlled by environment flags:

COWORK_LLM_OUTPUT_POLICY=legacy|adaptive
COWORK_LLM_MAX_OUTPUT_TOKENS
COWORK_LLM_AGENTIC_INITIAL_MAX_TOKENS
COWORK_LLM_AGENTIC_ESCALATED_MAX_TOKENS

COWORK_LLM_OUTPUT_POLICY defaults to legacy unless explicitly enabled. COWORK_LLM_TOOL_RESPONSE_MAX_TOKENS remains a legacy compatibility input and is no longer the primary path in adaptive mode.

Snapshot And Restore Algorithm

Persisted task state uses the legacy conversation_snapshot event name for compatibility, but the payload is now versioned as session_runtime_v2.

Write path

SessionRuntime captures the full runtime snapshot.
It writes a conversation_snapshot event with:
- schema: "session_runtime_v2"
- version: 2
- transcript, tooling, files, loop, recovery, queues, worker, verification, checklist, prompt-cache, usage, and permissions state
TaskExecutor no longer writes the payload directly.

Restore path

restoreFromEvents() follows a strict precedence order:

Load a V2 checkpoint payload, if present.
Otherwise load the latest V2 conversation_snapshot event payload.
Otherwise restore from a legacy checkpoint payload that still contains conversationHistory.
Otherwise restore from a legacy conversation_snapshot event payload with conversationHistory.
Otherwise rebuild a readable fallback conversation from task events.

If a legacy payload is restored, the runtime writes back a V2 snapshot on the next checkpoint so future resumes use the canonical schema.

Why the order matters

Checkpoints are the most recent durable session state.
Event payloads provide a replayable source of truth when a checkpoint is absent or stale.
Legacy payloads remain readable so older tasks can still resume.
Event-derived fallback keeps very old or partially migrated tasks usable even when no snapshot payload is available.

Task Projection

projectTaskState() exposes the runtime-owned task fields that still need to be reflected on the task row:

budget usage
continuation count and window
lifetime turns used
compaction count and last compaction markers
no-progress streak
last loop fingerprint

This keeps the task row synchronized with the runtime without copying the rest of the session state into the database row itself.

Turn-Budget Ownership

SessionRuntime and the executor now treat window turn caps as optional runtime state.

normal interactive tasks have no implicit strategy-derived window cap
explicit agentConfig.maxTurns or agentConfig.windowTurnCap opt a task into a bounded window
turnBudgetPolicy is only meaningful when one of those explicit caps exists

Runtime telemetry reflects that distinction:

turn_policy_selected records whether a task is effectively uncapped or using an explicit window cap
turn_window_soft_exhausted only appears when an explicit adaptive window is actually configured
Global turn limit exceeded: X/X is now reserved for tasks that intentionally opted into capped behavior

Uncapped tasks still project and enforce:

budget usage
continuation count and window
lifetime turns used
emergency fuse thresholds
loop and no-progress safety state

Terminal Status Synchronization

The runtime boundary is also responsible for keeping the task row and the event stream in the same terminal state.

Completion algorithm

The executor computes the final outcome and terminal metadata.
The daemon persists the task row first, including status, completedAt, terminalStatus, failureClass, and completion summaries.
Only after the row update succeeds does the daemon emit task_completed or the final task_status event.

This ordering ensures the event stream never advertises completion ahead of the durable task row update.

Resume safety algorithm

Late resume calls can happen after approvals, structured-input responses, or renderer-side event handling. To prevent a finished task from being reopened accidentally:

resume callers ask the daemon to resume instead of writing executing themselves
the daemon re-reads the persisted task row and derives the canonical lifecycle state
if the task is already terminal, resume is rejected and no state is rewritten
if the task is already executing, the daemon skips duplicate executing writes
only a genuinely resumable non-terminal task is moved back to executing

This protects terminal fields from stale post-resume writes and keeps task rows aligned with task_completed events even when approval or follow-up flows resolve very quickly.

Workspace Refresh

When the active workspace changes, SessionRuntime swaps to the new tool registry, invalidates tool availability caches, and preserves the live transcript and loop state. This lets the task continue without resetting the session mirror.

It also clears workspace-derived permission cache entries so the next evaluation uses the refreshed workspace rule set.

Security Guide

Showcase

Was this page helpful?Edit this page on GitHub