Reference

Project Status

Synced from github.com/CoWork-OS/CoWork-OS/docs

Production-Ready Implementation

CoWork OS is a security-first personal AI assistant platform with multi-channel messaging support, comprehensive guardrails, and extensive test coverage.

What CoWork OS Is

Personal AI Gateway: Connect your AI assistant to WhatsApp, Telegram, Discord, Slack, and iMessage
Everything Workbench: Create, open, review, lightly edit, and revise generated documents, spreadsheets, presentations, web pages, PDFs, and previews from the same local-first task workspace
Managed Devices: Operate local and remote CoWork machines from a dedicated Devices tab
Automations Surface: One settings group for queueing, scheduling, triggers, briefing, and Workflow Intelligence suggestions/reflection; task view can also create cron scheduled tasks from the current task menu
Renderer Performance: Sidebar and timeline virtualization in the CoWork-OS/CoWork-OS repo use @chenglou/pretext for text measurement and keep long task feeds responsive
Security-First Design: 4,932 automated tests across 390 test files, configurable guardrails, layered permission rules, workspace-local policy files, and approval workflows
Imported Capability Security: managed skill and pack imports are staged, scanned, reported, and quarantined when blocked instead of being activated directly
Multi-Provider Support: 30+ LLM providers including free local models via Ollama
Local-First Architecture: Your data stays on your machine, BYOK model

What's Built and Working

1. Core Architecture

Reliability Flywheel (Eval + Risk Gates)

Eval corpus extraction from failed/partial tasks (scripts/qa/build_eval_corpus.cjs)
Deterministic eval suite replay runner (scripts/qa/run_eval_suite.cjs)
Eval schema and task metadata (eval_cases, eval_suites, eval_runs, eval_case_runs, task risk/eval columns)
Eval service and IPC endpoints (eval:listSuites, eval:runSuite, eval:getRun, eval:getCase, eval:createCaseFromTask)
Risk scoring and policy-driven tiered review gate (off, balanced, strict)
Prompt reliability hardening (modular prompt sections, shared policy dedupe, token budgets)
Session- and turn-scoped prompt section memoization for execution and follow-up prompt assembly
Provider-aware prompt caching with stable-prefix hashing, Anthropic/OpenRouter/OpenAI-family routing, and cache telemetry
Prompt-aware tool rendering with compact planning text and provider-facing description reuse
Skill shortlist routing with low-confidence fallback and text budget caps
Earlier verification nudges through checklist tool-result reminders and pre-finalization prompt reminders
PR regression policy gate for production incident fixes
Nightly hardening workflow with machine-readable report artifact
Release hardening gate (date-based strictness window)
Local-only reliability data policy (no required telemetry upload path)
Reference: docs/reliability-flywheel.md

Database Layer

SQLite schema with 6 tables (workspaces, tasks, events, artifacts, approvals, skills)
Repository pattern for data access
Type-safe database operations
Located: src/electron/database/

Agent System

AgentDaemon - Main orchestrator with worktree isolation and collaborative mode
TaskExecutor - Shared turn kernel, metadata-driven tool scheduler, delegated-work orchestration, and terminal-state-safe completion/resume handoff
SessionRuntime - Canonical owner for task-session state, session checklists, snapshots, recovery, and task projection
Prompt-cache runtime state - stable system blocks, stable-prefix hashing, provider-family mode tracking, and resume-safe cache invalidation
ExecutorEventEmitter - Typed event system for executor lifecycle
LifecycleMutex - Concurrency control for executor operations
Tool Registry - Manages all available tools and scheduler metadata
Tool prompt layer - Internal prompt metadata renders visible-tool guidance after filtering without changing provider schemas
Session checklist runtime tools - task_list_create, task_list_update, and task_list_list for execution-style tasks
Orchestration graph engine - Normalized delegation runs for spawn_agent, workflow phases, teams, and ACP tasks
Worker roles - researcher, implementer, verifier, and synthesizer with hard tool scopes
Structured delegation brief - child tasks inherit objective, scope, evidence, deliverable, and completion contracts
Permission system with layered rules, workspace policy files, and approval flow
Context Manager - Conversation context handling
Capability Matcher - Auto-select agents based on task requirements
Located: src/electron/agent/

Multi-Provider LLM Support

Anthropic (Claude models)
Google Gemini
OpenRouter (multi-model access)
OpenAI (API Key: GPT-4o, o1 models)
OpenAI (ChatGPT OAuth: Use your ChatGPT subscription)
Prompt caching defaults for Anthropic, Azure Anthropic, OpenAI, Azure OpenAI, and OpenRouter GPT/Claude routes
AWS Bedrock
Ollama (local/free)
Provider Factory with dynamic selection
Located: src/electron/agent/llm/

Web Search Integration

DuckDuckGo (free built-in, no API key — automatic last-resort fallback)
Tavily (AI-optimized)
Brave Search
SerpAPI (Google results)
Google Custom Search
Primary + fallback provider support
web_search tool always available (DuckDuckGo ensures zero-config search)
Located: src/electron/agent/search/

Browser Automation

Browser V2 session manager with visible workbench default
Electron Workbench CDP control through renderer-owned webview
Playwright local fallback for forced headless/background runs
External CDP attach path gated by explicit real-browser consent
Right-sidebar/fullscreen workbench routing with persistent workspace browser profile
Accessibility snapshots with short-lived refs and stale-ref validation
Visible cursor movement for agent browser actions
Screenshot capture and screenshot annotation
Console, network, downloads, storage, dialog, emulation, and trace diagnostics
Navigation, screenshots, PDF export
Ref-aware click, fill, type, read, hover, drag, upload, and press-key actions
Content extraction (text, links, forms)
Scroll, wait for elements
Located: src/electron/browser/, src/electron/agent/browser/, and src/electron/agent/tools/browser-tools.ts

Channel Integrations

WhatsApp bot with QR code pairing and self-chat mode
Telegram bot with commands
Discord bot with slash commands
Slack bot with Socket Mode
Session management
Security modes (pairing, allowlist, open)
Located: src/electron/gateway/

Composer Routing

Grouped @ autocomplete for Agents, Integrations, and Files
Configured integration mention resolver with Google Workspace split into Gmail, Google Drive, and Google Calendar
Rich inline integration chips in the composer, sent user bubbles, and restored task/session history
Soft integrationMentions runtime guidance without changing allowedTools
@Inbox routing from the main composer into Inbox Agent Ask Inbox
Ask Inbox right-sidebar chat with run-scoped live step events and matched evidence
Hybrid mailbox search architecture for Ask Inbox: local FTS, semantic mailbox index, provider-native search, attachment text, shortlist/read/rerank
Located: src/renderer/components/PromptComposerInput.tsx, src/electron/integrations/

2. Tools & Skills

File Operations (7 tools)

read_file - Read file contents
write_file - Create or overwrite files
list_directory - List folder contents
rename_file - Rename or move files
delete_file - Delete with approval
create_directory - Create folders
search_files - Search by name/content

Document Tools and Skills

Everything Workbench - shared task-output model for generated docs, sheets, decks, web pages, PDFs, and previews with compact cards, sidebar/fullscreen artifact workspaces, follow-up composer context, refresh-after-edit behavior, and external app actions for advanced native workflows
Spreadsheet - Excel .xlsx (exceljs) generation and structured preview extraction
Spreadsheet artifact workbench - compact task cards, resizable sidebar viewer, fullscreen editable grid, selection/copy/save/zoom, and follow-up composer controls
Document - Word .docx and PDF (docx, pdfkit)
Document artifact workbench - compact task cards, resizable sidebar/fullscreen viewer, direct DOCX editing, save/copy/external actions, follow-up composer controls, and best-effort preview/external handling for DOC/RTF/ODT/OTT/Pages outputs
LaTeX compilation - .tex source to PDF via system tectonic, latexmk, xelatex, lualatex, or pdflatex
Presentation - PowerPoint .pptx generation through the Codex presentation runtime with pptxgenjs fallback
Presentation artifact workbench - compact task cards, resizable sidebar/fullscreen viewer, fast text-first loading, cached slide images, navigation, zoom, speaker notes, and follow-up composer controls
Web page artifact workbench - compact task cards for generated HTML/HTM and built React output, resizable sidebar/fullscreen sandboxed iframe viewer, browser/folder/copy actions, and follow-up composer controls
Folder Organizer - By type/date
Kami - Editorial PDFs, resumes, one-pagers, diagrams, and slide decks with workspace-local scaffolding

Browser Tools (34 tools)

browser_navigate
browser_snapshot
browser_screenshot
browser_save_pdf
browser_click
browser_hover
browser_drag
browser_fill
browser_type
browser_press
browser_get_content
browser_get_text
browser_scroll
browser_wait
browser_select
browser_upload_file
browser_handle_dialog
browser_tabs
browser_switch_tab
browser_close_tab
browser_console
browser_network
browser_downloads
browser_storage
browser_emulate
browser_trace_start
browser_trace_stop
browser_evaluate
browser_back
browser_forward
browser_reload
browser_attach
browser_act_batch
browser_close

Search Tools

web_search - Multi-provider web search

Code Tools (3 tools)

glob - Fast pattern-based file search
grep - Regex content search across files
edit_file - Surgical file editing with find-and-replace

Git Tools (3 tools)

git_commit - Commit changes in workspace or worktree
git_diff - View staged/unstaged changes
git_branch - List, create, or switch branches

Web Fetch Tools (2 tools)

web_fetch - Fetch and parse web pages
http_request - Full HTTP client (curl-like)

Shell Tools

execute_command - Shell command execution (requires approval)

System Tools

take_screenshot - Full screen or specific windows
clipboard_read / clipboard_write - Clipboard access
open_application / open_url / open_path - Launch apps and URLs
show_in_finder - Reveal files in Finder
get_system_info - System information and environment

Custom Skills

User-defined reusable workflows
YAML-based skill definitions
Priority-based sorting
Parameter input modal for skill variables
Managed import scanning, persisted security reports, quarantine, and digest recheck for imported skill bundles
Located: ~/Library/Application Support/cowork-os/skills/

Research Vault Workflow

First-class bundled llm-wiki skill
Workspace-local markdown vault structure with SCHEMA.md, index.md, log.md, inbox.md, and durable raw/ captures
Obsidian-friendly note/link conventions
Deterministic vault analyzer for link health, bridge pages, surprising cross-section links, and suggested follow-up questions
Desktop + gateway slash-command support with inline chaining
Located: resources/skills/llm-wiki.json and resources/skills/llm-wiki/

Personality System

6 personality styles (professional, friendly, concise, creative, technical, casual)
9 persona overlays (jarvis, friday, hal, computer, alfred, intern, sensei, pirate, noir)
Response style options (emoji usage, response length, code comments, explanation depth)
Quirks (catchphrase, sign-off, analogy domain)
Prompt-based control via conversation
Relationship tracking (user name, interaction count)
Located: src/electron/settings/personality-manager.ts

MCP (Model Context Protocol)

MCP Client - Connect to external MCP servers
MCP Host - Expose CoWork's tools as MCP server
MCP Registry - One-click server installation
SSE and WebSocket transports
Located: src/electron/mcp/

3. User Interface

Main Components

Workspace selector with folder picker
Task list with status indicators and task pinning
Task detail view with timeline and scroll-to-bottom button
Right-panel checklist section showing the latest read-only session checklist and verification nudge state
Task learning progression surface with memory, playbook, and skill proposal visibility
Approval dialog system
Real-time event streaming
Quick Task FAB (floating action button)
Toast notifications for task completion
In-app file viewer for artifacts
Spreadsheet artifact viewer with sidebar/fullscreen modes, persisted sidebar width, editable grid controls, structured workbook/CSV/TSV preview data, and external artifact handling for Numbers/Google Sheets/ODS/XLSB outputs
Document artifact viewer with sidebar/fullscreen modes, persisted sidebar width, structured document preview data, direct DOCX editing, save/copy controls, and external artifact handling for legacy/native document formats
Paired LaTeX/PDF artifact workbench with Summary, .tex source, and PDF tabs
Rich PPTX artifact viewer with inline deck cards, sidebar/fullscreen modes, fast text-first preview, cached rendered slides, and follow-up refresh after completion
Web page artifact viewer with inline HTML cards, sidebar/fullscreen modes, sandboxed iframe preview, built React output handling, and follow-up refresh after completion
Parallel task queue panel
Collaborative Thoughts Panel - Real-time agent thinking display
Comparison View - Side-by-side agent/model output comparison
Multi-LLM Selection Panel - Configure multi-provider runs
Live router visibility - active provider, active model, and fallback state surfaced in the task UI
Unified recall search across tasks, messages, files, memory, and knowledge-graph context
Persistent shell session status and retained-state controls for long-running operator workflows
Worktree Settings - Git worktree configuration UI
Devices tab - saved remote devices, remote task feed, remote workspace browser, remote file picker
Companies tab - company shell setup, goals, projects, issues, linked operators
Workflow Intelligence settings - heartbeat-triggered reflection, target kinds, last winner visibility, namespaced backlog, suggestion output, and dispatch history

Settings UI

LLM provider configuration
Model selection
Search provider configuration
Telegram bot settings
Discord bot settings
Slack bot settings
Update settings
Guardrail settings (budgets, limits)
Queue settings (concurrency)
Automations settings group (queue, Workflow Intelligence, scheduled, hooks, triggers, briefing)
Task-sourced scheduled automations from task view overflow menu
Custom Skills management
Quarantined Imports sections for skills and plugin packs with report, retry scan, and removal actions
Personality settings (styles, personas, quirks)
MCP server configuration

4. Infrastructure

Security

Secure credential storage (safeStorage)
Path traversal protection
Content Security Policy
Input validation
Approval flow for destructive operations

Configurable Guardrails

Token budget per task (1K - 10M)
Cost budget per task ($0.01 - $100)
Iteration limit (5 - 500)
Dangerous command blocking
Auto-approve trusted commands
File size limits
Domain allowlist for browser

Goal Mode & Re-planning

Success criteria (shell commands or file checks)
Auto-retry up to N attempts
Dynamic re-planning mid-execution
revise_plan tool for agent adaptation

Parallel Task Queue

Configurable concurrency (1-10)
FIFO queue management
Auto-start next task
Queue persistence across restarts

Auto-Update System

Update checking
Download progress
One-click install
GitHub releases integration

Build System

Electron + React + TypeScript
Vite for development
electron-builder for packaging
macOS entitlements

File Structure

cowork-os/
├── src/
│   ├── electron/
│   │   ├── main.ts
│   │   ├── preload.ts
│   │   ├── database/
│   │   │   ├── schema.ts
│   │   │   └── repositories.ts
│   │   ├── agent/
│   │   │   ├── daemon.ts
│   │   │   ├── executor.ts
│   │   │   ├── queue-manager.ts    # Parallel task queue
│   │   │   ├── context-manager.ts
│   │   │   ├── custom-skill-loader.ts
│   │   │   ├── executor-*-utils.ts # Modular executor utilities
│   │   │   ├── executor-event-emitter.ts
│   │   │   ├── executor-lifecycle-mutex.ts
│   │   │   ├── llm/           # 30+ providers and compatible gateways
│   │   │   ├── search/        # 4 providers
│   │   │   ├── browser/       # Legacy Playwright fallback service
│   │   │   ├── tools/         # All tool implementations + git tools
│   │   │   ├── skills/        # Document skills
│   │   │   └── guardrails/    # Safety limits
│   │   ├── git/               # Git worktree & comparison service
│   │   ├── browser/           # Browser V2 session manager and workbench bridge
│   │   ├── agents/            # Agent teams, thoughts, capability matcher
│   │   ├── gateway/           # WhatsApp, Telegram, Discord & Slack
│   │   ├── settings/          # Personality manager
│   │   ├── mcp/               # Model Context Protocol
│   │   │   ├── client/        # Connect to servers
│   │   │   ├── host/          # Expose tools
│   │   │   └── registry/      # Server catalog
│   │   ├── updater/           # Auto-update
│   │   ├── ipc/
│   │   └── utils/
│   ├── renderer/
│   │   ├── App.tsx
│   │   ├── components/        # 20+ components
│   │   └── styles/
│   └── shared/
│       └── types.ts
├── build/
│   └── entitlements.mac.plist
└── package.json

How It Works

Execution Flow

1. User selects workspace folder
   |
2. User creates task with description
   |
3. AgentDaemon starts TaskExecutor
   |
4. TaskExecutor builds or refreshes SessionRuntime and delegates the next turn request to it
   |
5. SessionRuntime prepares the message set, owns the turn-loop mirror state, and constructs the active TurnKernel for the step, follow-up, or text turn
   |
6. For each plan step:
   - LLM decides which tools to use
   - TaskExecutor routes the batch through ToolScheduler and ToolRegistry
   - Tools perform operations (with permission checks)
   - Results sent back to LLM
   - Events logged and streamed to UI
   |
7. If approval needed:
   - TaskExecutor pauses
   - ApprovalDialog shown to user
   - User approves/denies
   - Execution continues or fails
   |
8. Task completes
   - Status updated to "completed"
   - All events and semantic completion summaries logged in database
   - Artifacts tracked

Permission Model

Workspace Permissions:
├── Read: Enabled by default
├── Write: Enabled by default
├── Delete: Enabled, requires approval
├── Network: Enabled (for web search)
└── Shell: Requires approval

Operations Requiring Approval:
├── Delete file
├── Delete multiple files
├── Bulk rename (>10 files)
├── Shell command execution
└── External service calls

What's NOT Implemented (Planned)

Agent Integrity and Trap Defense

Status: Planned
Spec: docs/agent-integrity-and-trap-defense-spec.md
Why it matters:
- hardens CoWork OS against hidden-content prompt injection, semantic manipulation, poisoned memory, malicious delegation, and approval-fatigue attacks
- turns current non-blocking prompt-injection detection into a durable runtime integrity model spanning ingestion, memory, permissions, delegation, and operator review
Planned phases:
- Phase 1: content integrity records and task-level risk classification for web, browser, scraping, email, and imported documents
- Phase 2: trusted vs untrusted memory lanes and promotion gates for KG, playbooks, and skill proposals
- Phase 3: provenance-aware approvals and permission decisions for sensitive actions
- Phase 4: taint propagation and restrictions across agent teams, child tasks, and remote delegation
- Phase 5: integrity dashboard plus eval and red-team coverage for agent-trap scenarios

VM Sandbox

Status: Stub implementation
File: src/electron/agent/sandbox/runner.ts
What's needed:
- macOS Virtualization.framework integration
- Linux VM image
- Workspace mount
- Network egress controls

Sub-Agents / Multi-Agent Collaboration

Status: Implemented (Collaborative Mode, Multi-LLM Mode, Agent Comparison)
What's built:
- Collaborative Mode: ephemeral multi-agent teams with real-time thought sharing
- Multi-LLM Mode: same task dispatched to multiple providers with judge synthesis
- Agent Comparison Mode: side-by-side output comparison across agents/models
- Capability Matcher: auto-select agents based on task requirements
- Git Worktree Isolation: per-task isolated branches with auto-commit/merge/cleanup

Ready to Use

You Can:

Select workspaces and create tasks
Use any configured LLM provider, including local Ollama and 30+ supported provider/gateway options
Execute multi-step file operations
Create real Office documents (.xlsx, .docx, .pdf, .pptx)
Search the web with multiple providers
Automate browser interactions
Run tasks remotely via WhatsApp, Telegram, Discord, or Slack
Track all agent activity in real-time
Approve/deny destructive operations
Receive automatic updates
Use Goal Mode with success criteria and auto-retry
Create custom skills with reusable workflows
Connect to MCP servers for extended tool access
Run multiple tasks in parallel (1-10 concurrent)
Configure safety guardrails (budgets, blocked commands)
Use system tools (screenshots, clipboard, open apps)
View artifacts with the in-app file viewer, including spreadsheet workbench views, document artifact editing, and rich .pptx deck previews
Customize agent personality via Settings or conversation prompts
Run tasks in isolated git worktrees with auto-commit and merge
Use collaborative mode for multi-agent team reasoning
Use multi-LLM mode to compare outputs across providers
Compare agent outputs side by side
Pin tasks for quick access
Gracefully wrap up running tasks
Use git tools (commit, diff, branch) within tasks

You Cannot (Yet):

Execute arbitrary code in a VM sandbox
Apply network egress controls

Dependencies

Production

react & react-dom - UI framework
better-sqlite3 - Local database
@anthropic-ai/sdk - Anthropic API
@google/generative-ai - Gemini API
@aws-sdk/client-bedrock-runtime - AWS Bedrock
playwright - Browser automation
discord.js - Discord bot
grammy - Telegram bot
@slack/bolt - Slack bot
exceljs - Excel creation, preview extraction, and save/update support
docx - Word document creation
pdfkit - PDF creation
@oai/artifact-tool / pptxgenjs - PowerPoint creation and rendering fallback
electron-updater - Auto-updates

Development

electron - Desktop framework
vite - Build tool
typescript - Type safety
electron-builder - App packaging

Quick Test Checklist

Before first run, verify:

Node.js 24+ installed
npm install completed successfully
On macOS or Windows (required for Electron desktop features)

Then run:

npm run dev

Expected behavior:

Vite dev server starts (port 5173)
Electron window opens
DevTools open automatically
Workspace selector appears
Configure API credentials in Settings (gear icon)

Performance Characteristics

Token Usage (varies by provider)

Plan creation: ~500-1000 tokens
Step execution: ~1000-3000 tokens per step
Average task: 5000-10000 tokens total

Timing

Plan creation: 2-5 seconds
Simple file operation: 3-6 seconds per step
Document creation: 5-10 seconds
Browser automation: 2-10 seconds per action
Web search: 1-3 seconds

Resource Usage

Memory: ~200-400MB (Electron + Playwright when active)
Database: <1MB per task
CPU: Minimal (except during API calls)

Summary

CoWork OS is a production-ready, security-first personal AI assistant platform:

Core Strengths

Security: 4,932 automated tests across 390 test files, configurable guardrails, layered permission rules, approval workflows, and brute-force protection
Multi-Channel: WhatsApp, Telegram, Discord, Slack, iMessage integration
Multi-Provider: 30+ LLM providers and compatible gateways, including Claude, GPT, Gemini, Bedrock, OpenRouter, and Ollama
Local-First: Your data stays on your machine, BYOK model
Extensible: MCP support (Client, Host, Registry), 147 built-in skills, and plugin packs

Feature Highlights

Real Office document creation (Excel, Word, PDF, PowerPoint)
Web search and browser automation
Code tools (glob, grep, edit_file) and git tools (commit, diff, branch)
Collaborative Mode with real-time thought sharing
Multi-LLM Mode with judge-based synthesis
Agent Comparison Mode for side-by-side output comparison
Git Worktree Isolation for per-task branch isolation
Task pinning and graceful wrap-up
Personality customization (6 styles, 9 personas)
Goal Mode with auto-retry
Parallel task queue (1-10 concurrent)
Remote access (Tailscale, SSH, WebSocket API)

Planned

Agent Integrity and Trap Defense runtime across ingestion, memory, approvals, and delegation
VM sandbox using macOS Virtualization.framework
Network egress controls with proxy
Linux desktop support
Web Browser Mode (--serve) — full app accessible from any browser via HTTP/WebSocket

The architecture is extensible. All future features can be added without refactoring core systems.

Ready to run with: npm install && npm run dev

Changelog

Acp Acpx Integration

Was this page helpful?Edit this page on GitHub