Reference
Project Status
Synced from github.com/CoWork-OS/CoWork-OS/docs
Production-Ready Implementation
CoWork OS is a security-first personal AI assistant platform with multi-channel messaging support, comprehensive guardrails, and extensive test coverage.
What CoWork OS Is
- Personal AI Gateway: Connect your AI assistant to WhatsApp, Telegram, Discord, Slack, and iMessage
- Everything Workbench: Create, open, review, lightly edit, and revise generated documents, spreadsheets, presentations, web pages, PDFs, and previews from the same local-first task workspace
- Managed Devices: Operate local and remote CoWork machines from a dedicated Devices tab
- Automations Surface: One settings group for queueing, scheduling, triggers, briefing, and Workflow Intelligence suggestions/reflection; task view can also create cron scheduled tasks from the current task menu
- Renderer Performance: Sidebar and timeline virtualization in the
CoWork-OS/CoWork-OSrepo use@chenglou/pretextfor text measurement and keep long task feeds responsive - Security-First Design: 4,932 automated tests across 390 test files, configurable guardrails, layered permission rules, workspace-local policy files, and approval workflows
- Imported Capability Security: managed skill and pack imports are staged, scanned, reported, and quarantined when blocked instead of being activated directly
- Multi-Provider Support: 30+ LLM providers including free local models via Ollama
- Local-First Architecture: Your data stays on your machine, BYOK model
What's Built and Working
1. Core Architecture
Reliability Flywheel (Eval + Risk Gates)
- Eval corpus extraction from failed/partial tasks (
scripts/qa/build_eval_corpus.cjs) - Deterministic eval suite replay runner (
scripts/qa/run_eval_suite.cjs) - Eval schema and task metadata (
eval_cases,eval_suites,eval_runs,eval_case_runs, task risk/eval columns) - Eval service and IPC endpoints (
eval:listSuites,eval:runSuite,eval:getRun,eval:getCase,eval:createCaseFromTask) - Risk scoring and policy-driven tiered review gate (
off,balanced,strict) - Prompt reliability hardening (modular prompt sections, shared policy dedupe, token budgets)
- Session- and turn-scoped prompt section memoization for execution and follow-up prompt assembly
- Provider-aware prompt caching with stable-prefix hashing, Anthropic/OpenRouter/OpenAI-family routing, and cache telemetry
- Prompt-aware tool rendering with compact planning text and provider-facing description reuse
- Skill shortlist routing with low-confidence fallback and text budget caps
- Earlier verification nudges through checklist tool-result reminders and pre-finalization prompt reminders
- PR regression policy gate for production incident fixes
- Nightly hardening workflow with machine-readable report artifact
- Release hardening gate (date-based strictness window)
- Local-only reliability data policy (no required telemetry upload path)
- Reference:
docs/reliability-flywheel.md
Database Layer
- SQLite schema with 6 tables (workspaces, tasks, events, artifacts, approvals, skills)
- Repository pattern for data access
- Type-safe database operations
- Located:
src/electron/database/
Agent System
- AgentDaemon - Main orchestrator with worktree isolation and collaborative mode
- TaskExecutor - Shared turn kernel, metadata-driven tool scheduler, delegated-work orchestration, and terminal-state-safe completion/resume handoff
- SessionRuntime - Canonical owner for task-session state, session checklists, snapshots, recovery, and task projection
- Prompt-cache runtime state - stable system blocks, stable-prefix hashing, provider-family mode tracking, and resume-safe cache invalidation
- ExecutorEventEmitter - Typed event system for executor lifecycle
- LifecycleMutex - Concurrency control for executor operations
- Tool Registry - Manages all available tools and scheduler metadata
- Tool prompt layer - Internal prompt metadata renders visible-tool guidance after filtering without changing provider schemas
- Session checklist runtime tools -
task_list_create,task_list_update, andtask_list_listfor execution-style tasks - Orchestration graph engine - Normalized delegation runs for spawn_agent, workflow phases, teams, and ACP tasks
- Worker roles - researcher, implementer, verifier, and synthesizer with hard tool scopes
- Structured delegation brief - child tasks inherit objective, scope, evidence, deliverable, and completion contracts
- Permission system with layered rules, workspace policy files, and approval flow
- Context Manager - Conversation context handling
- Capability Matcher - Auto-select agents based on task requirements
- Located:
src/electron/agent/
Multi-Provider LLM Support
- Anthropic (Claude models)
- Google Gemini
- OpenRouter (multi-model access)
- OpenAI (API Key: GPT-4o, o1 models)
- OpenAI (ChatGPT OAuth: Use your ChatGPT subscription)
- Prompt caching defaults for Anthropic, Azure Anthropic, OpenAI, Azure OpenAI, and OpenRouter GPT/Claude routes
- AWS Bedrock
- Ollama (local/free)
- Provider Factory with dynamic selection
- Located:
src/electron/agent/llm/
Web Search Integration
- DuckDuckGo (free built-in, no API key — automatic last-resort fallback)
- Tavily (AI-optimized)
- Brave Search
- SerpAPI (Google results)
- Google Custom Search
- Primary + fallback provider support
- web_search tool always available (DuckDuckGo ensures zero-config search)
- Located:
src/electron/agent/search/
Browser Automation
- Browser V2 session manager with visible workbench default
- Electron Workbench CDP control through renderer-owned webview
- Playwright local fallback for forced headless/background runs
- External CDP attach path gated by explicit real-browser consent
- Right-sidebar/fullscreen workbench routing with persistent workspace browser profile
- Accessibility snapshots with short-lived refs and stale-ref validation
- Visible cursor movement for agent browser actions
- Screenshot capture and screenshot annotation
- Console, network, downloads, storage, dialog, emulation, and trace diagnostics
- Navigation, screenshots, PDF export
- Ref-aware click, fill, type, read, hover, drag, upload, and press-key actions
- Content extraction (text, links, forms)
- Scroll, wait for elements
- Located:
src/electron/browser/,src/electron/agent/browser/, andsrc/electron/agent/tools/browser-tools.ts
Channel Integrations
- WhatsApp bot with QR code pairing and self-chat mode
- Telegram bot with commands
- Discord bot with slash commands
- Slack bot with Socket Mode
- Session management
- Security modes (pairing, allowlist, open)
- Located:
src/electron/gateway/
Composer Routing
- Grouped
@autocomplete for Agents, Integrations, and Files - Configured integration mention resolver with Google Workspace split into Gmail, Google Drive, and Google Calendar
- Rich inline integration chips in the composer, sent user bubbles, and restored task/session history
- Soft
integrationMentionsruntime guidance without changingallowedTools -
@Inboxrouting from the main composer into Inbox Agent Ask Inbox - Ask Inbox right-sidebar chat with run-scoped live step events and matched evidence
- Hybrid mailbox search architecture for Ask Inbox: local FTS, semantic mailbox index, provider-native search, attachment text, shortlist/read/rerank
- Located:
src/renderer/components/PromptComposerInput.tsx,src/electron/integrations/
2. Tools & Skills
File Operations (7 tools)
- read_file - Read file contents
- write_file - Create or overwrite files
- list_directory - List folder contents
- rename_file - Rename or move files
- delete_file - Delete with approval
- create_directory - Create folders
- search_files - Search by name/content
Document Tools and Skills
- Everything Workbench - shared task-output model for generated docs, sheets, decks, web pages, PDFs, and previews with compact cards, sidebar/fullscreen artifact workspaces, follow-up composer context, refresh-after-edit behavior, and external app actions for advanced native workflows
- Spreadsheet - Excel .xlsx (exceljs) generation and structured preview extraction
- Spreadsheet artifact workbench - compact task cards, resizable sidebar viewer, fullscreen editable grid, selection/copy/save/zoom, and follow-up composer controls
- Document - Word .docx and PDF (docx, pdfkit)
- Document artifact workbench - compact task cards, resizable sidebar/fullscreen viewer, direct DOCX editing, save/copy/external actions, follow-up composer controls, and best-effort preview/external handling for DOC/RTF/ODT/OTT/Pages outputs
- LaTeX compilation -
.texsource to PDF via systemtectonic,latexmk,xelatex,lualatex, orpdflatex - Presentation - PowerPoint .pptx generation through the Codex presentation runtime with
pptxgenjsfallback - Presentation artifact workbench - compact task cards, resizable sidebar/fullscreen viewer, fast text-first loading, cached slide images, navigation, zoom, speaker notes, and follow-up composer controls
- Web page artifact workbench - compact task cards for generated HTML/HTM and built React output, resizable sidebar/fullscreen sandboxed iframe viewer, browser/folder/copy actions, and follow-up composer controls
- Folder Organizer - By type/date
- Kami - Editorial PDFs, resumes, one-pagers, diagrams, and slide decks with workspace-local scaffolding
Browser Tools (34 tools)
- browser_navigate
- browser_snapshot
- browser_screenshot
- browser_save_pdf
- browser_click
- browser_hover
- browser_drag
- browser_fill
- browser_type
- browser_press
- browser_get_content
- browser_get_text
- browser_scroll
- browser_wait
- browser_select
- browser_upload_file
- browser_handle_dialog
- browser_tabs
- browser_switch_tab
- browser_close_tab
- browser_console
- browser_network
- browser_downloads
- browser_storage
- browser_emulate
- browser_trace_start
- browser_trace_stop
- browser_evaluate
- browser_back
- browser_forward
- browser_reload
- browser_attach
- browser_act_batch
- browser_close
Search Tools
- web_search - Multi-provider web search
Code Tools (3 tools)
- glob - Fast pattern-based file search
- grep - Regex content search across files
- edit_file - Surgical file editing with find-and-replace
Git Tools (3 tools)
- git_commit - Commit changes in workspace or worktree
- git_diff - View staged/unstaged changes
- git_branch - List, create, or switch branches
Web Fetch Tools (2 tools)
- web_fetch - Fetch and parse web pages
- http_request - Full HTTP client (curl-like)
Shell Tools
- execute_command - Shell command execution (requires approval)
System Tools
- take_screenshot - Full screen or specific windows
- clipboard_read / clipboard_write - Clipboard access
- open_application / open_url / open_path - Launch apps and URLs
- show_in_finder - Reveal files in Finder
- get_system_info - System information and environment
Custom Skills
- User-defined reusable workflows
- YAML-based skill definitions
- Priority-based sorting
- Parameter input modal for skill variables
- Managed import scanning, persisted security reports, quarantine, and digest recheck for imported skill bundles
- Located:
~/Library/Application Support/cowork-os/skills/
Research Vault Workflow
- First-class bundled
llm-wikiskill - Workspace-local markdown vault structure with
SCHEMA.md,index.md,log.md,inbox.md, and durableraw/captures - Obsidian-friendly note/link conventions
- Deterministic vault analyzer for link health, bridge pages, surprising cross-section links, and suggested follow-up questions
- Desktop + gateway slash-command support with inline chaining
- Located:
resources/skills/llm-wiki.jsonandresources/skills/llm-wiki/
Personality System
- 6 personality styles (professional, friendly, concise, creative, technical, casual)
- 9 persona overlays (jarvis, friday, hal, computer, alfred, intern, sensei, pirate, noir)
- Response style options (emoji usage, response length, code comments, explanation depth)
- Quirks (catchphrase, sign-off, analogy domain)
- Prompt-based control via conversation
- Relationship tracking (user name, interaction count)
- Located:
src/electron/settings/personality-manager.ts
MCP (Model Context Protocol)
- MCP Client - Connect to external MCP servers
- MCP Host - Expose CoWork's tools as MCP server
- MCP Registry - One-click server installation
- SSE and WebSocket transports
- Located:
src/electron/mcp/
3. User Interface
Main Components
- Workspace selector with folder picker
- Task list with status indicators and task pinning
- Task detail view with timeline and scroll-to-bottom button
- Right-panel checklist section showing the latest read-only session checklist and verification nudge state
- Task learning progression surface with memory, playbook, and skill proposal visibility
- Approval dialog system
- Real-time event streaming
- Quick Task FAB (floating action button)
- Toast notifications for task completion
- In-app file viewer for artifacts
- Spreadsheet artifact viewer with sidebar/fullscreen modes, persisted sidebar width, editable grid controls, structured workbook/CSV/TSV preview data, and external artifact handling for Numbers/Google Sheets/ODS/XLSB outputs
- Document artifact viewer with sidebar/fullscreen modes, persisted sidebar width, structured document preview data, direct DOCX editing, save/copy controls, and external artifact handling for legacy/native document formats
- Paired LaTeX/PDF artifact workbench with Summary,
.tex source, and PDF tabs - Rich PPTX artifact viewer with inline deck cards, sidebar/fullscreen modes, fast text-first preview, cached rendered slides, and follow-up refresh after completion
- Web page artifact viewer with inline HTML cards, sidebar/fullscreen modes, sandboxed iframe preview, built React output handling, and follow-up refresh after completion
- Parallel task queue panel
- Collaborative Thoughts Panel - Real-time agent thinking display
- Comparison View - Side-by-side agent/model output comparison
- Multi-LLM Selection Panel - Configure multi-provider runs
- Live router visibility - active provider, active model, and fallback state surfaced in the task UI
- Unified recall search across tasks, messages, files, memory, and knowledge-graph context
- Persistent shell session status and retained-state controls for long-running operator workflows
- Worktree Settings - Git worktree configuration UI
- Devices tab - saved remote devices, remote task feed, remote workspace browser, remote file picker
- Companies tab - company shell setup, goals, projects, issues, linked operators
- Workflow Intelligence settings - heartbeat-triggered reflection, target kinds, last winner visibility, namespaced backlog, suggestion output, and dispatch history
Settings UI
- LLM provider configuration
- Model selection
- Search provider configuration
- Telegram bot settings
- Discord bot settings
- Slack bot settings
- Update settings
- Guardrail settings (budgets, limits)
- Queue settings (concurrency)
- Automations settings group (queue, Workflow Intelligence, scheduled, hooks, triggers, briefing)
- Task-sourced scheduled automations from task view overflow menu
- Custom Skills management
- Quarantined Imports sections for skills and plugin packs with report, retry scan, and removal actions
- Personality settings (styles, personas, quirks)
- MCP server configuration
4. Infrastructure
Security
- Secure credential storage (safeStorage)
- Path traversal protection
- Content Security Policy
- Input validation
- Approval flow for destructive operations
Configurable Guardrails
- Token budget per task (1K - 10M)
- Cost budget per task ($0.01 - $100)
- Iteration limit (5 - 500)
- Dangerous command blocking
- Auto-approve trusted commands
- File size limits
- Domain allowlist for browser
Goal Mode & Re-planning
- Success criteria (shell commands or file checks)
- Auto-retry up to N attempts
- Dynamic re-planning mid-execution
-
revise_plantool for agent adaptation
Parallel Task Queue
- Configurable concurrency (1-10)
- FIFO queue management
- Auto-start next task
- Queue persistence across restarts
Auto-Update System
- Update checking
- Download progress
- One-click install
- GitHub releases integration
Build System
- Electron + React + TypeScript
- Vite for development
- electron-builder for packaging
- macOS entitlements
File Structure
cowork-os/
├── src/
│ ├── electron/
│ │ ├── main.ts
│ │ ├── preload.ts
│ │ ├── database/
│ │ │ ├── schema.ts
│ │ │ └── repositories.ts
│ │ ├── agent/
│ │ │ ├── daemon.ts
│ │ │ ├── executor.ts
│ │ │ ├── queue-manager.ts # Parallel task queue
│ │ │ ├── context-manager.ts
│ │ │ ├── custom-skill-loader.ts
│ │ │ ├── executor-*-utils.ts # Modular executor utilities
│ │ │ ├── executor-event-emitter.ts
│ │ │ ├── executor-lifecycle-mutex.ts
│ │ │ ├── llm/ # 30+ providers and compatible gateways
│ │ │ ├── search/ # 4 providers
│ │ │ ├── browser/ # Legacy Playwright fallback service
│ │ │ ├── tools/ # All tool implementations + git tools
│ │ │ ├── skills/ # Document skills
│ │ │ └── guardrails/ # Safety limits
│ │ ├── git/ # Git worktree & comparison service
│ │ ├── browser/ # Browser V2 session manager and workbench bridge
│ │ ├── agents/ # Agent teams, thoughts, capability matcher
│ │ ├── gateway/ # WhatsApp, Telegram, Discord & Slack
│ │ ├── settings/ # Personality manager
│ │ ├── mcp/ # Model Context Protocol
│ │ │ ├── client/ # Connect to servers
│ │ │ ├── host/ # Expose tools
│ │ │ └── registry/ # Server catalog
│ │ ├── updater/ # Auto-update
│ │ ├── ipc/
│ │ └── utils/
│ ├── renderer/
│ │ ├── App.tsx
│ │ ├── components/ # 20+ components
│ │ └── styles/
│ └── shared/
│ └── types.ts
├── build/
│ └── entitlements.mac.plist
└── package.json
How It Works
Execution Flow
1. User selects workspace folder
|
2. User creates task with description
|
3. AgentDaemon starts TaskExecutor
|
4. TaskExecutor builds or refreshes SessionRuntime and delegates the next turn request to it
|
5. SessionRuntime prepares the message set, owns the turn-loop mirror state, and constructs the active TurnKernel for the step, follow-up, or text turn
|
6. For each plan step:
- LLM decides which tools to use
- TaskExecutor routes the batch through ToolScheduler and ToolRegistry
- Tools perform operations (with permission checks)
- Results sent back to LLM
- Events logged and streamed to UI
|
7. If approval needed:
- TaskExecutor pauses
- ApprovalDialog shown to user
- User approves/denies
- Execution continues or fails
|
8. Task completes
- Status updated to "completed"
- All events and semantic completion summaries logged in database
- Artifacts tracked
Permission Model
Workspace Permissions:
├── Read: Enabled by default
├── Write: Enabled by default
├── Delete: Enabled, requires approval
├── Network: Enabled (for web search)
└── Shell: Requires approval
Operations Requiring Approval:
├── Delete file
├── Delete multiple files
├── Bulk rename (>10 files)
├── Shell command execution
└── External service calls
What's NOT Implemented (Planned)
Agent Integrity and Trap Defense
- Status: Planned
- Spec:
docs/agent-integrity-and-trap-defense-spec.md - Why it matters:
- hardens CoWork OS against hidden-content prompt injection, semantic manipulation, poisoned memory, malicious delegation, and approval-fatigue attacks
- turns current non-blocking prompt-injection detection into a durable runtime integrity model spanning ingestion, memory, permissions, delegation, and operator review
- Planned phases:
- Phase 1: content integrity records and task-level risk classification for web, browser, scraping, email, and imported documents
- Phase 2: trusted vs untrusted memory lanes and promotion gates for KG, playbooks, and skill proposals
- Phase 3: provenance-aware approvals and permission decisions for sensitive actions
- Phase 4: taint propagation and restrictions across agent teams, child tasks, and remote delegation
- Phase 5: integrity dashboard plus eval and red-team coverage for agent-trap scenarios
VM Sandbox
- Status: Stub implementation
- File:
src/electron/agent/sandbox/runner.ts - What's needed:
- macOS Virtualization.framework integration
- Linux VM image
- Workspace mount
- Network egress controls
Sub-Agents / Multi-Agent Collaboration
- Status: Implemented (Collaborative Mode, Multi-LLM Mode, Agent Comparison)
- What's built:
- Collaborative Mode: ephemeral multi-agent teams with real-time thought sharing
- Multi-LLM Mode: same task dispatched to multiple providers with judge synthesis
- Agent Comparison Mode: side-by-side output comparison across agents/models
- Capability Matcher: auto-select agents based on task requirements
- Git Worktree Isolation: per-task isolated branches with auto-commit/merge/cleanup
Ready to Use
You Can:
- Select workspaces and create tasks
- Use any configured LLM provider, including local Ollama and 30+ supported provider/gateway options
- Execute multi-step file operations
- Create real Office documents (.xlsx, .docx, .pdf, .pptx)
- Search the web with multiple providers
- Automate browser interactions
- Run tasks remotely via WhatsApp, Telegram, Discord, or Slack
- Track all agent activity in real-time
- Approve/deny destructive operations
- Receive automatic updates
- Use Goal Mode with success criteria and auto-retry
- Create custom skills with reusable workflows
- Connect to MCP servers for extended tool access
- Run multiple tasks in parallel (1-10 concurrent)
- Configure safety guardrails (budgets, blocked commands)
- Use system tools (screenshots, clipboard, open apps)
- View artifacts with the in-app file viewer, including spreadsheet workbench views, document artifact editing, and rich
.pptxdeck previews - Customize agent personality via Settings or conversation prompts
- Run tasks in isolated git worktrees with auto-commit and merge
- Use collaborative mode for multi-agent team reasoning
- Use multi-LLM mode to compare outputs across providers
- Compare agent outputs side by side
- Pin tasks for quick access
- Gracefully wrap up running tasks
- Use git tools (commit, diff, branch) within tasks
You Cannot (Yet):
- Execute arbitrary code in a VM sandbox
- Apply network egress controls
Dependencies
Production
react&react-dom- UI frameworkbetter-sqlite3- Local database@anthropic-ai/sdk- Anthropic API@google/generative-ai- Gemini API@aws-sdk/client-bedrock-runtime- AWS Bedrockplaywright- Browser automationdiscord.js- Discord botgrammy- Telegram bot@slack/bolt- Slack botexceljs- Excel creation, preview extraction, and save/update supportdocx- Word document creationpdfkit- PDF creation@oai/artifact-tool/pptxgenjs- PowerPoint creation and rendering fallbackelectron-updater- Auto-updates
Development
electron- Desktop frameworkvite- Build tooltypescript- Type safetyelectron-builder- App packaging
Quick Test Checklist
Before first run, verify:
- Node.js 24+ installed
-
npm installcompleted successfully - On macOS or Windows (required for Electron desktop features)
Then run:
npm run dev
Expected behavior:
- Vite dev server starts (port 5173)
- Electron window opens
- DevTools open automatically
- Workspace selector appears
- Configure API credentials in Settings (gear icon)
Performance Characteristics
Token Usage (varies by provider)
- Plan creation: ~500-1000 tokens
- Step execution: ~1000-3000 tokens per step
- Average task: 5000-10000 tokens total
Timing
- Plan creation: 2-5 seconds
- Simple file operation: 3-6 seconds per step
- Document creation: 5-10 seconds
- Browser automation: 2-10 seconds per action
- Web search: 1-3 seconds
Resource Usage
- Memory: ~200-400MB (Electron + Playwright when active)
- Database: <1MB per task
- CPU: Minimal (except during API calls)
Summary
CoWork OS is a production-ready, security-first personal AI assistant platform:
Core Strengths
- Security: 4,932 automated tests across 390 test files, configurable guardrails, layered permission rules, approval workflows, and brute-force protection
- Multi-Channel: WhatsApp, Telegram, Discord, Slack, iMessage integration
- Multi-Provider: 30+ LLM providers and compatible gateways, including Claude, GPT, Gemini, Bedrock, OpenRouter, and Ollama
- Local-First: Your data stays on your machine, BYOK model
- Extensible: MCP support (Client, Host, Registry), 147 built-in skills, and plugin packs
Feature Highlights
- Real Office document creation (Excel, Word, PDF, PowerPoint)
- Web search and browser automation
- Code tools (glob, grep, edit_file) and git tools (commit, diff, branch)
- Collaborative Mode with real-time thought sharing
- Multi-LLM Mode with judge-based synthesis
- Agent Comparison Mode for side-by-side output comparison
- Git Worktree Isolation for per-task branch isolation
- Task pinning and graceful wrap-up
- Personality customization (6 styles, 9 personas)
- Goal Mode with auto-retry
- Parallel task queue (1-10 concurrent)
- Remote access (Tailscale, SSH, WebSocket API)
Planned
- Agent Integrity and Trap Defense runtime across ingestion, memory, approvals, and delegation
- VM sandbox using macOS Virtualization.framework
- Network egress controls with proxy
- Linux desktop support
- Web Browser Mode (
--serve) — full app accessible from any browser via HTTP/WebSocket
The architecture is extensible. All future features can be added without refactoring core systems.
Ready to run with: npm install && npm run dev
Was this page helpful?Edit this page on GitHub