Additional
Relationship-Agent UAT Checklist
Synced from github.com/CoWork-OS/CoWork-OS/docs
Use this checklist to validate that CoWork OS behaves as a personal relationship agent, not only a task executor.
Preconditions
- App is running with a configured LLM provider.
- Workspace is selected.
- Build health already passes:
npm run type-checknpm run build
Test 1: Strategy Prompt (Answer First)
Input
I built you, how should I position you, who should I target, and what should I aim to achieve?
Expected
- A direct answer appears early.
- The agent may continue deeper execution, but should not withhold the main answer.
- Task final state is
completed.
Test 2: Mixed Mode (Talk + Act)
Input
Give me a quick positioning recommendation first, then produce a deeper GTM strategy document.
Expected
- Initial concise recommendation.
- Follow-on execution for deeper artifact.
- Final user-facing completion response.
Test 3: Timeout Recovery (No Silent End)
Setup
Use a prompt likely to create a long synthesis step.
Expected
- If timeout occurs, logs show recovery behavior.
- Task still returns a best-effort final response.
- Task should not terminate without user-facing output.
Diagnostic Logs
- Good signal:
Task cancelled - not logging as error (reason: timeout)with recovery/finalization events.
- Bad signal:
- cancellation log without any final user response.
Test 4: Relationship Memory Capture
Input
Call me almarion. I prefer concise responses. Remind me to send investor update tomorrow.
Expected
- Relationship memory includes identity and preference items.
- Commitment item exists and is
open. - Due-soon reminder is returned within the configured window.
Test 5: Relationship Memory Controls
Actions (Settings > Memory System)
- Edit a relationship item.
- Mark a commitment done.
- Reopen the same commitment.
- Forget one item.
Expected
- UI updates immediately.
- Changes persist across restart.
- Due-soon list reflects status changes.
Test 6: Image Attachment in Task Creation
Input
Create a new task with a JPEG or PNG image attached and the prompt:
Describe what you see in the attached image.
Expected
- The LLM response references specific visual content from the image (not generic text).
- No "Image skipped" warnings appear in the activity log.
- Follow-up messages with additional image attachments are also processed correctly.
Test 7: Shared Context Safety
Setup
Run equivalent prompts from private and shared channel contexts (if configured).
Expected
- Private context can use relationship memory normally.
- Shared context follows memory isolation policy unless explicitly trusted (
allowSharedContextMemory).
Pass Criteria
Release is accepted when all seven tests above pass with no silent task termination and no regression in completion behavior.