docs: Update AI documentation to accurately reflect current codebase (#708)

* docs: Update AI documentation for accurate codebase reflection

- Replace obsolete POLLING_ARCHITECTURE.md with DATA_FETCHING_ARCHITECTURE.md
- Rewrite API_NAMING_CONVENTIONS.md with file references instead of code examples
- Condense ARCHITECTURE.md from 482 to 195 lines for clarity
- Update ETAG_IMPLEMENTATION.md to reflect actual implementation
- Update QUERY_PATTERNS.md to reflect completed Phase 5 (nanoid optimistic updates)
- Add PRPs/stories/ to .gitignore

All documentation now references actual files in codebase rather than
embedding potentially stale code examples.


* docs: Update CLAUDE.md and AGENTS.md with current patterns

- Update CLAUDE.md to reference documentation files instead of embedding code
- Replace Service Layer and Error Handling code examples with file references
- Add proper distinction between DATA_FETCHING_ARCHITECTURE and QUERY_PATTERNS docs
- Include ETag implementation reference
- Update environment variables section with .env.example reference


* docs: apply PR review improvements to AI documentation

- Fix punctuation, hyphenation, and grammar issues across all docs
- Add language tags to directory tree code blocks for proper markdown linting
- Clarify TanStack Query integration (not replacing polling, but integrating it)
- Add Cache-Control header documentation and browser vs non-browser fetch behavior
- Reference actual implementation files for polling intervals instead of hardcoding values
- Improve type-safety phrasing and remove line numbers from file references
- Clarify Phase 1 removed manual frontend ETag cache (backend ETags remain)
This commit is contained in:
Wirasm 2025-09-19 13:29:46 +03:00 committed by GitHub
parent 0502d378f0
commit 1b272ed2af
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
9 changed files with 807 additions and 1197 deletions

1
.gitignore vendored
View File

@ -4,6 +4,7 @@ __pycache__
.claude/settings.local.json
PRPs/local
PRPs/completed/
PRPs/stories/
/logs/
.zed
tmp/

268
AGENTS.md
View File

@ -8,9 +8,13 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
### Core Principles
- **No backwards compatibility** - remove deprecated code immediately
- **No backwards compatibility; we follow a fixforward approach** — remove deprecated code immediately
- **Detailed errors over graceful failures** - we want to identify and fix issues fast
- **Break things to improve them** - beta is for rapid iteration
- **Continuous improvement** - embrace change and learn from mistakes
- **KISS** - keep it simple
- **DRY** when appropriate
- **YAGNI** — don't implement features that are not needed
### Error Handling
@ -40,51 +44,7 @@ These operations should continue but track and report failures clearly:
#### Critical Nuance: Never Accept Corrupted Data
When a process should continue despite failures, it must **skip the failed item entirely** rather than storing corrupted data:
**❌ WRONG - Silent Corruption:**
```python
try:
embedding = create_embedding(text)
except Exception as e:
embedding = [0.0] * 1536 # NEVER DO THIS - corrupts database
store_document(doc, embedding)
```
**✅ CORRECT - Skip Failed Items:**
```python
try:
embedding = create_embedding(text)
store_document(doc, embedding) # Only store on success
except Exception as e:
failed_items.append({'doc': doc, 'error': str(e)})
logger.error(f"Skipping document {doc.id}: {e}")
# Continue with next document, don't store anything
```
**✅ CORRECT - Batch Processing with Failure Tracking:**
```python
def process_batch(items):
results = {'succeeded': [], 'failed': []}
for item in items:
try:
result = process_item(item)
results['succeeded'].append(result)
except Exception as e:
results['failed'].append({
'item': item,
'error': str(e),
'traceback': traceback.format_exc()
})
logger.error(f"Failed to process {item.id}: {e}")
# Always return both successes and failures
return results
```
When a process should continue despite failures, it must **skip the failed item entirely** rather than storing corrupted data
#### Error Message Guidelines
@ -98,9 +58,10 @@ def process_batch(items):
### Code Quality
- Remove dead code immediately rather than maintaining it - no backward compatibility or legacy functions
- Prioritize functionality over production-ready patterns
- Avoid backward compatibility mappings or legacy function wrappers
- Fix forward
- Focus on user experience and feature completeness
- When updating code, don't reference what is changing (avoid keywords like LEGACY, CHANGED, REMOVED), instead focus on comments that document just the functionality of the code
- When updating code, don't reference what is changing (avoid keywords like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED), instead focus on comments that document just the functionality of the code
- When commenting on code in the codebase, only comment on the functionality and reasoning behind the code. Refrain from speaking to Archon being in "beta" or referencing anything else that comes from these global rules.
## Development Commands
@ -175,139 +136,35 @@ make test-be # Backend tests only
## Architecture Overview
Archon Beta is a microservices-based knowledge management system with MCP (Model Context Protocol) integration:
@PRPs/ai_docs/ARCHITECTURE.md
### Service Architecture
#### TanStack Query Implementation
- **Frontend (port 3737)**: React + TypeScript + Vite + TailwindCSS
- **Dual UI Strategy**:
- `/features` - Modern vertical slice with Radix UI primitives + TanStack Query
- `/components` - Legacy custom components (being migrated)
- **State Management**: TanStack Query for all data fetching (no prop drilling)
- **Styling**: Tron-inspired glassmorphism with Tailwind CSS
- **Linting**: Biome for `/features`, ESLint for legacy code
For architecture and file references:
@PRPs/ai_docs/DATA_FETCHING_ARCHITECTURE.md
- **Main Server (port 8181)**: FastAPI with HTTP polling for updates
- Handles all business logic, database operations, and external API calls
- WebSocket support removed in favor of HTTP polling with ETag caching
- **MCP Server (port 8051)**: Lightweight HTTP-based MCP protocol server
- Provides tools for AI assistants (Claude, Cursor, Windsurf)
- Exposes knowledge search, task management, and project operations
- **Agents Service (port 8052)**: PydanticAI agents for AI/ML operations
- Handles complex AI workflows and document processing
- **Database**: Supabase (PostgreSQL + pgvector for embeddings)
- Cloud or local Supabase both supported
- pgvector for semantic search capabilities
### Frontend Architecture Details
#### Vertical Slice Architecture (/features)
Features are organized by domain hierarchy with self-contained modules:
```
src/features/
├── ui/
│ ├── primitives/ # Radix UI base components
│ ├── hooks/ # Shared UI hooks (useSmartPolling, etc)
│ └── types/ # UI type definitions
├── projects/
│ ├── components/ # Project UI components
│ ├── hooks/ # Project hooks (useProjectQueries, etc)
│ ├── services/ # Project API services
│ ├── types/ # Project type definitions
│ ├── tasks/ # Tasks sub-feature (nested under projects)
│ │ ├── components/
│ │ ├── hooks/ # Task-specific hooks
│ │ ├── services/ # Task API services
│ │ └── types/
│ └── documents/ # Documents sub-feature
│ ├── components/
│ ├── services/
│ └── types/
```
#### TanStack Query Patterns
All data fetching uses TanStack Query with consistent patterns:
```typescript
// Query keys factory pattern
export const projectKeys = {
all: ["projects"] as const,
lists: () => [...projectKeys.all, "list"] as const,
detail: (id: string) => [...projectKeys.all, "detail", id] as const,
};
// Smart polling with visibility awareness
const { refetchInterval } = useSmartPolling(10000); // Pauses when tab inactive
// Optimistic updates with rollback
useMutation({
onMutate: async (data) => {
await queryClient.cancelQueries(key);
const previous = queryClient.getQueryData(key);
queryClient.setQueryData(key, optimisticData);
return { previous };
},
onError: (err, vars, context) => {
if (context?.previous) {
queryClient.setQueryData(key, context.previous);
}
},
});
```
### Backend Architecture Details
For code patterns and examples:
@PRPs/ai_docs/QUERY_PATTERNS.md
#### Service Layer Pattern
```python
# API Route -> Service -> Database
# src/server/api_routes/projects.py
@router.get("/{project_id}")
async def get_project(project_id: str):
return await project_service.get_project(project_id)
See implementation examples:
# src/server/services/project_service.py
async def get_project(project_id: str):
# Business logic here
return await db.fetch_project(project_id)
```
- API routes: `python/src/server/api_routes/projects_api.py`
- Service layer: `python/src/server/services/project_service.py`
- Pattern: API Route → Service → Database
#### Error Handling Patterns
```python
# Use specific exceptions
class ProjectNotFoundError(Exception): pass
class ValidationError(Exception): pass
See implementation examples:
# Rich error responses
@app.exception_handler(ProjectNotFoundError)
async def handle_not_found(request, exc):
return JSONResponse(
status_code=404,
content={"detail": str(exc), "type": "not_found"}
)
```
- Custom exceptions: `python/src/server/exceptions.py`
- Exception handlers: `python/src/server/main.py` (search for @app.exception_handler)
- Service error handling: `python/src/server/services/` (various services)
## Polling Architecture
## ETag Implementation
### HTTP Polling (replaced Socket.IO)
- **Polling intervals**: 1-2s for active operations, 5-10s for background data
- **ETag caching**: Reduces bandwidth by ~70% via 304 Not Modified responses
- **Smart pausing**: Stops polling when browser tab is inactive
- **Progress endpoints**: `/api/progress/{id}` for operation tracking
### Key Polling Hooks
- `useSmartPolling` - Adjusts interval based on page visibility/focus
- `useCrawlProgressPolling` - Specialized for crawl progress with auto-cleanup
- `useProjectTasks` - Smart polling for task lists
@PRPs/ai_docs/ETAG_IMPLEMENTATION.md
## Database Schema
@ -327,25 +184,9 @@ Key tables in Supabase:
## API Naming Conventions
### Task Status Values
@PRPs/ai_docs/API_NAMING_CONVENTIONS.md
Use database values directly (no UI mapping):
- `todo`, `doing`, `review`, `done`
### Service Method Patterns
- `get[Resource]sByProject(projectId)` - Scoped queries
- `get[Resource](id)` - Single resource
- `create[Resource](data)` - Create operations
- `update[Resource](id, updates)` - Updates
- `delete[Resource](id)` - Soft deletes
### State Naming
- `is[Action]ing` - Loading states (e.g., `isSwitchingProject`)
- `[resource]Error` - Error messages
- `selected[Resource]` - Current selection
Use database values directly (no mapping in the FE typesafe from BE and up):
## Environment Variables
@ -356,15 +197,8 @@ SUPABASE_URL=https://your-project.supabase.co # Or http://host.docker.internal:
SUPABASE_SERVICE_KEY=your-service-key-here # Use legacy key format for cloud Supabase
```
Optional:
```bash
LOGFIRE_TOKEN=your-logfire-token # For observability
LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR
ARCHON_SERVER_PORT=8181 # Server port
ARCHON_MCP_PORT=8051 # MCP server port
ARCHON_UI_PORT=3737 # Frontend port
```
Optional variables and full configuration:
See `python/.env.example` for complete list
## Common Development Tasks
@ -383,6 +217,14 @@ ARCHON_UI_PORT=3737 # Frontend port
4. Use TanStack Query hook from `src/features/[feature]/hooks/`
5. Apply Tron-inspired glassmorphism styling with Tailwind
### Add or modify MCP tools
1. MCP tools are in `python/src/mcp_server/features/[feature]/[feature]_tools.py`
2. Follow the pattern:
- `find_[resource]` - Handles list, search, and get single item operations
- `manage_[resource]` - Handles create, update, delete with an "action" parameter
3. Register tools in the feature's `__init__.py` file
### Debug MCP connection issues
1. Check MCP health: `curl http://localhost:8051/health`
@ -421,21 +263,37 @@ npm run lint:files src/components/SomeComponent.tsx
## MCP Tools Available
When connected to Client/Cursor/Windsurf:
When connected to Claude/Cursor/Windsurf, the following tools are available:
- `archon:perform_rag_query` - Search knowledge base
- `archon:search_code_examples` - Find code snippets
- `archon:create_project` - Create new project
- `archon:list_projects` - List all projects
- `archon:create_task` - Create task in project
- `archon:list_tasks` - List and filter tasks
- `archon:update_task` - Update task status/details
- `archon:get_available_sources` - List knowledge sources
### Knowledge Base Tools
- `archon:rag_search_knowledge_base` - Search knowledge base for relevant content
- `archon:rag_search_code_examples` - Find code snippets in the knowledge base
- `archon:rag_get_available_sources` - List available knowledge sources
### Project Management
- `archon:find_projects` - Find all projects, search, or get specific project (by project_id)
- `archon:manage_project` - Manage projects with actions: "create", "update", "delete"
### Task Management
- `archon:find_tasks` - Find tasks with search, filters, or get specific task (by task_id)
- `archon:manage_task` - Manage tasks with actions: "create", "update", "delete"
### Document Management
- `archon:find_documents` - Find documents, search, or get specific document (by document_id)
- `archon:manage_document` - Manage documents with actions: "create", "update", "delete"
### Version Control
- `archon:find_versions` - Find version history or get specific version
- `archon:manage_version` - Manage versions with actions: "create", "restore"
## Important Notes
- Projects feature is optional - toggle in Settings UI
- All services communicate via HTTP, not gRPC
- HTTP polling handles all updates
- Frontend uses Vite proxy for API calls in development
- Python backend uses `uv` for dependency management

236
CLAUDE.md
View File

@ -8,9 +8,13 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
### Core Principles
- **No backwards compatibility** - remove deprecated code immediately
- **No backwards compatibility; we follow a fixforward approach** — remove deprecated code immediately
- **Detailed errors over graceful failures** - we want to identify and fix issues fast
- **Break things to improve them** - beta is for rapid iteration
- **Continuous improvement** - embrace change and learn from mistakes
- **KISS** - keep it simple
- **DRY** when appropriate
- **YAGNI** — don't implement features that are not needed
### Error Handling
@ -40,51 +44,7 @@ These operations should continue but track and report failures clearly:
#### Critical Nuance: Never Accept Corrupted Data
When a process should continue despite failures, it must **skip the failed item entirely** rather than storing corrupted data:
**❌ WRONG - Silent Corruption:**
```python
try:
embedding = create_embedding(text)
except Exception as e:
embedding = [0.0] * 1536 # NEVER DO THIS - corrupts database
store_document(doc, embedding)
```
**✅ CORRECT - Skip Failed Items:**
```python
try:
embedding = create_embedding(text)
store_document(doc, embedding) # Only store on success
except Exception as e:
failed_items.append({'doc': doc, 'error': str(e)})
logger.error(f"Skipping document {doc.id}: {e}")
# Continue with next document, don't store anything
```
**✅ CORRECT - Batch Processing with Failure Tracking:**
```python
def process_batch(items):
results = {'succeeded': [], 'failed': []}
for item in items:
try:
result = process_item(item)
results['succeeded'].append(result)
except Exception as e:
results['failed'].append({
'item': item,
'error': str(e),
'traceback': traceback.format_exc()
})
logger.error(f"Failed to process {item.id}: {e}")
# Always return both successes and failures
return results
```
When a process should continue despite failures, it must **skip the failed item entirely** rather than storing corrupted data
#### Error Message Guidelines
@ -99,9 +59,9 @@ def process_batch(items):
- Remove dead code immediately rather than maintaining it - no backward compatibility or legacy functions
- Avoid backward compatibility mappings or legacy function wrappers
- Prioritize functionality over production-ready patterns
- Fix forward
- Focus on user experience and feature completeness
- When updating code, don't reference what is changing (avoid keywords like LEGACY, CHANGED, REMOVED), instead focus on comments that document just the functionality of the code
- When updating code, don't reference what is changing (avoid keywords like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED), instead focus on comments that document just the functionality of the code
- When commenting on code in the codebase, only comment on the functionality and reasoning behind the code. Refrain from speaking to Archon being in "beta" or referencing anything else that comes from these global rules.
## Development Commands
@ -176,139 +136,33 @@ make test-be # Backend tests only
## Architecture Overview
Archon Beta is a microservices-based knowledge management system with MCP (Model Context Protocol) integration:
@PRPs/ai_docs/ARCHITECTURE.md
### Service Architecture
#### TanStack Query Implementation
- **Frontend (port 3737)**: React + TypeScript + Vite + TailwindCSS
- **Dual UI Strategy**:
- `/features` - Modern vertical slice with Radix UI primitives + TanStack Query
- `/components` - Legacy custom components (being migrated)
- **State Management**: TanStack Query for all data fetching (no prop drilling)
- **Styling**: Tron-inspired glassmorphism with Tailwind CSS
- **Linting**: Biome for `/features`, ESLint for legacy code
For architecture and file references:
@PRPs/ai_docs/DATA_FETCHING_ARCHITECTURE.md
- **Main Server (port 8181)**: FastAPI with HTTP polling for updates
- Handles all business logic, database operations, and external API calls
- WebSocket support removed in favor of HTTP polling with ETag caching
- **MCP Server (port 8051)**: Lightweight HTTP-based MCP protocol server
- Provides tools for AI assistants (Claude, Cursor, Windsurf)
- Exposes knowledge search, task management, and project operations
- **Agents Service (port 8052)**: PydanticAI agents for AI/ML operations
- Handles complex AI workflows and document processing
- **Database**: Supabase (PostgreSQL + pgvector for embeddings)
- Cloud or local Supabase both supported
- pgvector for semantic search capabilities
### Frontend Architecture Details
#### Vertical Slice Architecture (/features)
Features are organized by domain hierarchy with self-contained modules:
```
src/features/
├── ui/
│ ├── primitives/ # Radix UI base components
│ ├── hooks/ # Shared UI hooks (useSmartPolling, etc)
│ └── types/ # UI type definitions
├── projects/
│ ├── components/ # Project UI components
│ ├── hooks/ # Project hooks (useProjectQueries, etc)
│ ├── services/ # Project API services
│ ├── types/ # Project type definitions
│ ├── tasks/ # Tasks sub-feature (nested under projects)
│ │ ├── components/
│ │ ├── hooks/ # Task-specific hooks
│ │ ├── services/ # Task API services
│ │ └── types/
│ └── documents/ # Documents sub-feature
│ ├── components/
│ ├── services/
│ └── types/
```
#### TanStack Query Patterns
All data fetching uses TanStack Query with consistent patterns:
```typescript
// Query keys factory pattern
export const projectKeys = {
all: ["projects"] as const,
lists: () => [...projectKeys.all, "list"] as const,
detail: (id: string) => [...projectKeys.all, "detail", id] as const,
};
// Smart polling with visibility awareness
const { refetchInterval } = useSmartPolling(10000); // Pauses when tab inactive
// Optimistic updates with rollback
useMutation({
onMutate: async (data) => {
await queryClient.cancelQueries(key);
const previous = queryClient.getQueryData(key);
queryClient.setQueryData(key, optimisticData);
return { previous };
},
onError: (err, vars, context) => {
if (context?.previous) {
queryClient.setQueryData(key, context.previous);
}
},
});
```
### Backend Architecture Details
For code patterns and examples:
@PRPs/ai_docs/QUERY_PATTERNS.md
#### Service Layer Pattern
```python
# API Route -> Service -> Database
# src/server/api_routes/projects.py
@router.get("/{project_id}")
async def get_project(project_id: str):
return await project_service.get_project(project_id)
# src/server/services/project_service.py
async def get_project(project_id: str):
# Business logic here
return await db.fetch_project(project_id)
```
See implementation examples:
- API routes: `python/src/server/api_routes/projects_api.py`
- Service layer: `python/src/server/services/project_service.py`
- Pattern: API Route → Service → Database
#### Error Handling Patterns
```python
# Use specific exceptions
class ProjectNotFoundError(Exception): pass
class ValidationError(Exception): pass
See implementation examples:
- Custom exceptions: `python/src/server/exceptions.py`
- Exception handlers: `python/src/server/main.py` (search for @app.exception_handler)
- Service error handling: `python/src/server/services/` (various services)
# Rich error responses
@app.exception_handler(ProjectNotFoundError)
async def handle_not_found(request, exc):
return JSONResponse(
status_code=404,
content={"detail": str(exc), "type": "not_found"}
)
```
## ETag Implementation
## Polling Architecture
### HTTP Polling (replaced Socket.IO)
- **Polling intervals**: 1-2s for active operations, 5-10s for background data
- **ETag caching**: Reduces bandwidth by ~70% via 304 Not Modified responses
- **Smart pausing**: Stops polling when browser tab is inactive
- **Progress endpoints**: `/api/progress/{id}` for operation tracking
### Key Polling Hooks
- `useSmartPolling` - Adjusts interval based on page visibility/focus
- `useCrawlProgressPolling` - Specialized for crawl progress with auto-cleanup
- `useProjectTasks` - Smart polling for task lists
@PRPs/ai_docs/ETAG_IMPLEMENTATION.md
## Database Schema
@ -328,25 +182,9 @@ Key tables in Supabase:
## API Naming Conventions
### Task Status Values
@PRPs/ai_docs/API_NAMING_CONVENTIONS.md
Use database values directly (no UI mapping):
- `todo`, `doing`, `review`, `done`
### Service Method Patterns
- `get[Resource]sByProject(projectId)` - Scoped queries
- `get[Resource](id)` - Single resource
- `create[Resource](data)` - Create operations
- `update[Resource](id, updates)` - Updates
- `delete[Resource](id)` - Soft deletes
### State Naming
- `is[Action]ing` - Loading states (e.g., `isSwitchingProject`)
- `[resource]Error` - Error messages
- `selected[Resource]` - Current selection
Use database values directly (no FE mapping; typesafe endtoend from BE upward):
## Environment Variables
@ -357,15 +195,8 @@ SUPABASE_URL=https://your-project.supabase.co # Or http://host.docker.internal:
SUPABASE_SERVICE_KEY=your-service-key-here # Use legacy key format for cloud Supabase
```
Optional:
```bash
LOGFIRE_TOKEN=your-logfire-token # For observability
LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR
ARCHON_SERVER_PORT=8181 # Server port
ARCHON_MCP_PORT=8051 # MCP server port
ARCHON_UI_PORT=3737 # Frontend port
```
Optional variables and full configuration:
See `python/.env.example` for complete list
## Common Development Tasks
@ -390,8 +221,7 @@ ARCHON_UI_PORT=3737 # Frontend port
2. Follow the pattern:
- `find_[resource]` - Handles list, search, and get single item operations
- `manage_[resource]` - Handles create, update, delete with an "action" parameter
3. Optimize responses by truncating/filtering fields in list operations
4. Register tools in the feature's `__init__.py` file
3. Register tools in the feature's `__init__.py` file
### Debug MCP connection issues
@ -434,31 +264,35 @@ npm run lint:files src/components/SomeComponent.tsx
When connected to Claude/Cursor/Windsurf, the following tools are available:
### Knowledge Base Tools
- `archon:rag_search_knowledge_base` - Search knowledge base for relevant content
- `archon:rag_search_code_examples` - Find code snippets in the knowledge base
- `archon:rag_get_available_sources` - List available knowledge sources
### Project Management
- `archon:find_projects` - Find all projects, search, or get specific project (by project_id)
- `archon:manage_project` - Manage projects with actions: "create", "update", "delete"
### Task Management
- `archon:find_tasks` - Find tasks with search, filters, or get specific task (by task_id)
- `archon:manage_task` - Manage tasks with actions: "create", "update", "delete"
### Document Management
- `archon:find_documents` - Find documents, search, or get specific document (by document_id)
- `archon:manage_document` - Manage documents with actions: "create", "update", "delete"
### Version Control
- `archon:find_versions` - Find version history or get specific version
- `archon:manage_version` - Manage versions with actions: "create", "restore"
## Important Notes
- Projects feature is optional - toggle in Settings UI
- All services communicate via HTTP, not gRPC
- HTTP polling handles all updates
- TanStack Query handles all data fetching; smart HTTP polling is used where appropriate (no WebSockets)
- Frontend uses Vite proxy for API calls in development
- Python backend uses `uv` for dependency management
- Docker Compose handles service orchestration

View File

@ -1,164 +1,249 @@
# API Naming Conventions
## Overview
This document defines the naming conventions used throughout the Archon V2 codebase for consistency and clarity.
## Task Status Values
**Database values only - no UI mapping:**
- `todo` - Task is in backlog/todo state
- `doing` - Task is actively being worked on
- `review` - Task is pending review
- `done` - Task is completed
This document describes the actual naming conventions used throughout Archon's codebase based on current implementation patterns. All examples reference real files where these patterns are implemented.
## Service Method Naming
## Backend API Endpoints
### Project Service (`projectService.ts`)
### RESTful Route Patterns
**Reference**: `python/src/server/api_routes/projects_api.py`
#### Projects
Standard REST patterns used:
- `GET /api/{resource}` - List all resources
- `POST /api/{resource}` - Create new resource
- `GET /api/{resource}/{id}` - Get single resource
- `PUT /api/{resource}/{id}` - Update resource
- `DELETE /api/{resource}/{id}` - Delete resource
Nested resource patterns:
- `GET /api/projects/{project_id}/tasks` - Tasks scoped to project
- `GET /api/projects/{project_id}/docs` - Documents scoped to project
- `POST /api/projects/{project_id}/versions` - Create version for project
### Actual Endpoint Examples
From `python/src/server/api_routes/`:
**Projects** (`projects_api.py`):
- `/api/projects` - Project CRUD
- `/api/projects/{project_id}/features` - Get project features
- `/api/projects/{project_id}/tasks` - Project-scoped tasks
- `/api/projects/{project_id}/docs` - Project documents
- `/api/projects/{project_id}/versions` - Version history
**Knowledge** (`knowledge_api.py`):
- `/api/knowledge/sources` - Knowledge sources
- `/api/knowledge/crawl` - Start web crawl
- `/api/knowledge/upload` - Upload document
- `/api/knowledge/search` - RAG search
- `/api/knowledge/code-search` - Code-specific search
**Progress** (`progress_api.py`):
- `/api/progress/active` - Active operations
- `/api/progress/{operation_id}` - Specific operation status
**MCP** (`mcp_api.py`):
- `/api/mcp/status` - MCP server status
- `/api/mcp/execute` - Execute MCP tool
## Frontend Service Methods
### Service Object Pattern
**Reference**: `archon-ui-main/src/features/projects/services/projectService.ts`
Services are exported as objects with async methods:
```typescript
export const serviceNameService = {
async methodName(): Promise<ReturnType> { ... }
}
```
### Standard Service Method Names
Actual patterns from service files:
**List Operations**:
- `listProjects()` - Get all projects
- `getProject(projectId)` - Get single project by ID
- `createProject(projectData)` - Create new project
- `updateProject(projectId, updates)` - Update project
- `deleteProject(projectId)` - Delete project
- `getTasksByProject(projectId)` - Get filtered list
- `getTasksByStatus(status)` - Get by specific criteria
#### Tasks
- `getTasksByProject(projectId)` - Get all tasks for a specific project
- `getTask(taskId)` - Get single task by ID
- `createTask(taskData)` - Create new task
- `updateTask(taskId, updates)` - Update task with partial data
- `updateTaskStatus(taskId, status)` - Update only task status
- `updateTaskOrder(taskId, newOrder, newStatus?)` - Update task position/order
- `deleteTask(taskId)` - Delete task (soft delete/archive)
- `getTasksByStatus(status)` - Get all tasks with specific status
**Single Item Operations**:
- `getProject(projectId)` - Get single item
- `getTask(taskId)` - Direct ID access
#### Documents
- `getDocuments(projectId)` - Get all documents for project
- `getDocument(projectId, docId)` - Get single document
- `createDocument(projectId, documentData)` - Create document
- `updateDocument(projectId, docId, updates)` - Update document
- `deleteDocument(projectId, docId)` - Delete document
**Create Operations**:
- `createProject(data)` - Returns created entity
- `createTask(data)` - Includes server-generated fields
#### Versions
- `createVersion(projectId, field, content)` - Create version snapshot
- `listVersions(projectId, fieldName?)` - List version history
- `getVersion(projectId, fieldName, versionNumber)` - Get specific version
- `restoreVersion(projectId, fieldName, versionNumber)` - Restore version
**Update Operations**:
- `updateProject(id, updates)` - Partial updates
- `updateTaskStatus(id, status)` - Specific field update
- `updateTaskOrder(id, order, status?)` - Complex updates
## API Endpoint Patterns
**Delete Operations**:
- `deleteProject(id)` - Returns void
- `deleteTask(id)` - Soft delete pattern
### RESTful Endpoints
```
GET /api/projects - List all projects
POST /api/projects - Create project
GET /api/projects/{project_id} - Get project
PUT /api/projects/{project_id} - Update project
DELETE /api/projects/{project_id} - Delete project
### Service File Locations
- **Projects**: `archon-ui-main/src/features/projects/services/projectService.ts`
- **Tasks**: `archon-ui-main/src/features/projects/tasks/services/taskService.ts`
- **Knowledge**: `archon-ui-main/src/features/knowledge/services/knowledgeService.ts`
- **Progress**: `archon-ui-main/src/features/progress/services/progressService.ts`
GET /api/projects/{project_id}/tasks - Get project tasks
POST /api/tasks - Create task (project_id in body)
GET /api/tasks/{task_id} - Get task
PUT /api/tasks/{task_id} - Update task
DELETE /api/tasks/{task_id} - Delete task
## React Hook Naming
GET /api/projects/{project_id}/docs - Get project documents
POST /api/projects/{project_id}/docs - Create document
GET /api/projects/{project_id}/docs/{doc_id} - Get document
PUT /api/projects/{project_id}/docs/{doc_id} - Update document
DELETE /api/projects/{project_id}/docs/{doc_id} - Delete document
```
### Query Hooks
**Reference**: `archon-ui-main/src/features/projects/tasks/hooks/useTaskQueries.ts`
### Progress/Polling Endpoints
```
GET /api/progress/{operation_id} - Generic operation progress
GET /api/knowledge/crawl-progress/{id} - Crawling progress
GET /api/agent-chat/sessions/{id}/messages - Chat messages
```
Standard patterns:
- `use[Resource]()` - List query (e.g., `useProjects`)
- `use[Resource]Detail(id)` - Single item query
- `use[Parent][Resource](parentId)` - Scoped query (e.g., `useProjectTasks`)
### Mutation Hooks
- `useCreate[Resource]()` - Creation mutation
- `useUpdate[Resource]()` - Update mutation
- `useDelete[Resource]()` - Deletion mutation
### Utility Hooks
**Reference**: `archon-ui-main/src/features/ui/hooks/`
- `useSmartPolling()` - Visibility-aware polling
- `useToast()` - Toast notifications
- `useDebounce()` - Debounced values
## Type Naming Conventions
### Type Definition Patterns
**Reference**: `archon-ui-main/src/features/projects/types/`
**Entity Types**:
- `Project` - Core entity type
- `Task` - Business object
- `Document` - Data model
**Request/Response Types**:
- `Create[Entity]Request` - Creation payload
- `Update[Entity]Request` - Update payload
- `[Entity]Response` - API response wrapper
**Database Types**:
- `DatabaseTaskStatus` - Exact database values
**Location**: `archon-ui-main/src/features/projects/tasks/types/task.ts`
Values: `"todo" | "doing" | "review" | "done"`
### Type File Organization
Following vertical slice architecture:
- Core types in `{feature}/types/`
- Sub-feature types in `{feature}/{subfeature}/types/`
- Shared types in `shared/types/`
## Query Key Factories
**Reference**: Each feature's `hooks/use{Feature}Queries.ts` file
Standard factory pattern:
- `{resource}Keys.all` - Base key for invalidation
- `{resource}Keys.lists()` - List queries
- `{resource}Keys.detail(id)` - Single item queries
- `{resource}Keys.byProject(projectId)` - Scoped queries
Examples:
- `projectKeys` - Projects domain
- `taskKeys` - Tasks (dual nature: global and project-scoped)
- `knowledgeKeys` - Knowledge base
- `progressKeys` - Progress tracking
- `documentKeys` - Document management
## Component Naming
### Hooks
- `use[Feature]` - Custom hooks (e.g., `usePolling`, `useProjectMutation`)
- Returns object with: `{ data, isLoading, error, refetch }`
### Page Components
**Location**: `archon-ui-main/src/pages/`
- `[Feature]Page.tsx` - Top-level pages
- `[Feature]View.tsx` - Main view components
### Services
- `[feature]Service` - Service modules (e.g., `projectService`, `crawlProgressService`)
- Methods return Promises with typed responses
### Feature Components
**Location**: `archon-ui-main/src/features/{feature}/components/`
- `[Entity]Card.tsx` - Card displays
- `[Entity]List.tsx` - List containers
- `[Entity]Form.tsx` - Form components
- `New[Entity]Modal.tsx` - Creation modals
- `Edit[Entity]Modal.tsx` - Edit modals
### Components
- `[Feature][Type]` - UI components (e.g., `TaskBoardView`, `EditTaskModal`)
- Props interfaces: `[Component]Props`
### Shared Components
**Location**: `archon-ui-main/src/features/ui/primitives/`
- Radix UI-based primitives
- Generic, reusable components
## State Variable Naming
### Loading States
- `isLoading[Feature]` - Boolean loading indicators
- `isSwitchingProject` - Specific operation states
- `movingTaskIds` - Set/Array of items being processed
**Examples from**: `archon-ui-main/src/features/projects/views/ProjectsView.tsx`
- `isLoading` - Generic loading
- `is[Action]ing` - Specific operations (e.g., `isSwitchingProject`)
- `[action]ingIds` - Sets of items being processed
### Error States
- `[feature]Error` - Error message strings
- `taskOperationError` - Specific operation errors
- `error` - Query errors
- `[operation]Error` - Specific operation errors
### Data States
- `[feature]s` - Plural for collections (e.g., `tasks`, `projects`)
- `selected[Feature]` - Currently selected item
- `[feature]Data` - Raw data from API
### Selection States
- `selected[Entity]` - Currently selected item
- `active[Entity]Id` - Active item ID
## Type Definitions
## Constants and Enums
### Database Types (from backend)
```typescript
type DatabaseTaskStatus = 'todo' | 'doing' | 'review' | 'done';
type Assignee = string; // Flexible string to support any agent name
// Common values: 'User', 'Archon', 'Coding Agent'
```
### Status Values
**Location**: `archon-ui-main/src/features/projects/tasks/types/task.ts`
Database values used directly - no mapping layers:
- Task statuses: `"todo"`, `"doing"`, `"review"`, `"done"`
- Operation statuses: `"pending"`, `"processing"`, `"completed"`, `"failed"`
### Request/Response Types
```typescript
Create[Feature]Request // e.g., CreateTaskRequest
Update[Feature]Request // e.g., UpdateTaskRequest
[Feature]Response // e.g., TaskResponse
```
### Time Constants
**Location**: `archon-ui-main/src/features/shared/queryPatterns.ts`
- `STALE_TIMES.instant` - 0ms
- `STALE_TIMES.realtime` - 3 seconds
- `STALE_TIMES.frequent` - 5 seconds
- `STALE_TIMES.normal` - 30 seconds
- `STALE_TIMES.rare` - 5 minutes
- `STALE_TIMES.static` - Infinity
## Function Naming Patterns
## File Naming Patterns
### Event Handlers
- `handle[Event]` - Generic handlers (e.g., `handleProjectSelect`)
- `on[Event]` - Props callbacks (e.g., `onTaskMove`, `onRefresh`)
### Service Layer
- `{feature}Service.ts` - Service modules
- Use lower camelCase with "Service" suffix (e.g., `projectService.ts`)
### Operations
- `load[Feature]` - Fetch data (e.g., `loadTasksForProject`)
- `save[Feature]` - Persist changes (e.g., `saveTask`)
- `delete[Feature]` - Remove items (e.g., `deleteTask`)
- `refresh[Feature]` - Reload data (e.g., `refreshTasks`)
### Hook Files
- `use{Feature}Queries.ts` - Query hooks and keys
- `use{Feature}.ts` - Feature-specific hooks
### Formatting/Transformation
- `format[Feature]` - Format for display (e.g., `formatTask`)
- `validate[Feature]` - Validate data (e.g., `validateUpdateTask`)
### Type Files
- `index.ts` - Barrel exports
- `{entity}.ts` - Specific entity types
### Test Files
- `{filename}.test.ts` - Unit tests
- Located in `tests/` subdirectories
## Best Practices
### ✅ Do Use
- `getTasksByProject(projectId)` - Clear scope with context
- `status` - Single source of truth from database
- Direct database values everywhere (no mapping)
- Polling with `usePolling` hook for data fetching
- Async/await with proper error handling
- ETag headers for efficient polling
- Loading indicators during operations
### Do Follow
- Use exact database values (no translation layers)
- Keep consistent patterns within features
- Use query key factories for all cache operations
- Follow vertical slice architecture
- Reference shared constants
## Current Architecture Patterns
### Don't Do
- Don't create mapping layers for database values
- Don't hardcode time values
- Don't mix query keys between features
- Don't use inconsistent naming within a feature
- Don't embed business logic in components
### Polling & Data Fetching
- HTTP polling with `usePolling` and `useCrawlProgressPolling` hooks
- ETag-based caching for bandwidth efficiency
- Loading state indicators (`isLoading`, `isSwitchingProject`)
- Error toast notifications for user feedback
- Manual refresh triggers via `refetch()`
- Immediate UI updates followed by API calls
## Common Patterns Reference
### Service Architecture
- Specialized services for different domains (`projectService`, `crawlProgressService`)
- Direct database value usage (no UI/DB mapping)
- Promise-based async operations
- Typed request/response interfaces
For implementation examples, see:
- Query patterns: Any `use{Feature}Queries.ts` file
- Service patterns: Any `{feature}Service.ts` file
- Type patterns: Any `{feature}/types/` directory
- Component patterns: Any `{feature}/components/` directory

View File

@ -2,480 +2,194 @@
## Overview
Archon follows a **Vertical Slice Architecture** pattern where features are organized by business capability rather than technical layers. Each module is self-contained with its own API, business logic, and data access, making the system modular, maintainable, and ready for future microservice extraction if needed.
Archon is a knowledge management system with AI capabilities, built as a monolithic application with vertical slice organization. The frontend uses React with TanStack Query, while the backend runs FastAPI with multiple service components.
## Core Principles
## Tech Stack
1. **Feature Cohesion**: All code for a feature lives together
2. **Module Independence**: Modules communicate through well-defined interfaces
3. **Vertical Slices**: Each feature contains its complete stack (API → Service → Repository)
4. **Shared Minimal**: Only truly cross-cutting concerns go in shared
5. **Migration Ready**: Structure supports easy extraction to microservices
**Frontend**: React 18, TypeScript 5, TanStack Query v5, Tailwind CSS, Vite
**Backend**: Python 3.12, FastAPI, Supabase, PydanticAI
**Infrastructure**: Docker, PostgreSQL + pgvector
## Directory Structure
```
archon/
├── python/
│ ├── src/
│ │ ├── knowledge/ # Knowledge Management Module
│ │ │ ├── __init__.py
│ │ │ ├── main.py # Knowledge module entry point
│ │ │ ├── shared/ # Shared within knowledge context
│ │ │ │ ├── models.py
│ │ │ │ ├── exceptions.py
│ │ │ │ └── utils.py
│ │ │ └── features/ # Knowledge feature slices
│ │ │ ├── crawling/ # Web crawling feature
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Crawl endpoints
│ │ │ │ ├── service.py # Crawling orchestration
│ │ │ │ ├── models.py # Crawl-specific models
│ │ │ │ ├── repository.py # Crawl data storage
│ │ │ │ └── tests/
│ │ │ ├── document_processing/ # Document upload & processing
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Upload endpoints
│ │ │ │ ├── service.py # PDF/DOCX processing
│ │ │ │ ├── extractors.py # Text extraction
│ │ │ │ └── tests/
│ │ │ ├── embeddings/ # Vector embeddings
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Embedding endpoints
│ │ │ │ ├── service.py # OpenAI/local embeddings
│ │ │ │ ├── models.py
│ │ │ │ └── repository.py # Vector storage
│ │ │ ├── search/ # RAG search
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Search endpoints
│ │ │ │ ├── service.py # Search algorithms
│ │ │ │ ├── reranker.py # Result reranking
│ │ │ │ └── tests/
│ │ │ ├── code_extraction/ # Code snippet extraction
│ │ │ │ ├── __init__.py
│ │ │ │ ├── service.py # Code parsing
│ │ │ │ ├── analyzers.py # Language detection
│ │ │ │ └── repository.py
│ │ │ └── source_management/ # Knowledge source CRUD
│ │ │ ├── __init__.py
│ │ │ ├── api.py
│ │ │ ├── service.py
│ │ │ └── repository.py
│ │ │
│ │ ├── projects/ # Project Management Module
│ │ │ ├── __init__.py
│ │ │ ├── main.py # Projects module entry point
│ │ │ ├── shared/ # Shared within projects context
│ │ │ │ ├── database.py # Project DB utilities
│ │ │ │ ├── models.py # Shared project models
│ │ │ │ └── exceptions.py # Project-specific exceptions
│ │ │ └── features/ # Project feature slices
│ │ │ ├── project_management/ # Project CRUD
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Project endpoints
│ │ │ │ ├── service.py # Project business logic
│ │ │ │ ├── models.py # Project models
│ │ │ │ ├── repository.py # Project DB operations
│ │ │ │ └── tests/
│ │ │ ├── task_management/ # Task CRUD
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Task endpoints
│ │ │ │ ├── service.py # Task business logic
│ │ │ │ ├── models.py # Task models
│ │ │ │ ├── repository.py # Task DB operations
│ │ │ │ └── tests/
│ │ │ ├── task_ordering/ # Drag-and-drop reordering
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Reorder endpoints
│ │ │ │ ├── service.py # Reordering algorithm
│ │ │ │ └── tests/
│ │ │ ├── document_management/ # Project documents
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Document endpoints
│ │ │ │ ├── service.py # Document logic
│ │ │ │ ├── models.py
│ │ │ │ └── repository.py
│ │ │ ├── document_versioning/ # Version control
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Version endpoints
│ │ │ │ ├── service.py # Versioning logic
│ │ │ │ ├── models.py # Version models
│ │ │ │ └── repository.py # Version storage
│ │ │ ├── ai_generation/ # AI project creation
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Generate endpoints
│ │ │ │ ├── service.py # AI orchestration
│ │ │ │ ├── agents.py # Agent interactions
│ │ │ │ ├── progress.py # Progress tracking
│ │ │ │ └── prompts.py # Generation prompts
│ │ │ ├── source_linking/ # Link to knowledge base
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Link endpoints
│ │ │ │ ├── service.py # Linking logic
│ │ │ │ └── repository.py # Junction table ops
│ │ │ └── bulk_operations/ # Batch updates
│ │ │ ├── __init__.py
│ │ │ ├── api.py # Bulk endpoints
│ │ │ ├── service.py # Batch processing
│ │ │ └── tests/
│ │ │
│ │ ├── mcp_server/ # MCP Protocol Server (IDE Integration)
│ │ │ ├── __init__.py
│ │ │ ├── main.py # MCP server entry point
│ │ │ ├── server.py # FastMCP server setup
│ │ │ ├── features/ # MCP tool implementations
│ │ │ │ ├── projects/ # Project tools for IDEs
│ │ │ │ │ ├── __init__.py
│ │ │ │ │ ├── project_tools.py
│ │ │ │ │ └── tests/
│ │ │ │ ├── tasks/ # Task tools for IDEs
│ │ │ │ │ ├── __init__.py
│ │ │ │ │ ├── task_tools.py
│ │ │ │ │ └── tests/
│ │ │ │ ├── documents/ # Document tools for IDEs
│ │ │ │ │ ├── __init__.py
│ │ │ │ │ ├── document_tools.py
│ │ │ │ │ ├── version_tools.py
│ │ │ │ │ └── tests/
│ │ │ │ └── feature_tools.py # Feature management
│ │ │ ├── modules/ # MCP modules
│ │ │ │ └── archon.py # Main Archon MCP module
│ │ │ └── utils/ # MCP utilities
│ │ │ └── tool_utils.py
│ │ │
│ │ ├── agents/ # AI Agents Module
│ │ │ ├── __init__.py
│ │ │ ├── main.py # Agents module entry point
│ │ │ ├── config.py # Agent configurations
│ │ │ ├── features/ # Agent capabilities
│ │ │ │ ├── document_agent/ # Document processing agent
│ │ │ │ │ ├── __init__.py
│ │ │ │ │ ├── agent.py # PydanticAI agent
│ │ │ │ │ ├── prompts.py # Agent prompts
│ │ │ │ │ └── tools.py # Agent tools
│ │ │ │ ├── code_agent/ # Code analysis agent
│ │ │ │ │ ├── __init__.py
│ │ │ │ │ ├── agent.py
│ │ │ │ │ └── analyzers.py
│ │ │ │ └── project_agent/ # Project creation agent
│ │ │ │ ├── __init__.py
│ │ │ │ ├── agent.py
│ │ │ │ ├── prp_generator.py
│ │ │ │ └── task_generator.py
│ │ │ └── shared/ # Shared agent utilities
│ │ │ ├── base_agent.py
│ │ │ ├── llm_client.py
│ │ │ └── response_models.py
│ │ │
│ │ ├── shared/ # Shared Across All Modules
│ │ │ ├── database/ # Database utilities
│ │ │ │ ├── __init__.py
│ │ │ │ ├── supabase.py # Supabase client
│ │ │ │ ├── migrations.py # DB migrations
│ │ │ │ └── connection_pool.py
│ │ │ ├── auth/ # Authentication
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api_keys.py
│ │ │ │ └── permissions.py
│ │ │ ├── config/ # Configuration
│ │ │ │ ├── __init__.py
│ │ │ │ ├── settings.py # Environment settings
│ │ │ │ └── logfire_config.py # Logging config
│ │ │ ├── middleware/ # HTTP middleware
│ │ │ │ ├── __init__.py
│ │ │ │ ├── cors.py
│ │ │ │ └── error_handler.py
│ │ │ └── utils/ # General utilities
│ │ │ ├── __init__.py
│ │ │ ├── datetime_utils.py
│ │ │ └── json_utils.py
│ │ │
│ │ └── main.py # Application orchestrator
│ │
│ └── tests/ # Integration tests
│ ├── test_api_essentials.py
│ ├── test_service_integration.py
│ └── fixtures/
├── archon-ui-main/ # Frontend Application
│ ├── src/
│ │ ├── pages/ # Page components
│ │ │ ├── KnowledgeBasePage.tsx
│ │ │ ├── ProjectPage.tsx
│ │ │ ├── SettingsPage.tsx
│ │ │ └── MCPPage.tsx
│ │ ├── components/ # Reusable components
│ │ │ ├── knowledge-base/ # Knowledge features
│ │ │ ├── project-tasks/ # Project features
│ │ │ └── ui/ # Shared UI components
│ │ ├── services/ # API services
│ │ │ ├── api.ts # Base API client
│ │ │ ├── knowledgeBaseService.ts
│ │ │ ├── projectService.ts
│ │ │ └── pollingService.ts # New polling utilities
│ │ ├── hooks/ # React hooks
│ │ │ ├── usePolling.ts # Polling hook
│ │ │ ├── useDatabaseMutation.ts # DB-first mutations
│ │ │ └── useAsyncAction.ts
│ │ └── contexts/ # React contexts
│ │ ├── ToastContext.tsx
│ │ └── ThemeContext.tsx
│ │
│ └── tests/ # Frontend tests
├── PRPs/ # Product Requirement Prompts
│ ├── templates/ # PRP templates
│ ├── ai_docs/ # AI context documentation
│ └── *.md # Feature PRPs
├── docs/ # Documentation
│ └── architecture/ # Architecture decisions
└── docker/ # Docker configurations
├── Dockerfile
└── docker-compose.yml
### Backend (`python/src/`)
```text
server/ # Main FastAPI application
├── api_routes/ # HTTP endpoints
├── services/ # Business logic
├── models/ # Data models
├── config/ # Configuration
├── middleware/ # Request processing
└── utils/ # Shared utilities
mcp_server/ # MCP server for IDE integration
└── features/ # MCP tool implementations
agents/ # AI agents (PydanticAI)
└── features/ # Agent capabilities
```
## Module Descriptions
### Frontend (`archon-ui-main/src/`)
```text
features/ # Vertical slice architecture
├── knowledge/ # Knowledge base feature
├── projects/ # Project management
│ ├── tasks/ # Task sub-feature
│ └── documents/ # Document sub-feature
├── progress/ # Operation tracking
├── mcp/ # MCP integration
├── shared/ # Cross-feature utilities
└── ui/ # UI components & hooks
### Knowledge Module (`src/knowledge/`)
Core knowledge management functionality including web crawling, document processing, embeddings, and RAG search. This is the heart of Archon's knowledge engine.
**Key Features:**
- Web crawling with JavaScript rendering
- Document upload and text extraction
- Vector embeddings and similarity search
- Code snippet extraction and indexing
- Source management and organization
### Projects Module (`src/projects/`)
Project and task management system with AI-powered project generation. Currently optional via feature flag.
**Key Features:**
- Project CRUD operations
- Task management with drag-and-drop ordering
- Document management with versioning
- AI-powered project generation
- Integration with knowledge base sources
### MCP Server Module (`src/mcp_server/`)
Model Context Protocol server that exposes Archon functionality to IDEs like Cursor and Windsurf.
**Key Features:**
- Tool-based API for IDE integration
- Project and task management tools
- Document operations
- Async operation support
### Agents Module (`src/agents/`)
AI agents powered by PydanticAI for intelligent document processing and project generation.
**Key Features:**
- Document analysis and summarization
- Code understanding and extraction
- Project requirement generation
- Task breakdown and planning
### Shared Module (`src/shared/`)
Cross-cutting concerns shared across all modules. Kept minimal to maintain module independence.
**Key Components:**
- Database connections and utilities
- Authentication and authorization
- Configuration management
- Logging and observability
- Common middleware
## Communication Patterns
### Inter-Module Communication
Modules communicate through:
1. **Direct HTTP API Calls** (current)
- Projects module calls Knowledge module APIs
- Simple and straightforward
- Works well for current scale
2. **Event Bus** (future consideration)
```python
# Example event-driven communication
await event_bus.publish("project.created", {
"project_id": "123",
"created_by": "user"
})
```
3. **Shared Database** (current reality)
- All modules use same Supabase instance
- Direct foreign keys between contexts
- Will need refactoring for true microservices
## Feature Flags
Features can be toggled via environment variables:
```python
# settings.py
PROJECTS_ENABLED = env.bool("PROJECTS_ENABLED", default=False)
TASK_ORDERING_ENABLED = env.bool("TASK_ORDERING_ENABLED", default=True)
AI_GENERATION_ENABLED = env.bool("AI_GENERATION_ENABLED", default=True)
pages/ # Route components
components/ # Legacy components (migrating)
```
## Database Architecture
## Core Modules
Currently using a shared Supabase (PostgreSQL) database:
### Knowledge Management
**Backend**: `python/src/server/services/knowledge_service.py`
**Frontend**: `archon-ui-main/src/features/knowledge/`
**Features**: Web crawling, document upload, embeddings, RAG search
```sql
-- Knowledge context tables
sources
documents
code_examples
### Project Management
**Backend**: `python/src/server/services/project_*_service.py`
**Frontend**: `archon-ui-main/src/features/projects/`
**Features**: Projects, tasks, documents, version history
-- Projects context tables
archon_projects
archon_tasks
archon_document_versions
### MCP Server
**Location**: `python/src/mcp_server/`
**Purpose**: Exposes tools to AI IDEs (Cursor, Windsurf)
**Port**: 8051
-- Cross-context junction tables
archon_project_sources -- Links projects to knowledge
```
### AI Agents
**Location**: `python/src/agents/`
**Purpose**: Document processing, code analysis, project generation
**Port**: 8052
## API Structure
Each feature exposes its own API routes:
### RESTful Endpoints
Pattern: `{METHOD} /api/{resource}/{id?}/{sub-resource?}`
```
/api/knowledge/
/crawl # Web crawling
/upload # Document upload
/search # RAG search
/sources # Source management
**Examples from** `python/src/server/api_routes/`:
- `/api/projects` - CRUD operations
- `/api/projects/{id}/tasks` - Nested resources
- `/api/knowledge/search` - RAG search
- `/api/progress/{id}` - Operation status
/api/projects/
/projects # Project CRUD
/tasks # Task management
/tasks/reorder # Task ordering
/documents # Document management
/generate # AI generation
### Service Layer
**Pattern**: `python/src/server/services/{feature}_service.py`
- Handles business logic
- Database operations via Supabase client
- Returns typed responses
## Frontend Architecture
### Data Fetching
**Core**: TanStack Query v5
**Configuration**: `archon-ui-main/src/features/shared/queryClient.ts`
**Patterns**: `archon-ui-main/src/features/shared/queryPatterns.ts`
### State Management
- **Server State**: TanStack Query
- **UI State**: React hooks & context
- **No Redux/Zustand**: Query cache handles all data
### Feature Organization
Each feature follows vertical slice pattern:
```text
features/{feature}/
├── components/ # UI components
├── hooks/ # Query hooks & keys
├── services/ # API calls
└── types/ # TypeScript types
```
## Deployment Architecture
### Smart Polling
**Implementation**: `archon-ui-main/src/features/ui/hooks/useSmartPolling.ts`
- Visibility-aware (pauses when tab hidden)
- Variable intervals based on focus state
### Current mixed
## Database
### Future (service modules)
**Provider**: Supabase (PostgreSQL + pgvector)
**Client**: `python/src/server/config/database.py`
Each module can become its own service:
### Main Tables
- `sources` - Knowledge sources
- `documents` - Document chunks with embeddings
- `code_examples` - Extracted code
- `archon_projects` - Projects
- `archon_tasks` - Tasks
- `archon_document_versions` - Version history
```yaml
# docker-compose.yml (future)
services:
knowledge:
image: archon-knowledge
ports: ["8001:8000"]
## Key Architectural Decisions
projects:
image: archon-projects
ports: ["8002:8000"]
### Vertical Slices
Features own their entire stack (UI → API → DB). See any `features/{feature}/` directory.
mcp-server:
image: archon-mcp
ports: ["8051:8051"]
### No WebSockets
HTTP polling with smart intervals. ETag caching reduces bandwidth by ~70%.
agents:
image: archon-agents
ports: ["8052:8052"]
### Query-First State
TanStack Query is the single source of truth. No separate state management needed.
### Direct Database Values
No translation layers. Database values (e.g., `"todo"`, `"doing"`) used directly in UI.
### Browser-Native Caching
ETags handled by browser, not JavaScript. See `archon-ui-main/src/features/shared/apiWithEtag.ts`.
## Deployment
### Development
```bash
# Backend
docker compose up -d
# or
cd python && uv run python -m src.server.main
# Frontend
cd archon-ui-main && npm run dev
```
## Migration Path
### Production
Single Docker Compose deployment with all services.
### Phase 1: Current State (Modules/service)
## Configuration
- All code in one repository
- Shared database
- Single deployment
### Environment Variables
**Required**: `SUPABASE_URL`, `SUPABASE_SERVICE_KEY`
**Optional**: See `.env.example`
### Phase 2: Vertical Slices
### Feature Flags
Controlled via Settings UI. Projects feature can be disabled.
- Reorganize by feature
- Clear module boundaries
- Feature flags for control
## Recent Refactors (Phases 1-5)
## Development Guidelines
1. **Removed ETag cache layer** - Browser handles HTTP caching
2. **Standardized query keys** - Each feature owns its keys
3. **Fixed optimistic updates** - UUID-based with nanoid
4. **Configured deduplication** - Centralized QueryClient
5. **Removed manual invalidations** - Trust backend consistency
### Adding a New Feature
## Performance Optimizations
1. **Identify the Module**: Which bounded context does it belong to?
2. **Create Feature Slice**: New folder under `module/features/`
3. **Implement Vertical Slice**:
- `api.py` - HTTP endpoints
- `service.py` - Business logic
- `models.py` - Data models
- `repository.py` - Data access
- `tests/` - Feature tests
- **Request Deduplication**: Same query key = one request
- **Smart Polling**: Adapts to tab visibility
- **ETag Caching**: 70% bandwidth reduction
- **Optimistic Updates**: Instant UI feedback
### Testing Strategy
## Testing
- **Unit Tests**: Each feature has its own tests
- **Integration Tests**: Test module boundaries
- **E2E Tests**: Test complete user flows
### Code Organization Rules
1. **Features are Self-Contained**: All code for a feature lives together
2. **No Cross-Feature Imports**: Use module's shared or API calls
3. **Shared is Minimal**: Only truly cross-cutting concerns
4. **Dependencies Point Inward**: Features → Module Shared → Global Shared
## Technology Stack
### Backend
- **FastAPI**: Web framework
- **Supabase**: Database and auth
- **PydanticAI**: AI agents
- **OpenAI**: Embeddings and LLM
- **Crawl4AI**: Web crawling
### Frontend
- **React**: UI framework
- **TypeScript**: Type safety
- **TailwindCSS**: Styling
- **React Query**: Data fetching
- **Vite**: Build tool
### Infrastructure
- **Docker**: Containerization
- **PostgreSQL**: Database (via Supabase, desire to support any PostgreSQL)
- **pgvector**: Vector storage, Desire to support ChromaDB, Pinecone, Weaviate, etc.
**Frontend Tests**: `archon-ui-main/src/features/*/tests/`
**Backend Tests**: `python/tests/`
**Patterns**: Mock services and query patterns, not implementation
## Future Considerations
### Planned Improvements
1. **Remove Socket.IO**: Replace with polling (in progress)
2. **API Gateway**: Central entry point for all services
3. **Separate Databases**: One per bounded context
### Scalability Path
1. **Vertical Scaling**: Current approach, works for single-user
2. **Horizontal Scaling**: Add load balancer and multiple instances
---
This architecture provides a clear path from the current monolithic application to a more modular approach with vertical slicing, for easy potential to service separation if needed.
- Server-Sent Events for real-time updates
- GraphQL for selective field queries
- Separate databases per bounded context
- Multi-tenant support

View File

@ -0,0 +1,192 @@
# Data Fetching Architecture
## Overview
Archon uses **TanStack Query v5** for all data fetching, caching, and synchronization. This replaces the former custom polling layer with a querycentric design that handles caching, deduplication, and smart refetching (including visibilityaware polling) automatically.
## Core Components
### 1. Query Client Configuration
**Location**: `archon-ui-main/src/features/shared/queryClient.ts`
Centralized QueryClient with:
- 30-second default stale time
- 10-minute garbage collection
- Smart retry logic (skips 4xx errors)
- Request deduplication enabled
- Structural sharing for optimized re-renders
### 2. Smart Polling Hook
**Location**: `archon-ui-main/src/features/ui/hooks/useSmartPolling.ts`
Visibility-aware polling that:
- Pauses when browser tab is hidden
- Slows down (1.5x interval) when tab is unfocused
- Returns `refetchInterval` for use with TanStack Query
### 3. Query Patterns
**Location**: `archon-ui-main/src/features/shared/queryPatterns.ts`
Shared constants:
- `DISABLED_QUERY_KEY` - For disabled queries
- `STALE_TIMES` - Standardized cache durations (instant, realtime, frequent, normal, rare, static)
## Feature Implementation Patterns
### Query Key Factories
Each feature maintains its own query keys:
- **Projects**: `archon-ui-main/src/features/projects/hooks/useProjectQueries.ts` (projectKeys)
- **Tasks**: `archon-ui-main/src/features/projects/tasks/hooks/useTaskQueries.ts` (taskKeys)
- **Knowledge**: `archon-ui-main/src/features/knowledge/hooks/useKnowledgeQueries.ts` (knowledgeKeys)
- **Progress**: `archon-ui-main/src/features/progress/hooks/useProgressQueries.ts` (progressKeys)
- **MCP**: `archon-ui-main/src/features/mcp/hooks/useMcpQueries.ts` (mcpKeys)
- **Documents**: `archon-ui-main/src/features/projects/documents/hooks/useDocumentQueries.ts` (documentKeys)
### Data Fetching Hooks
Standard pattern across all features:
- `use[Feature]()` - List queries
- `use[Feature]Detail(id)` - Single item queries
- `useCreate[Feature]()` - Creation mutations
- `useUpdate[Feature]()` - Update mutations
- `useDelete[Feature]()` - Deletion mutations
## Backend Integration
### ETag Support
**Location**: `archon-ui-main/src/features/shared/apiWithEtag.ts`
ETag implementation:
- Browser handles ETag headers automatically
- 304 responses reduce bandwidth
- TanStack Query manages cache state
### API Structure
Backend endpoints follow RESTful patterns:
- **Knowledge**: `python/src/server/api_routes/knowledge_api.py`
- **Projects**: `python/src/server/api_routes/projects_api.py`
- **Progress**: `python/src/server/api_routes/progress_api.py`
- **MCP**: `python/src/server/api_routes/mcp_api.py`
## Optimistic Updates
**Utilities**: `archon-ui-main/src/features/shared/optimistic.ts`
All mutations use nanoid-based optimistic updates:
- Creates temporary entities with `_optimistic` flag
- Replaces with server data on success
- Rollback on error
- Visual indicators for pending state
## Refetch Strategies
### Smart Polling Usage
**Implementation**: `archon-ui-main/src/features/ui/hooks/useSmartPolling.ts`
Polling intervals are defined in each feature's query hooks. See actual implementations:
- **Projects**: `archon-ui-main/src/features/projects/hooks/useProjectQueries.ts`
- **Tasks**: `archon-ui-main/src/features/projects/tasks/hooks/useTaskQueries.ts`
- **Knowledge**: `archon-ui-main/src/features/knowledge/hooks/useKnowledgeQueries.ts`
- **Progress**: `archon-ui-main/src/features/progress/hooks/useProgressQueries.ts`
- **MCP**: `archon-ui-main/src/features/mcp/hooks/useMcpQueries.ts`
Standard intervals from `archon-ui-main/src/features/shared/queryPatterns.ts`:
- `STALE_TIMES.instant`: 0ms (always fresh)
- `STALE_TIMES.frequent`: 5 seconds (frequently changing data)
- `STALE_TIMES.normal`: 30 seconds (standard cache)
### Manual Refetch
All queries expose `refetch()` for manual updates.
## Performance Optimizations
### Request Deduplication
Handled automatically by TanStack Query when same query key is used.
### Stale Time Configuration
Defined in `STALE_TIMES` and used consistently:
- Auth/Settings: `Infinity` (never stale)
- Active operations: `0` (always fresh)
- Normal data: `30_000` (30 seconds)
- Rare updates: `300_000` (5 minutes)
### Garbage Collection
Unused data removed after 10 minutes (configurable in queryClient).
## Migration from Polling
### What Changed (Phases 1-5)
1. **Phase 1**: Removed ETag cache layer
2. **Phase 2**: Standardized query keys
3. **Phase 3**: Fixed optimistic updates with UUIDs
4. **Phase 4**: Configured request deduplication
5. **Phase 5**: Removed manual invalidations
### Deprecated Patterns
- `usePolling` hook (removed)
- `useCrawlProgressPolling` (removed)
- Manual cache invalidation with setTimeout
- Socket.IO connections
- Double-layer caching
## Testing Patterns
### Hook Testing
**Example**: `archon-ui-main/src/features/projects/hooks/tests/useProjectQueries.test.ts`
Standard mocking approach for:
- Service methods
- Query patterns (STALE_TIMES, DISABLED_QUERY_KEY)
- Smart polling behavior
### Integration Testing
Use React Testing Library with QueryClientProvider wrapper.
## Developer Guidelines
### Adding New Data Fetching
1. Create query key factory in `{feature}/hooks/use{Feature}Queries.ts`
2. Use `useQuery` with appropriate stale time from `STALE_TIMES`
3. Add smart polling if real-time updates needed
4. Implement optimistic updates for mutations
5. Follow existing patterns in similar features
### Common Patterns to Follow
- Always use query key factories
- Never hardcode stale times
- Use `DISABLED_QUERY_KEY` for conditional queries
- Implement optimistic updates for better UX
- Add loading and error states
## Future Considerations
- Server-Sent Events for true real-time (post-Phase 5)
- WebSocket fallback for critical updates
- GraphQL migration for selective field updates

View File

@ -1,39 +1,149 @@
# ETag Implementation
## Current Implementation
## Overview
Our ETag implementation provides efficient HTTP caching for polling endpoints to reduce bandwidth usage.
Archon implements HTTP ETag caching to optimize bandwidth usage by reducing redundant data transfers. The implementation leverages browser-native HTTP caching combined with backend ETag generation for efficient cache validation.
### What It Does
- **Generates ETags**: Creates MD5 hashes of JSON response data
- **Checks ETags**: Simple string equality comparison between client's `If-None-Match` header and current data's ETag
- **Returns 304**: When ETags match, returns `304 Not Modified` with no body (saves bandwidth)
## How It Works
### How It Works
1. Server generates ETag from response data using MD5 hash
2. Client sends previous ETag in `If-None-Match` header
3. Server compares ETags:
- **Match**: Returns 304 (no body)
- **No match**: Returns 200 with new data and new ETag
### Backend ETag Generation
**Location**: `python/src/server/utils/etag_utils.py`
### Example
```python
# Server generates: ETag: "a3c2f1e4b5d6789"
# Client sends: If-None-Match: "a3c2f1e4b5d6789"
# Server returns: 304 Not Modified (no body)
```
The backend generates ETags for API responses:
- Creates MD5 hash of JSON-serialized response data
- Returns quoted ETag string (RFC 7232 format)
- Sets `Cache-Control: no-cache, must-revalidate` headers
- Compares client's `If-None-Match` header with current data's ETag
- Returns `304 Not Modified` when ETags match
## Limitations
### Frontend Handling
**Location**: `archon-ui-main/src/features/shared/apiWithEtag.ts`
Our implementation is simplified and doesn't support full RFC 7232 features:
- ❌ Wildcard (`*`) matching
- ❌ Multiple ETags (`"etag1", "etag2"`)
- ❌ Weak validators (`W/"etag"`)
- ✅ Single ETag comparison only
The frontend relies on browser-native HTTP caching:
- Browser automatically sends `If-None-Match` headers with cached ETags
- Browser handles 304 responses by returning cached data from HTTP cache
- No manual ETag tracking or cache management needed
- TanStack Query manages data freshness through `staleTime` configuration
This works perfectly for our browser-to-API polling use case but may need enhancement for CDN/proxy support.
#### Browser vs Non-Browser Behavior
- **Standard Browsers**: Per the Fetch spec, a 304 response freshens the HTTP cache and returns the cached body to JavaScript
- **Non-Browser Runtimes** (React Native, custom fetch): May surface 304 with empty body to JavaScript
- **Client Fallback**: The `apiWithEtag.ts` implementation handles both scenarios, ensuring consistent behavior across environments
## Files
- Implementation: `python/src/server/utils/etag_utils.py`
- Tests: `python/tests/server/utils/test_etag_utils.py`
- Used in: Progress API, Projects API polling endpoints
## Implementation Details
### Backend API Integration
ETags are used in these API routes:
- **Projects**: `python/src/server/api_routes/projects_api.py`
- Project lists
- Task lists
- Task counts
- **Progress**: `python/src/server/api_routes/progress_api.py`
- Active operations tracking
### ETag Generation Process
1. **Data Serialization**: Response data is JSON-serialized with sorted keys for consistency
2. **Hash Creation**: MD5 hash generated from JSON string
3. **Format**: Returns quoted string per RFC 7232 (e.g., `"a3c2f1e4b5d6789"`)
### Cache Validation Flow
1. **Initial Request**: Server generates ETag and sends with response
2. **Subsequent Requests**: Browser sends `If-None-Match` header with cached ETag
3. **Server Validation**:
- ETags match → Returns `304 Not Modified` (no body)
- ETags differ → Returns `200 OK` with new data and new ETag
4. **Browser Behavior**: On 304, browser serves cached response to JavaScript
## Key Design Decisions
### Browser-Native Caching
The implementation leverages browser HTTP caching instead of manual cache management:
- Reduces code complexity
- Eliminates cache synchronization issues
- Works seamlessly with TanStack Query
- Maintains bandwidth optimization
### No Manual ETag Tracking
Unlike previous implementations, the current approach:
- Does NOT maintain ETag maps in JavaScript
- Does NOT manually handle 304 responses
- Lets browser and TanStack Query handle caching layers
## Integration with TanStack Query
### Cache Coordination
- **Browser Cache**: Handles HTTP-level caching (ETags/304s)
- **TanStack Query Cache**: Manages application-level data freshness
- **Separation of Concerns**: HTTP caching for bandwidth, TanStack for state
### Configuration
Cache behavior is controlled through TanStack Query's `staleTime`:
- See `archon-ui-main/src/features/shared/queryPatterns.ts` for standard times
- See `archon-ui-main/src/features/shared/queryClient.ts` for global configuration
## Performance Benefits
### Bandwidth Reduction
- ~70% reduction in data transfer for unchanged responses (based on internal measurements)
- Especially effective for polling patterns
- Significant improvement for mobile/slow connections
### Server Load
- Reduced JSON serialization for 304 responses
- Lower network I/O
- Faster response times for cached data
## Files and References
### Core Implementation
- **Backend Utilities**: `python/src/server/utils/etag_utils.py`
- **Frontend Client**: `archon-ui-main/src/features/shared/apiWithEtag.ts`
- **Tests**: `python/tests/server/utils/test_etag_utils.py`
### Usage Examples
- **Projects API**: `python/src/server/api_routes/projects_api.py` (lines with `generate_etag`, `check_etag`)
- **Progress API**: `python/src/server/api_routes/progress_api.py` (active operations tracking)
## Testing
### Backend Testing
Tests in `python/tests/server/utils/test_etag_utils.py` verify:
- Correct ETag generation format
- Consistent hashing for same data
- Different hashes for different data
- Proper quote formatting
### Frontend Testing
Browser DevTools verification:
1. Network tab shows `If-None-Match` headers on requests
2. 304 responses have no body
3. Response served from cache on 304
4. New ETag values when data changes
## Monitoring
### How to Verify ETags are Working
1. Open Chrome DevTools → Network tab
2. Make a request to a supported endpoint
3. Note the `ETag` response header
4. Refresh or re-request the same data
5. Observe:
- Request includes `If-None-Match` header
- Server returns `304 Not Modified` if unchanged
- Response body is empty on 304
- Browser serves cached data
### Metrics to Track
- Ratio of 304 vs 200 responses
- Bandwidth saved through 304 responses
- Cache hit rate in production
## Future Considerations
- Consider implementing strong vs weak ETags for more granular control
- Evaluate adding ETag support to more endpoints
- Monitor cache effectiveness in production
- Consider Last-Modified headers as supplementary validation

View File

@ -1,194 +0,0 @@
# Polling Architecture Documentation
## Overview
Archon V2 uses HTTP polling instead of WebSockets for real-time updates. This simplifies the architecture, reduces complexity, and improves maintainability while providing adequate responsiveness for project management tasks.
## Core Components
### 1. usePolling Hook (`archon-ui-main/src/hooks/usePolling.ts`)
Generic polling hook that manages periodic data fetching with smart optimizations.
**Key Features:**
- Configurable polling intervals (default: 3 seconds)
- Automatic pause during browser tab inactivity
- ETag-based caching to reduce bandwidth
- Manual refresh capability
**Usage:**
```typescript
const { data, isLoading, error, refetch } = usePolling('/api/projects', {
interval: 5000,
enabled: true,
onSuccess: (data) => console.log('Projects updated:', data)
});
```
### 2. Specialized Progress Services
Individual services handle specific progress tracking needs:
**CrawlProgressService (`archon-ui-main/src/services/crawlProgressService.ts`)**
- Tracks website crawling operations
- Maps backend status to UI-friendly format
- Includes in-flight request guard to prevent overlapping fetches
- 1-second polling interval during active crawls
**Polling Endpoints:**
- `/api/projects` - Project list updates
- `/api/projects/{project_id}/tasks` - Task list for active project
- `/api/crawl-progress/{progress_id}` - Website crawling progress
- `/api/agent-chat/sessions/{session_id}/messages` - Chat messages
## Backend Support
### ETag Implementation (`python/src/server/utils/etag_utils.py`)
Server-side optimization to reduce unnecessary data transfer.
**How it works:**
1. Server generates ETag hash from response data
2. Client sends `If-None-Match` header with cached ETag
3. Server returns 304 Not Modified if data unchanged
4. Client uses cached data, reducing bandwidth by ~70%
### Progress API (`python/src/server/api_routes/progress_api.py`)
Dedicated endpoints for progress tracking:
- `GET /api/crawl-progress/{progress_id}` - Returns crawling status with ETag support
- Includes completion percentage, current step, and error details
## State Management
### Loading States
Visual feedback during operations:
- `movingTaskIds: Set<string>` - Tracks tasks being moved
- `isSwitchingProject: boolean` - Project transition state
- Loading overlays prevent concurrent operations
## Error Handling
### Retry Strategy
```typescript
retryCount: 3
retryDelay: attempt => Math.min(1000 * 2 ** attempt, 30000)
```
- Exponential backoff: 1s, 2s, 4s...
- Maximum retry delay: 30 seconds
- Automatic recovery after network issues
### User Feedback
- Toast notifications for errors
- Loading spinners during operations
- Clear error messages with recovery actions
## Performance Optimizations
### 1. Request Deduplication
Prevents multiple components from making identical requests:
```typescript
const cacheKey = `${endpoint}-${JSON.stringify(params)}`;
if (pendingRequests.has(cacheKey)) {
return pendingRequests.get(cacheKey);
}
```
### 2. Smart Polling Intervals
- Active operations: 1-2 second intervals
- Background data: 5-10 second intervals
- Paused when tab inactive (visibility API)
### 3. Selective Updates
Only polls active/relevant data:
- Tasks poll only for selected project
- Progress polls only during active operations
- Chat polls only for open sessions
## Architecture Benefits
### What We Have
- **Simple HTTP polling** - Standard request/response pattern
- **Automatic error recovery** - Built-in retry with exponential backoff
- **ETag caching** - 70% bandwidth reduction via 304 responses
- **Easy debugging** - Standard HTTP requests visible in DevTools
- **No connection limits** - Scales with standard HTTP infrastructure
- **Consolidated polling hooks** - Single pattern for all data fetching
### Trade-offs
- **Latency:** 1-5 second delay vs instant updates
- **Bandwidth:** More requests, but mitigated by ETags
- **Battery:** Slightly higher mobile battery usage
## Developer Guidelines
### Adding New Polling Endpoint
1. **Frontend - Use the usePolling hook:**
```typescript
// In your component or custom hook
const { data, isLoading, error, refetch } = usePolling('/api/new-endpoint', {
interval: 5000,
enabled: true,
staleTime: 2000
});
```
2. **Backend - Add ETag support:**
```python
from ..utils.etag_utils import generate_etag, check_etag
@router.get("/api/new-endpoint")
async def get_data(request: Request):
data = fetch_data()
etag = generate_etag(data)
if check_etag(request, etag):
return Response(status_code=304)
return JSONResponse(
content=data,
headers={"ETag": etag}
)
```
3. **For progress tracking, use useCrawlProgressPolling:**
```typescript
const { data, isLoading } = useCrawlProgressPolling(operationId, {
onSuccess: (data) => {
if (data.status === 'completed') {
// Handle completion
}
}
});
```
### Best Practices
1. **Always provide loading states** - Users should know when data is updating
2. **Handle errors gracefully** - Show toast notifications with clear messages
3. **Respect polling intervals** - Don't poll faster than necessary
4. **Clean up on unmount** - Cancel pending requests when components unmount
5. **Use ETag caching** - Reduce bandwidth with 304 responses
## Testing Polling Behavior
### Manual Testing
1. Open Network tab in DevTools
2. Look for requests with 304 status (cache hits)
3. Verify polling stops when switching tabs
4. Test error recovery by stopping backend
### Debugging Tips
- Check `localStorage` for cached ETags
- Monitor `console.log` for polling lifecycle events
- Use React DevTools to inspect hook states
- Watch for memory leaks in long-running sessions
## Future Improvements
### Planned Enhancements
- WebSocket fallback for critical updates
- Configurable per-user polling rates
- Smart polling based on user activity patterns
- GraphQL subscriptions for selective field updates
### Considered Alternatives
- Server-Sent Events (SSE) - One-way real-time updates
- Long polling - Reduced request frequency
- WebRTC data channels - P2P updates between clients

View File

@ -106,6 +106,8 @@ export function useFeatureDetail(id: string | undefined) {
## Mutations with Optimistic Updates
```typescript
import { createOptimisticEntity, replaceOptimisticEntity } from "@/features/shared/optimistic";
export function useCreateFeature() {
const queryClient = useQueryClient();
@ -119,13 +121,13 @@ export function useCreateFeature() {
// Snapshot for rollback
const previous = queryClient.getQueryData(featureKeys.lists());
// Optimistic update (use timestamp IDs for now - Phase 3 will use UUIDs)
const tempId = `temp-${Date.now()}`;
// Optimistic update with nanoid for stable IDs
const optimisticEntity = createOptimisticEntity(newData);
queryClient.setQueryData(featureKeys.lists(), (old: Feature[] = []) =>
[...old, { ...newData, id: tempId }]
[...old, optimisticEntity]
);
return { previous, tempId };
return { previous, localId: optimisticEntity._localId };
},
onError: (err, variables, context) => {
@ -138,7 +140,7 @@ export function useCreateFeature() {
onSuccess: (data, variables, context) => {
// Replace optimistic with real data
queryClient.setQueryData(featureKeys.lists(), (old: Feature[] = []) =>
old.map(item => item.id === context?.tempId ? data : item)
replaceOptimisticEntity(old, context?.localId, data)
);
},
});
@ -176,7 +178,7 @@ vi.mock("../../../shared/queryPatterns", () => ({
Each feature is self-contained:
```
```text
src/features/projects/
├── components/ # UI components
├── hooks/
@ -189,7 +191,7 @@ src/features/projects/
Sub-features (like tasks under projects) follow the same structure:
```
```text
src/features/projects/tasks/
├── components/
├── hooks/
@ -220,8 +222,16 @@ When refactoring to these patterns:
4. **Don't skip mocking in tests** - Mock both services and patterns
5. **Don't use inconsistent patterns** - Follow the established conventions
## Future Improvements (Phase 3+)
## Completed Improvements (Phases 1-5)
- ✅ Phase 1: Removed manual frontend ETag cache layer (backend ETags remain; browser-managed)
- ✅ Phase 2: Standardized query keys with factories
- ✅ Phase 3: Implemented UUID-based optimistic updates using nanoid
- ✅ Phase 4: Configured request deduplication
- ✅ Phase 5: Removed manual cache invalidations
## Future Considerations
- Replace timestamp IDs (`temp-${Date.now()}`) with UUIDs
- Add Server-Sent Events for real-time updates
- Consider Zustand for complex client state
- Consider WebSocket fallback for critical updates
- Evaluate Zustand for complex client state management