docs: Update AI documentation to accurately reflect current codebase (#708)

* docs: Update AI documentation for accurate codebase reflection

- Replace obsolete POLLING_ARCHITECTURE.md with DATA_FETCHING_ARCHITECTURE.md
- Rewrite API_NAMING_CONVENTIONS.md with file references instead of code examples
- Condense ARCHITECTURE.md from 482 to 195 lines for clarity
- Update ETAG_IMPLEMENTATION.md to reflect actual implementation
- Update QUERY_PATTERNS.md to reflect completed Phase 5 (nanoid optimistic updates)
- Add PRPs/stories/ to .gitignore

All documentation now references actual files in codebase rather than
embedding potentially stale code examples.


* docs: Update CLAUDE.md and AGENTS.md with current patterns

- Update CLAUDE.md to reference documentation files instead of embedding code
- Replace Service Layer and Error Handling code examples with file references
- Add proper distinction between DATA_FETCHING_ARCHITECTURE and QUERY_PATTERNS docs
- Include ETag implementation reference
- Update environment variables section with .env.example reference


* docs: apply PR review improvements to AI documentation

- Fix punctuation, hyphenation, and grammar issues across all docs
- Add language tags to directory tree code blocks for proper markdown linting
- Clarify TanStack Query integration (not replacing polling, but integrating it)
- Add Cache-Control header documentation and browser vs non-browser fetch behavior
- Reference actual implementation files for polling intervals instead of hardcoding values
- Improve type-safety phrasing and remove line numbers from file references
- Clarify Phase 1 removed manual frontend ETag cache (backend ETags remain)
This commit is contained in:
Wirasm 2025-09-19 13:29:46 +03:00 committed by GitHub
parent 0502d378f0
commit 1b272ed2af
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
9 changed files with 807 additions and 1197 deletions

1
.gitignore vendored
View File

@ -4,6 +4,7 @@ __pycache__
.claude/settings.local.json .claude/settings.local.json
PRPs/local PRPs/local
PRPs/completed/ PRPs/completed/
PRPs/stories/
/logs/ /logs/
.zed .zed
tmp/ tmp/

268
AGENTS.md
View File

@ -8,9 +8,13 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
### Core Principles ### Core Principles
- **No backwards compatibility** - remove deprecated code immediately - **No backwards compatibility; we follow a fixforward approach** — remove deprecated code immediately
- **Detailed errors over graceful failures** - we want to identify and fix issues fast - **Detailed errors over graceful failures** - we want to identify and fix issues fast
- **Break things to improve them** - beta is for rapid iteration - **Break things to improve them** - beta is for rapid iteration
- **Continuous improvement** - embrace change and learn from mistakes
- **KISS** - keep it simple
- **DRY** when appropriate
- **YAGNI** — don't implement features that are not needed
### Error Handling ### Error Handling
@ -40,51 +44,7 @@ These operations should continue but track and report failures clearly:
#### Critical Nuance: Never Accept Corrupted Data #### Critical Nuance: Never Accept Corrupted Data
When a process should continue despite failures, it must **skip the failed item entirely** rather than storing corrupted data: When a process should continue despite failures, it must **skip the failed item entirely** rather than storing corrupted data
**❌ WRONG - Silent Corruption:**
```python
try:
embedding = create_embedding(text)
except Exception as e:
embedding = [0.0] * 1536 # NEVER DO THIS - corrupts database
store_document(doc, embedding)
```
**✅ CORRECT - Skip Failed Items:**
```python
try:
embedding = create_embedding(text)
store_document(doc, embedding) # Only store on success
except Exception as e:
failed_items.append({'doc': doc, 'error': str(e)})
logger.error(f"Skipping document {doc.id}: {e}")
# Continue with next document, don't store anything
```
**✅ CORRECT - Batch Processing with Failure Tracking:**
```python
def process_batch(items):
results = {'succeeded': [], 'failed': []}
for item in items:
try:
result = process_item(item)
results['succeeded'].append(result)
except Exception as e:
results['failed'].append({
'item': item,
'error': str(e),
'traceback': traceback.format_exc()
})
logger.error(f"Failed to process {item.id}: {e}")
# Always return both successes and failures
return results
```
#### Error Message Guidelines #### Error Message Guidelines
@ -98,9 +58,10 @@ def process_batch(items):
### Code Quality ### Code Quality
- Remove dead code immediately rather than maintaining it - no backward compatibility or legacy functions - Remove dead code immediately rather than maintaining it - no backward compatibility or legacy functions
- Prioritize functionality over production-ready patterns - Avoid backward compatibility mappings or legacy function wrappers
- Fix forward
- Focus on user experience and feature completeness - Focus on user experience and feature completeness
- When updating code, don't reference what is changing (avoid keywords like LEGACY, CHANGED, REMOVED), instead focus on comments that document just the functionality of the code - When updating code, don't reference what is changing (avoid keywords like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED), instead focus on comments that document just the functionality of the code
- When commenting on code in the codebase, only comment on the functionality and reasoning behind the code. Refrain from speaking to Archon being in "beta" or referencing anything else that comes from these global rules. - When commenting on code in the codebase, only comment on the functionality and reasoning behind the code. Refrain from speaking to Archon being in "beta" or referencing anything else that comes from these global rules.
## Development Commands ## Development Commands
@ -175,139 +136,35 @@ make test-be # Backend tests only
## Architecture Overview ## Architecture Overview
Archon Beta is a microservices-based knowledge management system with MCP (Model Context Protocol) integration: @PRPs/ai_docs/ARCHITECTURE.md
### Service Architecture #### TanStack Query Implementation
- **Frontend (port 3737)**: React + TypeScript + Vite + TailwindCSS For architecture and file references:
- **Dual UI Strategy**: @PRPs/ai_docs/DATA_FETCHING_ARCHITECTURE.md
- `/features` - Modern vertical slice with Radix UI primitives + TanStack Query
- `/components` - Legacy custom components (being migrated)
- **State Management**: TanStack Query for all data fetching (no prop drilling)
- **Styling**: Tron-inspired glassmorphism with Tailwind CSS
- **Linting**: Biome for `/features`, ESLint for legacy code
- **Main Server (port 8181)**: FastAPI with HTTP polling for updates For code patterns and examples:
- Handles all business logic, database operations, and external API calls @PRPs/ai_docs/QUERY_PATTERNS.md
- WebSocket support removed in favor of HTTP polling with ETag caching
- **MCP Server (port 8051)**: Lightweight HTTP-based MCP protocol server
- Provides tools for AI assistants (Claude, Cursor, Windsurf)
- Exposes knowledge search, task management, and project operations
- **Agents Service (port 8052)**: PydanticAI agents for AI/ML operations
- Handles complex AI workflows and document processing
- **Database**: Supabase (PostgreSQL + pgvector for embeddings)
- Cloud or local Supabase both supported
- pgvector for semantic search capabilities
### Frontend Architecture Details
#### Vertical Slice Architecture (/features)
Features are organized by domain hierarchy with self-contained modules:
```
src/features/
├── ui/
│ ├── primitives/ # Radix UI base components
│ ├── hooks/ # Shared UI hooks (useSmartPolling, etc)
│ └── types/ # UI type definitions
├── projects/
│ ├── components/ # Project UI components
│ ├── hooks/ # Project hooks (useProjectQueries, etc)
│ ├── services/ # Project API services
│ ├── types/ # Project type definitions
│ ├── tasks/ # Tasks sub-feature (nested under projects)
│ │ ├── components/
│ │ ├── hooks/ # Task-specific hooks
│ │ ├── services/ # Task API services
│ │ └── types/
│ └── documents/ # Documents sub-feature
│ ├── components/
│ ├── services/
│ └── types/
```
#### TanStack Query Patterns
All data fetching uses TanStack Query with consistent patterns:
```typescript
// Query keys factory pattern
export const projectKeys = {
all: ["projects"] as const,
lists: () => [...projectKeys.all, "list"] as const,
detail: (id: string) => [...projectKeys.all, "detail", id] as const,
};
// Smart polling with visibility awareness
const { refetchInterval } = useSmartPolling(10000); // Pauses when tab inactive
// Optimistic updates with rollback
useMutation({
onMutate: async (data) => {
await queryClient.cancelQueries(key);
const previous = queryClient.getQueryData(key);
queryClient.setQueryData(key, optimisticData);
return { previous };
},
onError: (err, vars, context) => {
if (context?.previous) {
queryClient.setQueryData(key, context.previous);
}
},
});
```
### Backend Architecture Details
#### Service Layer Pattern #### Service Layer Pattern
```python See implementation examples:
# API Route -> Service -> Database
# src/server/api_routes/projects.py
@router.get("/{project_id}")
async def get_project(project_id: str):
return await project_service.get_project(project_id)
# src/server/services/project_service.py - API routes: `python/src/server/api_routes/projects_api.py`
async def get_project(project_id: str): - Service layer: `python/src/server/services/project_service.py`
# Business logic here - Pattern: API Route → Service → Database
return await db.fetch_project(project_id)
```
#### Error Handling Patterns #### Error Handling Patterns
```python See implementation examples:
# Use specific exceptions
class ProjectNotFoundError(Exception): pass
class ValidationError(Exception): pass
# Rich error responses - Custom exceptions: `python/src/server/exceptions.py`
@app.exception_handler(ProjectNotFoundError) - Exception handlers: `python/src/server/main.py` (search for @app.exception_handler)
async def handle_not_found(request, exc): - Service error handling: `python/src/server/services/` (various services)
return JSONResponse(
status_code=404,
content={"detail": str(exc), "type": "not_found"}
)
```
## Polling Architecture ## ETag Implementation
### HTTP Polling (replaced Socket.IO) @PRPs/ai_docs/ETAG_IMPLEMENTATION.md
- **Polling intervals**: 1-2s for active operations, 5-10s for background data
- **ETag caching**: Reduces bandwidth by ~70% via 304 Not Modified responses
- **Smart pausing**: Stops polling when browser tab is inactive
- **Progress endpoints**: `/api/progress/{id}` for operation tracking
### Key Polling Hooks
- `useSmartPolling` - Adjusts interval based on page visibility/focus
- `useCrawlProgressPolling` - Specialized for crawl progress with auto-cleanup
- `useProjectTasks` - Smart polling for task lists
## Database Schema ## Database Schema
@ -327,25 +184,9 @@ Key tables in Supabase:
## API Naming Conventions ## API Naming Conventions
### Task Status Values @PRPs/ai_docs/API_NAMING_CONVENTIONS.md
Use database values directly (no UI mapping): Use database values directly (no mapping in the FE typesafe from BE and up):
- `todo`, `doing`, `review`, `done`
### Service Method Patterns
- `get[Resource]sByProject(projectId)` - Scoped queries
- `get[Resource](id)` - Single resource
- `create[Resource](data)` - Create operations
- `update[Resource](id, updates)` - Updates
- `delete[Resource](id)` - Soft deletes
### State Naming
- `is[Action]ing` - Loading states (e.g., `isSwitchingProject`)
- `[resource]Error` - Error messages
- `selected[Resource]` - Current selection
## Environment Variables ## Environment Variables
@ -356,15 +197,8 @@ SUPABASE_URL=https://your-project.supabase.co # Or http://host.docker.internal:
SUPABASE_SERVICE_KEY=your-service-key-here # Use legacy key format for cloud Supabase SUPABASE_SERVICE_KEY=your-service-key-here # Use legacy key format for cloud Supabase
``` ```
Optional: Optional variables and full configuration:
See `python/.env.example` for complete list
```bash
LOGFIRE_TOKEN=your-logfire-token # For observability
LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR
ARCHON_SERVER_PORT=8181 # Server port
ARCHON_MCP_PORT=8051 # MCP server port
ARCHON_UI_PORT=3737 # Frontend port
```
## Common Development Tasks ## Common Development Tasks
@ -383,6 +217,14 @@ ARCHON_UI_PORT=3737 # Frontend port
4. Use TanStack Query hook from `src/features/[feature]/hooks/` 4. Use TanStack Query hook from `src/features/[feature]/hooks/`
5. Apply Tron-inspired glassmorphism styling with Tailwind 5. Apply Tron-inspired glassmorphism styling with Tailwind
### Add or modify MCP tools
1. MCP tools are in `python/src/mcp_server/features/[feature]/[feature]_tools.py`
2. Follow the pattern:
- `find_[resource]` - Handles list, search, and get single item operations
- `manage_[resource]` - Handles create, update, delete with an "action" parameter
3. Register tools in the feature's `__init__.py` file
### Debug MCP connection issues ### Debug MCP connection issues
1. Check MCP health: `curl http://localhost:8051/health` 1. Check MCP health: `curl http://localhost:8051/health`
@ -421,21 +263,37 @@ npm run lint:files src/components/SomeComponent.tsx
## MCP Tools Available ## MCP Tools Available
When connected to Client/Cursor/Windsurf: When connected to Claude/Cursor/Windsurf, the following tools are available:
- `archon:perform_rag_query` - Search knowledge base ### Knowledge Base Tools
- `archon:search_code_examples` - Find code snippets
- `archon:create_project` - Create new project - `archon:rag_search_knowledge_base` - Search knowledge base for relevant content
- `archon:list_projects` - List all projects - `archon:rag_search_code_examples` - Find code snippets in the knowledge base
- `archon:create_task` - Create task in project - `archon:rag_get_available_sources` - List available knowledge sources
- `archon:list_tasks` - List and filter tasks
- `archon:update_task` - Update task status/details ### Project Management
- `archon:get_available_sources` - List knowledge sources
- `archon:find_projects` - Find all projects, search, or get specific project (by project_id)
- `archon:manage_project` - Manage projects with actions: "create", "update", "delete"
### Task Management
- `archon:find_tasks` - Find tasks with search, filters, or get specific task (by task_id)
- `archon:manage_task` - Manage tasks with actions: "create", "update", "delete"
### Document Management
- `archon:find_documents` - Find documents, search, or get specific document (by document_id)
- `archon:manage_document` - Manage documents with actions: "create", "update", "delete"
### Version Control
- `archon:find_versions` - Find version history or get specific version
- `archon:manage_version` - Manage versions with actions: "create", "restore"
## Important Notes ## Important Notes
- Projects feature is optional - toggle in Settings UI - Projects feature is optional - toggle in Settings UI
- All services communicate via HTTP, not gRPC
- HTTP polling handles all updates - HTTP polling handles all updates
- Frontend uses Vite proxy for API calls in development - Frontend uses Vite proxy for API calls in development
- Python backend uses `uv` for dependency management - Python backend uses `uv` for dependency management

236
CLAUDE.md
View File

@ -8,9 +8,13 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
### Core Principles ### Core Principles
- **No backwards compatibility** - remove deprecated code immediately - **No backwards compatibility; we follow a fixforward approach** — remove deprecated code immediately
- **Detailed errors over graceful failures** - we want to identify and fix issues fast - **Detailed errors over graceful failures** - we want to identify and fix issues fast
- **Break things to improve them** - beta is for rapid iteration - **Break things to improve them** - beta is for rapid iteration
- **Continuous improvement** - embrace change and learn from mistakes
- **KISS** - keep it simple
- **DRY** when appropriate
- **YAGNI** — don't implement features that are not needed
### Error Handling ### Error Handling
@ -40,51 +44,7 @@ These operations should continue but track and report failures clearly:
#### Critical Nuance: Never Accept Corrupted Data #### Critical Nuance: Never Accept Corrupted Data
When a process should continue despite failures, it must **skip the failed item entirely** rather than storing corrupted data: When a process should continue despite failures, it must **skip the failed item entirely** rather than storing corrupted data
**❌ WRONG - Silent Corruption:**
```python
try:
embedding = create_embedding(text)
except Exception as e:
embedding = [0.0] * 1536 # NEVER DO THIS - corrupts database
store_document(doc, embedding)
```
**✅ CORRECT - Skip Failed Items:**
```python
try:
embedding = create_embedding(text)
store_document(doc, embedding) # Only store on success
except Exception as e:
failed_items.append({'doc': doc, 'error': str(e)})
logger.error(f"Skipping document {doc.id}: {e}")
# Continue with next document, don't store anything
```
**✅ CORRECT - Batch Processing with Failure Tracking:**
```python
def process_batch(items):
results = {'succeeded': [], 'failed': []}
for item in items:
try:
result = process_item(item)
results['succeeded'].append(result)
except Exception as e:
results['failed'].append({
'item': item,
'error': str(e),
'traceback': traceback.format_exc()
})
logger.error(f"Failed to process {item.id}: {e}")
# Always return both successes and failures
return results
```
#### Error Message Guidelines #### Error Message Guidelines
@ -99,9 +59,9 @@ def process_batch(items):
- Remove dead code immediately rather than maintaining it - no backward compatibility or legacy functions - Remove dead code immediately rather than maintaining it - no backward compatibility or legacy functions
- Avoid backward compatibility mappings or legacy function wrappers - Avoid backward compatibility mappings or legacy function wrappers
- Prioritize functionality over production-ready patterns - Fix forward
- Focus on user experience and feature completeness - Focus on user experience and feature completeness
- When updating code, don't reference what is changing (avoid keywords like LEGACY, CHANGED, REMOVED), instead focus on comments that document just the functionality of the code - When updating code, don't reference what is changing (avoid keywords like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED), instead focus on comments that document just the functionality of the code
- When commenting on code in the codebase, only comment on the functionality and reasoning behind the code. Refrain from speaking to Archon being in "beta" or referencing anything else that comes from these global rules. - When commenting on code in the codebase, only comment on the functionality and reasoning behind the code. Refrain from speaking to Archon being in "beta" or referencing anything else that comes from these global rules.
## Development Commands ## Development Commands
@ -176,139 +136,33 @@ make test-be # Backend tests only
## Architecture Overview ## Architecture Overview
Archon Beta is a microservices-based knowledge management system with MCP (Model Context Protocol) integration: @PRPs/ai_docs/ARCHITECTURE.md
### Service Architecture #### TanStack Query Implementation
- **Frontend (port 3737)**: React + TypeScript + Vite + TailwindCSS For architecture and file references:
- **Dual UI Strategy**: @PRPs/ai_docs/DATA_FETCHING_ARCHITECTURE.md
- `/features` - Modern vertical slice with Radix UI primitives + TanStack Query
- `/components` - Legacy custom components (being migrated)
- **State Management**: TanStack Query for all data fetching (no prop drilling)
- **Styling**: Tron-inspired glassmorphism with Tailwind CSS
- **Linting**: Biome for `/features`, ESLint for legacy code
- **Main Server (port 8181)**: FastAPI with HTTP polling for updates For code patterns and examples:
- Handles all business logic, database operations, and external API calls @PRPs/ai_docs/QUERY_PATTERNS.md
- WebSocket support removed in favor of HTTP polling with ETag caching
- **MCP Server (port 8051)**: Lightweight HTTP-based MCP protocol server
- Provides tools for AI assistants (Claude, Cursor, Windsurf)
- Exposes knowledge search, task management, and project operations
- **Agents Service (port 8052)**: PydanticAI agents for AI/ML operations
- Handles complex AI workflows and document processing
- **Database**: Supabase (PostgreSQL + pgvector for embeddings)
- Cloud or local Supabase both supported
- pgvector for semantic search capabilities
### Frontend Architecture Details
#### Vertical Slice Architecture (/features)
Features are organized by domain hierarchy with self-contained modules:
```
src/features/
├── ui/
│ ├── primitives/ # Radix UI base components
│ ├── hooks/ # Shared UI hooks (useSmartPolling, etc)
│ └── types/ # UI type definitions
├── projects/
│ ├── components/ # Project UI components
│ ├── hooks/ # Project hooks (useProjectQueries, etc)
│ ├── services/ # Project API services
│ ├── types/ # Project type definitions
│ ├── tasks/ # Tasks sub-feature (nested under projects)
│ │ ├── components/
│ │ ├── hooks/ # Task-specific hooks
│ │ ├── services/ # Task API services
│ │ └── types/
│ └── documents/ # Documents sub-feature
│ ├── components/
│ ├── services/
│ └── types/
```
#### TanStack Query Patterns
All data fetching uses TanStack Query with consistent patterns:
```typescript
// Query keys factory pattern
export const projectKeys = {
all: ["projects"] as const,
lists: () => [...projectKeys.all, "list"] as const,
detail: (id: string) => [...projectKeys.all, "detail", id] as const,
};
// Smart polling with visibility awareness
const { refetchInterval } = useSmartPolling(10000); // Pauses when tab inactive
// Optimistic updates with rollback
useMutation({
onMutate: async (data) => {
await queryClient.cancelQueries(key);
const previous = queryClient.getQueryData(key);
queryClient.setQueryData(key, optimisticData);
return { previous };
},
onError: (err, vars, context) => {
if (context?.previous) {
queryClient.setQueryData(key, context.previous);
}
},
});
```
### Backend Architecture Details
#### Service Layer Pattern #### Service Layer Pattern
```python See implementation examples:
# API Route -> Service -> Database - API routes: `python/src/server/api_routes/projects_api.py`
# src/server/api_routes/projects.py - Service layer: `python/src/server/services/project_service.py`
@router.get("/{project_id}") - Pattern: API Route → Service → Database
async def get_project(project_id: str):
return await project_service.get_project(project_id)
# src/server/services/project_service.py
async def get_project(project_id: str):
# Business logic here
return await db.fetch_project(project_id)
```
#### Error Handling Patterns #### Error Handling Patterns
```python See implementation examples:
# Use specific exceptions - Custom exceptions: `python/src/server/exceptions.py`
class ProjectNotFoundError(Exception): pass - Exception handlers: `python/src/server/main.py` (search for @app.exception_handler)
class ValidationError(Exception): pass - Service error handling: `python/src/server/services/` (various services)
# Rich error responses ## ETag Implementation
@app.exception_handler(ProjectNotFoundError)
async def handle_not_found(request, exc):
return JSONResponse(
status_code=404,
content={"detail": str(exc), "type": "not_found"}
)
```
## Polling Architecture @PRPs/ai_docs/ETAG_IMPLEMENTATION.md
### HTTP Polling (replaced Socket.IO)
- **Polling intervals**: 1-2s for active operations, 5-10s for background data
- **ETag caching**: Reduces bandwidth by ~70% via 304 Not Modified responses
- **Smart pausing**: Stops polling when browser tab is inactive
- **Progress endpoints**: `/api/progress/{id}` for operation tracking
### Key Polling Hooks
- `useSmartPolling` - Adjusts interval based on page visibility/focus
- `useCrawlProgressPolling` - Specialized for crawl progress with auto-cleanup
- `useProjectTasks` - Smart polling for task lists
## Database Schema ## Database Schema
@ -328,25 +182,9 @@ Key tables in Supabase:
## API Naming Conventions ## API Naming Conventions
### Task Status Values @PRPs/ai_docs/API_NAMING_CONVENTIONS.md
Use database values directly (no UI mapping): Use database values directly (no FE mapping; typesafe endtoend from BE upward):
- `todo`, `doing`, `review`, `done`
### Service Method Patterns
- `get[Resource]sByProject(projectId)` - Scoped queries
- `get[Resource](id)` - Single resource
- `create[Resource](data)` - Create operations
- `update[Resource](id, updates)` - Updates
- `delete[Resource](id)` - Soft deletes
### State Naming
- `is[Action]ing` - Loading states (e.g., `isSwitchingProject`)
- `[resource]Error` - Error messages
- `selected[Resource]` - Current selection
## Environment Variables ## Environment Variables
@ -357,15 +195,8 @@ SUPABASE_URL=https://your-project.supabase.co # Or http://host.docker.internal:
SUPABASE_SERVICE_KEY=your-service-key-here # Use legacy key format for cloud Supabase SUPABASE_SERVICE_KEY=your-service-key-here # Use legacy key format for cloud Supabase
``` ```
Optional: Optional variables and full configuration:
See `python/.env.example` for complete list
```bash
LOGFIRE_TOKEN=your-logfire-token # For observability
LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR
ARCHON_SERVER_PORT=8181 # Server port
ARCHON_MCP_PORT=8051 # MCP server port
ARCHON_UI_PORT=3737 # Frontend port
```
## Common Development Tasks ## Common Development Tasks
@ -390,8 +221,7 @@ ARCHON_UI_PORT=3737 # Frontend port
2. Follow the pattern: 2. Follow the pattern:
- `find_[resource]` - Handles list, search, and get single item operations - `find_[resource]` - Handles list, search, and get single item operations
- `manage_[resource]` - Handles create, update, delete with an "action" parameter - `manage_[resource]` - Handles create, update, delete with an "action" parameter
3. Optimize responses by truncating/filtering fields in list operations 3. Register tools in the feature's `__init__.py` file
4. Register tools in the feature's `__init__.py` file
### Debug MCP connection issues ### Debug MCP connection issues
@ -434,31 +264,35 @@ npm run lint:files src/components/SomeComponent.tsx
When connected to Claude/Cursor/Windsurf, the following tools are available: When connected to Claude/Cursor/Windsurf, the following tools are available:
### Knowledge Base Tools ### Knowledge Base Tools
- `archon:rag_search_knowledge_base` - Search knowledge base for relevant content - `archon:rag_search_knowledge_base` - Search knowledge base for relevant content
- `archon:rag_search_code_examples` - Find code snippets in the knowledge base - `archon:rag_search_code_examples` - Find code snippets in the knowledge base
- `archon:rag_get_available_sources` - List available knowledge sources - `archon:rag_get_available_sources` - List available knowledge sources
### Project Management ### Project Management
- `archon:find_projects` - Find all projects, search, or get specific project (by project_id) - `archon:find_projects` - Find all projects, search, or get specific project (by project_id)
- `archon:manage_project` - Manage projects with actions: "create", "update", "delete" - `archon:manage_project` - Manage projects with actions: "create", "update", "delete"
### Task Management ### Task Management
- `archon:find_tasks` - Find tasks with search, filters, or get specific task (by task_id) - `archon:find_tasks` - Find tasks with search, filters, or get specific task (by task_id)
- `archon:manage_task` - Manage tasks with actions: "create", "update", "delete" - `archon:manage_task` - Manage tasks with actions: "create", "update", "delete"
### Document Management ### Document Management
- `archon:find_documents` - Find documents, search, or get specific document (by document_id) - `archon:find_documents` - Find documents, search, or get specific document (by document_id)
- `archon:manage_document` - Manage documents with actions: "create", "update", "delete" - `archon:manage_document` - Manage documents with actions: "create", "update", "delete"
### Version Control ### Version Control
- `archon:find_versions` - Find version history or get specific version - `archon:find_versions` - Find version history or get specific version
- `archon:manage_version` - Manage versions with actions: "create", "restore" - `archon:manage_version` - Manage versions with actions: "create", "restore"
## Important Notes ## Important Notes
- Projects feature is optional - toggle in Settings UI - Projects feature is optional - toggle in Settings UI
- All services communicate via HTTP, not gRPC - TanStack Query handles all data fetching; smart HTTP polling is used where appropriate (no WebSockets)
- HTTP polling handles all updates
- Frontend uses Vite proxy for API calls in development - Frontend uses Vite proxy for API calls in development
- Python backend uses `uv` for dependency management - Python backend uses `uv` for dependency management
- Docker Compose handles service orchestration - Docker Compose handles service orchestration

View File

@ -1,164 +1,249 @@
# API Naming Conventions # API Naming Conventions
## Overview ## Overview
This document defines the naming conventions used throughout the Archon V2 codebase for consistency and clarity.
## Task Status Values This document describes the actual naming conventions used throughout Archon's codebase based on current implementation patterns. All examples reference real files where these patterns are implemented.
**Database values only - no UI mapping:**
- `todo` - Task is in backlog/todo state
- `doing` - Task is actively being worked on
- `review` - Task is pending review
- `done` - Task is completed
## Service Method Naming ## Backend API Endpoints
### Project Service (`projectService.ts`) ### RESTful Route Patterns
**Reference**: `python/src/server/api_routes/projects_api.py`
#### Projects Standard REST patterns used:
- `GET /api/{resource}` - List all resources
- `POST /api/{resource}` - Create new resource
- `GET /api/{resource}/{id}` - Get single resource
- `PUT /api/{resource}/{id}` - Update resource
- `DELETE /api/{resource}/{id}` - Delete resource
Nested resource patterns:
- `GET /api/projects/{project_id}/tasks` - Tasks scoped to project
- `GET /api/projects/{project_id}/docs` - Documents scoped to project
- `POST /api/projects/{project_id}/versions` - Create version for project
### Actual Endpoint Examples
From `python/src/server/api_routes/`:
**Projects** (`projects_api.py`):
- `/api/projects` - Project CRUD
- `/api/projects/{project_id}/features` - Get project features
- `/api/projects/{project_id}/tasks` - Project-scoped tasks
- `/api/projects/{project_id}/docs` - Project documents
- `/api/projects/{project_id}/versions` - Version history
**Knowledge** (`knowledge_api.py`):
- `/api/knowledge/sources` - Knowledge sources
- `/api/knowledge/crawl` - Start web crawl
- `/api/knowledge/upload` - Upload document
- `/api/knowledge/search` - RAG search
- `/api/knowledge/code-search` - Code-specific search
**Progress** (`progress_api.py`):
- `/api/progress/active` - Active operations
- `/api/progress/{operation_id}` - Specific operation status
**MCP** (`mcp_api.py`):
- `/api/mcp/status` - MCP server status
- `/api/mcp/execute` - Execute MCP tool
## Frontend Service Methods
### Service Object Pattern
**Reference**: `archon-ui-main/src/features/projects/services/projectService.ts`
Services are exported as objects with async methods:
```typescript
export const serviceNameService = {
async methodName(): Promise<ReturnType> { ... }
}
```
### Standard Service Method Names
Actual patterns from service files:
**List Operations**:
- `listProjects()` - Get all projects - `listProjects()` - Get all projects
- `getProject(projectId)` - Get single project by ID - `getTasksByProject(projectId)` - Get filtered list
- `createProject(projectData)` - Create new project - `getTasksByStatus(status)` - Get by specific criteria
- `updateProject(projectId, updates)` - Update project
- `deleteProject(projectId)` - Delete project
#### Tasks **Single Item Operations**:
- `getTasksByProject(projectId)` - Get all tasks for a specific project - `getProject(projectId)` - Get single item
- `getTask(taskId)` - Get single task by ID - `getTask(taskId)` - Direct ID access
- `createTask(taskData)` - Create new task
- `updateTask(taskId, updates)` - Update task with partial data
- `updateTaskStatus(taskId, status)` - Update only task status
- `updateTaskOrder(taskId, newOrder, newStatus?)` - Update task position/order
- `deleteTask(taskId)` - Delete task (soft delete/archive)
- `getTasksByStatus(status)` - Get all tasks with specific status
#### Documents **Create Operations**:
- `getDocuments(projectId)` - Get all documents for project - `createProject(data)` - Returns created entity
- `getDocument(projectId, docId)` - Get single document - `createTask(data)` - Includes server-generated fields
- `createDocument(projectId, documentData)` - Create document
- `updateDocument(projectId, docId, updates)` - Update document
- `deleteDocument(projectId, docId)` - Delete document
#### Versions **Update Operations**:
- `createVersion(projectId, field, content)` - Create version snapshot - `updateProject(id, updates)` - Partial updates
- `listVersions(projectId, fieldName?)` - List version history - `updateTaskStatus(id, status)` - Specific field update
- `getVersion(projectId, fieldName, versionNumber)` - Get specific version - `updateTaskOrder(id, order, status?)` - Complex updates
- `restoreVersion(projectId, fieldName, versionNumber)` - Restore version
## API Endpoint Patterns **Delete Operations**:
- `deleteProject(id)` - Returns void
- `deleteTask(id)` - Soft delete pattern
### RESTful Endpoints ### Service File Locations
``` - **Projects**: `archon-ui-main/src/features/projects/services/projectService.ts`
GET /api/projects - List all projects - **Tasks**: `archon-ui-main/src/features/projects/tasks/services/taskService.ts`
POST /api/projects - Create project - **Knowledge**: `archon-ui-main/src/features/knowledge/services/knowledgeService.ts`
GET /api/projects/{project_id} - Get project - **Progress**: `archon-ui-main/src/features/progress/services/progressService.ts`
PUT /api/projects/{project_id} - Update project
DELETE /api/projects/{project_id} - Delete project
GET /api/projects/{project_id}/tasks - Get project tasks ## React Hook Naming
POST /api/tasks - Create task (project_id in body)
GET /api/tasks/{task_id} - Get task
PUT /api/tasks/{task_id} - Update task
DELETE /api/tasks/{task_id} - Delete task
GET /api/projects/{project_id}/docs - Get project documents ### Query Hooks
POST /api/projects/{project_id}/docs - Create document **Reference**: `archon-ui-main/src/features/projects/tasks/hooks/useTaskQueries.ts`
GET /api/projects/{project_id}/docs/{doc_id} - Get document
PUT /api/projects/{project_id}/docs/{doc_id} - Update document
DELETE /api/projects/{project_id}/docs/{doc_id} - Delete document
```
### Progress/Polling Endpoints Standard patterns:
``` - `use[Resource]()` - List query (e.g., `useProjects`)
GET /api/progress/{operation_id} - Generic operation progress - `use[Resource]Detail(id)` - Single item query
GET /api/knowledge/crawl-progress/{id} - Crawling progress - `use[Parent][Resource](parentId)` - Scoped query (e.g., `useProjectTasks`)
GET /api/agent-chat/sessions/{id}/messages - Chat messages
``` ### Mutation Hooks
- `useCreate[Resource]()` - Creation mutation
- `useUpdate[Resource]()` - Update mutation
- `useDelete[Resource]()` - Deletion mutation
### Utility Hooks
**Reference**: `archon-ui-main/src/features/ui/hooks/`
- `useSmartPolling()` - Visibility-aware polling
- `useToast()` - Toast notifications
- `useDebounce()` - Debounced values
## Type Naming Conventions
### Type Definition Patterns
**Reference**: `archon-ui-main/src/features/projects/types/`
**Entity Types**:
- `Project` - Core entity type
- `Task` - Business object
- `Document` - Data model
**Request/Response Types**:
- `Create[Entity]Request` - Creation payload
- `Update[Entity]Request` - Update payload
- `[Entity]Response` - API response wrapper
**Database Types**:
- `DatabaseTaskStatus` - Exact database values
**Location**: `archon-ui-main/src/features/projects/tasks/types/task.ts`
Values: `"todo" | "doing" | "review" | "done"`
### Type File Organization
Following vertical slice architecture:
- Core types in `{feature}/types/`
- Sub-feature types in `{feature}/{subfeature}/types/`
- Shared types in `shared/types/`
## Query Key Factories
**Reference**: Each feature's `hooks/use{Feature}Queries.ts` file
Standard factory pattern:
- `{resource}Keys.all` - Base key for invalidation
- `{resource}Keys.lists()` - List queries
- `{resource}Keys.detail(id)` - Single item queries
- `{resource}Keys.byProject(projectId)` - Scoped queries
Examples:
- `projectKeys` - Projects domain
- `taskKeys` - Tasks (dual nature: global and project-scoped)
- `knowledgeKeys` - Knowledge base
- `progressKeys` - Progress tracking
- `documentKeys` - Document management
## Component Naming ## Component Naming
### Hooks ### Page Components
- `use[Feature]` - Custom hooks (e.g., `usePolling`, `useProjectMutation`) **Location**: `archon-ui-main/src/pages/`
- Returns object with: `{ data, isLoading, error, refetch }` - `[Feature]Page.tsx` - Top-level pages
- `[Feature]View.tsx` - Main view components
### Services ### Feature Components
- `[feature]Service` - Service modules (e.g., `projectService`, `crawlProgressService`) **Location**: `archon-ui-main/src/features/{feature}/components/`
- Methods return Promises with typed responses - `[Entity]Card.tsx` - Card displays
- `[Entity]List.tsx` - List containers
- `[Entity]Form.tsx` - Form components
- `New[Entity]Modal.tsx` - Creation modals
- `Edit[Entity]Modal.tsx` - Edit modals
### Components ### Shared Components
- `[Feature][Type]` - UI components (e.g., `TaskBoardView`, `EditTaskModal`) **Location**: `archon-ui-main/src/features/ui/primitives/`
- Props interfaces: `[Component]Props` - Radix UI-based primitives
- Generic, reusable components
## State Variable Naming ## State Variable Naming
### Loading States ### Loading States
- `isLoading[Feature]` - Boolean loading indicators **Examples from**: `archon-ui-main/src/features/projects/views/ProjectsView.tsx`
- `isSwitchingProject` - Specific operation states - `isLoading` - Generic loading
- `movingTaskIds` - Set/Array of items being processed - `is[Action]ing` - Specific operations (e.g., `isSwitchingProject`)
- `[action]ingIds` - Sets of items being processed
### Error States ### Error States
- `[feature]Error` - Error message strings - `error` - Query errors
- `taskOperationError` - Specific operation errors - `[operation]Error` - Specific operation errors
### Data States ### Selection States
- `[feature]s` - Plural for collections (e.g., `tasks`, `projects`) - `selected[Entity]` - Currently selected item
- `selected[Feature]` - Currently selected item - `active[Entity]Id` - Active item ID
- `[feature]Data` - Raw data from API
## Type Definitions ## Constants and Enums
### Database Types (from backend) ### Status Values
```typescript **Location**: `archon-ui-main/src/features/projects/tasks/types/task.ts`
type DatabaseTaskStatus = 'todo' | 'doing' | 'review' | 'done'; Database values used directly - no mapping layers:
type Assignee = string; // Flexible string to support any agent name - Task statuses: `"todo"`, `"doing"`, `"review"`, `"done"`
// Common values: 'User', 'Archon', 'Coding Agent' - Operation statuses: `"pending"`, `"processing"`, `"completed"`, `"failed"`
```
### Request/Response Types ### Time Constants
```typescript **Location**: `archon-ui-main/src/features/shared/queryPatterns.ts`
Create[Feature]Request // e.g., CreateTaskRequest - `STALE_TIMES.instant` - 0ms
Update[Feature]Request // e.g., UpdateTaskRequest - `STALE_TIMES.realtime` - 3 seconds
[Feature]Response // e.g., TaskResponse - `STALE_TIMES.frequent` - 5 seconds
``` - `STALE_TIMES.normal` - 30 seconds
- `STALE_TIMES.rare` - 5 minutes
- `STALE_TIMES.static` - Infinity
## Function Naming Patterns ## File Naming Patterns
### Event Handlers ### Service Layer
- `handle[Event]` - Generic handlers (e.g., `handleProjectSelect`) - `{feature}Service.ts` - Service modules
- `on[Event]` - Props callbacks (e.g., `onTaskMove`, `onRefresh`) - Use lower camelCase with "Service" suffix (e.g., `projectService.ts`)
### Operations ### Hook Files
- `load[Feature]` - Fetch data (e.g., `loadTasksForProject`) - `use{Feature}Queries.ts` - Query hooks and keys
- `save[Feature]` - Persist changes (e.g., `saveTask`) - `use{Feature}.ts` - Feature-specific hooks
- `delete[Feature]` - Remove items (e.g., `deleteTask`)
- `refresh[Feature]` - Reload data (e.g., `refreshTasks`)
### Formatting/Transformation ### Type Files
- `format[Feature]` - Format for display (e.g., `formatTask`) - `index.ts` - Barrel exports
- `validate[Feature]` - Validate data (e.g., `validateUpdateTask`) - `{entity}.ts` - Specific entity types
### Test Files
- `{filename}.test.ts` - Unit tests
- Located in `tests/` subdirectories
## Best Practices ## Best Practices
### ✅ Do Use ### Do Follow
- `getTasksByProject(projectId)` - Clear scope with context - Use exact database values (no translation layers)
- `status` - Single source of truth from database - Keep consistent patterns within features
- Direct database values everywhere (no mapping) - Use query key factories for all cache operations
- Polling with `usePolling` hook for data fetching - Follow vertical slice architecture
- Async/await with proper error handling - Reference shared constants
- ETag headers for efficient polling
- Loading indicators during operations
## Current Architecture Patterns ### Don't Do
- Don't create mapping layers for database values
- Don't hardcode time values
- Don't mix query keys between features
- Don't use inconsistent naming within a feature
- Don't embed business logic in components
### Polling & Data Fetching ## Common Patterns Reference
- HTTP polling with `usePolling` and `useCrawlProgressPolling` hooks
- ETag-based caching for bandwidth efficiency
- Loading state indicators (`isLoading`, `isSwitchingProject`)
- Error toast notifications for user feedback
- Manual refresh triggers via `refetch()`
- Immediate UI updates followed by API calls
### Service Architecture For implementation examples, see:
- Specialized services for different domains (`projectService`, `crawlProgressService`) - Query patterns: Any `use{Feature}Queries.ts` file
- Direct database value usage (no UI/DB mapping) - Service patterns: Any `{feature}Service.ts` file
- Promise-based async operations - Type patterns: Any `{feature}/types/` directory
- Typed request/response interfaces - Component patterns: Any `{feature}/components/` directory

View File

@ -2,480 +2,194 @@
## Overview ## Overview
Archon follows a **Vertical Slice Architecture** pattern where features are organized by business capability rather than technical layers. Each module is self-contained with its own API, business logic, and data access, making the system modular, maintainable, and ready for future microservice extraction if needed. Archon is a knowledge management system with AI capabilities, built as a monolithic application with vertical slice organization. The frontend uses React with TanStack Query, while the backend runs FastAPI with multiple service components.
## Core Principles ## Tech Stack
1. **Feature Cohesion**: All code for a feature lives together **Frontend**: React 18, TypeScript 5, TanStack Query v5, Tailwind CSS, Vite
2. **Module Independence**: Modules communicate through well-defined interfaces **Backend**: Python 3.12, FastAPI, Supabase, PydanticAI
3. **Vertical Slices**: Each feature contains its complete stack (API → Service → Repository) **Infrastructure**: Docker, PostgreSQL + pgvector
4. **Shared Minimal**: Only truly cross-cutting concerns go in shared
5. **Migration Ready**: Structure supports easy extraction to microservices
## Directory Structure ## Directory Structure
``` ### Backend (`python/src/`)
archon/ ```text
├── python/ server/ # Main FastAPI application
│ ├── src/ ├── api_routes/ # HTTP endpoints
│ │ ├── knowledge/ # Knowledge Management Module ├── services/ # Business logic
│ │ │ ├── __init__.py ├── models/ # Data models
│ │ │ ├── main.py # Knowledge module entry point ├── config/ # Configuration
│ │ │ ├── shared/ # Shared within knowledge context ├── middleware/ # Request processing
│ │ │ │ ├── models.py └── utils/ # Shared utilities
│ │ │ │ ├── exceptions.py
│ │ │ │ └── utils.py mcp_server/ # MCP server for IDE integration
│ │ │ └── features/ # Knowledge feature slices └── features/ # MCP tool implementations
│ │ │ ├── crawling/ # Web crawling feature
│ │ │ │ ├── __init__.py agents/ # AI agents (PydanticAI)
│ │ │ │ ├── api.py # Crawl endpoints └── features/ # Agent capabilities
│ │ │ │ ├── service.py # Crawling orchestration
│ │ │ │ ├── models.py # Crawl-specific models
│ │ │ │ ├── repository.py # Crawl data storage
│ │ │ │ └── tests/
│ │ │ ├── document_processing/ # Document upload & processing
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Upload endpoints
│ │ │ │ ├── service.py # PDF/DOCX processing
│ │ │ │ ├── extractors.py # Text extraction
│ │ │ │ └── tests/
│ │ │ ├── embeddings/ # Vector embeddings
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Embedding endpoints
│ │ │ │ ├── service.py # OpenAI/local embeddings
│ │ │ │ ├── models.py
│ │ │ │ └── repository.py # Vector storage
│ │ │ ├── search/ # RAG search
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Search endpoints
│ │ │ │ ├── service.py # Search algorithms
│ │ │ │ ├── reranker.py # Result reranking
│ │ │ │ └── tests/
│ │ │ ├── code_extraction/ # Code snippet extraction
│ │ │ │ ├── __init__.py
│ │ │ │ ├── service.py # Code parsing
│ │ │ │ ├── analyzers.py # Language detection
│ │ │ │ └── repository.py
│ │ │ └── source_management/ # Knowledge source CRUD
│ │ │ ├── __init__.py
│ │ │ ├── api.py
│ │ │ ├── service.py
│ │ │ └── repository.py
│ │ │
│ │ ├── projects/ # Project Management Module
│ │ │ ├── __init__.py
│ │ │ ├── main.py # Projects module entry point
│ │ │ ├── shared/ # Shared within projects context
│ │ │ │ ├── database.py # Project DB utilities
│ │ │ │ ├── models.py # Shared project models
│ │ │ │ └── exceptions.py # Project-specific exceptions
│ │ │ └── features/ # Project feature slices
│ │ │ ├── project_management/ # Project CRUD
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Project endpoints
│ │ │ │ ├── service.py # Project business logic
│ │ │ │ ├── models.py # Project models
│ │ │ │ ├── repository.py # Project DB operations
│ │ │ │ └── tests/
│ │ │ ├── task_management/ # Task CRUD
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Task endpoints
│ │ │ │ ├── service.py # Task business logic
│ │ │ │ ├── models.py # Task models
│ │ │ │ ├── repository.py # Task DB operations
│ │ │ │ └── tests/
│ │ │ ├── task_ordering/ # Drag-and-drop reordering
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Reorder endpoints
│ │ │ │ ├── service.py # Reordering algorithm
│ │ │ │ └── tests/
│ │ │ ├── document_management/ # Project documents
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Document endpoints
│ │ │ │ ├── service.py # Document logic
│ │ │ │ ├── models.py
│ │ │ │ └── repository.py
│ │ │ ├── document_versioning/ # Version control
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Version endpoints
│ │ │ │ ├── service.py # Versioning logic
│ │ │ │ ├── models.py # Version models
│ │ │ │ └── repository.py # Version storage
│ │ │ ├── ai_generation/ # AI project creation
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Generate endpoints
│ │ │ │ ├── service.py # AI orchestration
│ │ │ │ ├── agents.py # Agent interactions
│ │ │ │ ├── progress.py # Progress tracking
│ │ │ │ └── prompts.py # Generation prompts
│ │ │ ├── source_linking/ # Link to knowledge base
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api.py # Link endpoints
│ │ │ │ ├── service.py # Linking logic
│ │ │ │ └── repository.py # Junction table ops
│ │ │ └── bulk_operations/ # Batch updates
│ │ │ ├── __init__.py
│ │ │ ├── api.py # Bulk endpoints
│ │ │ ├── service.py # Batch processing
│ │ │ └── tests/
│ │ │
│ │ ├── mcp_server/ # MCP Protocol Server (IDE Integration)
│ │ │ ├── __init__.py
│ │ │ ├── main.py # MCP server entry point
│ │ │ ├── server.py # FastMCP server setup
│ │ │ ├── features/ # MCP tool implementations
│ │ │ │ ├── projects/ # Project tools for IDEs
│ │ │ │ │ ├── __init__.py
│ │ │ │ │ ├── project_tools.py
│ │ │ │ │ └── tests/
│ │ │ │ ├── tasks/ # Task tools for IDEs
│ │ │ │ │ ├── __init__.py
│ │ │ │ │ ├── task_tools.py
│ │ │ │ │ └── tests/
│ │ │ │ ├── documents/ # Document tools for IDEs
│ │ │ │ │ ├── __init__.py
│ │ │ │ │ ├── document_tools.py
│ │ │ │ │ ├── version_tools.py
│ │ │ │ │ └── tests/
│ │ │ │ └── feature_tools.py # Feature management
│ │ │ ├── modules/ # MCP modules
│ │ │ │ └── archon.py # Main Archon MCP module
│ │ │ └── utils/ # MCP utilities
│ │ │ └── tool_utils.py
│ │ │
│ │ ├── agents/ # AI Agents Module
│ │ │ ├── __init__.py
│ │ │ ├── main.py # Agents module entry point
│ │ │ ├── config.py # Agent configurations
│ │ │ ├── features/ # Agent capabilities
│ │ │ │ ├── document_agent/ # Document processing agent
│ │ │ │ │ ├── __init__.py
│ │ │ │ │ ├── agent.py # PydanticAI agent
│ │ │ │ │ ├── prompts.py # Agent prompts
│ │ │ │ │ └── tools.py # Agent tools
│ │ │ │ ├── code_agent/ # Code analysis agent
│ │ │ │ │ ├── __init__.py
│ │ │ │ │ ├── agent.py
│ │ │ │ │ └── analyzers.py
│ │ │ │ └── project_agent/ # Project creation agent
│ │ │ │ ├── __init__.py
│ │ │ │ ├── agent.py
│ │ │ │ ├── prp_generator.py
│ │ │ │ └── task_generator.py
│ │ │ └── shared/ # Shared agent utilities
│ │ │ ├── base_agent.py
│ │ │ ├── llm_client.py
│ │ │ └── response_models.py
│ │ │
│ │ ├── shared/ # Shared Across All Modules
│ │ │ ├── database/ # Database utilities
│ │ │ │ ├── __init__.py
│ │ │ │ ├── supabase.py # Supabase client
│ │ │ │ ├── migrations.py # DB migrations
│ │ │ │ └── connection_pool.py
│ │ │ ├── auth/ # Authentication
│ │ │ │ ├── __init__.py
│ │ │ │ ├── api_keys.py
│ │ │ │ └── permissions.py
│ │ │ ├── config/ # Configuration
│ │ │ │ ├── __init__.py
│ │ │ │ ├── settings.py # Environment settings
│ │ │ │ └── logfire_config.py # Logging config
│ │ │ ├── middleware/ # HTTP middleware
│ │ │ │ ├── __init__.py
│ │ │ │ ├── cors.py
│ │ │ │ └── error_handler.py
│ │ │ └── utils/ # General utilities
│ │ │ ├── __init__.py
│ │ │ ├── datetime_utils.py
│ │ │ └── json_utils.py
│ │ │
│ │ └── main.py # Application orchestrator
│ │
│ └── tests/ # Integration tests
│ ├── test_api_essentials.py
│ ├── test_service_integration.py
│ └── fixtures/
├── archon-ui-main/ # Frontend Application
│ ├── src/
│ │ ├── pages/ # Page components
│ │ │ ├── KnowledgeBasePage.tsx
│ │ │ ├── ProjectPage.tsx
│ │ │ ├── SettingsPage.tsx
│ │ │ └── MCPPage.tsx
│ │ ├── components/ # Reusable components
│ │ │ ├── knowledge-base/ # Knowledge features
│ │ │ ├── project-tasks/ # Project features
│ │ │ └── ui/ # Shared UI components
│ │ ├── services/ # API services
│ │ │ ├── api.ts # Base API client
│ │ │ ├── knowledgeBaseService.ts
│ │ │ ├── projectService.ts
│ │ │ └── pollingService.ts # New polling utilities
│ │ ├── hooks/ # React hooks
│ │ │ ├── usePolling.ts # Polling hook
│ │ │ ├── useDatabaseMutation.ts # DB-first mutations
│ │ │ └── useAsyncAction.ts
│ │ └── contexts/ # React contexts
│ │ ├── ToastContext.tsx
│ │ └── ThemeContext.tsx
│ │
│ └── tests/ # Frontend tests
├── PRPs/ # Product Requirement Prompts
│ ├── templates/ # PRP templates
│ ├── ai_docs/ # AI context documentation
│ └── *.md # Feature PRPs
├── docs/ # Documentation
│ └── architecture/ # Architecture decisions
└── docker/ # Docker configurations
├── Dockerfile
└── docker-compose.yml
``` ```
## Module Descriptions ### Frontend (`archon-ui-main/src/`)
```text
features/ # Vertical slice architecture
├── knowledge/ # Knowledge base feature
├── projects/ # Project management
│ ├── tasks/ # Task sub-feature
│ └── documents/ # Document sub-feature
├── progress/ # Operation tracking
├── mcp/ # MCP integration
├── shared/ # Cross-feature utilities
└── ui/ # UI components & hooks
### Knowledge Module (`src/knowledge/`) pages/ # Route components
components/ # Legacy components (migrating)
Core knowledge management functionality including web crawling, document processing, embeddings, and RAG search. This is the heart of Archon's knowledge engine.
**Key Features:**
- Web crawling with JavaScript rendering
- Document upload and text extraction
- Vector embeddings and similarity search
- Code snippet extraction and indexing
- Source management and organization
### Projects Module (`src/projects/`)
Project and task management system with AI-powered project generation. Currently optional via feature flag.
**Key Features:**
- Project CRUD operations
- Task management with drag-and-drop ordering
- Document management with versioning
- AI-powered project generation
- Integration with knowledge base sources
### MCP Server Module (`src/mcp_server/`)
Model Context Protocol server that exposes Archon functionality to IDEs like Cursor and Windsurf.
**Key Features:**
- Tool-based API for IDE integration
- Project and task management tools
- Document operations
- Async operation support
### Agents Module (`src/agents/`)
AI agents powered by PydanticAI for intelligent document processing and project generation.
**Key Features:**
- Document analysis and summarization
- Code understanding and extraction
- Project requirement generation
- Task breakdown and planning
### Shared Module (`src/shared/`)
Cross-cutting concerns shared across all modules. Kept minimal to maintain module independence.
**Key Components:**
- Database connections and utilities
- Authentication and authorization
- Configuration management
- Logging and observability
- Common middleware
## Communication Patterns
### Inter-Module Communication
Modules communicate through:
1. **Direct HTTP API Calls** (current)
- Projects module calls Knowledge module APIs
- Simple and straightforward
- Works well for current scale
2. **Event Bus** (future consideration)
```python
# Example event-driven communication
await event_bus.publish("project.created", {
"project_id": "123",
"created_by": "user"
})
```
3. **Shared Database** (current reality)
- All modules use same Supabase instance
- Direct foreign keys between contexts
- Will need refactoring for true microservices
## Feature Flags
Features can be toggled via environment variables:
```python
# settings.py
PROJECTS_ENABLED = env.bool("PROJECTS_ENABLED", default=False)
TASK_ORDERING_ENABLED = env.bool("TASK_ORDERING_ENABLED", default=True)
AI_GENERATION_ENABLED = env.bool("AI_GENERATION_ENABLED", default=True)
``` ```
## Database Architecture ## Core Modules
Currently using a shared Supabase (PostgreSQL) database: ### Knowledge Management
**Backend**: `python/src/server/services/knowledge_service.py`
**Frontend**: `archon-ui-main/src/features/knowledge/`
**Features**: Web crawling, document upload, embeddings, RAG search
```sql ### Project Management
-- Knowledge context tables **Backend**: `python/src/server/services/project_*_service.py`
sources **Frontend**: `archon-ui-main/src/features/projects/`
documents **Features**: Projects, tasks, documents, version history
code_examples
-- Projects context tables ### MCP Server
archon_projects **Location**: `python/src/mcp_server/`
archon_tasks **Purpose**: Exposes tools to AI IDEs (Cursor, Windsurf)
archon_document_versions **Port**: 8051
-- Cross-context junction tables ### AI Agents
archon_project_sources -- Links projects to knowledge **Location**: `python/src/agents/`
``` **Purpose**: Document processing, code analysis, project generation
**Port**: 8052
## API Structure ## API Structure
Each feature exposes its own API routes: ### RESTful Endpoints
Pattern: `{METHOD} /api/{resource}/{id?}/{sub-resource?}`
``` **Examples from** `python/src/server/api_routes/`:
/api/knowledge/ - `/api/projects` - CRUD operations
/crawl # Web crawling - `/api/projects/{id}/tasks` - Nested resources
/upload # Document upload - `/api/knowledge/search` - RAG search
/search # RAG search - `/api/progress/{id}` - Operation status
/sources # Source management
/api/projects/ ### Service Layer
/projects # Project CRUD **Pattern**: `python/src/server/services/{feature}_service.py`
/tasks # Task management - Handles business logic
/tasks/reorder # Task ordering - Database operations via Supabase client
/documents # Document management - Returns typed responses
/generate # AI generation
## Frontend Architecture
### Data Fetching
**Core**: TanStack Query v5
**Configuration**: `archon-ui-main/src/features/shared/queryClient.ts`
**Patterns**: `archon-ui-main/src/features/shared/queryPatterns.ts`
### State Management
- **Server State**: TanStack Query
- **UI State**: React hooks & context
- **No Redux/Zustand**: Query cache handles all data
### Feature Organization
Each feature follows vertical slice pattern:
```text
features/{feature}/
├── components/ # UI components
├── hooks/ # Query hooks & keys
├── services/ # API calls
└── types/ # TypeScript types
``` ```
## Deployment Architecture ### Smart Polling
**Implementation**: `archon-ui-main/src/features/ui/hooks/useSmartPolling.ts`
- Visibility-aware (pauses when tab hidden)
- Variable intervals based on focus state
### Current mixed ## Database
### Future (service modules) **Provider**: Supabase (PostgreSQL + pgvector)
**Client**: `python/src/server/config/database.py`
Each module can become its own service: ### Main Tables
- `sources` - Knowledge sources
- `documents` - Document chunks with embeddings
- `code_examples` - Extracted code
- `archon_projects` - Projects
- `archon_tasks` - Tasks
- `archon_document_versions` - Version history
```yaml ## Key Architectural Decisions
# docker-compose.yml (future)
services:
knowledge:
image: archon-knowledge
ports: ["8001:8000"]
projects: ### Vertical Slices
image: archon-projects Features own their entire stack (UI → API → DB). See any `features/{feature}/` directory.
ports: ["8002:8000"]
mcp-server: ### No WebSockets
image: archon-mcp HTTP polling with smart intervals. ETag caching reduces bandwidth by ~70%.
ports: ["8051:8051"]
agents: ### Query-First State
image: archon-agents TanStack Query is the single source of truth. No separate state management needed.
ports: ["8052:8052"]
### Direct Database Values
No translation layers. Database values (e.g., `"todo"`, `"doing"`) used directly in UI.
### Browser-Native Caching
ETags handled by browser, not JavaScript. See `archon-ui-main/src/features/shared/apiWithEtag.ts`.
## Deployment
### Development
```bash
# Backend
docker compose up -d
# or
cd python && uv run python -m src.server.main
# Frontend
cd archon-ui-main && npm run dev
``` ```
## Migration Path ### Production
Single Docker Compose deployment with all services.
### Phase 1: Current State (Modules/service) ## Configuration
- All code in one repository ### Environment Variables
- Shared database **Required**: `SUPABASE_URL`, `SUPABASE_SERVICE_KEY`
- Single deployment **Optional**: See `.env.example`
### Phase 2: Vertical Slices ### Feature Flags
Controlled via Settings UI. Projects feature can be disabled.
- Reorganize by feature ## Recent Refactors (Phases 1-5)
- Clear module boundaries
- Feature flags for control
## Development Guidelines 1. **Removed ETag cache layer** - Browser handles HTTP caching
2. **Standardized query keys** - Each feature owns its keys
3. **Fixed optimistic updates** - UUID-based with nanoid
4. **Configured deduplication** - Centralized QueryClient
5. **Removed manual invalidations** - Trust backend consistency
### Adding a New Feature ## Performance Optimizations
1. **Identify the Module**: Which bounded context does it belong to? - **Request Deduplication**: Same query key = one request
2. **Create Feature Slice**: New folder under `module/features/` - **Smart Polling**: Adapts to tab visibility
3. **Implement Vertical Slice**: - **ETag Caching**: 70% bandwidth reduction
- `api.py` - HTTP endpoints - **Optimistic Updates**: Instant UI feedback
- `service.py` - Business logic
- `models.py` - Data models
- `repository.py` - Data access
- `tests/` - Feature tests
### Testing Strategy ## Testing
- **Unit Tests**: Each feature has its own tests **Frontend Tests**: `archon-ui-main/src/features/*/tests/`
- **Integration Tests**: Test module boundaries **Backend Tests**: `python/tests/`
- **E2E Tests**: Test complete user flows **Patterns**: Mock services and query patterns, not implementation
### Code Organization Rules
1. **Features are Self-Contained**: All code for a feature lives together
2. **No Cross-Feature Imports**: Use module's shared or API calls
3. **Shared is Minimal**: Only truly cross-cutting concerns
4. **Dependencies Point Inward**: Features → Module Shared → Global Shared
## Technology Stack
### Backend
- **FastAPI**: Web framework
- **Supabase**: Database and auth
- **PydanticAI**: AI agents
- **OpenAI**: Embeddings and LLM
- **Crawl4AI**: Web crawling
### Frontend
- **React**: UI framework
- **TypeScript**: Type safety
- **TailwindCSS**: Styling
- **React Query**: Data fetching
- **Vite**: Build tool
### Infrastructure
- **Docker**: Containerization
- **PostgreSQL**: Database (via Supabase, desire to support any PostgreSQL)
- **pgvector**: Vector storage, Desire to support ChromaDB, Pinecone, Weaviate, etc.
## Future Considerations ## Future Considerations
### Planned Improvements - Server-Sent Events for real-time updates
- GraphQL for selective field queries
1. **Remove Socket.IO**: Replace with polling (in progress) - Separate databases per bounded context
2. **API Gateway**: Central entry point for all services - Multi-tenant support
3. **Separate Databases**: One per bounded context
### Scalability Path
1. **Vertical Scaling**: Current approach, works for single-user
2. **Horizontal Scaling**: Add load balancer and multiple instances
---
This architecture provides a clear path from the current monolithic application to a more modular approach with vertical slicing, for easy potential to service separation if needed.

View File

@ -0,0 +1,192 @@
# Data Fetching Architecture
## Overview
Archon uses **TanStack Query v5** for all data fetching, caching, and synchronization. This replaces the former custom polling layer with a querycentric design that handles caching, deduplication, and smart refetching (including visibilityaware polling) automatically.
## Core Components
### 1. Query Client Configuration
**Location**: `archon-ui-main/src/features/shared/queryClient.ts`
Centralized QueryClient with:
- 30-second default stale time
- 10-minute garbage collection
- Smart retry logic (skips 4xx errors)
- Request deduplication enabled
- Structural sharing for optimized re-renders
### 2. Smart Polling Hook
**Location**: `archon-ui-main/src/features/ui/hooks/useSmartPolling.ts`
Visibility-aware polling that:
- Pauses when browser tab is hidden
- Slows down (1.5x interval) when tab is unfocused
- Returns `refetchInterval` for use with TanStack Query
### 3. Query Patterns
**Location**: `archon-ui-main/src/features/shared/queryPatterns.ts`
Shared constants:
- `DISABLED_QUERY_KEY` - For disabled queries
- `STALE_TIMES` - Standardized cache durations (instant, realtime, frequent, normal, rare, static)
## Feature Implementation Patterns
### Query Key Factories
Each feature maintains its own query keys:
- **Projects**: `archon-ui-main/src/features/projects/hooks/useProjectQueries.ts` (projectKeys)
- **Tasks**: `archon-ui-main/src/features/projects/tasks/hooks/useTaskQueries.ts` (taskKeys)
- **Knowledge**: `archon-ui-main/src/features/knowledge/hooks/useKnowledgeQueries.ts` (knowledgeKeys)
- **Progress**: `archon-ui-main/src/features/progress/hooks/useProgressQueries.ts` (progressKeys)
- **MCP**: `archon-ui-main/src/features/mcp/hooks/useMcpQueries.ts` (mcpKeys)
- **Documents**: `archon-ui-main/src/features/projects/documents/hooks/useDocumentQueries.ts` (documentKeys)
### Data Fetching Hooks
Standard pattern across all features:
- `use[Feature]()` - List queries
- `use[Feature]Detail(id)` - Single item queries
- `useCreate[Feature]()` - Creation mutations
- `useUpdate[Feature]()` - Update mutations
- `useDelete[Feature]()` - Deletion mutations
## Backend Integration
### ETag Support
**Location**: `archon-ui-main/src/features/shared/apiWithEtag.ts`
ETag implementation:
- Browser handles ETag headers automatically
- 304 responses reduce bandwidth
- TanStack Query manages cache state
### API Structure
Backend endpoints follow RESTful patterns:
- **Knowledge**: `python/src/server/api_routes/knowledge_api.py`
- **Projects**: `python/src/server/api_routes/projects_api.py`
- **Progress**: `python/src/server/api_routes/progress_api.py`
- **MCP**: `python/src/server/api_routes/mcp_api.py`
## Optimistic Updates
**Utilities**: `archon-ui-main/src/features/shared/optimistic.ts`
All mutations use nanoid-based optimistic updates:
- Creates temporary entities with `_optimistic` flag
- Replaces with server data on success
- Rollback on error
- Visual indicators for pending state
## Refetch Strategies
### Smart Polling Usage
**Implementation**: `archon-ui-main/src/features/ui/hooks/useSmartPolling.ts`
Polling intervals are defined in each feature's query hooks. See actual implementations:
- **Projects**: `archon-ui-main/src/features/projects/hooks/useProjectQueries.ts`
- **Tasks**: `archon-ui-main/src/features/projects/tasks/hooks/useTaskQueries.ts`
- **Knowledge**: `archon-ui-main/src/features/knowledge/hooks/useKnowledgeQueries.ts`
- **Progress**: `archon-ui-main/src/features/progress/hooks/useProgressQueries.ts`
- **MCP**: `archon-ui-main/src/features/mcp/hooks/useMcpQueries.ts`
Standard intervals from `archon-ui-main/src/features/shared/queryPatterns.ts`:
- `STALE_TIMES.instant`: 0ms (always fresh)
- `STALE_TIMES.frequent`: 5 seconds (frequently changing data)
- `STALE_TIMES.normal`: 30 seconds (standard cache)
### Manual Refetch
All queries expose `refetch()` for manual updates.
## Performance Optimizations
### Request Deduplication
Handled automatically by TanStack Query when same query key is used.
### Stale Time Configuration
Defined in `STALE_TIMES` and used consistently:
- Auth/Settings: `Infinity` (never stale)
- Active operations: `0` (always fresh)
- Normal data: `30_000` (30 seconds)
- Rare updates: `300_000` (5 minutes)
### Garbage Collection
Unused data removed after 10 minutes (configurable in queryClient).
## Migration from Polling
### What Changed (Phases 1-5)
1. **Phase 1**: Removed ETag cache layer
2. **Phase 2**: Standardized query keys
3. **Phase 3**: Fixed optimistic updates with UUIDs
4. **Phase 4**: Configured request deduplication
5. **Phase 5**: Removed manual invalidations
### Deprecated Patterns
- `usePolling` hook (removed)
- `useCrawlProgressPolling` (removed)
- Manual cache invalidation with setTimeout
- Socket.IO connections
- Double-layer caching
## Testing Patterns
### Hook Testing
**Example**: `archon-ui-main/src/features/projects/hooks/tests/useProjectQueries.test.ts`
Standard mocking approach for:
- Service methods
- Query patterns (STALE_TIMES, DISABLED_QUERY_KEY)
- Smart polling behavior
### Integration Testing
Use React Testing Library with QueryClientProvider wrapper.
## Developer Guidelines
### Adding New Data Fetching
1. Create query key factory in `{feature}/hooks/use{Feature}Queries.ts`
2. Use `useQuery` with appropriate stale time from `STALE_TIMES`
3. Add smart polling if real-time updates needed
4. Implement optimistic updates for mutations
5. Follow existing patterns in similar features
### Common Patterns to Follow
- Always use query key factories
- Never hardcode stale times
- Use `DISABLED_QUERY_KEY` for conditional queries
- Implement optimistic updates for better UX
- Add loading and error states
## Future Considerations
- Server-Sent Events for true real-time (post-Phase 5)
- WebSocket fallback for critical updates
- GraphQL migration for selective field updates

View File

@ -1,39 +1,149 @@
# ETag Implementation # ETag Implementation
## Current Implementation ## Overview
Our ETag implementation provides efficient HTTP caching for polling endpoints to reduce bandwidth usage. Archon implements HTTP ETag caching to optimize bandwidth usage by reducing redundant data transfers. The implementation leverages browser-native HTTP caching combined with backend ETag generation for efficient cache validation.
### What It Does ## How It Works
- **Generates ETags**: Creates MD5 hashes of JSON response data
- **Checks ETags**: Simple string equality comparison between client's `If-None-Match` header and current data's ETag
- **Returns 304**: When ETags match, returns `304 Not Modified` with no body (saves bandwidth)
### How It Works ### Backend ETag Generation
1. Server generates ETag from response data using MD5 hash **Location**: `python/src/server/utils/etag_utils.py`
2. Client sends previous ETag in `If-None-Match` header
3. Server compares ETags:
- **Match**: Returns 304 (no body)
- **No match**: Returns 200 with new data and new ETag
### Example The backend generates ETags for API responses:
```python - Creates MD5 hash of JSON-serialized response data
# Server generates: ETag: "a3c2f1e4b5d6789" - Returns quoted ETag string (RFC 7232 format)
# Client sends: If-None-Match: "a3c2f1e4b5d6789" - Sets `Cache-Control: no-cache, must-revalidate` headers
# Server returns: 304 Not Modified (no body) - Compares client's `If-None-Match` header with current data's ETag
``` - Returns `304 Not Modified` when ETags match
## Limitations ### Frontend Handling
**Location**: `archon-ui-main/src/features/shared/apiWithEtag.ts`
Our implementation is simplified and doesn't support full RFC 7232 features: The frontend relies on browser-native HTTP caching:
- ❌ Wildcard (`*`) matching - Browser automatically sends `If-None-Match` headers with cached ETags
- ❌ Multiple ETags (`"etag1", "etag2"`) - Browser handles 304 responses by returning cached data from HTTP cache
- ❌ Weak validators (`W/"etag"`) - No manual ETag tracking or cache management needed
- ✅ Single ETag comparison only - TanStack Query manages data freshness through `staleTime` configuration
This works perfectly for our browser-to-API polling use case but may need enhancement for CDN/proxy support. #### Browser vs Non-Browser Behavior
- **Standard Browsers**: Per the Fetch spec, a 304 response freshens the HTTP cache and returns the cached body to JavaScript
- **Non-Browser Runtimes** (React Native, custom fetch): May surface 304 with empty body to JavaScript
- **Client Fallback**: The `apiWithEtag.ts` implementation handles both scenarios, ensuring consistent behavior across environments
## Files ## Implementation Details
- Implementation: `python/src/server/utils/etag_utils.py`
- Tests: `python/tests/server/utils/test_etag_utils.py` ### Backend API Integration
- Used in: Progress API, Projects API polling endpoints
ETags are used in these API routes:
- **Projects**: `python/src/server/api_routes/projects_api.py`
- Project lists
- Task lists
- Task counts
- **Progress**: `python/src/server/api_routes/progress_api.py`
- Active operations tracking
### ETag Generation Process
1. **Data Serialization**: Response data is JSON-serialized with sorted keys for consistency
2. **Hash Creation**: MD5 hash generated from JSON string
3. **Format**: Returns quoted string per RFC 7232 (e.g., `"a3c2f1e4b5d6789"`)
### Cache Validation Flow
1. **Initial Request**: Server generates ETag and sends with response
2. **Subsequent Requests**: Browser sends `If-None-Match` header with cached ETag
3. **Server Validation**:
- ETags match → Returns `304 Not Modified` (no body)
- ETags differ → Returns `200 OK` with new data and new ETag
4. **Browser Behavior**: On 304, browser serves cached response to JavaScript
## Key Design Decisions
### Browser-Native Caching
The implementation leverages browser HTTP caching instead of manual cache management:
- Reduces code complexity
- Eliminates cache synchronization issues
- Works seamlessly with TanStack Query
- Maintains bandwidth optimization
### No Manual ETag Tracking
Unlike previous implementations, the current approach:
- Does NOT maintain ETag maps in JavaScript
- Does NOT manually handle 304 responses
- Lets browser and TanStack Query handle caching layers
## Integration with TanStack Query
### Cache Coordination
- **Browser Cache**: Handles HTTP-level caching (ETags/304s)
- **TanStack Query Cache**: Manages application-level data freshness
- **Separation of Concerns**: HTTP caching for bandwidth, TanStack for state
### Configuration
Cache behavior is controlled through TanStack Query's `staleTime`:
- See `archon-ui-main/src/features/shared/queryPatterns.ts` for standard times
- See `archon-ui-main/src/features/shared/queryClient.ts` for global configuration
## Performance Benefits
### Bandwidth Reduction
- ~70% reduction in data transfer for unchanged responses (based on internal measurements)
- Especially effective for polling patterns
- Significant improvement for mobile/slow connections
### Server Load
- Reduced JSON serialization for 304 responses
- Lower network I/O
- Faster response times for cached data
## Files and References
### Core Implementation
- **Backend Utilities**: `python/src/server/utils/etag_utils.py`
- **Frontend Client**: `archon-ui-main/src/features/shared/apiWithEtag.ts`
- **Tests**: `python/tests/server/utils/test_etag_utils.py`
### Usage Examples
- **Projects API**: `python/src/server/api_routes/projects_api.py` (lines with `generate_etag`, `check_etag`)
- **Progress API**: `python/src/server/api_routes/progress_api.py` (active operations tracking)
## Testing
### Backend Testing
Tests in `python/tests/server/utils/test_etag_utils.py` verify:
- Correct ETag generation format
- Consistent hashing for same data
- Different hashes for different data
- Proper quote formatting
### Frontend Testing
Browser DevTools verification:
1. Network tab shows `If-None-Match` headers on requests
2. 304 responses have no body
3. Response served from cache on 304
4. New ETag values when data changes
## Monitoring
### How to Verify ETags are Working
1. Open Chrome DevTools → Network tab
2. Make a request to a supported endpoint
3. Note the `ETag` response header
4. Refresh or re-request the same data
5. Observe:
- Request includes `If-None-Match` header
- Server returns `304 Not Modified` if unchanged
- Response body is empty on 304
- Browser serves cached data
### Metrics to Track
- Ratio of 304 vs 200 responses
- Bandwidth saved through 304 responses
- Cache hit rate in production
## Future Considerations
- Consider implementing strong vs weak ETags for more granular control
- Evaluate adding ETag support to more endpoints
- Monitor cache effectiveness in production
- Consider Last-Modified headers as supplementary validation

View File

@ -1,194 +0,0 @@
# Polling Architecture Documentation
## Overview
Archon V2 uses HTTP polling instead of WebSockets for real-time updates. This simplifies the architecture, reduces complexity, and improves maintainability while providing adequate responsiveness for project management tasks.
## Core Components
### 1. usePolling Hook (`archon-ui-main/src/hooks/usePolling.ts`)
Generic polling hook that manages periodic data fetching with smart optimizations.
**Key Features:**
- Configurable polling intervals (default: 3 seconds)
- Automatic pause during browser tab inactivity
- ETag-based caching to reduce bandwidth
- Manual refresh capability
**Usage:**
```typescript
const { data, isLoading, error, refetch } = usePolling('/api/projects', {
interval: 5000,
enabled: true,
onSuccess: (data) => console.log('Projects updated:', data)
});
```
### 2. Specialized Progress Services
Individual services handle specific progress tracking needs:
**CrawlProgressService (`archon-ui-main/src/services/crawlProgressService.ts`)**
- Tracks website crawling operations
- Maps backend status to UI-friendly format
- Includes in-flight request guard to prevent overlapping fetches
- 1-second polling interval during active crawls
**Polling Endpoints:**
- `/api/projects` - Project list updates
- `/api/projects/{project_id}/tasks` - Task list for active project
- `/api/crawl-progress/{progress_id}` - Website crawling progress
- `/api/agent-chat/sessions/{session_id}/messages` - Chat messages
## Backend Support
### ETag Implementation (`python/src/server/utils/etag_utils.py`)
Server-side optimization to reduce unnecessary data transfer.
**How it works:**
1. Server generates ETag hash from response data
2. Client sends `If-None-Match` header with cached ETag
3. Server returns 304 Not Modified if data unchanged
4. Client uses cached data, reducing bandwidth by ~70%
### Progress API (`python/src/server/api_routes/progress_api.py`)
Dedicated endpoints for progress tracking:
- `GET /api/crawl-progress/{progress_id}` - Returns crawling status with ETag support
- Includes completion percentage, current step, and error details
## State Management
### Loading States
Visual feedback during operations:
- `movingTaskIds: Set<string>` - Tracks tasks being moved
- `isSwitchingProject: boolean` - Project transition state
- Loading overlays prevent concurrent operations
## Error Handling
### Retry Strategy
```typescript
retryCount: 3
retryDelay: attempt => Math.min(1000 * 2 ** attempt, 30000)
```
- Exponential backoff: 1s, 2s, 4s...
- Maximum retry delay: 30 seconds
- Automatic recovery after network issues
### User Feedback
- Toast notifications for errors
- Loading spinners during operations
- Clear error messages with recovery actions
## Performance Optimizations
### 1. Request Deduplication
Prevents multiple components from making identical requests:
```typescript
const cacheKey = `${endpoint}-${JSON.stringify(params)}`;
if (pendingRequests.has(cacheKey)) {
return pendingRequests.get(cacheKey);
}
```
### 2. Smart Polling Intervals
- Active operations: 1-2 second intervals
- Background data: 5-10 second intervals
- Paused when tab inactive (visibility API)
### 3. Selective Updates
Only polls active/relevant data:
- Tasks poll only for selected project
- Progress polls only during active operations
- Chat polls only for open sessions
## Architecture Benefits
### What We Have
- **Simple HTTP polling** - Standard request/response pattern
- **Automatic error recovery** - Built-in retry with exponential backoff
- **ETag caching** - 70% bandwidth reduction via 304 responses
- **Easy debugging** - Standard HTTP requests visible in DevTools
- **No connection limits** - Scales with standard HTTP infrastructure
- **Consolidated polling hooks** - Single pattern for all data fetching
### Trade-offs
- **Latency:** 1-5 second delay vs instant updates
- **Bandwidth:** More requests, but mitigated by ETags
- **Battery:** Slightly higher mobile battery usage
## Developer Guidelines
### Adding New Polling Endpoint
1. **Frontend - Use the usePolling hook:**
```typescript
// In your component or custom hook
const { data, isLoading, error, refetch } = usePolling('/api/new-endpoint', {
interval: 5000,
enabled: true,
staleTime: 2000
});
```
2. **Backend - Add ETag support:**
```python
from ..utils.etag_utils import generate_etag, check_etag
@router.get("/api/new-endpoint")
async def get_data(request: Request):
data = fetch_data()
etag = generate_etag(data)
if check_etag(request, etag):
return Response(status_code=304)
return JSONResponse(
content=data,
headers={"ETag": etag}
)
```
3. **For progress tracking, use useCrawlProgressPolling:**
```typescript
const { data, isLoading } = useCrawlProgressPolling(operationId, {
onSuccess: (data) => {
if (data.status === 'completed') {
// Handle completion
}
}
});
```
### Best Practices
1. **Always provide loading states** - Users should know when data is updating
2. **Handle errors gracefully** - Show toast notifications with clear messages
3. **Respect polling intervals** - Don't poll faster than necessary
4. **Clean up on unmount** - Cancel pending requests when components unmount
5. **Use ETag caching** - Reduce bandwidth with 304 responses
## Testing Polling Behavior
### Manual Testing
1. Open Network tab in DevTools
2. Look for requests with 304 status (cache hits)
3. Verify polling stops when switching tabs
4. Test error recovery by stopping backend
### Debugging Tips
- Check `localStorage` for cached ETags
- Monitor `console.log` for polling lifecycle events
- Use React DevTools to inspect hook states
- Watch for memory leaks in long-running sessions
## Future Improvements
### Planned Enhancements
- WebSocket fallback for critical updates
- Configurable per-user polling rates
- Smart polling based on user activity patterns
- GraphQL subscriptions for selective field updates
### Considered Alternatives
- Server-Sent Events (SSE) - One-way real-time updates
- Long polling - Reduced request frequency
- WebRTC data channels - P2P updates between clients

View File

@ -106,6 +106,8 @@ export function useFeatureDetail(id: string | undefined) {
## Mutations with Optimistic Updates ## Mutations with Optimistic Updates
```typescript ```typescript
import { createOptimisticEntity, replaceOptimisticEntity } from "@/features/shared/optimistic";
export function useCreateFeature() { export function useCreateFeature() {
const queryClient = useQueryClient(); const queryClient = useQueryClient();
@ -119,13 +121,13 @@ export function useCreateFeature() {
// Snapshot for rollback // Snapshot for rollback
const previous = queryClient.getQueryData(featureKeys.lists()); const previous = queryClient.getQueryData(featureKeys.lists());
// Optimistic update (use timestamp IDs for now - Phase 3 will use UUIDs) // Optimistic update with nanoid for stable IDs
const tempId = `temp-${Date.now()}`; const optimisticEntity = createOptimisticEntity(newData);
queryClient.setQueryData(featureKeys.lists(), (old: Feature[] = []) => queryClient.setQueryData(featureKeys.lists(), (old: Feature[] = []) =>
[...old, { ...newData, id: tempId }] [...old, optimisticEntity]
); );
return { previous, tempId }; return { previous, localId: optimisticEntity._localId };
}, },
onError: (err, variables, context) => { onError: (err, variables, context) => {
@ -138,7 +140,7 @@ export function useCreateFeature() {
onSuccess: (data, variables, context) => { onSuccess: (data, variables, context) => {
// Replace optimistic with real data // Replace optimistic with real data
queryClient.setQueryData(featureKeys.lists(), (old: Feature[] = []) => queryClient.setQueryData(featureKeys.lists(), (old: Feature[] = []) =>
old.map(item => item.id === context?.tempId ? data : item) replaceOptimisticEntity(old, context?.localId, data)
); );
}, },
}); });
@ -176,7 +178,7 @@ vi.mock("../../../shared/queryPatterns", () => ({
Each feature is self-contained: Each feature is self-contained:
``` ```text
src/features/projects/ src/features/projects/
├── components/ # UI components ├── components/ # UI components
├── hooks/ ├── hooks/
@ -189,7 +191,7 @@ src/features/projects/
Sub-features (like tasks under projects) follow the same structure: Sub-features (like tasks under projects) follow the same structure:
``` ```text
src/features/projects/tasks/ src/features/projects/tasks/
├── components/ ├── components/
├── hooks/ ├── hooks/
@ -220,8 +222,16 @@ When refactoring to these patterns:
4. **Don't skip mocking in tests** - Mock both services and patterns 4. **Don't skip mocking in tests** - Mock both services and patterns
5. **Don't use inconsistent patterns** - Follow the established conventions 5. **Don't use inconsistent patterns** - Follow the established conventions
## Future Improvements (Phase 3+) ## Completed Improvements (Phases 1-5)
- ✅ Phase 1: Removed manual frontend ETag cache layer (backend ETags remain; browser-managed)
- ✅ Phase 2: Standardized query keys with factories
- ✅ Phase 3: Implemented UUID-based optimistic updates using nanoid
- ✅ Phase 4: Configured request deduplication
- ✅ Phase 5: Removed manual cache invalidations
## Future Considerations
- Replace timestamp IDs (`temp-${Date.now()}`) with UUIDs
- Add Server-Sent Events for real-time updates - Add Server-Sent Events for real-time updates
- Consider Zustand for complex client state - Consider WebSocket fallback for critical updates
- Evaluate Zustand for complex client state management