- Updated docker-compose.yml to include PLAYWRIGHT_BROWSERS_PATH and MCP_PUBLIC_URL environment variables. - Modified k8s-manifests-complete.yaml to add Playwright and MCP configurations in the ConfigMap and deployment spec. - Adjusted resource limits in k8s manifests for improved performance during crawling. - Updated Dockerfiles to install Playwright browsers in accessible locations for appuser. - Added HTTP health check endpoint in mcp_server.py for better monitoring. - Enhanced MCP API to utilize MCP_PUBLIC_URL for generating client configuration. - Created MCP_PUBLIC_URL_GUIDE.md for detailed configuration instructions. - Documented changes and recommendations in K8S_COMPLETE_ADJUSTMENTS.md.
827 lines
20 KiB
Markdown
827 lines
20 KiB
Markdown
# Kubernetes Complete Adjustments Guide
|
|
|
|
## Executive Summary
|
|
|
|
Este documento descreve **todas as mudanças necessárias** para executar o Archon em produção no Kubernetes, não apenas o Playwright. As mudanças cobrem:
|
|
|
|
- ✅ Playwright browser binaries (JÁ CORRIGIDO)
|
|
- ⚠️ Variáveis de ambiente em K8s manifests
|
|
- ⚠️ Resource limits para crawling
|
|
- ⚠️ Nginx permissions e configuration
|
|
- ⚠️ Security contexts avançados
|
|
- ⚠️ Health checks otimizados
|
|
- ⚠️ Init containers para warm-up
|
|
|
|
---
|
|
|
|
## 1. Playwright Browser Binaries (✅ JÁ CORRIGIDO)
|
|
|
|
### Problema Identificado
|
|
Playwright instalava binários em `/root/.cache/ms-playwright` (root), mas container roda como `appuser` (UID 1001) e não tinha acesso.
|
|
|
|
### Solução Aplicada
|
|
|
|
**Dockerfile.k8s.server:**
|
|
```dockerfile
|
|
# Install Playwright browsers in a location accessible to appuser
|
|
ENV PATH=/venv/bin:$PATH
|
|
ENV PLAYWRIGHT_BROWSERS_PATH=/app/ms-playwright
|
|
RUN mkdir -p /app/ms-playwright && \
|
|
playwright install chromium && \
|
|
chown -R appuser:appuser /app/ms-playwright
|
|
|
|
# Runtime environment
|
|
ENV PLAYWRIGHT_BROWSERS_PATH=/app/ms-playwright
|
|
```
|
|
|
|
**Dockerfile.server (Docker Compose):**
|
|
```dockerfile
|
|
ENV PLAYWRIGHT_BROWSERS_PATH=/tmp/ms-playwright
|
|
RUN mkdir -p /tmp/ms-playwright && \
|
|
playwright install chromium && \
|
|
chmod -R 777 /tmp/ms-playwright
|
|
|
|
ENV PLAYWRIGHT_BROWSERS_PATH=/tmp/ms-playwright
|
|
```
|
|
|
|
### ⚠️ AÇÃO NECESSÁRIA: Adicionar em K8s Manifests
|
|
|
|
**Adicionar em `k8s-manifests-complete.yaml` - archon-server deployment:**
|
|
|
|
```yaml
|
|
spec:
|
|
template:
|
|
spec:
|
|
containers:
|
|
- name: server
|
|
env:
|
|
# ... outras variáveis ...
|
|
|
|
# ADICIONAR ESTA LINHA:
|
|
- name: PLAYWRIGHT_BROWSERS_PATH
|
|
value: "/app/ms-playwright"
|
|
```
|
|
|
|
---
|
|
|
|
## 2. Resource Limits para Crawling com Chromium
|
|
|
|
### Problema
|
|
Chromium consome significativa memória e CPU durante crawling. Os limites atuais podem ser insuficientes:
|
|
|
|
**Atual:**
|
|
```yaml
|
|
resources:
|
|
requests:
|
|
memory: "512Mi"
|
|
cpu: "500m"
|
|
limits:
|
|
memory: "1Gi"
|
|
cpu: "1000m"
|
|
```
|
|
|
|
### Solução Recomendada
|
|
|
|
**Atualizar em `k8s-manifests-complete.yaml` - archon-server:**
|
|
|
|
```yaml
|
|
resources:
|
|
requests:
|
|
memory: "768Mi" # Aumentado de 512Mi
|
|
cpu: "500m"
|
|
limits:
|
|
memory: "2Gi" # Aumentado de 1Gi (Chromium pode usar 1.5Gi em picos)
|
|
cpu: "2000m" # Aumentado de 1000m (crawling paralelo)
|
|
|
|
# ADICIONAR: Limitar uso de ephemeral storage
|
|
ephemeral-storage: "5Gi"
|
|
```
|
|
|
|
### Justificativa
|
|
- Chromium headless consome ~300-600MB por instância
|
|
- Crawling paralelo pode executar múltiplas instâncias
|
|
- Processamento de documentos grandes precisa de memória
|
|
- Margem de segurança para evitar OOMKilled
|
|
|
|
---
|
|
|
|
## 3. Nginx Configuration e Permissions
|
|
|
|
### Status Atual
|
|
✅ Nginx já configurado para rodar como non-root (user `nginx`, UID 101)
|
|
|
|
**Dockerfile.k8s.production:**
|
|
```dockerfile
|
|
RUN chown -R nginx:nginx /usr/share/nginx/html /var/cache/nginx /var/log/nginx /etc/nginx/conf.d && \
|
|
touch /var/run/nginx.pid && \
|
|
chown -R nginx:nginx /var/run/nginx.pid
|
|
|
|
USER nginx
|
|
```
|
|
|
|
### ⚠️ Melhorias Recomendadas
|
|
|
|
**Adicionar em `k8s-manifests-complete.yaml` - archon-frontend:**
|
|
|
|
```yaml
|
|
spec:
|
|
template:
|
|
spec:
|
|
securityContext:
|
|
runAsNonRoot: true
|
|
runAsUser: 101 # nginx user
|
|
runAsGroup: 101
|
|
fsGroup: 101
|
|
# ADICIONAR:
|
|
seccompProfile:
|
|
type: RuntimeDefault
|
|
|
|
containers:
|
|
- name: frontend
|
|
securityContext:
|
|
allowPrivilegeEscalation: false
|
|
capabilities:
|
|
drop:
|
|
- ALL
|
|
# Nginx não precisa de capabilities especiais na porta 3737
|
|
readOnlyRootFilesystem: true # MUDAR para true
|
|
|
|
# ADICIONAR volumes para diretórios que nginx precisa escrever:
|
|
volumeMounts:
|
|
- name: nginx-cache
|
|
mountPath: /var/cache/nginx
|
|
- name: nginx-run
|
|
mountPath: /var/run
|
|
- name: nginx-logs
|
|
mountPath: /var/log/nginx
|
|
|
|
volumes:
|
|
- name: nginx-cache
|
|
emptyDir: {}
|
|
- name: nginx-run
|
|
emptyDir: {}
|
|
- name: nginx-logs
|
|
emptyDir: {}
|
|
```
|
|
|
|
---
|
|
|
|
## 4. Advanced Security Contexts
|
|
|
|
### Problema
|
|
Security contexts estão básicos. Podem ser fortalecidos para melhor segurança.
|
|
|
|
### Solução: Pod Security Standards
|
|
|
|
**Adicionar em TODOS os deployments:**
|
|
|
|
```yaml
|
|
spec:
|
|
template:
|
|
metadata:
|
|
labels:
|
|
app: archon-server # ou mcp, frontend, etc
|
|
# ADICIONAR:
|
|
pod-security.kubernetes.io/enforce: baseline
|
|
pod-security.kubernetes.io/audit: restricted
|
|
pod-security.kubernetes.io/warn: restricted
|
|
|
|
spec:
|
|
# Security context do pod
|
|
securityContext:
|
|
runAsNonRoot: true
|
|
runAsUser: 1001
|
|
runAsGroup: 1001
|
|
fsGroup: 1001
|
|
# ADICIONAR:
|
|
seccompProfile:
|
|
type: RuntimeDefault
|
|
supplementalGroups: []
|
|
|
|
# Security context do container
|
|
containers:
|
|
- name: server
|
|
securityContext:
|
|
allowPrivilegeEscalation: false
|
|
runAsNonRoot: true
|
|
runAsUser: 1001
|
|
capabilities:
|
|
drop:
|
|
- ALL
|
|
# ADICIONAR (se possível - testar primeiro):
|
|
readOnlyRootFilesystem: false # true após configurar volumes
|
|
seccompProfile:
|
|
type: RuntimeDefault
|
|
```
|
|
|
|
### Arquivos que Precisam Escrever
|
|
|
|
**archon-server:**
|
|
- `/app/ms-playwright` - Playwright browser cache (já configurado com ownership correto)
|
|
- `/tmp` - Temporary files (já acessível para appuser)
|
|
- Nenhum volume persistente necessário (tudo vai para Supabase)
|
|
|
|
**archon-mcp e archon-agents:**
|
|
- Nenhum arquivo local necessário
|
|
- Podem usar `readOnlyRootFilesystem: true`
|
|
|
|
---
|
|
|
|
## 5. Health Checks Otimizados
|
|
|
|
### Problema Atual
|
|
Health checks podem ser muito agressivos durante operações pesadas (crawling).
|
|
|
|
### Solução
|
|
|
|
**Atualizar em `k8s-manifests-complete.yaml` - archon-server:**
|
|
|
|
```yaml
|
|
livenessProbe:
|
|
httpGet:
|
|
path: /health
|
|
port: 8181
|
|
initialDelaySeconds: 60 # Aumentado de 40 (tempo para Playwright inicializar)
|
|
periodSeconds: 30 # OK
|
|
timeoutSeconds: 15 # Aumentado de 10 (crawling pode deixar servidor lento)
|
|
failureThreshold: 5 # Aumentado de 3 (mais tolerante)
|
|
successThreshold: 1
|
|
|
|
readinessProbe:
|
|
httpGet:
|
|
path: /health
|
|
port: 8181
|
|
initialDelaySeconds: 15 # Aumentado de 10
|
|
periodSeconds: 10
|
|
timeoutSeconds: 5
|
|
failureThreshold: 3
|
|
successThreshold: 1
|
|
|
|
# ADICIONAR startup probe para não matar pod durante startup lento:
|
|
startupProbe:
|
|
httpGet:
|
|
path: /health
|
|
port: 8181
|
|
initialDelaySeconds: 10
|
|
periodSeconds: 10
|
|
timeoutSeconds: 5
|
|
failureThreshold: 12 # 12 x 10s = 2 minutos para startup
|
|
successThreshold: 1
|
|
```
|
|
|
|
---
|
|
|
|
## 6. Init Container para Playwright Warm-up (Opcional mas Recomendado)
|
|
|
|
### Problema
|
|
Primeira requisição de crawling é lenta porque Playwright precisa inicializar.
|
|
|
|
### Solução
|
|
|
|
**Adicionar em `k8s-manifests-complete.yaml` - archon-server:**
|
|
|
|
```yaml
|
|
spec:
|
|
template:
|
|
spec:
|
|
# ADICIONAR antes de containers:
|
|
initContainers:
|
|
- name: playwright-warmup
|
|
image: git.automatizase.com.br/luis.erlacher/archon/server:k8s-latest
|
|
imagePullPolicy: Always
|
|
command:
|
|
- sh
|
|
- -c
|
|
- |
|
|
echo "Verificando instalação do Playwright..."
|
|
python -c "from playwright.sync_api import sync_playwright; print('Playwright OK')" || exit 1
|
|
echo "Playwright inicializado com sucesso"
|
|
env:
|
|
- name: PLAYWRIGHT_BROWSERS_PATH
|
|
value: "/app/ms-playwright"
|
|
resources:
|
|
requests:
|
|
memory: "256Mi"
|
|
cpu: "200m"
|
|
limits:
|
|
memory: "512Mi"
|
|
cpu: "500m"
|
|
securityContext:
|
|
runAsNonRoot: true
|
|
runAsUser: 1001
|
|
allowPrivilegeEscalation: false
|
|
capabilities:
|
|
drop:
|
|
- ALL
|
|
readOnlyRootFilesystem: false
|
|
|
|
containers:
|
|
- name: server
|
|
# ... resto da configuração ...
|
|
```
|
|
|
|
---
|
|
|
|
## 7. ConfigMap Updates
|
|
|
|
### Adicionar Playwright e outras configurações
|
|
|
|
**Atualizar em `k8s-manifests-complete.yaml` - ConfigMap:**
|
|
|
|
```yaml
|
|
apiVersion: v1
|
|
kind: ConfigMap
|
|
metadata:
|
|
name: archon-config
|
|
namespace: archon
|
|
data:
|
|
# Existing configs...
|
|
SERVICE_DISCOVERY_MODE: "kubernetes"
|
|
LOG_LEVEL: "INFO"
|
|
ARCHON_SERVER_PORT: "8181"
|
|
ARCHON_MCP_PORT: "8051"
|
|
ARCHON_UI_PORT: "3737"
|
|
ARCHON_HOST: "localhost"
|
|
TRANSPORT: "sse"
|
|
AGENTS_ENABLED: "false"
|
|
|
|
# ADICIONAR:
|
|
PLAYWRIGHT_BROWSERS_PATH: "/app/ms-playwright"
|
|
|
|
# MCP Public URL - IMPORTANTE: Configure com seu domínio!
|
|
# Format: "domain.com:8051" or "localhost:8051"
|
|
# Examples:
|
|
# - Development: localhost:8051
|
|
# - Production: archon.automatizase.com.br:8051
|
|
# - Custom: mcp.mycompany.com:8051
|
|
# This is used to generate MCP client configuration JSON
|
|
MCP_PUBLIC_URL: "archon.automatizase.com.br:8051" # ← CHANGE THIS!
|
|
|
|
# Chromium optimization flags (já configurados no código, mas podem ser sobrescritos):
|
|
CHROMIUM_DISABLE_DEV_SHM: "true"
|
|
CHROMIUM_HEADLESS: "true"
|
|
```
|
|
|
|
---
|
|
|
|
## 8. Network Policies (Segurança Adicional)
|
|
|
|
### Criar Network Policy para isolar pods
|
|
|
|
**Criar arquivo `k8s-network-policies.yaml`:**
|
|
|
|
```yaml
|
|
---
|
|
# Network Policy - Archon Server
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: archon-server-netpol
|
|
namespace: archon
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app: archon-server
|
|
policyTypes:
|
|
- Ingress
|
|
- Egress
|
|
ingress:
|
|
# Permite tráfego do frontend
|
|
- from:
|
|
- podSelector:
|
|
matchLabels:
|
|
app: archon-frontend
|
|
ports:
|
|
- protocol: TCP
|
|
port: 8181
|
|
# Permite tráfego do MCP
|
|
- from:
|
|
- podSelector:
|
|
matchLabels:
|
|
app: archon-mcp
|
|
ports:
|
|
- protocol: TCP
|
|
port: 8181
|
|
egress:
|
|
# Permite DNS
|
|
- to:
|
|
- namespaceSelector:
|
|
matchLabels:
|
|
name: kube-system
|
|
ports:
|
|
- protocol: UDP
|
|
port: 53
|
|
# Permite Supabase (internet)
|
|
- to:
|
|
- namespaceSelector: {}
|
|
ports:
|
|
- protocol: TCP
|
|
port: 443
|
|
# Permite comunicação com MCP
|
|
- to:
|
|
- podSelector:
|
|
matchLabels:
|
|
app: archon-mcp
|
|
ports:
|
|
- protocol: TCP
|
|
port: 8051
|
|
|
|
---
|
|
# Network Policy - Archon MCP
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: archon-mcp-netpol
|
|
namespace: archon
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app: archon-mcp
|
|
policyTypes:
|
|
- Ingress
|
|
- Egress
|
|
ingress:
|
|
# Permite tráfego do server
|
|
- from:
|
|
- podSelector:
|
|
matchLabels:
|
|
app: archon-server
|
|
ports:
|
|
- protocol: TCP
|
|
port: 8051
|
|
egress:
|
|
# Permite DNS
|
|
- to:
|
|
- namespaceSelector:
|
|
matchLabels:
|
|
name: kube-system
|
|
ports:
|
|
- protocol: UDP
|
|
port: 53
|
|
# Permite Supabase
|
|
- to:
|
|
- namespaceSelector: {}
|
|
ports:
|
|
- protocol: TCP
|
|
port: 443
|
|
# Permite comunicação com server
|
|
- to:
|
|
- podSelector:
|
|
matchLabels:
|
|
app: archon-server
|
|
ports:
|
|
- protocol: TCP
|
|
port: 8181
|
|
|
|
---
|
|
# Network Policy - Archon Frontend
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: archon-frontend-netpol
|
|
namespace: archon
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app: archon-frontend
|
|
policyTypes:
|
|
- Ingress
|
|
- Egress
|
|
ingress:
|
|
# Permite tráfego de qualquer lugar (public-facing)
|
|
- {}
|
|
egress:
|
|
# Permite DNS
|
|
- to:
|
|
- namespaceSelector:
|
|
matchLabels:
|
|
name: kube-system
|
|
ports:
|
|
- protocol: UDP
|
|
port: 53
|
|
# Permite comunicação com server (para API calls)
|
|
- to:
|
|
- podSelector:
|
|
matchLabels:
|
|
app: archon-server
|
|
ports:
|
|
- protocol: TCP
|
|
port: 8181
|
|
```
|
|
|
|
---
|
|
|
|
## 9. Horizontal Pod Autoscaling (HPA)
|
|
|
|
### Configurar autoscaling para server
|
|
|
|
**Criar arquivo `k8s-hpa.yaml`:**
|
|
|
|
```yaml
|
|
---
|
|
# HPA - Archon Server (crawling pode ter spikes de carga)
|
|
apiVersion: autoscaling/v2
|
|
kind: HorizontalPodAutoscaler
|
|
metadata:
|
|
name: archon-server-hpa
|
|
namespace: archon
|
|
spec:
|
|
scaleTargetRef:
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
name: archon-server
|
|
minReplicas: 2
|
|
maxReplicas: 5
|
|
metrics:
|
|
- type: Resource
|
|
resource:
|
|
name: cpu
|
|
target:
|
|
type: Utilization
|
|
averageUtilization: 70
|
|
- type: Resource
|
|
resource:
|
|
name: memory
|
|
target:
|
|
type: Utilization
|
|
averageUtilization: 80
|
|
behavior:
|
|
scaleDown:
|
|
stabilizationWindowSeconds: 300 # Espera 5min antes de scale down
|
|
policies:
|
|
- type: Percent
|
|
value: 50
|
|
periodSeconds: 60
|
|
scaleUp:
|
|
stabilizationWindowSeconds: 30 # Scale up rápido
|
|
policies:
|
|
- type: Percent
|
|
value: 100
|
|
periodSeconds: 30
|
|
|
|
---
|
|
# HPA - Frontend (menos crítico, pode ser fixo em 2 réplicas)
|
|
# Opcional se houver muito tráfego
|
|
apiVersion: autoscaling/v2
|
|
kind: HorizontalPodAutoscaler
|
|
metadata:
|
|
name: archon-frontend-hpa
|
|
namespace: archon
|
|
spec:
|
|
scaleTargetRef:
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
name: archon-frontend
|
|
minReplicas: 2
|
|
maxReplicas: 4
|
|
metrics:
|
|
- type: Resource
|
|
resource:
|
|
name: cpu
|
|
target:
|
|
type: Utilization
|
|
averageUtilization: 75
|
|
```
|
|
|
|
---
|
|
|
|
## 10. PodDisruptionBudget (Alta Disponibilidade)
|
|
|
|
### Garantir disponibilidade durante rolling updates
|
|
|
|
**Criar arquivo `k8s-pdb.yaml`:**
|
|
|
|
```yaml
|
|
---
|
|
# PDB - Archon Server
|
|
apiVersion: policy/v1
|
|
kind: PodDisruptionBudget
|
|
metadata:
|
|
name: archon-server-pdb
|
|
namespace: archon
|
|
spec:
|
|
minAvailable: 1
|
|
selector:
|
|
matchLabels:
|
|
app: archon-server
|
|
unhealthyPodEvictionPolicy: AlwaysAllow
|
|
|
|
---
|
|
# PDB - Archon Frontend
|
|
apiVersion: policy/v1
|
|
kind: PodDisruptionBudget
|
|
metadata:
|
|
name: archon-frontend-pdb
|
|
namespace: archon
|
|
spec:
|
|
minAvailable: 1
|
|
selector:
|
|
matchLabels:
|
|
app: archon-frontend
|
|
unhealthyPodEvictionPolicy: AlwaysAllow
|
|
```
|
|
|
|
---
|
|
|
|
## 11. Persistent Volumes - NÃO NECESSÁRIO
|
|
|
|
### Análise de Necessidade
|
|
|
|
**✅ Arquon NÃO precisa de volumes persistentes porque:**
|
|
|
|
1. **Uploads de documentos**: Processados em memória e salvos no Supabase
|
|
2. **Crawling results**: Salvos diretamente no Supabase
|
|
3. **Playwright cache**: Reinstalado na inicialização do pod (stateless)
|
|
4. **Logs**: Enviados para stdout/stderr (capturados pelo K8s)
|
|
5. **Credenciais**: Armazenadas no Supabase (encrypted)
|
|
6. **Session data**: Gerenciado por Socket.IO em memória
|
|
|
|
**📊 Arquitetura Stateless:**
|
|
```
|
|
Pod → Processa dados → Salva no Supabase → Pod morre → Novo pod funciona igual
|
|
```
|
|
|
|
**⚠️ Exceção:** Se precisar de cache local para performance:
|
|
```yaml
|
|
# Opcional: Volume efêmero para cache de embeddings (não persiste entre restarts)
|
|
volumes:
|
|
- name: embedding-cache
|
|
emptyDir:
|
|
sizeLimit: 1Gi
|
|
```
|
|
|
|
---
|
|
|
|
## 12. Monitoring e Observability
|
|
|
|
### Prometheus Metrics (Recomendado)
|
|
|
|
**Adicionar annotations nos deployments:**
|
|
|
|
```yaml
|
|
spec:
|
|
template:
|
|
metadata:
|
|
annotations:
|
|
# ADICIONAR:
|
|
prometheus.io/scrape: "true"
|
|
prometheus.io/port: "8181" # ou 8051 para MCP
|
|
prometheus.io/path: "/metrics" # Se implementar endpoint
|
|
```
|
|
|
|
### Logfire Integration
|
|
|
|
**Verificar em `k8s-manifests-complete.yaml` - Secrets:**
|
|
|
|
```yaml
|
|
apiVersion: v1
|
|
kind: Secret
|
|
metadata:
|
|
name: archon-secrets
|
|
namespace: archon
|
|
type: Opaque
|
|
stringData:
|
|
SUPABASE_URL: "https://seu-projeto.supabase.co"
|
|
SUPABASE_SERVICE_KEY: "sua-service-role-key-aqui"
|
|
OPENAI_API_KEY: "sua-openai-key-aqui"
|
|
LOGFIRE_TOKEN: "seu-logfire-token-aqui" # CONFIGURAR se usar Logfire
|
|
```
|
|
|
|
---
|
|
|
|
## Checklist de Implementação
|
|
|
|
### 🔴 PRIORIDADE CRÍTICA (Impede funcionamento)
|
|
- [x] ✅ Corrigir Playwright browser path nos Dockerfiles
|
|
- [x] ✅ Adicionar `PLAYWRIGHT_BROWSERS_PATH` env var no deployment K8s
|
|
- [x] ✅ Adicionar `MCP_PUBLIC_URL` no ConfigMap e deployment K8s
|
|
- [x] ✅ Aumentar resource limits (memory: 2Gi, cpu: 2000m)
|
|
- [ ] ⚠️ Configurar `MCP_PUBLIC_URL` com o domínio correto no ConfigMap
|
|
- [ ] ⚠️ Rebuild e push das imagens K8s
|
|
|
|
### 🟡 PRIORIDADE ALTA (Segurança e estabilidade)
|
|
- [x] ✅ Atualizar health checks (startup probe, failureThreshold)
|
|
- [ ] ⚠️ Adicionar security contexts avançados (seccompProfile, readOnlyRootFilesystem)
|
|
- [ ] ⚠️ Configurar volumes para nginx (cache, run, logs)
|
|
- [ ] ⚠️ Implementar Network Policies
|
|
|
|
### 🟢 PRIORIDADE MÉDIA (Performance e observabilidade)
|
|
- [ ] 🔄 Adicionar init container para Playwright warm-up
|
|
- [ ] 🔄 Configurar HPA para server
|
|
- [ ] 🔄 Configurar PodDisruptionBudget
|
|
- [ ] 🔄 Adicionar Prometheus annotations
|
|
|
|
### 🔵 PRIORIDADE BAIXA (Melhoria contínua)
|
|
- [ ] 📝 Implementar /metrics endpoint para Prometheus
|
|
- [ ] 📝 Configurar Logfire token
|
|
- [ ] 📝 Testar readOnlyRootFilesystem: true no server
|
|
- [ ] 📝 Considerar resource quotas por namespace
|
|
|
|
---
|
|
|
|
## Comandos para Deploy
|
|
|
|
### 1. Rebuild e Push das Imagens
|
|
|
|
```bash
|
|
# Server
|
|
cd /home/lperl/Archon
|
|
docker build -f python/Dockerfile.k8s.server -t git.automatizase.com.br/luis.erlacher/archon/server:k8s-latest python/
|
|
docker push git.automatizase.com.br/luis.erlacher/archon/server:k8s-latest
|
|
|
|
# MCP (não mudou, mas rebuild para garantir)
|
|
docker build -f python/Dockerfile.k8s.mcp -t git.automatizase.com.br/luis.erlacher/archon/mcp:k8s-latest python/
|
|
docker push git.automatizase.com.br/luis.erlacher/archon/mcp:k8s-latest
|
|
|
|
# Frontend (não mudou, mas rebuild para garantir)
|
|
docker build -f archon-ui-main/Dockerfile.k8s.production -t git.automatizase.com.br/luis.erlacher/archon/frontend:k8s-latest archon-ui-main/
|
|
docker push git.automatizase.com.br/luis.erlacher/archon/frontend:k8s-latest
|
|
|
|
# Agents (se usado)
|
|
docker build -f python/Dockerfile.k8s.agents -t git.automatizase.com.br/luis.erlacher/archon/agents:k8s-latest python/
|
|
docker push git.automatizase.com.br/luis.erlacher/archon/agents:k8s-latest
|
|
```
|
|
|
|
### 2. Aplicar K8s Manifests
|
|
|
|
```bash
|
|
# Namespace e secrets (se ainda não existir)
|
|
kubectl apply -f k8s-manifests-complete.yaml
|
|
|
|
# Network policies (criar arquivo primeiro)
|
|
kubectl apply -f k8s-network-policies.yaml
|
|
|
|
# HPA (criar arquivo primeiro)
|
|
kubectl apply -f k8s-hpa.yaml
|
|
|
|
# PDB (criar arquivo primeiro)
|
|
kubectl apply -f k8s-pdb.yaml
|
|
```
|
|
|
|
### 3. Rolling Restart
|
|
|
|
```bash
|
|
# Restart server (vai pegar nova imagem)
|
|
kubectl rollout restart deployment/archon-server -n archon
|
|
kubectl rollout status deployment/archon-server -n archon
|
|
|
|
# Restart MCP
|
|
kubectl rollout restart deployment/archon-mcp -n archon
|
|
kubectl rollout status deployment/archon-mcp -n archon
|
|
|
|
# Restart frontend
|
|
kubectl rollout restart deployment/archon-frontend -n archon
|
|
kubectl rollout status deployment/archon-frontend -n archon
|
|
```
|
|
|
|
### 4. Verificar Status
|
|
|
|
```bash
|
|
# Ver pods
|
|
kubectl get pods -n archon -w
|
|
|
|
# Ver logs do server
|
|
kubectl logs -f deployment/archon-server -n archon
|
|
|
|
# Ver eventos
|
|
kubectl get events -n archon --sort-by='.lastTimestamp'
|
|
|
|
# Testar crawling
|
|
kubectl port-forward -n archon svc/archon-server-service 8181:8181
|
|
# Em outro terminal:
|
|
curl -X POST http://localhost:8181/api/knowledge/crawl \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"url": "https://example.com"}'
|
|
```
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Problema: Pod crashando com OOMKilled
|
|
**Solução:** Aumentar memory limits para 2Gi ou mais
|
|
|
|
### Problema: Playwright ainda não encontra browser
|
|
**Verificar:**
|
|
```bash
|
|
kubectl exec -it deployment/archon-server -n archon -- bash
|
|
echo $PLAYWRIGHT_BROWSERS_PATH
|
|
ls -la /app/ms-playwright
|
|
```
|
|
|
|
### Problema: Health check falhando
|
|
**Solução:** Aumentar `initialDelaySeconds` e `failureThreshold`
|
|
|
|
### Problema: Rolling update com downtime
|
|
**Solução:** Verificar PodDisruptionBudget e garantir minAvailable: 1
|
|
|
|
---
|
|
|
|
## Referências
|
|
|
|
- [Kubernetes Best Practices](https://kubernetes.io/docs/concepts/configuration/overview/)
|
|
- [Pod Security Standards](https://kubernetes.io/docs/concepts/security/pod-security-standards/)
|
|
- [Playwright in Docker](https://playwright.dev/docs/docker)
|
|
- [Nginx Non-Root](https://hub.docker.com/_/nginx)
|
|
- [FastAPI Deployment](https://fastapi.tiangolo.com/deployment/docker/)
|