Vega on AWS
This page explains how Vega works when deployed to AWS — what each AWS service does, how the pieces talk to each other, and the exact flow of a scan from start to finish.
Architecture diagram
flowchart TD
User[Browser user]
subgraph edge["Edge"]
CF[CloudFront CDN]
S3FE[S3: frontend bucket\nstatic React files]
end
subgraph vpc["VPC — private network"]
ALB[Application Load Balancer\npublic subnet]
subgraph app["Application containers — ECS Fargate"]
API[vega-api\nprivate subnet]
Worker[vega-worker\nprivate subnet]
Proxy[vega-llm-proxy\nprivate subnet]
end
subgraph runner["Scan execution — ECS RunTask"]
Runner[vega-v16-runner\nprivate subnet, one per scan]
end
RDS[(RDS Postgres\nprivate subnet)]
end
subgraph aws_services["AWS managed services"]
SQS[SQS scan queue]
S3SRC[S3: source bucket]
S3ART[S3: artifacts bucket]
Cognito[Cognito user pool]
SM[Secrets Manager]
CW[CloudWatch logs]
ECR[ECR image registry]
end
Provider[AI provider API]
User --> CF
CF --> S3FE
CF --> ALB
ALB --> API
API --> Cognito
API --> RDS
API --> S3SRC
API --> SQS
SQS --> Worker
Worker --> RDS
Worker -->|RunTask| Runner
Runner --> RDS
Runner --> S3SRC
Runner --> S3ART
Runner --> Proxy
Proxy --> Provider
API --> CW
Worker --> CW
Runner --> CW
Proxy --> CW
API --> SM
Worker --> SM
ECS -->|pull images| ECR
Edge and frontend
CloudFront is AWS's CDN (Content Delivery Network). When a user opens the Vega dashboard, their browser connects to the nearest CloudFront edge location — not directly to AWS. CloudFront serves two things:
-
Static frontend files — the built React app stored in an S3 bucket. S3 is AWS's object storage; it holds files but isn't a web server. CloudFront makes S3 act like one, adding caching and HTTPS.
-
API requests — requests to
/v1/*are forwarded by CloudFront to the Application Load Balancer, which routes them to thevega-apiECS service.
Authentication with Cognito
AWS Cognito is a managed user authentication service. Vega uses it so the team doesn't have to build user management from scratch. Here's how it works:
- The user enters their credentials on the dashboard login page.
- The frontend calls Cognito directly (SRP authentication protocol) and receives JWT tokens.
- For every subsequent API request, the frontend sends
Authorization: Bearer <token>. - The
vega-apivalidates the JWT signature using Cognito's public JWKS (JSON Web Key Set) endpoint.
Relevant code: app/auth/cognito.py, app/auth/service.py.
Terraform module: infra/terraform/modules/cognito/main.tf.
Network isolation with VPC
A VPC (Virtual Private Cloud) is a private network inside AWS. All Vega services run inside the VPC. The database, worker, runner, and LLM proxy are in private subnets — they have no direct internet access. The API's load balancer is in a public subnet so it can receive traffic from CloudFront.
Security groups act as firewalls. For example: - The database security group only accepts connections from the API and runner task security groups. - The LLM proxy security group only accepts connections from runner tasks. - The worker security group only needs outbound access to SQS, RDS, and ECS.
Relevant Terraform modules: infra/terraform/modules/network/main.tf, infra/terraform/modules/security/main.tf.
Compute with ECS Fargate
ECS (Elastic Container Service) runs Docker containers. Fargate is the serverless mode — you don't manage EC2 instances. You define how much CPU and memory a container needs, and AWS handles the rest.
Vega has three long-running ECS services (the API, worker, and LLM proxy) and two ECS task definitions used for one-off runs (the runner and maintenance):
| Type | Service/Task | How it runs |
|---|---|---|
| Long-running service | vega-api |
Always running, ECS restarts on failure |
| Long-running service | vega-worker |
Always running, polls SQS in a loop |
| Long-running service | vega-llm-proxy |
Always running, handles AI proxy requests |
| One-off task | vega-v16-runner |
Launched per scan via ECS RunTask, exits when done |
| One-off task | vega-maintenance |
Launched manually for migrations and cleanup |
Scan flow in AWS (step by step)
- User creates a scan in the dashboard.
vega-apiwrites aqueuedscan row to RDS Postgres.vega-apisends a message to the SQS queue containing the scan ID.vega-worker(which is always running) receives the SQS message.vega-workerclaims the scan with a Postgres row lock (prevents double-execution).vega-workercalls AWS ECSRunTaskto start avega-v16-runnercontainer.vega-v16-runnerdownloads the source snapshot from S3.vega-v16-runnerruns the v16 scan engine — planning + per-component audits.- All AI calls from v16/Codex go through
vega-llm-proxy, which holds the provider API key. - Findings and events are written to Postgres. Artifacts are uploaded to S3.
- Scan status is updated to
completed. - The
vega-v16-runnerECS task exits. You're only billed for the time it ran. - The dashboard reads the updated scan status and findings via the API.
Logs and observability
CloudWatch is AWS's logging service. Every ECS container is configured to send its stdout/stderr to a CloudWatch log group. When debugging an AWS issue, CloudWatch logs are almost always the first place to look.
Log groups follow this naming pattern: /vega/<env>/<service-name>
Relevant Terraform: infra/terraform/modules/observability/main.tf.