Scan Lifecycle
A scan is the central operation of Vega. This page traces the complete journey from the moment a user clicks "Start scan" to when findings appear in the dashboard. The exact path differs between local mode and AWS production mode.
Complete scan flow
flowchart TD
A["User clicks 'Start scan' in dashboard"]
B["POST /v1/repositories/:id/scans"]
C["API writes scan row to DB\nstatus = queued"]
A --> B --> C
C --> D{Execution mode?}
D -->|thread| E["API runs scan in a\nbackground thread"]
D -->|external| F["Worker polling loop\nclaims the scan"]
D -->|sqs| G["API sends message to SQS"]
G --> F
F --> H["Worker locks scan row in DB\nstatus = claimed"]
H --> I{Worker execution mode?}
I -->|local| E
I -->|ecs| J["Worker calls ECS RunTask\nlaunches vega-v16-runner"]
E --> K
J --> K
K["Runner loads source snapshot from S3\nor local directory"]
K --> L["Runner calls ProjectService\n.run_claimed_scan_by_id()"]
L --> M["Calls V16ServiceAdapter\n.scan_source()"]
M --> N["v16 runs planning step\nCodex identifies components to audit"]
N --> O["v16 runs audit per component\nCodex analyzes each one"]
O --> P["v16 emits V16Events:\nscan_progress, finding_updated,\nscan_completed, etc."]
P --> Q["Backend adapter maps events\nto Vega event records"]
Q --> R["ProjectService upserts findings\nin Postgres or JSON"]
Q --> S["Events appended to event store"]
R --> T["Runner uploads artifacts to S3:\nv16-events.jsonl, runner-summary.json,\nv16-report.json, v16-debug-bundle.zip"]
T --> U["Scan status → completed\n(or failed / cancelled)"]
U --> V["Dashboard polls /v1/repositories/:id/scans\nand reads findings"]
Step-by-step explanation
1. Scan is created
The user initiates a scan from the dashboard or via API. The backend creates a scan record in the database with status queued. If the execution mode is sqs, a message is also sent to the SQS queue. The API returns immediately — it doesn't wait for the scan to finish.
2. Worker claims the scan
A worker process (either running locally or as an ECS service) polls SQS for scan messages. When it receives one, it claims the scan using a database row lock. This is crucial: claiming prevents two workers from picking up the same scan simultaneously. The scan status moves from queued to claimed.
3. Runner is launched
If VEGA_SCAN_WORKER_EXECUTION_MODE=ecs, the worker calls AWS ECS RunTask to start a vega-v16-runner container. The scan ID is passed as an environment variable. In local mode, the scan runs in the worker process itself.
4. Source is loaded
The runner loads the source snapshot. In production, this means downloading a zip from S3. Locally, it reads from a directory under data/snapshots/.
5. v16 planning
The v16 scan engine first runs a planning step. It uses Codex (an AI CLI tool) to analyze the codebase structure and the threat profile, then produces a plan: which components of the code to audit and in what order.
6. Per-component auditing
For each component in the plan, v16 runs an audit step. Codex reads the source files for that component and generates findings based on the threat profile. All Codex calls go through the LLM proxy, which enforces per-scan usage limits.
7. Events and findings stream back
As v16 works, it emits events:
| Event kind | Meaning |
|---|---|
scan_started |
v16 has begun |
scan_progress |
Progress update (e.g., "auditing component 3 of 8") |
scan_log |
Debug log line from v16 or Codex |
finding_updated |
A security issue was found or updated |
scan_completed |
v16 finished successfully |
scan_failed |
v16 encountered an unrecoverable error |
scan_cancelled |
The scan was cancelled |
The backend adapter (app/projects/v16_adapter.py) maps these into Vega event records and calls back into ProjectService to upsert findings.
8. Artifacts are written
When the scan finishes, the runner uploads artifacts to S3:
v16-events.jsonl— all raw v16 eventsrunner-summary.json— scan metadata and timingv16-report.json— structured findings reportv16-debug-bundle.zip— full debug bundle including Codex state (for diagnosing scan failures)
9. Dashboard updates
The frontend polls the scan status endpoint and the findings endpoint. Once the scan is completed, all findings are available for review.
Execution mode quick reference
Local (simplest):
VEGA_SCAN_EXECUTION_MODE=thread
→ API process runs the scan directly
Local (production-like):
VEGA_SCAN_EXECUTION_MODE=external
VEGA_SCAN_WORKER_EXECUTION_MODE=local
→ Separate worker process, but no ECS
Production:
VEGA_SCAN_EXECUTION_MODE=sqs
VEGA_SCAN_WORKER_EXECUTION_MODE=ecs
→ SQS queue, ECS worker, isolated ECS runner per scan
Cancellation
A user can cancel a running scan. The cancellation flow:
DELETE /v1/repositories/:id/scans/:scanIdis called- API sets a cancellation flag on the scan record
- If an ECS runner task is running, the worker stops it via
ECS StopTask - v16 and Codex respect the cancellation signal cooperatively
- Partial findings and events up to the cancellation point are preserved
- Scan status moves to
cancelled
Debugging stale scans
A scan that stays running for longer than VEGA_SCAN_RUNNING_STALE_SECONDS (default: 6 hours) is treated as stale. The worker's recovery loop will detect it and mark it failed with an appropriate error event. This handles cases where the runner task crashed silently without writing a failure event.