S3 and Artifacts
S3 stores large objects that don't belong in Postgres rows: source code, scan reports, and debug bundles. Think of S3 as a durable key-value store where the key is a path-like string and the value is a file.
Vega's S3 buckets
| Bucket (configured via) | Contents |
|---|---|
VEGA_S3_SOURCE_BUCKET |
Source snapshots — zip archives of ingested repositories |
VEGA_S3_ARTIFACTS_BUCKET |
Scan artifacts — reports, event logs, debug bundles |
VEGA_S3_EXPORTS_BUCKET |
User-downloadable exports (findings exports, etc.) |
| (frontend_hosting module) | Built frontend static files (React app) |
Each environment (dev, prod) has its own separate set of buckets. The Terraform storage module creates them with versioning, encryption, and appropriate access controls.
Scan artifacts
After a scan completes, the runner uploads artifacts to the S3 artifacts bucket. The object keys follow this pattern:
scans/<scan_id>/<artifact_name>
| Artifact | Description |
|---|---|
v16-events.jsonl |
Every raw event emitted by v16 during the scan, one JSON object per line |
runner-summary.json |
Scan metadata: timing, component counts, status |
v16-report.json |
Structured findings report from v16 |
v16-debug-bundle.zip |
Full debug bundle including Codex prompts, responses, and internal state |
The v16-debug-bundle.zip is the most useful artifact for diagnosing scan failures. Download it from S3 and inspect its contents to see exactly what Codex was asked and what it answered.
Object keys in the database
Postgres records store S3 object keys (not full URLs). To construct a presigned URL for downloading:
from app.storage.s3 import get_s3_client
url = get_s3_client(settings).generate_presigned_url(
bucket=settings.s3_artifacts_bucket,
key="scans/abc123/v16-debug-bundle.zip",
expires_in=3600,
)
The API exposes presigned URLs for scan artifacts through the scan detail endpoint.
Configuration
VEGA_FILE_STORAGE_BACKEND=s3
VEGA_S3_SOURCE_BUCKET=vega-dev-source-abc123
VEGA_S3_ARTIFACTS_BUCKET=vega-dev-artifacts-abc123
VEGA_S3_EXPORTS_BUCKET=vega-dev-exports-abc123
Debugging
Artifact missing after a completed scan:
1. Confirm VEGA_FILE_STORAGE_BACKEND=s3 in the runner task definition.
2. Check VEGA_S3_ARTIFACTS_BUCKET — is it the right bucket for this environment?
3. Did the scan get far enough to produce the artifact? A very early failure won't create v16-report.json.
4. Check the runner CloudWatch logs around the upload step.
5. Verify the runner task role has s3:PutObject permission on the artifacts bucket.
Source snapshot not found during scan:
1. Check VEGA_S3_SOURCE_BUCKET in the runner task definition.
2. Confirm the source was successfully uploaded during ingest — check API logs.
3. Verify the runner task role has s3:GetObject permission on the source bucket.
4. Check that the bucket name and object key in the scan/repository record match what's actually in S3.