Skip to content

Backend Components

The backend is a FastAPI application organized into three layers: HTTP routes, domain services, and storage adapters. This page maps the app/ directory and explains how the layers relate.

Layer overview

app/api/        ← HTTP layer: receive requests, call services, return responses
app/projects/   ← Core domain: projects, repos, scans, findings, v16 bridge
app/auth/       ← Auth domain: login, tokens, Cognito, current-user dependency
app/sessions/   ← Legacy domain: old upload/analyze flow (secondary path)
app/events/     ← Event persistence: append-only scan event log
app/api_keys/   ← API key management for programmatic access
app/billing/    ← Usage summaries (placeholder)
app/hardening/  ← Quotas, worker heartbeats, stale-scan cleanup
app/llm_proxy/  ← AI provider proxy with per-scan usage enforcement
app/queues/     ← SQS producer and consumer
app/storage/    ← Postgres, S3, archive safety, migrations
app/core/       ← Settings, structured errors, JSON logging

The HTTP layer (app/api/)

Route handlers in app/api/ are intentionally thin. A route handler should:

  1. Extract and validate request data (FastAPI/Pydantic does most of this)
  2. Call a domain service
  3. Return a response model or raise an HTTP error

Business logic does not belong in route handlers. If you find yourself writing conditional logic or database queries directly in a route, that logic belongs in a service instead.

app/api/routes.py assembles all route modules and mounts them under /v1.

The domain layer (app/projects/, app/auth/, etc.)

Domain services own state transitions and business rules. The most important service by far is app/projects/service.py — it's the single source of truth for everything related to projects, repositories, scans, and findings.

Some key responsibilities of ProjectService:

Method area What it does
Project CRUD Create, list, update, delete workspace projects
Repository management Add repositories, trigger ingest, track ingest status
Source ingest Fetch git repos, extract archives, store snapshots
Threat profile Create and update threat profiles for repositories
Scan creation Create scan records, dispatch to queue or thread
Scan claiming Lock a scan row in Postgres to prevent double-execution
Scan execution Call V16ServiceAdapter.scan_source(), handle events
Finding persistence Upsert findings from scan engine events
Artifact upload Write scan reports and debug bundles to S3
Stale scan recovery Detect and recover stuck or crashed scans
ECS management Launch, stop, and describe runner tasks

The storage layer (app/storage/)

Storage abstractions let the application code remain the same regardless of whether you're running locally with JSON files or in production with Postgres and S3.

Module What it abstracts
app/storage/postgres.py All Postgres operations — a single PostgresStore class
app/storage/s3.py S3 upload, download, and presigned URL generation
app/storage/archive.py Safe archive extraction with path traversal and size checks
app/storage/migrations/ SQL files that evolve the Postgres schema

Application code calls storage methods; it never writes raw SQL or boto3 calls directly.

Full package map

app/
├── main.py                    FastAPI app creation, CORS, middleware, router mounting
├── api/
│   ├── routes.py              Assembles all route modules under /v1
│   ├── health.py              GET /v1/healthz and /v1/readyz
│   ├── auth.py                POST /v1/auth/login, refresh, logout, me
│   ├── projects.py            Project CRUD and rollup views
│   ├── repositories.py        Repository workflow: ingest, threat profile, scans, findings
│   ├── sessions.py            Legacy upload/analyze endpoints
│   ├── api_keys.py            API key management
│   ├── billing.py             Usage summary
│   ├── git_upload.py          Temporary git upload metadata
│   └── hardening.py           /v1/ops/* limits, workers, cleanup
├── auth/
│   ├── service.py             AuthService — login/logout, custom vs Cognito
│   ├── cognito.py             Cognito JWT validation via JWKS
│   ├── dependencies.py        require_current_user FastAPI dependency
│   ├── tokens.py              HMAC token creation and verification (local auth)
│   └── models.py              Pydantic auth models (LoginRequest, etc.)
├── projects/
│   ├── service.py             ProjectService — core domain (~4500 lines)
│   ├── models.py              Pydantic request/response models
│   ├── fetcher.py             Git clone, zip extract, source tree
│   ├── v16_adapter.py         Bridge between ProjectService and v16 engine
│   ├── runner.py              ECS RunTask launcher/stopper for scan runners
│   └── codex_container_manager.py  Local Docker Codex runs
├── storage/
│   ├── postgres.py            PostgresStore — all DB operations
│   ├── s3.py                  S3 client wrapper
│   ├── archive.py             Safe archive extraction
│   └── migrations/            SQL migration files (001–005)
├── queues/
│   └── sqs.py                 SQS enqueue/receive/delete
├── llm_proxy/
│   ├── main.py                Proxy FastAPI app
│   ├── tokens.py              Scan-scoped token issuance and validation
│   └── usage.py               Per-scan usage tracking
├── events/
│   ├── service.py             Append-only event log
│   └── models.py              Event model
├── sessions/         (legacy — see note below)
├── api_keys/
├── billing/
├── hardening/
│   ├── limits.py              Quota checks
│   ├── workers.py             Worker heartbeat registry
│   └── cleanup.py             Stale artifact removal
└── core/
    ├── settings.py            All VEGA_* configuration
    ├── errors.py              api_error() helper — structured error envelope
    └── logging.py             JSON logging and request middleware

Legacy sessions

The app/sessions/ package and /v1/sessions/ routes implement an older upload/analyze flow. New features should use the project/repository model. The legacy path is still supported but is not the current primary workflow.