Projects and Repositories

Projects and repositories are the foundation of the Vega domain model. Almost every other concept — scans, findings, events — exists in the context of a repository inside a project.

Concepts

A project is a workspace. Users create projects to group related security work. A project contains one or more repositories.

A repository is a source code target. It can be: - A Git repository specified by URL — the backend clones it - A zip/tar archive uploaded by the user — the backend extracts it

After a repository is added, it goes through an ingest process: the source code is fetched, an immutable snapshot is stored, and the repository moves to ready status. Only ready repositories can be scanned.

See the Data Model for the full entity hierarchy.

The ingest flow

Add repository
    ↓
Fetch source
  ├── Git URL → clone with app/projects/fetcher.py
  └── Zip upload → extract with app/uploads/service.py + app/storage/archive.py
    ↓
Store snapshot
  ├── Local mode → save to data/snapshots/
  └── S3 mode → upload zip to S3 source bucket
    ↓
Repository status = ready
    ↓
User edits threat profile
    ↓
User creates scan

Archive safety — when users upload zip archives, the backend enforces limits to prevent zip bomb attacks and path traversal exploits. See Source Ingest for details.

ProjectService

app/projects/service.py is the central service for all project and repository operations. It's a large file (~4500 lines) but well-organized by concern.

Key method areas:

# Project operations
create_project(user, request) → Project
list_projects(user) → List[Project]
update_project(user, project_id, request) → Project
delete_project(user, project_id) → None

# Repository operations
create_repository(user, project_id, request) → Repository
get_repository(user, repo_id) → Repository
trigger_ingest(user, repo_id) → None

# Scan operations
create_scan(user, repo_id, request) → Scan
cancel_scan(user, scan_id) → None
run_claimed_scan_by_id(scan_id, settings) → None  ← called by runner

# Finding operations
list_findings(user, repo_id, filters) → List[Finding]
get_finding(user, finding_id) → Finding

Route structure

Route module	What it handles
`app/api/projects.py`	Project CRUD, project-level scan views, project-level findings views
`app/api/repositories.py`	Repository add/edit, ingest status, source tree, threat profile, scans, findings

Example workflow

Here's the full flow a new user takes to get their first scan:

POST /v1/projects                    ← create a project
POST /v1/projects/:id/repositories   ← add a repository
GET  /v1/repositories/:id            ← wait for ingest_status = ready
PUT  /v1/repositories/:id/threat-profile  ← describe what to look for
POST /v1/repositories/:id/scans      ← start a scan
GET  /v1/repositories/:id/scans      ← poll scan status
GET  /v1/repositories/:id/findings   ← read findings when complete

Debugging

Repository stuck in ingesting

Check API logs for ingest errors.
If using S3, confirm the API task role has s3:PutObject on the source bucket.
Check app/projects/fetcher.py for git clone errors (bad URL, missing credentials, etc.).
For zip uploads, check archive validation errors in app/storage/archive.py.

Scan not progressing after creation

Check VEGA_SCAN_EXECUTION_MODE. If external or sqs, a worker must be running.
Check the scan record status in the database.
If status is queued, check the worker (see Scan Lifecycle).
If status is claimed, check the runner task.
If status is running for longer than expected, the runner may have crashed silently — check CloudWatch logs.