Login
Cambria VCS Implementation Guide
Login

Last Updated: 2025-12-19

This document provides a comprehensive guide to the implementation of the Cambria version control system in Go. It is intended for developers working on the Cambria project and for AI agents tasked with its maintenance and extension.

1. Design Philosophy & Key Principles

Cambria is a re-implementation of the core concepts of Fossil SCM in pure Go. It adheres to a set of guiding principles that emphasize simplicity, correctness, and robustness.

  1. Minimal and Correct: Prioritize the simplest correct implementation.
  2. SQLite-Backed: A single *.db file serves as the atomic source of truth for the entire repository. This includes all versioned files, historical metadata, and project configuration.
  3. Content-Addressable Storage: All artifacts (files, manifests) are identified by a unique SHA-256 hash of their content. This ensures data integrity and provides a canonical identifier (UUID) for every piece of data.
  4. Immutable Artifacts: Once an artifact (a "blob") is written to the repository, it is never changed. New versions are created as new blobs.
  5. Transactional Integrity: All database operations that modify state are executed within a transaction. The store.DBTX interface abstracts *sql.DB and *sql.Tx to enforce this at the data access layer.
  6. Repository Pattern: High-level VCS operations are exposed via the vcs.Repository struct, which encapsulates the database connection and provides a clean API for all version control logic.
  7. Go Idioms: The project favors the Go standard library and minimizes external dependencies.
  8. Standard Testing: All functionality is tested using the standard testing package.

2. Core Architecture

vcs.Repository

The vcs.Repository struct is the primary entry point for all version control logic. It encapsulates the database connection and provides methods for all high-level operations.

// pkg/vcs/repo.go
type Repository struct {
    db *store.DB
}

// Usage:
repo, err := vcs.InitRepository("path/to/my-repo.db")
// or
repo, err := vcs.OpenRepository("path/to/my-repo.db")

// Perform operations:
uuid, err := repo.Checkin(...)
err = repo.Checkout(...)

store.DBTX Interface

To guarantee transactional integrity, all functions in the pkg/store package that modify the database accept a DBTX interface. This allows the same function to be used either in a single read operation (with a *store.DB) or as part of a larger atomic write operation (with a *sql.Tx).

// pkg/store/dbtx.go
type DBTX interface {
    Exec(query string, args ...interface{}) (sql.Result, error)
    Query(query string, args ...interface{}) (*sql.Rows, error)
    QueryRow(query string, args ...interface{}) *sql.Row
}

// Transactional Write Pattern:
tx, err := repo.DB().Begin()
if err != nil { /* ... */ }
defer tx.Rollback() // Ensures rollback on error

// ... call store functions with the transaction object ...
err = store.CreateManifest(tx, rid, false)
// ...

return tx.Commit() // Commits all changes atomically

3. Package Structure

cambria/
├── pkg/
│   ├── hash/           # Content hashing (SHA-256)
│   ├── store/          # SQLite data access layer (uses DBTX)
│   │   ├── db.go
│   │   ├── dbtx.go     # The transactional interface
│   │   ├── schema.go   # Includes vfile and vmerge tables
│   │   ├── blob.go
│   │   ├── manifest.go
│   │   ├── mlink.go
│   │   ├── plink.go
│   │   └── label.go
│   ├── artifact/       # Manifest parsing/generation
│   │   └── manifest.go
│   └── vcs/            # High-level version control operations
│       ├── repo.go                     # Repository struct and lifecycle
│       ├── checkin.go                  # Commit operation (uses vfile)
│       ├── checkout.go                 # Checkout operation (populates vfile)
│       ├── add.go                      # File addition (updates vfile)
│       ├── diff.go                     # Diff computation between versions
│       ├── log.go                      # Timeline/history operations
│       ├── label.go                    # Branch and tag management
│       ├── workdir.go                  # Working directory operations (Remove, Rename)
│       ├── merge.go                    # Three-way merge implementation
│       ├── vfile.go                    # VFILE system implementation
│       └── ... (test files)
├── internal/
│   └── testutil/       # Test helpers
├── cmd/
│   └── cambria/        # CLI application
│       ├── main.go
│       ├── common.go   # Shared utilities (ResolveVersion, etc.)
│       ├── init.go
│       ├── open.go
│       ├── close.go
│       ├── add.go
│       ├── rm.go
│       ├── mv.go
│       ├── commit.go
│       ├── checkout.go
│       ├── status.go
│       ├── diff.go
│       ├── log.go
│       ├── branch.go
│       ├── tag.go
│       └── merge.go
└── doc_cambria/
    └── CAMBRIA_VCS_IMPL.md # This file

4. Core Data Model (Schema)

The database schema is designed to be idempotent, using CREATE TABLE IF NOT EXISTS to allow for safe re-initialization.

-- Immutable artifact storage
CREATE TABLE IF NOT EXISTS blob(
    rid INTEGER PRIMARY KEY AUTOINCREMENT,
    uuid TEXT UNIQUE NOT NULL,
    size INTEGER NOT NULL,
    content BLOB NOT NULL
);

-- Manifest identification (a manifest is a special type of blob)
CREATE TABLE IF NOT EXISTS manifest(
    rid INTEGER PRIMARY KEY REFERENCES blob(rid),
    is_merge BOOLEAN DEFAULT 0
);

-- Manifest-file linkage (which files are in which manifest)
CREATE TABLE IF NOT EXISTS mlink(
    manifest INTEGER NOT NULL REFERENCES manifest(rid),
    fn TEXT NOT NULL,
    fid INTEGER NOT NULL REFERENCES blob(rid),
    PRIMARY KEY(manifest, fn)
);

-- Parent-child DAG (the commit graph)
CREATE TABLE IF NOT EXISTS plink(
    parent INTEGER NOT NULL REFERENCES manifest(rid),
    child INTEGER NOT NULL REFERENCES manifest(rid),
    PRIMARY KEY(parent, child)
);

-- Labels (branches and tags)
CREATE TABLE IF NOT EXISTS label(
    manifest INTEGER NOT NULL REFERENCES manifest(rid),
    name TEXT NOT NULL,
    PRIMARY KEY(manifest, name)
);

-- Working directory file state tracking (VFILE)
CREATE TABLE IF NOT EXISTS vfile(
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    vid INTEGER NOT NULL REFERENCES manifest(rid), -- Baseline manifest
    rid INTEGER REFERENCES blob(rid),             -- Baseline file content
    mrid INTEGER REFERENCES blob(rid),            -- Merged file content
    pathname TEXT NOT NULL COLLATE NOCASE,
    origname TEXT COLLATE NOCASE,                 -- Original name if renamed
    is_exe BOOLEAN NOT NULL DEFAULT 0,
    is_link BOOLEAN NOT NULL DEFAULT 0,
    chnged INTEGER NOT NULL DEFAULT 0,            -- Change status
    deleted BOOLEAN NOT NULL DEFAULT 0,
    mhash TEXT,
    mtime INTEGER,
    size INTEGER,
    UNIQUE(vid, pathname)
);

-- Merge state tracking (VMERGE)
CREATE TABLE IF NOT EXISTS vmerge(
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    merge INTEGER NOT NULL REFERENCES manifest(rid), -- Merge-in manifest
    mhash TEXT NOT NULL,
    merge_type INTEGER NOT NULL DEFAULT 0,
    is_baseline BOOLEAN NOT NULL DEFAULT 0,
    UNIQUE(merge)
);

-- Temporary snapshot for merge abort
CREATE TABLE IF NOT EXISTS vfile_snapshot(...);

5. Implemented Features

Core Version Control

VFILE System for Working Directory Management

Inspired by Fossil's vfile.c, Cambria uses a set of SQLite tables (vfile, vmerge) to efficiently track the state of the working directory. This avoids expensive full-directory scans for operations like status or commit.

Command-Line Interface (CLI)

A full-featured CLI is implemented using the urfave/cli/v3 framework.

Advanced Features

7. Fossil Module to Cambria Package Mapping

Fossil Module Cambria Package Responsibility
src/content.c pkg/store/blob.go Content storage
src/manifest.c pkg/artifact/manifest.go Manifest parsing and generation
src/db.c pkg/store/ Database operations (via DBTX)
src/checkin.c pkg/vcs/checkin.go High-level commit creation
src/checkout.c pkg/vcs/checkout.go High-level checkout to filesystem
src/add.c pkg/vcs/add.go File addition to version control
src/diff.c pkg/vcs/diff.go Diff computation
src/vfile.c pkg/vcs/vfile.go VFILE system for working directory tracking
src/rm.c, mv.c pkg/vcs/workdir.go File removal and rename operations
src/timeline.c pkg/vcs/log.go History and log generation
src/branch.c pkg/vcs/label.go Branch and tag management
src/merge.c pkg/vcs/merge.go Three-way merge logic

8. Critical Reminders for AI Agents

  1. Use the Repository API: Do not call pkg/store functions directly for VCS operations. Use the methods on the vcs.Repository struct.
  2. Embrace Transactions: When adding new multi-step database logic, use the tx, err := repo.DB().Begin() pattern.
  3. CGo is Required: The SQLite driver requires CGo. Ensure CGO_ENABLED=1.
  4. Test Everything: All new functionality must be accompanied by tests in the same package. Use the setupTestRepo helper in pkg/vcs for integration-style tests.
  5. Path Security: Always validate user-provided file paths to prevent directory traversal attacks.
  6. VFILE origname Field: When implementing file operations, use the origname field in the vfile table to track original filenames for renamed files.
  7. Label Prefixes: Branch and tag names are stored internally with prefixes ("branch:" and "tag:"). Always use the VCS API functions (CreateBranch, CreateTag) to manage labels, not raw SQL.
  8. Map Key Selection: When building maps from database queries, be cautious. Using non-unique values (like a content RID, which can be shared by multiple files) as map keys will lead to silently overwritten entries. Prefer map[filename]rid over map[rid]filename.
  9. Test with VCS APIs: When writing tests, always use the proper VCS API functions (e.g., repo.CreateBranch()) rather than raw SQL inserts. The VCS layer applies important transformations (like label prefixes) that raw SQL bypasses.

9. Development Commands

# Build all packages
go build ./...

# Run all tests
go test ./...

# Run tests with coverage
go test -cover ./...

# Run tests with race detector (CRITICAL before submitting)
go test -race ./...

# Format code
go fmt ./...

# Static analysis
go vet ./...