ThetaNil

Cambria Phase 0: Project Setup and Goals
Login

Cambria Phase 0: Project Setup and Goals

Project Overview

Cambria is a version control system implemented in Go that aims to achieve feature parity with Fossil SCM's core version control and data model capabilities. Cambria focuses on the version control library and data persistence layer, deferring web features (wiki, ticketing, forums) to future phases.

Design Philosophy

Technology Choices

Fossil SCM Cambria Rationale
C Go Memory safety, concurrency, modern tooling
TH1/TCL Starlark Sandboxed, deterministic scripting
Custom HTTP Chi router Idiomatic HTTP routing (future phase)
Custom templates HTMX + templates Progressive enhancement (future phase)
SQLite C API mattn/go-sqlite3 Pure Go with CGo for SQLite driver
Custom CLI urfave/cli Standard Go CLI framework

Phase 0 Goals

  1. Project Structure: Establish Go module layout and package organization
  2. Dependencies: Set up required external dependencies
  3. Data Model: Implement minimal SQLite schema
  4. Build System: Configure build, test, and development workflow
  5. Documentation: Foundation for future development phases

Non-Goals (Deferred to Later Phases)

Project Structure

cambria/
├── go.mod                      # Go module definition
├── go.sum                      # Dependency checksums
├── README.md                   # Project documentation
├── LICENSE                     # License file
├── Makefile                    # Build automation (optional)
│
├── cmd/
│   └── cambria/                # CLI application entry point
│       └── main.go             # (Phase 1+)
│
├── pkg/
│   ├── hash/                   # Content hashing utilities
│   │   ├── hash.go             # Hash computation and verification
│   │   └── hash_test.go
│   │
│   ├── store/                  # SQLite data access layer
│   │   ├── db.go               # Database connection management
│   │   ├── db_test.go
│   │   ├── schema.go           # Schema creation and migrations
│   │   ├── blob.go             # Blob storage operations
│   │   ├── blob_test.go
│   │   ├── manifest.go         # Manifest storage operations
│   │   ├── manifest_test.go
│   │   ├── mlink.go            # Manifest-file linkage
│   │   ├── mlink_test.go
│   │   ├── plink.go            # Parent-child linkage
│   │   ├── plink_test.go
│   │   └── label.go            # Tag/branch labels
│   │
│   ├── artifact/               # Artifact parsing and generation
│   │   ├── manifest.go         # Manifest format parsing/serialization
│   │   ├── manifest_test.go
│   │   └── canonicalize.go     # Manifest canonicalization
│   │
│   ├── vcs/                    # Version control operations (Phase 1)
│   │   ├── repo.go             # Repository abstraction
│   │   ├── checkin.go          # Commit operations
│   │   ├── checkout.go         # Checkout operations
│   │   ├── diff.go             # Diff operations
│   │   └── merge.go            # Merge operations
│   │
│   ├── delta/                  # Delta compression (Phase 2)
│   │   ├── delta.go            # Delta algorithm
│   │   └── delta_test.go
│   │
│   └── script/                 # Starlark integration (Phase 2)
│       ├── runtime.go          # Starlark execution environment
│       └── builtins.go         # Custom Starlark functions
│
├── internal/                   # Private packages
│   ├── testutil/               # Test utilities and helpers
│   │   ├── fixtures.go         # Test fixtures
│   │   └── tempdir.go          # Temporary directory management
│   │
│   └── fileutil/               # File system utilities
│       ├── walk.go             # Directory traversal
│       └── hash.go             # File hashing
│
└── testdata/                   # Test fixtures and golden files
    ├── manifests/              # Sample manifest files
    └── repos/                  # Sample repository states

Dependencies

Core Dependencies

module github.com/yourusername/cambria

go 1.21

require (
 github.com/mattn/go-sqlite3 v1.14.18
 go.starlark.net v0.0.0-20231121155337-90ade8b19d09
)

Future Dependencies (Phase 1+)

require (
 github.com/urfave/cli/v2 v2.27.1      // CLI framework
 github.com/go-chi/chi/v5 v5.0.11      // HTTP router (web phase)
)

Dependency Justification

  1. mattn/go-sqlite3 (v1.14.18+)

    • Pure Go interface to SQLite via CGo
    • Stable, widely-used, well-maintained
    • Full SQLite feature support including transactions, indexes, foreign keys
  2. go.starlark.net (latest)

    • Google's Go implementation of Starlark (Python dialect)
    • Deterministic, sandboxed execution
    • No file I/O or network access by default
    • Replaces Fossil's TH1/TCL for hooks and scripting
  3. urfave/cli/v2 (Phase 1+)

    • Ergonomic CLI framework
    • Subcommand support
    • Flag parsing
    • Help generation
  4. go-chi/chi/v5 (Future)

    • Lightweight, idiomatic HTTP router
    • Middleware support
    • Compatible with net/http

Minimal Schema (Phase 0)

Based on CAMBRIA_DATA_MODEL_DESIGN.md, implement the core tables:

-- Immutable artifact storage
CREATE TABLE blob(
    rid INTEGER PRIMARY KEY AUTOINCREMENT,
    uuid TEXT UNIQUE NOT NULL,
    size INTEGER NOT NULL,
    content BLOB NOT NULL
);
CREATE INDEX blob_uuid ON blob(uuid);

-- Manifest identification
CREATE TABLE manifest(
    rid INTEGER PRIMARY KEY REFERENCES blob(rid),
    is_merge BOOLEAN DEFAULT 0
);

-- Manifest file linkage (path -> file content)
CREATE TABLE mlink(
    manifest INTEGER NOT NULL REFERENCES manifest(rid),
    fn TEXT NOT NULL,
    fid INTEGER NOT NULL REFERENCES blob(rid),
    PRIMARY KEY(manifest, fn)
);
CREATE INDEX mlink_manifest ON mlink(manifest);
CREATE INDEX mlink_fn ON mlink(fn);
CREATE INDEX mlink_fid ON mlink(fid);

-- Parent-child DAG
CREATE TABLE plink(
    parent INTEGER NOT NULL REFERENCES manifest(rid),
    child INTEGER NOT NULL REFERENCES manifest(rid),
    PRIMARY KEY(parent, child)
);
CREATE INDEX plink_child ON plink(child);
CREATE INDEX plink_parent ON plink(parent);

-- Branch and tag labels
CREATE TABLE label(
    manifest INTEGER NOT NULL REFERENCES manifest(rid),
    name TEXT NOT NULL,
    PRIMARY KEY(manifest, name)
);
CREATE INDEX label_name ON label(name);

-- Repository metadata
CREATE TABLE config(
    key TEXT PRIMARY KEY,
    value TEXT NOT NULL
);

Go Module Setup

Step 1: Initialize Module

cd cambria
go mod init github.com/yourusername/cambria

Step 2: Add Dependencies

go get github.com/mattn/go-sqlite3@v1.14.18
go get go.starlark.net@latest

Step 3: Verify CGo Setup

mattn/go-sqlite3 requires CGo. Verify:

go env CGO_ENABLED  # Should be "1"

If not enabled:

export CGO_ENABLED=1

Step 4: Test Build

go build ./...
go test ./...

Testing Strategy

Principles

  1. Standard Library Only: Use only testing package, no external frameworks
  2. Table-Driven Tests: Prefer table-driven tests for multiple scenarios
  3. Test Fixtures: Use testdata/ for complex test inputs
  4. Temporary Directories: Create isolated temp directories per test
  5. No Mocks: Prefer real SQLite databases (in-memory or temp files)
  6. Parallel Tests: Enable t.Parallel() where safe

Example Test Structure

func TestBlobStorage(t *testing.T) {
    // Create temp database
    db, cleanup := testutil.NewTempDB(t)
    defer cleanup()
    
    tests := []struct {
        name    string
        content []byte
        wantErr bool
    }{
        {"empty", []byte{}, false},
        {"small", []byte("hello"), false},
        {"large", make([]byte, 1<<20), false},
    }
    
    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            rid, uuid, err := store.WriteBlob(db, tt.content)
            if (err != nil) != tt.wantErr {
                t.Errorf("WriteBlob() error = %v, wantErr %v", err, tt.wantErr)
            }
            if err != nil {
                return
            }
            
            // Verify content retrieval
            got, err := store.ReadBlob(db, rid)
            if err != nil {
                t.Fatalf("ReadBlob() error = %v", err)
            }
            if !bytes.Equal(got, tt.content) {
                t.Errorf("ReadBlob() = %v, want %v", got, tt.content)
            }
            
            // Verify UUID lookup
            rid2, err := store.FindBlobByUUID(db, uuid)
            if err != nil {
                t.Fatalf("FindBlobByUUID() error = %v", err)
            }
            if rid2 != rid {
                t.Errorf("FindBlobByUUID() = %v, want %v", rid2, rid)
            }
        })
    }
}

Development Workflow

Build Commands

# Build all packages
go build ./...

# Build CLI (Phase 1+)
go build -o bin/cambria ./cmd/cambria

# Run tests
go test ./...

# Run tests with coverage
go test -cover ./...

# Run tests with race detector
go test -race ./...

# Run specific package tests
go test ./pkg/store

# Verbose test output
go test -v ./...

# Run benchmarks
go test -bench=. ./...

Code Quality

# Format code
go fmt ./...

# Static analysis
go vet ./...

# Install golangci-lint
go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest

# Run linters
golangci-lint run

Useful Make targets (optional)

.PHONY: all build test clean

all: test build

build:
 go build -o bin/cambria ./cmd/cambria

test:
 go test -v -race -cover ./...

test-short:
 go test -short ./...

bench:
 go test -bench=. -benchmem ./...

clean:
 rm -rf bin/
 go clean -cache -testcache

lint:
 golangci-lint run

fmt:
 go fmt ./...
 gofumpt -w .

deps:
 go mod download
 go mod verify

Package Responsibilities (Phase 0)

pkg/hash

Purpose: Content-addressable hashing utilities.

Key Functions:

Test Coverage:

pkg/store

Purpose: SQLite data access layer for all repository data.

Key Operations:

Test Coverage:

pkg/artifact

Purpose: Parse and generate manifest artifacts.

Key Functions:

Manifest Structure:

type Manifest struct {
    Files   []FileEntry  // F-lines
    Parents []string     // P-lines (UUIDs)
    Comment string       // C-line
    Labels  []string     // T-lines
}

type FileEntry struct {
    Path string
    UUID string  // File content hash
}

Test Coverage:

Configuration and Conventions

File Naming

Code Style

Error Handling

Logging

Success Criteria (Phase 0)

Phase 0 is complete when:

  1. ✅ Go module initialized with correct dependencies
  2. ✅ SQLite schema implemented and tested
  3. pkg/hash package implemented with tests
  4. pkg/store package implemented with comprehensive tests
  5. pkg/artifact package can parse and generate manifests
  6. ✅ All tests pass: go test ./...
  7. ✅ No race conditions detected: go test -race ./...
  8. ✅ Test coverage >80% for implemented packages
  9. ✅ Code passes go vet and golangci-lint
  10. ✅ Documentation comments on all exported functions

Next Phase Preview

Phase 1 will implement:

See CAMBRIA_PHASE_1.md for detailed Phase 1 planning.

References

Timeline Estimate

Assuming one experienced Go developer:

Total: ~10 days for Phase 0 completion.

Open Questions

  1. Hash Algorithm: SHA-256 (standard) vs BLAKE3 (faster)?

    • Recommendation: SHA-256 for compatibility, add BLAKE3 option later
  2. SQLite Pragmas: Which pragmas to enable by default?

    • Recommendation: WAL mode, foreign keys, synchronous=NORMAL
  3. Concurrency Model: Single-writer or multiple writers?

    • Recommendation: Single-writer for Phase 0 (SQLite WAL limitation)
  4. Repository Location: Single file or directory?

    • Recommendation: Single .cambria file (like .fossil)
  5. Platform Support: Windows, Linux, macOS?

    • Recommendation: All three, CI for all platforms