Project Overview
Cambria is a version control system implemented in Go that aims to achieve feature parity with Fossil SCM's core version control and data model capabilities. Cambria focuses on the version control library and data persistence layer, deferring web features (wiki, ticketing, forums) to future phases.
Design Philosophy
- Minimal and Correct: Start with the simplest correct implementation
- SQLite-Backed: Use SQLite as the single source of truth for all versioned data
- Content-Addressable: All artifacts identified by cryptographic hash (SHA-256)
- Immutable Artifacts: Once written, artifacts never change
- Go Idioms: Embrace Go's simplicity, composition, and standard library
- Standard Testing: Use only Go's standard library
testingpackage
Technology Choices
| Fossil SCM | Cambria | Rationale |
|---|---|---|
| C | Go | Memory safety, concurrency, modern tooling |
| TH1/TCL | Starlark | Sandboxed, deterministic scripting |
| Custom HTTP | Chi router | Idiomatic HTTP routing (future phase) |
| Custom templates | HTMX + templates | Progressive enhancement (future phase) |
| SQLite C API | mattn/go-sqlite3 | Pure Go with CGo for SQLite driver |
| Custom CLI | urfave/cli | Standard Go CLI framework |
Phase 0 Goals
- Project Structure: Establish Go module layout and package organization
- Dependencies: Set up required external dependencies
- Data Model: Implement minimal SQLite schema
- Build System: Configure build, test, and development workflow
- Documentation: Foundation for future development phases
Non-Goals (Deferred to Later Phases)
- Web UI implementation (wiki, tickets, forums, timeline visualization)
- CLI command implementation (only library layer in Phase 0)
- Network synchronization protocol
- Advanced merge strategies
- Delta compression optimization
- User authentication and permissions
Project Structure
cambria/
├── go.mod # Go module definition
├── go.sum # Dependency checksums
├── README.md # Project documentation
├── LICENSE # License file
├── Makefile # Build automation (optional)
│
├── cmd/
│ └── cambria/ # CLI application entry point
│ └── main.go # (Phase 1+)
│
├── pkg/
│ ├── hash/ # Content hashing utilities
│ │ ├── hash.go # Hash computation and verification
│ │ └── hash_test.go
│ │
│ ├── store/ # SQLite data access layer
│ │ ├── db.go # Database connection management
│ │ ├── db_test.go
│ │ ├── schema.go # Schema creation and migrations
│ │ ├── blob.go # Blob storage operations
│ │ ├── blob_test.go
│ │ ├── manifest.go # Manifest storage operations
│ │ ├── manifest_test.go
│ │ ├── mlink.go # Manifest-file linkage
│ │ ├── mlink_test.go
│ │ ├── plink.go # Parent-child linkage
│ │ ├── plink_test.go
│ │ └── label.go # Tag/branch labels
│ │
│ ├── artifact/ # Artifact parsing and generation
│ │ ├── manifest.go # Manifest format parsing/serialization
│ │ ├── manifest_test.go
│ │ └── canonicalize.go # Manifest canonicalization
│ │
│ ├── vcs/ # Version control operations (Phase 1)
│ │ ├── repo.go # Repository abstraction
│ │ ├── checkin.go # Commit operations
│ │ ├── checkout.go # Checkout operations
│ │ ├── diff.go # Diff operations
│ │ └── merge.go # Merge operations
│ │
│ ├── delta/ # Delta compression (Phase 2)
│ │ ├── delta.go # Delta algorithm
│ │ └── delta_test.go
│ │
│ └── script/ # Starlark integration (Phase 2)
│ ├── runtime.go # Starlark execution environment
│ └── builtins.go # Custom Starlark functions
│
├── internal/ # Private packages
│ ├── testutil/ # Test utilities and helpers
│ │ ├── fixtures.go # Test fixtures
│ │ └── tempdir.go # Temporary directory management
│ │
│ └── fileutil/ # File system utilities
│ ├── walk.go # Directory traversal
│ └── hash.go # File hashing
│
└── testdata/ # Test fixtures and golden files
├── manifests/ # Sample manifest files
└── repos/ # Sample repository states
Dependencies
Core Dependencies
module github.com/yourusername/cambria
go 1.21
require (
github.com/mattn/go-sqlite3 v1.14.18
go.starlark.net v0.0.0-20231121155337-90ade8b19d09
)
Future Dependencies (Phase 1+)
require (
github.com/urfave/cli/v2 v2.27.1 // CLI framework
github.com/go-chi/chi/v5 v5.0.11 // HTTP router (web phase)
)
Dependency Justification
mattn/go-sqlite3 (v1.14.18+)
- Pure Go interface to SQLite via CGo
- Stable, widely-used, well-maintained
- Full SQLite feature support including transactions, indexes, foreign keys
go.starlark.net (latest)
- Google's Go implementation of Starlark (Python dialect)
- Deterministic, sandboxed execution
- No file I/O or network access by default
- Replaces Fossil's TH1/TCL for hooks and scripting
urfave/cli/v2 (Phase 1+)
- Ergonomic CLI framework
- Subcommand support
- Flag parsing
- Help generation
go-chi/chi/v5 (Future)
- Lightweight, idiomatic HTTP router
- Middleware support
- Compatible with net/http
Minimal Schema (Phase 0)
Based on CAMBRIA_DATA_MODEL_DESIGN.md, implement the core tables:
-- Immutable artifact storage
CREATE TABLE blob(
rid INTEGER PRIMARY KEY AUTOINCREMENT,
uuid TEXT UNIQUE NOT NULL,
size INTEGER NOT NULL,
content BLOB NOT NULL
);
CREATE INDEX blob_uuid ON blob(uuid);
-- Manifest identification
CREATE TABLE manifest(
rid INTEGER PRIMARY KEY REFERENCES blob(rid),
is_merge BOOLEAN DEFAULT 0
);
-- Manifest file linkage (path -> file content)
CREATE TABLE mlink(
manifest INTEGER NOT NULL REFERENCES manifest(rid),
fn TEXT NOT NULL,
fid INTEGER NOT NULL REFERENCES blob(rid),
PRIMARY KEY(manifest, fn)
);
CREATE INDEX mlink_manifest ON mlink(manifest);
CREATE INDEX mlink_fn ON mlink(fn);
CREATE INDEX mlink_fid ON mlink(fid);
-- Parent-child DAG
CREATE TABLE plink(
parent INTEGER NOT NULL REFERENCES manifest(rid),
child INTEGER NOT NULL REFERENCES manifest(rid),
PRIMARY KEY(parent, child)
);
CREATE INDEX plink_child ON plink(child);
CREATE INDEX plink_parent ON plink(parent);
-- Branch and tag labels
CREATE TABLE label(
manifest INTEGER NOT NULL REFERENCES manifest(rid),
name TEXT NOT NULL,
PRIMARY KEY(manifest, name)
);
CREATE INDEX label_name ON label(name);
-- Repository metadata
CREATE TABLE config(
key TEXT PRIMARY KEY,
value TEXT NOT NULL
);
Go Module Setup
Step 1: Initialize Module
cd cambria
go mod init github.com/yourusername/cambria
Step 2: Add Dependencies
go get github.com/mattn/go-sqlite3@v1.14.18
go get go.starlark.net@latest
Step 3: Verify CGo Setup
mattn/go-sqlite3 requires CGo. Verify:
go env CGO_ENABLED # Should be "1"
If not enabled:
export CGO_ENABLED=1
Step 4: Test Build
go build ./...
go test ./...
Testing Strategy
Principles
- Standard Library Only: Use only
testingpackage, no external frameworks - Table-Driven Tests: Prefer table-driven tests for multiple scenarios
- Test Fixtures: Use
testdata/for complex test inputs - Temporary Directories: Create isolated temp directories per test
- No Mocks: Prefer real SQLite databases (in-memory or temp files)
- Parallel Tests: Enable
t.Parallel()where safe
Example Test Structure
func TestBlobStorage(t *testing.T) {
// Create temp database
db, cleanup := testutil.NewTempDB(t)
defer cleanup()
tests := []struct {
name string
content []byte
wantErr bool
}{
{"empty", []byte{}, false},
{"small", []byte("hello"), false},
{"large", make([]byte, 1<<20), false},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
rid, uuid, err := store.WriteBlob(db, tt.content)
if (err != nil) != tt.wantErr {
t.Errorf("WriteBlob() error = %v, wantErr %v", err, tt.wantErr)
}
if err != nil {
return
}
// Verify content retrieval
got, err := store.ReadBlob(db, rid)
if err != nil {
t.Fatalf("ReadBlob() error = %v", err)
}
if !bytes.Equal(got, tt.content) {
t.Errorf("ReadBlob() = %v, want %v", got, tt.content)
}
// Verify UUID lookup
rid2, err := store.FindBlobByUUID(db, uuid)
if err != nil {
t.Fatalf("FindBlobByUUID() error = %v", err)
}
if rid2 != rid {
t.Errorf("FindBlobByUUID() = %v, want %v", rid2, rid)
}
})
}
}
Development Workflow
Build Commands
# Build all packages
go build ./...
# Build CLI (Phase 1+)
go build -o bin/cambria ./cmd/cambria
# Run tests
go test ./...
# Run tests with coverage
go test -cover ./...
# Run tests with race detector
go test -race ./...
# Run specific package tests
go test ./pkg/store
# Verbose test output
go test -v ./...
# Run benchmarks
go test -bench=. ./...
Code Quality
# Format code
go fmt ./...
# Static analysis
go vet ./...
# Install golangci-lint
go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest
# Run linters
golangci-lint run
Useful Make targets (optional)
.PHONY: all build test clean
all: test build
build:
go build -o bin/cambria ./cmd/cambria
test:
go test -v -race -cover ./...
test-short:
go test -short ./...
bench:
go test -bench=. -benchmem ./...
clean:
rm -rf bin/
go clean -cache -testcache
lint:
golangci-lint run
fmt:
go fmt ./...
gofumpt -w .
deps:
go mod download
go mod verify
Package Responsibilities (Phase 0)
pkg/hash
Purpose: Content-addressable hashing utilities.
Key Functions:
ComputeSHA256(data []byte) string- Compute hex-encoded SHA-256ComputeFileSHA256(path string) (string, error)- Hash file contentsVerify(data []byte, expectedHash string) bool- Verify hash matches
Test Coverage:
- Hash consistency (same input → same output)
- Empty input handling
- Large input handling
- File hashing
- Invalid hash detection
pkg/store
Purpose: SQLite data access layer for all repository data.
Key Operations:
- Database lifecycle (open, close, schema setup)
- Blob CRUD (create, read, find by UUID)
- Manifest CRUD
- Mlink operations (add, list, query)
- Plink operations (add, query ancestry)
- Label operations (add, list, query)
- Transaction management
Test Coverage:
- Schema creation
- Blob storage and retrieval
- UUID uniqueness enforcement
- Foreign key constraints
- Index usage verification
- Transaction rollback
- Concurrent access (race detector)
pkg/artifact
Purpose: Parse and generate manifest artifacts.
Key Functions:
ParseManifest(content []byte) (*Manifest, error)- Parse manifest textGenerateManifest(m *Manifest) ([]byte, error)- Generate canonical textCanonicalize(m *Manifest) []byte- Ensure canonical form
Manifest Structure:
type Manifest struct {
Files []FileEntry // F-lines
Parents []string // P-lines (UUIDs)
Comment string // C-line
Labels []string // T-lines
}
type FileEntry struct {
Path string
UUID string // File content hash
}
Test Coverage:
- Parsing valid manifests
- Parsing malformed manifests (errors)
- Round-trip (parse → generate → parse)
- Canonical ordering (paths sorted)
- UTF-8 handling
- Line ending normalization
Configuration and Conventions
File Naming
- Source files:
lowercase.go - Test files:
lowercase_test.go - Package names: single lowercase word (no underscores)
Code Style
- Follow Go standard style (enforced by
gofmt) - Use
gofumptfor stricter formatting - Error messages: lowercase, no punctuation
- Exported functions: document with godoc comments
Error Handling
- Return errors, don't panic (except for programmer errors)
- Wrap errors with context:
fmt.Errorf("operation failed: %w", err) - Use
errors.Is()anderrors.As()for error checking
Logging
- Phase 0: No logging framework, use
logpackage sparingly - Phase 1+: Consider structured logging (slog in stdlib)
Success Criteria (Phase 0)
Phase 0 is complete when:
- ✅ Go module initialized with correct dependencies
- ✅ SQLite schema implemented and tested
- ✅
pkg/hashpackage implemented with tests - ✅
pkg/storepackage implemented with comprehensive tests - ✅
pkg/artifactpackage can parse and generate manifests - ✅ All tests pass:
go test ./... - ✅ No race conditions detected:
go test -race ./... - ✅ Test coverage >80% for implemented packages
- ✅ Code passes
go vetandgolangci-lint - ✅ Documentation comments on all exported functions
Next Phase Preview
Phase 1 will implement:
- Repository operations (init, add, commit, checkout, status)
- Working directory management
- Basic diff functionality
- CLI interface using urfave/cli
- Integration tests for complete workflows
See CAMBRIA_PHASE_1.md for detailed Phase 1 planning.
References
- FOSSIL_VERSION_CONTROL.md: Schema and module mapping
- FOSSIL_VERSION_CONTROL_TEST.md: Test suite organization
- CAMBRIA_DATA_MODEL_DESIGN.md: Minimal data model specification
- Fossil Source:
/workspace/src/for C implementation reference
Timeline Estimate
Assuming one experienced Go developer:
- Project setup and dependencies: 1 day
- pkg/hash implementation and tests: 1 day
- pkg/store schema and core CRUD: 3 days
- pkg/store advanced operations and tests: 2 days
- pkg/artifact manifest parsing: 2 days
- Integration testing and documentation: 1 day
Total: ~10 days for Phase 0 completion.
Open Questions
Hash Algorithm: SHA-256 (standard) vs BLAKE3 (faster)?
- Recommendation: SHA-256 for compatibility, add BLAKE3 option later
SQLite Pragmas: Which pragmas to enable by default?
- Recommendation: WAL mode, foreign keys, synchronous=NORMAL
Concurrency Model: Single-writer or multiple writers?
- Recommendation: Single-writer for Phase 0 (SQLite WAL limitation)
Repository Location: Single file or directory?
- Recommendation: Single
.cambriafile (like.fossil)
- Recommendation: Single
Platform Support: Windows, Linux, macOS?
- Recommendation: All three, CI for all platforms