cambria

Cambria VFILE and VMERGE Tables Implementation
Login

Created: 2025-12-14 Status: ✅ Completed (Phase 1.5) Related: CAMBRIA_CLI_PLAN.md Reference: Fossil src/vfile.c

Overview

This document describes the completed implementation of Cambria's working directory state tracking system, centered on the vfile and vmerge tables. These tables provide critical infrastructure for commands like status, add, commit, and checkout.

The VFILE system was implemented as Phase 1.5, between the completion of core VCS operations (Phase 1) and the upcoming CLI implementation (Phase 2).

Background: Fossil's VFILE System

In Fossil SCM, the vfile table serves as the "manifest of the working directory." It tracks:

Cambria implements this same model using SQLite tables with full transactional support.

Core Concepts

1. Working Directory Baseline

The baseline is the manifest (commit) that the working directory is currently based on. This represents "the last clean checkout."

2. File States

Files in a working directory can be in several states:

State Description VFILE Representation
CLEAN Unchanged since baseline chnged=0, file exists on disk
MODIFIED Content changed chnged>0, file exists on disk
ADDED New file, staged for commit rid IS NULL, chnged=1
DELETED File removed from disk deleted=1 OR file missing when rid exists
UNTRACKED Not in version control No row in vfile

3. Change Detection

Cambria detects file changes by comparing content hashes:

  1. Size Check: If file size differs from baseline, mark as changed
  2. Hash Comparison: Compute SHA-256 and compare to baseline hash
  3. Update VFILE: Set chnged status, mtime, and size columns

The implementation currently always hashes files for accuracy. Future optimizations may use mtime as a hint.

Database Schema

VFILE Table

CREATE TABLE IF NOT EXISTS vfile(
    id INTEGER PRIMARY KEY AUTOINCREMENT,

    -- Version tracking
    vid INTEGER NOT NULL REFERENCES manifest(rid),  -- Checked-out manifest (baseline)
    rid INTEGER REFERENCES blob(rid),               -- File's blob in baseline manifest
    mrid INTEGER REFERENCES blob(rid),              -- File's blob if merged from another version

    -- File identification
    pathname TEXT NOT NULL COLLATE NOCASE,          -- Relative path from working directory root
    origname TEXT COLLATE NOCASE,                   -- Original name if renamed (for merge tracking)

    -- File metadata
    is_exe BOOLEAN NOT NULL DEFAULT 0,              -- Executable permission
    is_link BOOLEAN NOT NULL DEFAULT 0,             -- Symbolic link

    -- State tracking
    chnged INTEGER NOT NULL DEFAULT 0,              -- Change state (0-9)
    deleted BOOLEAN NOT NULL DEFAULT 0,             -- Marked for deletion

    -- Hash tracking (for change detection)
    mhash TEXT,                                     -- Hash from merged version

    -- Performance optimization
    mtime INTEGER,                                  -- Last modification time (for quick checks)
    size INTEGER,                                   -- File size (for quick checks)

    UNIQUE(vid, pathname)
);

CREATE INDEX IF NOT EXISTS vfile_vid_idx ON vfile(vid);
CREATE INDEX IF NOT EXISTS vfile_pathname_idx ON vfile(pathname);

CHNGED Values (from Fossil's vfile_check_signature):

Current Implementation: Supports states 0 (clean) and 1 (edited). Merge states (2-5) and permission-only changes (6-9) are reserved for future implementation.

VMERGE Table

CREATE TABLE IF NOT EXISTS vmerge(
    id INTEGER PRIMARY KEY AUTOINCREMENT,

    -- Merge identification
    merge INTEGER NOT NULL REFERENCES manifest(rid),  -- Manifest being merged in
    mhash TEXT NOT NULL,                             -- Hash of merge manifest (denormalized)

    -- Merge type and status
    merge_type INTEGER NOT NULL DEFAULT 0,           -- 0=normal, 1=integrate, 2=cherrypick
    is_baseline BOOLEAN NOT NULL DEFAULT 0,          -- True if this is the baseline merge

    UNIQUE(merge)
);

CREATE INDEX IF NOT EXISTS vmerge_merge_idx ON vmerge(merge);

Current Status: Schema defined but not yet used. Merge operations will be implemented in a future phase.

Implementation Details

Core Data Structures

File: pkg/vcs/vfile.go

// VFileEntry represents a file in the working directory
type VFileEntry struct {
    ID       int
    Vid      int     // Checked-out manifest RID
    Rid      *int    // File blob RID in baseline (NULL for added files)
    Mrid     *int    // File blob RID if merged
    Pathname string
    Origname *string
    IsExe    bool
    IsLink   bool
    Chnged   int     // Change status (0-9)
    Deleted  bool
    Mhash    *string
    Mtime    *int64
    Size     *int64
}

// VFileStatus represents the change state
type VFileStatus int

const (
    VFileClean VFileStatus = 0  // No change
    VFileEdited VFileStatus = 1 // Edited
    VFileMerged VFileStatus = 2 // Changed due to merge
    // ... (all 10 states defined)
)

// VMergeEntry represents a merge in progress
type VMergeEntry struct {
    ID          int
    Merge       int    // Manifest RID being merged
    Mhash       string // Hash of merge manifest
    MergeType   int    // 0=normal, 1=integrate, 2=cherrypick
    IsBaseline  bool   // True if this is the baseline
}

Core Functions

LoadVFileFromManifest

Function: LoadVFileFromManifest(tx store.DBTX, vid int) error

Populates the vfile table from a manifest. Called during checkout to establish the baseline.

Implementation:

  1. Clears existing vfile entries for this vid
  2. Retrieves manifest content from blob table
  3. Parses manifest to get file list
  4. For each file:
    • Looks up file's blob RID by UUID
    • Inserts row into vfile table
    • Sets pathname, rid, is_exe, is_link
    • Sets chnged=0, deleted=0 (clean checkout)

Location: pkg/vcs/vfile.go:48-111

CheckVFileSignatures

Function: CheckVFileSignatures(tx store.DBTX, vid int, workDir string, opts *CheckSignatureOpts) error

Detects file changes by comparing disk state to baseline. Updates vfile table with change status.

Implementation:

  1. Queries all vfile entries for vid
  2. For each entry:
    • Checks if file exists on disk (using os.Lstat)
    • If missing, marks as changed (chnged=1)
    • If exists, reads file content and computes hash
    • Compares hash to baseline (from blob table)
    • Updates chnged status, mtime, and size

Options:

type CheckSignatureOpts struct {
    Hash bool // Always hash (for future mtime optimization)
}

Location: pkg/vcs/vfile.go:114-188

WriteVFileToDisk

Function: WriteVFileToDisk(tx store.DBTX, vid int, workDir string, opts *WriteToDiskOpts) error

Writes files from vfile table to disk. Used during checkout operations.

Implementation:

  1. Queries vfile entries for vid where rid > 0
  2. For each entry:
    • Loads blob content from database
    • Validates file path (security check)
    • Checks if file exists (honors Force option)
    • Writes content to disk
    • Sets file permissions (executable, symlink not yet supported)

Options:

type WriteToDiskOpts struct {
    Force bool // Overwrite existing files
}

Location: pkg/vcs/vfile.go:191-259

Helper Functions

GetVFileEntry - Retrieves a single vfile entry by pathname Location: pkg/vcs/vfile.go:262-312

GetCurrentCheckout - Gets the current checked-out manifest RID and UUID Location: pkg/vcs/vfile.go:315-337

SetCurrentCheckout - Sets the current checkout configuration Location: pkg/vcs/vfile.go:340-355

validatePath - Validates file paths for security (prevents directory traversal) Location: pkg/vcs/vfile.go:358-376

VCS Integration

Checkout Integration

File: pkg/vcs/checkout.go

Changes:

  1. Wrapped all operations in a transaction
  2. After loading manifest, calls LoadVFileFromManifest()
  3. Calls WriteVFileToDisk() with Force option support
  4. Sets current checkout via SetCurrentCheckout()

Key Code:

// Load vfile table from manifest
err = LoadVFileFromManifest(tx, manifest.RID)

// Write files to disk from vfile
writeOpts := &WriteToDiskOpts{Force: opts.Force}
err = WriteVFileToDisk(tx, manifest.RID, root, writeOpts)

// Set current checkout configuration
err = SetCurrentCheckout(tx, manifest.RID, manifestUUID)

Location: pkg/vcs/checkout.go:35-132

Add Integration

File: pkg/vcs/add.go

Changes:

  1. Gets current checkout (vid) from config
  2. For each file to add:
    • Validates path (security)
    • Checks if already in vfile
    • If exists and deleted, resurrects file
    • If exists and clean, reports already tracked
    • If new, inserts vfile entry with rid=NULL, chnged=1
  3. Handles no-checkout case by writing blobs for later use

Key Code:

// Insert new vfile entry for added file
_, err = tx.Exec(`
    INSERT INTO vfile(vid, pathname, chnged, deleted, is_exe, is_link, rid, mrid)
    VALUES (?, ?, ?, 0, ?, ?, NULL, NULL)
`, vid, path, int(VFileEdited), isExe, isLink)

Location: pkg/vcs/add.go:46-176

Scan Integration

File: pkg/vcs/workdir.go

Changes:

  1. Gets current checkout (vid)
  2. Calls CheckVFileSignatures() to update change detection
  3. Queries vfile table instead of manifest directly
  4. Determines file status based on vfile state:
    • deleted=true → StatusDeleted
    • rid=NULL → StatusAdded
    • chnged>0 → StatusModified (checks if file exists on disk)
    • chnged=0 → StatusClean
  5. Scans filesystem for untracked files not in vfile

Key Code:

// Check file signatures to update vfile table
err = CheckVFileSignatures(tx, vid, w.root, nil)

// Query vfile entries
rows, err := tx.Query(`
    SELECT v.pathname, v.chnged, v.deleted, v.rid, b.uuid, b.size
    FROM vfile v
    LEFT JOIN blob b ON v.rid = b.rid
    WHERE v.vid = ?
`, vid)

// Determine status based on vfile state
if deleted {
    wf.Status = StatusDeleted
} else if chnged > 0 {
    if _, err := os.Lstat(fullPath); os.IsNotExist(err) {
        wf.Status = StatusDeleted
    } else {
        wf.Status = StatusModified
    }
}

Location: pkg/vcs/workdir.go:48-198

Commit Integration

File: pkg/vcs/checkin.go

Changes:

  1. Gets current checkout (vid, currentUUID)
  2. Calls CheckVFileSignatures() to detect changes
  3. Queries vfile for changed and tracked files
  4. For each file:
    • Skips deleted files
    • Reads content from disk (treats missing files as deleted)
    • Writes blob and adds to commit file list
  5. Calls Checkin() to create new manifest
  6. Updates vfile to new baseline:
    • Calls LoadVFileFromManifest() with new manifest RID
    • Updates current checkout config

Key Code:

// Query vfile for changed files
rows, err := tx.Query(`
    SELECT pathname, chnged, deleted, rid, is_exe, is_link
    FROM vfile
    WHERE vid = ?
`, vid)

// Skip deleted files
if deleted {
    hasChanges = true
    continue
}

// After creating new manifest, update vfile baseline
newManifest, _ := store.GetManifestByUUID(tx2, newManifestUUID)
LoadVFileFromManifest(tx2, newManifest.RID)
SetCurrentCheckout(tx2, newManifest.RID, newManifestUUID)

Location: pkg/vcs/checkin.go:218-375

Testing

Unit Tests

File: pkg/vcs/vfile_test.go (481 lines, 8 tests)

Comprehensive tests for all vfile core functions:

  1. TestLoadVFileFromManifest - Verifies vfile population from manifest
  2. TestCheckVFileSignatures_Clean - Tests clean file detection
  3. TestCheckVFileSignatures_Modified - Tests modified file detection
  4. TestCheckVFileSignatures_Deleted - Tests deleted file detection
  5. TestWriteVFileToDisk - Tests file writing from vfile
  6. TestGetVFileEntry - Tests single entry retrieval
  7. TestGetSetCurrentCheckout - Tests checkout config operations
  8. TestValidatePath - Tests path security validation

Coverage: All core vfile functions tested with positive and negative cases.

Integration Tests

File: pkg/vcs/vfile_integration_test.go (400 lines, 3 tests)

End-to-end workflow tests:

  1. TestVFileIntegration_CompleteWorkflow - Complete checkout → modify → add → commit → checkout cycle
  2. TestVFileIntegration_DeletedFiles - Handling of file deletion and commit
  3. TestVFileIntegration_ModifyAndRevert - Modify file and force checkout to revert

Coverage: All VCS operations tested with vfile integration.

Test Results

All 31 VCS tests passing:

No race conditions detected: go test -race ./... passes

Security Features

Path Traversal Protection

All vfile operations validate file paths using validatePath():

func validatePath(path string) error {
    // No absolute paths
    if filepath.IsAbs(path) {
        return fmt.Errorf("absolute paths not allowed: %s", path)
    }

    // No ".." components
    if strings.Contains(path, "..") {
        return fmt.Errorf("path traversal not allowed: %s", path)
    }

    // Clean path
    clean := filepath.Clean(path)
    if strings.HasPrefix(clean, "..") || strings.HasPrefix(clean, "/") {
        return fmt.Errorf("invalid path: %s", path)
    }

    return nil
}

This prevents:

Symlink Handling

Current implementation:

Performance Characteristics

Current Implementation

Future Optimizations

Planned for future phases:

  1. Mtime-based Change Detection: Skip hash computation if mtime unchanged
  2. Batch Signature Checks: Process multiple files in parallel
  3. Incremental Updates: Only check files that might have changed

Scalability Targets

Expected performance (to be measured):

Known Limitations

Current Phase 1.5 Scope

The following features are not yet implemented:

  1. Merge Operations: vmerge table defined but not used
  2. Permission-only Changes: Change states 6-9 (executable/symlink permission changes)
  3. Merge States: Change states 2-5 (merge-related changes)
  4. Symlink Content: Symlinks tracked but not fully supported
  5. Mtime Optimization: All files always hashed

Future Work

These limitations will be addressed in future phases:

Fossil Vfile Functions → Cambria Mapping

Fossil Function Cambria Equivalent Status
load_vfile_from_rid() LoadVFileFromManifest() ✅ Implemented
vfile_check_signature() CheckVFileSignatures() ✅ Implemented
vfile_to_disk() WriteVFileToDisk() ✅ Implemented
vfile_scan() Part of Scan() ✅ Integrated
vfile_aggregate_checksum_disk() - Future (validation)
vfile_compare_repository_to_disk() - Future (fsck)

Success Criteria

Phase 1.5 Complete ✅

All success criteria met:

Related Documentation

Summary

The VFILE system implementation is complete and ready for use. All core operations (checkout, add, commit, scan) now use the vfile table for efficient working directory state tracking. The implementation provides:

The next phase (Phase 2) will implement the CLI using these vfile-backed operations.


Implementation Complete: 2025-12-14 Next Phase: CLI Implementation (CAMBRIA_CLI_PLAN.md)