Created: 2025-12-14
Status: ✅ Completed (Phase 1.5)
Related: CAMBRIA_CLI_PLAN.md
Reference: Fossil src/vfile.c
Overview
This document describes the completed implementation of Cambria's working directory state tracking system, centered on the vfile and vmerge tables. These tables provide critical infrastructure for commands like status, add, commit, and checkout.
The VFILE system was implemented as Phase 1.5, between the completion of core VCS operations (Phase 1) and the upcoming CLI implementation (Phase 2).
Background: Fossil's VFILE System
In Fossil SCM, the vfile table serves as the "manifest of the working directory." It tracks:
- Which manifest (version) is currently checked out
- The state of every file in the working directory
- Changes made since checkout (added, deleted, modified)
- File metadata (permissions, symlink status)
Cambria implements this same model using SQLite tables with full transactional support.
Core Concepts
1. Working Directory Baseline
The baseline is the manifest (commit) that the working directory is currently based on. This represents "the last clean checkout."
- Set by
repo.Checkout(root, manifestUUID, opts) - Updated by
repo.Commit(workDir, opts)to the newly created manifest - Stored in both the
vfile.vidcolumn andconfigtable
2. File States
Files in a working directory can be in several states:
| State | Description | VFILE Representation |
|---|---|---|
| CLEAN | Unchanged since baseline | chnged=0, file exists on disk |
| MODIFIED | Content changed | chnged>0, file exists on disk |
| ADDED | New file, staged for commit | rid IS NULL, chnged=1 |
| DELETED | File removed from disk | deleted=1 OR file missing when rid exists |
| UNTRACKED | Not in version control | No row in vfile |
3. Change Detection
Cambria detects file changes by comparing content hashes:
- Size Check: If file size differs from baseline, mark as changed
- Hash Comparison: Compute SHA-256 and compare to baseline hash
- Update VFILE: Set
chngedstatus,mtime, andsizecolumns
The implementation currently always hashes files for accuracy. Future optimizations may use mtime as a hint.
Database Schema
VFILE Table
CREATE TABLE IF NOT EXISTS vfile(
id INTEGER PRIMARY KEY AUTOINCREMENT,
-- Version tracking
vid INTEGER NOT NULL REFERENCES manifest(rid), -- Checked-out manifest (baseline)
rid INTEGER REFERENCES blob(rid), -- File's blob in baseline manifest
mrid INTEGER REFERENCES blob(rid), -- File's blob if merged from another version
-- File identification
pathname TEXT NOT NULL COLLATE NOCASE, -- Relative path from working directory root
origname TEXT COLLATE NOCASE, -- Original name if renamed (for merge tracking)
-- File metadata
is_exe BOOLEAN NOT NULL DEFAULT 0, -- Executable permission
is_link BOOLEAN NOT NULL DEFAULT 0, -- Symbolic link
-- State tracking
chnged INTEGER NOT NULL DEFAULT 0, -- Change state (0-9)
deleted BOOLEAN NOT NULL DEFAULT 0, -- Marked for deletion
-- Hash tracking (for change detection)
mhash TEXT, -- Hash from merged version
-- Performance optimization
mtime INTEGER, -- Last modification time (for quick checks)
size INTEGER, -- File size (for quick checks)
UNIQUE(vid, pathname)
);
CREATE INDEX IF NOT EXISTS vfile_vid_idx ON vfile(vid);
CREATE INDEX IF NOT EXISTS vfile_pathname_idx ON vfile(pathname);
CHNGED Values (from Fossil's vfile_check_signature):
0= No change (clean)1= Edited2= Changed due to merge3= Added by merge4= Changed due to integrate merge5= Added by integrate merge6= Became executable (content unchanged)7= Became symlink (content unchanged)8= Lost executable status (content unchanged)9= Lost symlink status (content unchanged)
Current Implementation: Supports states 0 (clean) and 1 (edited). Merge states (2-5) and permission-only changes (6-9) are reserved for future implementation.
VMERGE Table
CREATE TABLE IF NOT EXISTS vmerge(
id INTEGER PRIMARY KEY AUTOINCREMENT,
-- Merge identification
merge INTEGER NOT NULL REFERENCES manifest(rid), -- Manifest being merged in
mhash TEXT NOT NULL, -- Hash of merge manifest (denormalized)
-- Merge type and status
merge_type INTEGER NOT NULL DEFAULT 0, -- 0=normal, 1=integrate, 2=cherrypick
is_baseline BOOLEAN NOT NULL DEFAULT 0, -- True if this is the baseline merge
UNIQUE(merge)
);
CREATE INDEX IF NOT EXISTS vmerge_merge_idx ON vmerge(merge);
Current Status: Schema defined but not yet used. Merge operations will be implemented in a future phase.
Implementation Details
Core Data Structures
File: pkg/vcs/vfile.go
// VFileEntry represents a file in the working directory
type VFileEntry struct {
ID int
Vid int // Checked-out manifest RID
Rid *int // File blob RID in baseline (NULL for added files)
Mrid *int // File blob RID if merged
Pathname string
Origname *string
IsExe bool
IsLink bool
Chnged int // Change status (0-9)
Deleted bool
Mhash *string
Mtime *int64
Size *int64
}
// VFileStatus represents the change state
type VFileStatus int
const (
VFileClean VFileStatus = 0 // No change
VFileEdited VFileStatus = 1 // Edited
VFileMerged VFileStatus = 2 // Changed due to merge
// ... (all 10 states defined)
)
// VMergeEntry represents a merge in progress
type VMergeEntry struct {
ID int
Merge int // Manifest RID being merged
Mhash string // Hash of merge manifest
MergeType int // 0=normal, 1=integrate, 2=cherrypick
IsBaseline bool // True if this is the baseline
}
Core Functions
LoadVFileFromManifest
Function: LoadVFileFromManifest(tx store.DBTX, vid int) error
Populates the vfile table from a manifest. Called during checkout to establish the baseline.
Implementation:
- Clears existing vfile entries for this vid
- Retrieves manifest content from blob table
- Parses manifest to get file list
- For each file:
- Looks up file's blob RID by UUID
- Inserts row into vfile table
- Sets pathname, rid, is_exe, is_link
- Sets chnged=0, deleted=0 (clean checkout)
Location: pkg/vcs/vfile.go:48-111
CheckVFileSignatures
Function: CheckVFileSignatures(tx store.DBTX, vid int, workDir string, opts *CheckSignatureOpts) error
Detects file changes by comparing disk state to baseline. Updates vfile table with change status.
Implementation:
- Queries all vfile entries for vid
- For each entry:
- Checks if file exists on disk (using
os.Lstat) - If missing, marks as changed (chnged=1)
- If exists, reads file content and computes hash
- Compares hash to baseline (from blob table)
- Updates chnged status, mtime, and size
- Checks if file exists on disk (using
Options:
type CheckSignatureOpts struct {
Hash bool // Always hash (for future mtime optimization)
}
Location: pkg/vcs/vfile.go:114-188
WriteVFileToDisk
Function: WriteVFileToDisk(tx store.DBTX, vid int, workDir string, opts *WriteToDiskOpts) error
Writes files from vfile table to disk. Used during checkout operations.
Implementation:
- Queries vfile entries for vid where rid > 0
- For each entry:
- Loads blob content from database
- Validates file path (security check)
- Checks if file exists (honors Force option)
- Writes content to disk
- Sets file permissions (executable, symlink not yet supported)
Options:
type WriteToDiskOpts struct {
Force bool // Overwrite existing files
}
Location: pkg/vcs/vfile.go:191-259
Helper Functions
GetVFileEntry - Retrieves a single vfile entry by pathname
Location: pkg/vcs/vfile.go:262-312
GetCurrentCheckout - Gets the current checked-out manifest RID and UUID
Location: pkg/vcs/vfile.go:315-337
SetCurrentCheckout - Sets the current checkout configuration
Location: pkg/vcs/vfile.go:340-355
validatePath - Validates file paths for security (prevents directory traversal)
Location: pkg/vcs/vfile.go:358-376
VCS Integration
Checkout Integration
File: pkg/vcs/checkout.go
Changes:
- Wrapped all operations in a transaction
- After loading manifest, calls
LoadVFileFromManifest() - Calls
WriteVFileToDisk()with Force option support - Sets current checkout via
SetCurrentCheckout()
Key Code:
// Load vfile table from manifest
err = LoadVFileFromManifest(tx, manifest.RID)
// Write files to disk from vfile
writeOpts := &WriteToDiskOpts{Force: opts.Force}
err = WriteVFileToDisk(tx, manifest.RID, root, writeOpts)
// Set current checkout configuration
err = SetCurrentCheckout(tx, manifest.RID, manifestUUID)
Location: pkg/vcs/checkout.go:35-132
Add Integration
File: pkg/vcs/add.go
Changes:
- Gets current checkout (vid) from config
- For each file to add:
- Validates path (security)
- Checks if already in vfile
- If exists and deleted, resurrects file
- If exists and clean, reports already tracked
- If new, inserts vfile entry with rid=NULL, chnged=1
- Handles no-checkout case by writing blobs for later use
Key Code:
// Insert new vfile entry for added file
_, err = tx.Exec(`
INSERT INTO vfile(vid, pathname, chnged, deleted, is_exe, is_link, rid, mrid)
VALUES (?, ?, ?, 0, ?, ?, NULL, NULL)
`, vid, path, int(VFileEdited), isExe, isLink)
Location: pkg/vcs/add.go:46-176
Scan Integration
File: pkg/vcs/workdir.go
Changes:
- Gets current checkout (vid)
- Calls
CheckVFileSignatures()to update change detection - Queries vfile table instead of manifest directly
- Determines file status based on vfile state:
- deleted=true → StatusDeleted
- rid=NULL → StatusAdded
- chnged>0 → StatusModified (checks if file exists on disk)
- chnged=0 → StatusClean
- Scans filesystem for untracked files not in vfile
Key Code:
// Check file signatures to update vfile table
err = CheckVFileSignatures(tx, vid, w.root, nil)
// Query vfile entries
rows, err := tx.Query(`
SELECT v.pathname, v.chnged, v.deleted, v.rid, b.uuid, b.size
FROM vfile v
LEFT JOIN blob b ON v.rid = b.rid
WHERE v.vid = ?
`, vid)
// Determine status based on vfile state
if deleted {
wf.Status = StatusDeleted
} else if chnged > 0 {
if _, err := os.Lstat(fullPath); os.IsNotExist(err) {
wf.Status = StatusDeleted
} else {
wf.Status = StatusModified
}
}
Location: pkg/vcs/workdir.go:48-198
Commit Integration
File: pkg/vcs/checkin.go
Changes:
- Gets current checkout (vid, currentUUID)
- Calls
CheckVFileSignatures()to detect changes - Queries vfile for changed and tracked files
- For each file:
- Skips deleted files
- Reads content from disk (treats missing files as deleted)
- Writes blob and adds to commit file list
- Calls
Checkin()to create new manifest - Updates vfile to new baseline:
- Calls
LoadVFileFromManifest()with new manifest RID - Updates current checkout config
- Calls
Key Code:
// Query vfile for changed files
rows, err := tx.Query(`
SELECT pathname, chnged, deleted, rid, is_exe, is_link
FROM vfile
WHERE vid = ?
`, vid)
// Skip deleted files
if deleted {
hasChanges = true
continue
}
// After creating new manifest, update vfile baseline
newManifest, _ := store.GetManifestByUUID(tx2, newManifestUUID)
LoadVFileFromManifest(tx2, newManifest.RID)
SetCurrentCheckout(tx2, newManifest.RID, newManifestUUID)
Location: pkg/vcs/checkin.go:218-375
Testing
Unit Tests
File: pkg/vcs/vfile_test.go (481 lines, 8 tests)
Comprehensive tests for all vfile core functions:
- TestLoadVFileFromManifest - Verifies vfile population from manifest
- TestCheckVFileSignatures_Clean - Tests clean file detection
- TestCheckVFileSignatures_Modified - Tests modified file detection
- TestCheckVFileSignatures_Deleted - Tests deleted file detection
- TestWriteVFileToDisk - Tests file writing from vfile
- TestGetVFileEntry - Tests single entry retrieval
- TestGetSetCurrentCheckout - Tests checkout config operations
- TestValidatePath - Tests path security validation
Coverage: All core vfile functions tested with positive and negative cases.
Integration Tests
File: pkg/vcs/vfile_integration_test.go (400 lines, 3 tests)
End-to-end workflow tests:
- TestVFileIntegration_CompleteWorkflow - Complete checkout → modify → add → commit → checkout cycle
- TestVFileIntegration_DeletedFiles - Handling of file deletion and commit
- TestVFileIntegration_ModifyAndRevert - Modify file and force checkout to revert
Coverage: All VCS operations tested with vfile integration.
Test Results
All 31 VCS tests passing:
- 8 vfile unit tests ✅
- 3 vfile integration tests ✅
- 13 existing VCS tests ✅
- 7 other VCS operation tests ✅
No race conditions detected: go test -race ./... passes
Security Features
Path Traversal Protection
All vfile operations validate file paths using validatePath():
func validatePath(path string) error {
// No absolute paths
if filepath.IsAbs(path) {
return fmt.Errorf("absolute paths not allowed: %s", path)
}
// No ".." components
if strings.Contains(path, "..") {
return fmt.Errorf("path traversal not allowed: %s", path)
}
// Clean path
clean := filepath.Clean(path)
if strings.HasPrefix(clean, "..") || strings.HasPrefix(clean, "/") {
return fmt.Errorf("invalid path: %s", path)
}
return nil
}
This prevents:
- Directory traversal attacks using ".."
- Absolute path manipulation
- Path escape attempts
Symlink Handling
Current implementation:
- Uses
os.Lstat()instead ofos.Stat()to avoid following symlinks - Tracks
is_linkflag in vfile table - Full symlink support deferred to future phase
Performance Characteristics
Current Implementation
- Change Detection: Always hashes files for accuracy
- Transaction Isolation: All vfile operations wrapped in SQLite transactions
- Index Usage:
vfile(vid, pathname)unique index for fast lookups
Future Optimizations
Planned for future phases:
- Mtime-based Change Detection: Skip hash computation if mtime unchanged
- Batch Signature Checks: Process multiple files in parallel
- Incremental Updates: Only check files that might have changed
Scalability Targets
Expected performance (to be measured):
- 1,000 files: Sub-second status check
- 10,000 files: < 5 second status check
- 100,000 files: < 30 second status check (with mtime optimization)
Known Limitations
Current Phase 1.5 Scope
The following features are not yet implemented:
- Merge Operations: vmerge table defined but not used
- Permission-only Changes: Change states 6-9 (executable/symlink permission changes)
- Merge States: Change states 2-5 (merge-related changes)
- Symlink Content: Symlinks tracked but not fully supported
- Mtime Optimization: All files always hashed
Future Work
These limitations will be addressed in future phases:
- Phase 2+: Merge support using vmerge table
- Phase 3+: Performance optimizations (mtime, parallel hashing)
- Phase 4+: Full symlink support
Fossil Vfile Functions → Cambria Mapping
| Fossil Function | Cambria Equivalent | Status |
|---|---|---|
load_vfile_from_rid() |
LoadVFileFromManifest() |
✅ Implemented |
vfile_check_signature() |
CheckVFileSignatures() |
✅ Implemented |
vfile_to_disk() |
WriteVFileToDisk() |
✅ Implemented |
vfile_scan() |
Part of Scan() |
✅ Integrated |
vfile_aggregate_checksum_disk() |
- | Future (validation) |
vfile_compare_repository_to_disk() |
- | Future (fsck) |
Success Criteria
Phase 1.5 Complete ✅
All success criteria met:
- ✅ vfile and vmerge tables defined in schema
- ✅
LoadVFileFromManifest()implemented and tested - ✅
CheckVFileSignatures()implemented and tested - ✅
WriteVFileToDisk()implemented and tested - ✅
Checkout()populates vfile - ✅
Add()updates vfile - ✅
Scan()uses vfile - ✅
Commit()reads from vfile and updates baseline - ✅ All unit tests passing (8/8)
- ✅ All integration tests passing (3/3)
- ✅ No race conditions detected
- ✅ Path security validated
- ✅ All 31 VCS tests passing
Related Documentation
- CAMBRIA_CLI_PLAN.md - CLI commands (Phase 2, next)
- AGENTS.md - Overall architecture and agent guide
- Fossil vfile.c - Reference implementation
Summary
The VFILE system implementation is complete and ready for use. All core operations (checkout, add, commit, scan) now use the vfile table for efficient working directory state tracking. The implementation provides:
- Reliable change detection via content hashing
- Transactional safety via SQLite transactions
- Path security via traversal protection
- Clean integration with existing VCS operations
- Comprehensive test coverage with unit and integration tests
The next phase (Phase 2) will implement the CLI using these vfile-backed operations.
Implementation Complete: 2025-12-14 Next Phase: CLI Implementation (CAMBRIA_CLI_PLAN.md)