Skip to content

perf: use CoW (copy-on-write) cloning for directory copies#122

Open
arwtyxouymz wants to merge 2 commits intocoderabbitai:mainfrom
arwtyxouymz:feat/faster-copy
Open

perf: use CoW (copy-on-write) cloning for directory copies#122
arwtyxouymz wants to merge 2 commits intocoderabbitai:mainfrom
arwtyxouymz:feat/faster-copy

Conversation

@arwtyxouymz
Copy link

@arwtyxouymz arwtyxouymz commented Feb 13, 2026

Summary

  • Add _fast_copy_dir() to lib/copy.sh that uses CoW file cloning when supported by the filesystem
    • macOS APFS: cp -cRP (clone) with fallback to cp -RP
    • Linux Btrfs/XFS: cp --reflink=auto -RP (auto-fallback built in)
    • Other filesystems: standard cp -RP (no behavior change)
  • Replace cp -RP in copy_directories() with _fast_copy_dir
  • Add 3 BATS tests for _fast_copy_dir (contents, symlinks, error handling)

Motivation

Copying large directories like node_modules or .venv via gtr.copy.includeDirs can be slow. On CoW-capable filesystems (APFS, Btrfs, XFS), file cloning is near-instant regardless of directory size, as only metadata is copied.

Risk

  • Zero regression on non-CoW filesystems (ext4, NTFS, etc.) — all paths fall back to the current cp -RP behavior
  • Individual file copies (copy_patterns) are intentionally unchanged, as CoW overhead outweighs benefit for small config files

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Chores

    • Optimized directory copying to use copy-on-write where supported, with safe fallbacks and cleanup for partial copies.
    • Improved symlink preservation and logging during copy operations.
  • Tests

    • Added safety tests verifying directory copy behavior, symlink preservation, and handling of missing sources.

Add _fast_copy_dir() that leverages filesystem-level cloning
(macOS APFS cp -c, Linux cp --reflink=auto) for near-instant
directory copies on supported filesystems, with automatic
fallback to standard cp -RP on ext4/NTFS/others.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Feb 13, 2026

Walkthrough

Adds _fast_copy_dir() to perform Copy-on-Write-capable directory copies with OS-specific commands and fallbacks, updates copy_directories() to use it, and adds tests validating content copying, symlink preservation, and error handling.

Changes

Cohort / File(s) Summary
Copy-on-Write Implementation
lib/copy.sh
Adds _fast_copy_dir() which detects OS once (cached), attempts CoW copies (macOS: cp -cRP then fallback; Linux: cp --reflink=auto -RP; others: cp -RP), cleans partial macOS targets on failure, and preserves symlinks. Replaces direct cp -RP call in copy_directories() with _fast_copy_dir.
Testing
tests/copy_safety.bats
Adds tests sourcing platform script, teardown cleanup, and three _fast_copy_dir() tests: directory/structure copying, symlink target preservation, and failure when source missing.

Sequence Diagram(s)

sequenceDiagram
  participant Caller as Script (copy_directories)
  participant Helper as _fast_copy_dir()
  participant OS as OS `cp` command
  participant FS as Filesystem

  Caller->>Helper: request copy(src, dst, excludes)
  Helper->>Helper: detect/cached OS
  Helper->>OS: run CoW-aware cp (platform-specific)
  OS->>FS: perform copy (reflink or full copy)
  alt macOS CoW failed
    Helper->>FS: remove partial dst
    Helper->>OS: run standard cp fallback
    OS->>FS: perform full copy
  end
  Helper->>Caller: return success/failure, log outcome
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 I hopped and nudged a copy bright,

CoW magic in the pale moonlight.
Symlinks safe and folders neat,
Fast and tidy—what a treat! 🥕

🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: introducing Copy-on-Write cloning for directory copies as a performance optimization.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

No actionable comments were generated in the recent review. 🎉


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@lib/copy.sh`:
- Around line 88-93: The darwin branch can leave a partially-populated
destination when cp -cRP "$src" "$dest" fails, so modify the fallback to clean
the destination before retrying: after the failed cp -cRP "$src" "$dest"
attempt, detect and remove the partially-created destination (e.g., rm -rf
"$dest" only when it was created by the failed attempt) or perform the initial
clone into a temporary directory and then atomically rename/move it into
"$dest", then run cp -RP "$src" "$dest" as the fallback; update the darwin case
where cp -cRP and cp -RP are invoked to implement this cleanup/temporal-staging
logic.
🧹 Nitpick comments (2)
lib/copy.sh (1)

84-85: detect_os is invoked on every call; consider caching for loop-heavy callers.

copy_directories calls _fast_copy_dir once per matched directory. Each invocation spawns a subshell for detect_os. You could cache the result in a module-level variable on first call.

Proposed optimization
+# Cached OS value for _fast_copy_dir; set on first call.
+_fast_copy_os=""
+
 _fast_copy_dir() {
   local src="$1" dest="$2"
-  local os
-  os=$(detect_os)
+  if [ -z "$_fast_copy_os" ]; then
+    _fast_copy_os=$(detect_os)
+  fi
+  local os="$_fast_copy_os"
tests/copy_safety.bats (1)

89-101: Temp directory cleanup won't run if an assertion fails.

If [ -f "$dst/mydir/sub/file.txt" ] fails on line 98, lines 100 (rm -rf) is never reached. Same pattern in the other two tests. Consider using a BATS teardown function or the $BATS_TEST_TMPDIR built-in (BATS 1.4+) for automatic cleanup.

Example using teardown
+teardown() {
+  # Clean up any temp dirs created during tests
+  [ -n "${_test_tmpdir:-}" ] && rm -rf "$_test_tmpdir"
+}
+
 `@test` "_fast_copy_dir copies directory contents" {
-  local src dst
-  src=$(mktemp -d)
-  dst=$(mktemp -d)
+  _test_tmpdir=$(mktemp -d)
+  local src="$_test_tmpdir/src" dst="$_test_tmpdir/dst"
+  mkdir -p "$src" "$dst"
   mkdir -p "$src/mydir/sub"
   echo "hello" > "$src/mydir/sub/file.txt"
 
   _fast_copy_dir "$src/mydir" "$dst/"
 
   [ -f "$dst/mydir/sub/file.txt" ]
   [ "$(cat "$dst/mydir/sub/file.txt")" = "hello" ]
-  rm -rf "$src" "$dst"
 }

- Clean up partial clone output before fallback in Darwin _fast_copy_dir
- Cache detect_os result to avoid repeated subshell calls
- Use teardown function in copy_safety tests for reliable tmpdir cleanup

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant