User Tools

Site Tools


linux:zfs:usb_backup

Complete Guide: Setting Up ZFS on a USB Drive for Backups

Walkthrough formatting, creating, and optimizing a ZFS pool on a 10TB USB drive, specifically for daily backups, with a focus on performance, reliability, and efficiency.


1. Pre-Installation Checks

Before proceeding, ensure:
- USB drive is connected and detected
- ZFS utilities are istalled

1.1 Install ZFS (if not already installed)

sudo apt update
sudo apt install -y zfsutils-linux

1.2 Identify the USB Drive

Find the correct device name:

lsblk -o NAME,SIZE,MODEL,MOUNTPOINT

The 10TB drive should look like /dev/sdX (where X is the correct letter).


2. Prepare the Drive (Partitioning & Formatting)

ZFS can use raw disks, but it's recommended to partition it for flexibility.

2.1 Wipe Existing Data (Optional)

To ensure a clean setup, erase any existing partitions:

sudo wipefs -a /dev/sdX
sudo dd if=/dev/zero of=/dev/sdX bs=1M count=100 status=progress

2.2 Create a Single GPT Partition

Use parted to create a GPT partition spanning the whole drive:

sudo parted /dev/sdX --script mklabel gpt
sudo parted /dev/sdX --script mkpart primary 0% 100%

Get the partition name (likely /dev/sdX1):

lsblk -o NAME,SIZE,MODEL,MOUNTPOINT

3. Create & Optimize the ZFS Pool

Now, we create a ZFS pool with the best options for a single USB backup drive.

3.1 Create the ZFS Pool with Optimized Settings

sudo zpool create -o ashift=12 \
      -o autotrim=on \
      -O compression=lz4 \
      -O atime=off \
      -O sync=disabled \
      -O recordsize=1M \
      -O xattr=sa \
      -O acltype=posix \
      -m /backup \
      backup_pool /dev/sdX1

Explanation of Options:

Option Purpose
-o ashift=12 Optimizes for 4K sector drives (prevents misaligned writes).
-o autotrim=on Enables automatic TRIM for SSDs (harmless for HDDs).
-O compression=lz4 Fast compression to reduce space usage and speed up writes.
-O atime=off Disables access time updates (reduces unnecessary writes).
-O sync=disabled Improves write performance for backups only (⚠️ use with care).
-O recordsize=1M Optimizes for large files (suitable for backups).
-O xattr=sa Stores extended attributes more efficiently.
-O acltype=posix Enables POSIX ACLs for better file permissions.
-m /backup Mounts the pool at /backup.

4. Verify & Check Pool Status

Check that ZFS is properly set up:

zpool status
zfs list

Expected output:

NAME         STATE     READ WRITE CKSUM
backup_pool  ONLINE       0     0     0

5. Automate Snapshots & Incremental Backups

5.1 Daily Snapshots

ZFS snapshots are still useful for local protection before syncing to the server. In this case backups use rsync instead of zfs send, because the drives on the server are BTRFS and EXT4.

Automatically create daily snapshots:

sudo crontab -e

Add the following line:

0 2 * * * /usr/sbin/zfs snapshot backup_pool@pre-rsync-$(date +\%F)

This ensures a ZFS snapshot is taken before rsync runs, preventing backup corruption if files change during the process.


**5.2 Incremental Backup Script**

Since the server uses Btrfs and Ext4, the ZFS incremental backup method (zfs send/recv) won't work, because it requires both the source and destination to be ZFS.

**Alternative Incremental Backup Method (Compatible with Any Filesystem)**

Use rsync to efficiently copy only changed files to your server.

**1. Create an Incremental Backup Script**

Save this as /usr/local/bin/backup-to-server.sh:

snippet.sh
#!/bin/bash
 
SRC="/data_to_backup/"  # Replace with the source directory you want to back up
DEST="/backup/"         # Replace with your ZFS pool mount point on the USB drive
 
# Use rsync with hard links to save space and make incremental backups
rsync -aAXHv --delete --progress \
      --link-dest="$DEST/latest" \
      "$SRC" "$DEST/backup-$(date +%F)"
 
# Update the "latest" symlink to point to the newest backup
rm -rf "$DEST/latest"
ln -s "$DEST/backup-$(date +%F)" "$DEST/latest"
 
# Delete backups older than 7 days
find "$DEST" -maxdepth 1 -type d -name "backup-*" -mtime +7 -exec rm -rf {} \;

Make it executable:

sudo chmod +x /usr/local/bin/backup-to-server.sh

**2. Automate Daily Incremental Backups**

Edit cron:

sudo crontab -e

Add:

0 2 * * * /usr/local/bin/backup-to-server.sh

This runs at 2 AM daily, copying only changed files to your server.

**How This Works**

- Uses rsync to copy files while preserving permissions, timestamps, and symlinks. - Deletes removed files (--delete) to match the source. - Uses --link-dest to create hard-linked snapshots, saving disk space. - Maintains a β€œlatest” symlink, making it easy to restore files.


**6. Auto-Cleanup Old Snapshots**

Make sure only ZFS snapshots older than 7 days are deleted, while leaving rsync backups untouched.

sudo crontab -e

Cron job for cleanup:

0 3 * * * /usr/sbin/zfs list -t snapshot -o name | grep backup_pool@pre-rsync | head -n -7 | xargs -n1 zfs destroy

This keeps the last 7 days of ZFS snapshots while avoiding unnecessary storage bloat.


**7. Prevent USB Sleep & Power Issues**

Since rsync runs incremental backups instead of a full zfs send, the USB drive may be idle for longer periods. If the drive enters sleep mode too aggressively, consider using udev rules for persistent power settings.

**7.1 Disable Drive Sleep**

Some USB drives aggressively spin down. Prevent this:

snippet.sh
sudo hdparm -B 254 /dev/sdX
sudo hdparm -S 0 /dev/sdX

**7.2 Reduce Write Caching Risks**

Prevent data loss if the USB drive is unplugged suddenly:

echo 1 | sudo tee /sys/block/sdX/device/scsi_disk/*/cache_type

This sets cache_type = 1, which means β€œWrite Through” mode.

  • Every write goes directly to the disk instead of being cached.
  • Safer for USB drives: If unplugged suddenly, less risk of data loss.
  • Downside: Slightly slower performance.

To Make This Persistent Across Reboots

Since /sys/block/sdX settings reset after reboot, add this to a startup script: Edit /etc/rc.local (or create it if missing):

#!/bin/bash
echo 1 | sudo tee /sys/block/sdX/device/scsi_disk/*/cache_type
exit 0

Make it executable:

sudo chmod +x /etc/rc.local

This ensures the setting is applied every time the system starts.

7.2b (alternative) Udev Rule for Write Caching

Instead of using /sys/block/sdX, use a udev rule:

echo 'ACTION=="add", SUBSYSTEM=="block", KERNEL=="sd?", ATTR{queue/write_cache}="0"' | sudo tee /etc/udev/rules.d/99-usb-write-cache.rules

This method disables write caching at the block device level. Difference from Option 1:

  • It applies to all USB drives (not just one specific drive).
  • Works persistently without needing /etc/rc.local.
  • Can be less effective depending on the drive firmware.

**8. Mount the ZFS Pool at Boot**

ZFS should automatically mount the pool, but ensure it imports on reboot:

sudo systemctl enable zfs-import-cache
sudo systemctl enable zfs-mount
sudo systemctl enable zfs.target

These services are responsible for:

  • zfs-import-cache: Imports the ZFS pools at boot.
  • zfs-mount: Mounts the datasets.
  • zfs.target: Ensures ZFS is correctly integrated into the boot process.

These steps should guarantee that the drive is mounted at /backup on reboot.

If the pool does not mount automatically:

sudo zpool import backup_pool

**9. Verify the Setup**

Check status:

zpool status backup_pool
zfs list

Test snapshot & backup:

zfs snapshot backup_pool@test
zfs list -t snapshot
zfs destroy backup_pool@test

**Conclusion**

ZFS fully optimized for 10TB USB backup drive
- Performance: Optimized writes (sync=disabled), large record size (1M), and compression (lz4).
- Reliability: Checksums detect corruption, snapshots provide rollback, and backups are incremental.
- Efficiency: Automated snapshots, incremental transfers, and old snapshot cleanup.
β€”


ZFS with one pool and one filesystem

Using ZFS on a server with only one pool and one filesystem (without redundancy) can still offer several advantages, even though you're not leveraging its redundancy features like mirroring (RAID1) or parity (RAIDZ). Here are some benefits:

1. **Data Integrity with Checksumming**

  1. ZFS detects silent data corruption (bit rot) using checksums on all data and metadata.
  2. Even without redundancy, ZFS will alert you if corruption occurs, which most traditional filesystems (like ext4 or XFS) do not do.

2. **Snapshots & Rollbacks**

  1. You can take snapshots of your filesystem, allowing you to revert to a previous state if needed.
  2. Useful for backups, testing, or recovering from accidental deletions.

3. **Compression (Transparent)**

  1. ZFS supports transparent compression (e.g., LZ4, ZSTD), reducing disk space usage and potentially improving performance.

4. **Automatic Space Management with Copy-on-Write (CoW)**

  1. ZFS avoids overwriting existing data until the new data is fully written.
  2. This minimizes data corruption risks from unexpected crashes.

5. **Efficient Storage Management with Datasets**

  1. Even if you have only one filesystem (zfs create tank), you can create multiple datasets (tank/home, tank/var/log, etc.).
  2. Each dataset can have independent quotas, snapshots, and settings.

6. **Native Encryption**

  1. If you need encryption, ZFS provides built-in encryption that is per-dataset with no performance penalty compared to LUKS.

7. **Flexible Volume Management**

  1. You can later add redundancy by attaching additional drives (e.g., convert to a mirror).
  2. No need for LVM, as ZFS natively handles volume management.

8. **Adaptive Read Caching (ARC)**

  1. ZFS intelligently caches data in RAM, improving read performance.
  2. If you later add an SSD, you can use it as an L2ARC (secondary cache).

9. **No Need for fsck**

  1. Unlike traditional filesystems, ZFS does not require periodic fsck checks after crashes.

**Downsides of Using ZFS Without Redundancy**

- Single Point of Failure: If the disk dies, you lose everything. - Higher RAM Usage: ZFS works best with at least 4GB+ of RAM. - Slightly Higher Write Overhead: Due to CoW, small writes might be slower compared to ext4 on HDDs.


When the ZFS pool is on a single USB drive used for daily backups, the Single Point of Failure (SPOF) risk is reduced because:
1. Primary Data is on the Server

  1. Even if the USB drive fails, your original data is still available on the server.

2. ZFS Features Benefit Backups

  1. Snapshots: You can use ZFS snapshots to create point-in-time backups.
    - Incremental Sends: ZFS allows efficient incremental backups using zfs send | zfs recv, reducing backup time and bandwidth.
    - Compression: You can save space with lz4 or zstd compression.
    - Data Integrity: If bit rot or corruption occurs, ZFS will detect it.

3. Potential Optimization for USB Backup Use Case

  1. ashift=12: If using an advanced format drive (4K sectors), set ashift=12 at pool creation.
    - zfs set atime=off: Disables access time updates, improving performance.
    - zfs set compression=lz4: Saves space without slowing down writes.
    - zfs set sync=disabled (optional): Improves write performance for backups (use only if acceptable).

### Final Thoughts
Using ZFS on a single USB drive for backups makes sense if you want:
- Data integrity (checksums)
- Snapshots and efficient incremental backups
- Compression to save space
However, USB drives can fail, so you may want an additional backup strategy, such as:
- A second backup drive rotated periodically
- Offsite/cloud backup for disaster recovery


Other notes

Enable Compression

Activate LZ4 compression to improve read/write performance:([Rdiffweb][1])

zfs set compression=lz4 rpool/backups

To prevent rdiff-backup from compressing data, use:([Rdiffweb][1])

rdiff-backup --no-compression ...

Optimize Cache Usage

Configure ZFS to store only metadata in its cache:([Rdiffweb][1])

sudo zfs set primarycache=metadata rpool/backups
sudo zfs set secondarycache=metadata rpool/backups

Consider Deduplication

Enable deduplication to save space when identical data blocks are present:([Rdiffweb][1])

sudo zfs set dedup=on rpool/backups

Note: Deduplication increases RAM usage; assess its impact based on your data.([Rdiffweb][1])

5. Support Non-UTF-8 Characters

Allow non-UTF-8 characters for compatibility with various filesystems:([Rdiffweb][1])

sudo zfs create -o utf8only=off rpool/backups

This setting cannot be changed after dataset creation.([Rdiffweb][1])


Implementing these configurations can optimize ZFS for rdiff-backup, enhancing backup performance and compatibility.


snippet.cpp
by:
β––     β–˜β––β––
β–Œ β–Œβ–Œβ–›β–˜β–Œβ–™β–Œ
β–™β––β–™β–Œβ–™β––β–Œ β–Œ
 
edited: June 2025
linux/zfs/usb_backup.txt Β· Last modified: by luci4