I had to check for data integrity due to a recent system switch, and was surpris...

mrmlz · 2025-10-16T06:29:26 1760596166

I've hosted my own data for twenty something years - and bitrot occurs but it is basically caused by two things.

1) Randomness <- this is rare 2) HW-failures <- much more common

So if you catch hw-failures early you can live a long life with very little bitrot... Little =! none so zfs is really great.

wuschel · 2025-10-16T06:42:48 1760596968

Don’t get me wrong: IMHO a ZFS mirror setup sounds very tempting, but its strength lie in active data storage. Due to the rarity of bitrot I would argue it can be replaced with manual file hashing (and replacing, if needed) and used in cold storage mode for months.

What worries me more than bitrot is that consumer disks (with enclosure, SWR) do not give access to SMART values over USB via smartctl. Disk failures are real and have strong impact on available data redundancy.

Data storage activities are an exercise in paranoia management: What is truly critical data, what can be replaced, what are the failure points in my strategy?

ssl-3 · 2025-10-16T22:11:40 1760652700

There's no worse backup system than that which is sufficiently-tedious and complex that it never gets used, except maybe the one that is so poorly documented that it cannot be used.

With ZFS, the hashing happens at every write and the checking happens at every read. It's a built-in. (Sure, it's possible to re-implement the features of ZFS, but why bother? It exists, it works, and it's documented.)

Paranoia? Absolutely. If the disk can't be trusted (as it clearly cannot be -- the only certainty with a hard drive is that it must fail), then how can it be trusted to self-report that it is has issues? ZFS catches problems that the disks (themselves inscrutable black boxes) may or may not ever make mention of.

But even then: Anecdotally, I've got a couple of permanently-USB-connected drives attached to the system I'm writing this on. One is a WD Elements drive that I bought a few years ago, and the other is a rather old, small Intel SSD that I use as scratch space with a boring literally-off-the-shelf-at-best-buy USB-SATA adapter.

And they each report a bevy of stats with smartctl, if a person's paranoia steers them to look that way. SMART seems to work just fine with them.

(Perhaps-amusingly, according to SMART-reported stats, I've stuffed many, many terabytes through those devices. The Intel SSD in particular is at ~95TBW. There's a popular notion that using USB like this sure to bring forth Ghostbusters-level mass hysteria, especially in conjunction with such filesystems as ZFS. But because of ZFS, I can say with reasonable certainty that neither drive has ever produced a single data error. The whole contrivance is therefore verified to work just fine [for now, of course]. I would have a lot less certainty of that status if I were using a more-common filesystem.)

kalaksi · 2025-10-16T11:13:14 1760613194

I agree about manual file hashing. For data that rarely changes it also has some benefits.

Some time ago, I ended up writing a couple of scripts for managing that kind of checksum files: https://github.com/kalaksi/checksumfile-tools