More

ibgeek · 2025-11-21T18:39:54 1763750394

Maybe two different things here: SBCs that run Linux versus microcontrollers (MCUs).

MCUs are lower power, have less overhead, and can perform hard real-time tasks. Most of what Arduino focuses on are MCUs. The equivalent is the Raspberry Pi Pico.

In my experience, the key thing is the library ecosystem for the C++ runtime environment. There are a large number of Arduino and third-party high-level libraries provided through their package management system that make it really easy to use sensors and other hardware without needing to write intermediate level code that uses SPI or I2C. And it all integrates and works together. The Pico C/C++ SDK is lower level and doesn’t have a good library / package management story, so you have to read vendor data sheets to figure out how to communicate with hardware and then write your own libraries.

It’s much more common for less experienced users to use MicroPython. It has a package management and library ecosystem. But it’s also harder to write anything of any complexity that fits within the small RAM available without calling gc.collect() in every other line.

exasperaited · 2025-11-21T19:15:43 1763752543

Yes. One looming concern here is that if the new Arduino is happy locking stuff down, the Arduino IDE story could end up being murkier like the PlatformIO story.

ibgeek · 2025-11-16T00:59:42 1763254782

I'm not sure if I'm understanding correctly, but it reminds me of the kernel trick. The distances between the training samples and a target sample are computed, the distances are scaled through a kernel function, and the scaled distances are used as features.

https://en.wikipedia.org/wiki/Kernel_method

ibgeek · 2025-11-14T01:51:01 1763085061

I really wish you guys would change the name since the product has moved so far away from the goals and concepts in the original publication. :). I love the product and what you are doing -- it's definitely needed and valuable.

jedberg · 2025-11-14T02:56:22 1763088982

We’ve considered it, but at this point we’re kind of stuck. We’d have to rebuild our brand from scratch. But also, a name almost never makes or breaks a company. :)

Also we have a backronym now: Durable Backends, Observable and Simple.

cowsandmilk · 2025-11-14T02:46:46 1763088406

What is the original publication?

rileymichael · 2025-11-14T03:05:40 1763089540

https://dbos-project.github.io/ has all of the info

ibgeek · 2025-08-30T18:08:43 1756577323

This isn’t BTRFS

doubletwoyou · 2025-08-30T18:20:44 1756578044

This might not be directly about btrfs but bcachefs zfs and btrfs are the only filesystems for Linux that provide modern features like transparent compression, snapshots, and CoW.

zfs is out of tree leaving it as an unviable option for many people. This news means that bcachefs is going to be in a very weird state in-kernel, which leaves only btrfs as the only other in-tree ‘modern’ filesystem.

This news about bcachefs has ramifications about the state of ‘modern’ FSes in Linux, and I’d say this news about the btrfs maintainer taking a step back is related to this.

ajross · 2025-08-30T18:32:46 1756578766

Meh. This war was stale like nine years ago. At this point the originally-beaten horse has decomposed into soil. My general reply to this is:

1. The dm layer gives you cow/snapshots for any filesystem you want already and has for more than a decade. Some implementations actually use it for clever trickery like updates, even. Anyone who has software requirements in this space (as distinct from "wants to yell on the internet about it") is very well served.

2. Compression seems silly in the modern world. Virtually everything is already compressed. To first approximation, every byte in persistent storage anywhere in the world is in a lossy media format. And the ones that aren't are in some other cooked format. The only workloads where you see significant use of losslessly-compressible data are in situations (databases) where you have app-managed storage performance (and who see little value from filesystem choice) or ones (software building, data science, ML training) where there's lots of ephemeral intermediate files being produced. And again those are usages where fancy filesystems are poorly deployed, you're going to throw it all away within hours to days anyway.

Filesystems are a solved problem. If ZFS disappeared from the world today... really who would even care? Only those of us still around trying to shout on the internet.

ThatPlayer · 2025-08-30T20:40:01 1756586401

For me bcachefs provides a feature no other filesystem on Linux has: automated tiered storage. I've wanted this ever since I got an SSD more than 10 years ago, but filesystems move slow.

A block level cache like bcache (not fs) and dm-cache handles it less ideally, and doesn't leave the SSD space as usable space. As a home user, 2TB of SSDs is 2TB of space I'd rather have. ZFS's ZIL is similar, not leaving it as usable space. Btrfs has some recent work in differentiating drives to store metadata on the faster drives (allocator hints), but that only does metadata as there is no handling of moving data to HDDs over time. Even Microsoft's ReFS does tiered storage I believe.

I just want to have 1 or 2 SSDs, with 1 or 2 HDDs in a single filesystem that gets the advantages of SSDs with recently used files and new writes, and moves all the LRU files to the HDDs. And probably keep all the metadata on the SSDs too.

guenthert · 2025-08-31T09:07:32 1756631252

> automated tiered storage. I've wanted this ever since I got an SSD more than 10 years ago, but filesystems move slow.

You were not alone. However, things changed, namely SSD continued to become cheaper and grew in capacity. I'd think most active data is these days on SSDs (certainly in most desktops, most servers which aren't explicit file or DB servers and all mobile and embedded devices), the role of spinning rust being more and more archiving (if found in a system at all).

wtallis · 2025-09-01T05:01:12 1756702872

Tiering didn't go away with the migration to all-SSD storage. It just got somewhat hidden. All consumer SSDs are doing tiered storage within the drive, using drive-specific heuristics that are completely undocumented, and host software rarely if ever makes use of features that exist to provide hints to the SSD to allow its tiering/caching to be more intelligent. In the server space, most SSDs aren't doing this kind of caching, but it's definitely not unheard-of.

ThatPlayer · 2025-09-01T08:33:55 1756715635

Yeah, for enterprise where you can have dedicated machines for single use (and $) there probably isn't much appeal. That's why I emphasized as a home user, where all my machines are running various applications.

Also for video games, where performance matters, game sizes are huge, and it's nice to have a bunch of games installed.

jcgl · 2025-09-01T12:50:22 1756731022

Until $/GB drops to comparable to HDDs, large-scale storage will continue to use HDDs.

yjftsjthsd-h · 2025-08-30T22:16:16 1756592176

> Compression seems silly in the modern world. Virtually everything is already compressed.

IIRC my laptop's zpool has a 1.2x compression ratio; it's worth doing. At a previous job, we had over a petabyte of postgres on ZFS and saved real money with compression. Hilariously, on some servers we also improved performance because ZFS could decompress reads faster than the disk could read.

adzm · 2025-08-31T05:03:48 1756616628

> we also improved performance because ZFS could decompress reads faster than the disk could read

This is my favorite side effect of compression in the right scenarios. I remember getting a huge speed up in a proprietary in-memory data structure by using LZO (or one of those fast algorithms) which outperformed memcpy, and this was already in memory so no disk io involved! And used less than a third of the memory.

bionsystem · 2025-08-31T09:16:54 1756631814

The performance gain from compression (replacing IO with compute) is not ironic, it was seen as a feature for the various NAS that Sun (and after them Oracle) developped around ZFS.

pezezin · 2025-08-31T06:44:36 1756622676

How do you get a PostgreSQL database to grow to one petabyte? The maximum table size is 32 TB o_O

yjftsjthsd-h · 2025-08-31T14:06:02 1756649162

Cumulative; dozens of machines with a combined database size over a PB even though each box only had like 20 TB.

olavgg · 2025-08-31T08:29:40 1756628980

Probably by using partitioning.

doubletwoyou · 2025-08-30T18:57:58 1756580278

I know my own personal anecdote isn’t much, but I’ve noticed pretty good space savings on the order of like 100 GB from zstd compression and CoW on my personal disks with btrfs

As for the snapshots, things like LVM snapshots are pretty coarse, especially for someone like me where I run dm-crypt on top of LVM

I’d say zfs would be pretty well missed with its data integrity features. I’ve heard that btrfs is worse in that aspect, so given that btrfs saved my bacon with a dying ssd, I can only imagine what zfs does.

anon-3988 · 2025-08-30T18:39:43 1756579183

> Filesystems are a solved problem. If ZFS disappeared from the world today... really who would even care? Only those of us still around trying to shout on the internet.

Yeah nah, have you tried processing terabytes of data every day and storing them? It gets better now with DDR5 but bit flips do actually happen.

bombcar · 2025-08-30T19:05:40 1756580740

Bit flips can happen, and if it’s a problem you should have additional verification above the filesystem layer, even if using ZFS.

And maybe below it.

And backups.

Backups make a lot of this minor.

toast0 · 2025-08-30T19:25:57 1756581957

Backups are great, but don't help much if you backup corrupted data.

You can certainly add verification above and below your filesystem, but the filesystem seems like a good layer to have verification. Capturing a checksum while writing and verifying it while reading seems appropriate; zfs scrub is a convenient way to check everything on a regular basis. Personally, my data feels important enough to make that level of effort, but not important enough to do anything else.

ajross · 2025-08-30T20:02:31 1756584151

FWIW, framed the way you do, I'd say the block device layer would be an *even better* place for that validation, no?

> Personally, my data feels important enough to make that level of effort, but not important enough to do anything else.

OMG. Backups! You need backups! Worry about polishing your geek cred once your data is on physically separate storage. Seriously, this is not a technology choice problem. Go to Amazon and buy an exfat stick, whatever. By far the most important thing you're ever going to do for your data is Back. It. Up.

Filesystem choice is, and I repeat, very much a yell-on-the-internet kind of thing. It makes you feel smart on HN. Backups to junky Chinese flash sticks are what are going to save you from losing data.

toast0 · 2025-08-30T22:41:37 1756593697

I apprechiate the argument. I do have backups. Zfs makes it easy to send snapshots and so I do.

But I don't usually verify the backups, so there's that. And everything is in the same zip code for the most part, so one big disaster and I'll lose everything. C'est la vie.

petre · 2025-08-31T10:55:18 1756637718

What good is a backup if you can't restore it?

toast0 · 2025-08-31T14:30:01 1756650601

Well, I expect that I can restore it, and that expectation has been good enough thus far. :p

tptacek · 2025-08-30T20:25:26 1756585526

Ok I think you're making a well-considered and interesting argument about devicemapper vs. feature-ful filesystems but you're also kind of personalizing this a bit. I want to read more technical stuff on this thread and less about geek cred and yelling. :)

I wouldn't comment but I feel like I'm naturally on your side of the argument and want to see it articulated well.

ajross · 2025-08-30T21:39:22 1756589962

I didn't really think it was that bad? But sure, point taken.

My goal was actually the same though: to try to short-circuit the inevitable platform flame by calling it out explicitly and pointing out that the technical details are sort of a solved problem.

ZFS argumentation gets exhausting, and has ever since it was released. It ends up as a proxy for Sun vs. Linux, GNU vs. BSD, Apple vs. Google, hippy free software vs. corporate open source, pick your side. Everyone has an opinion, everyone thinks it's crucially important, and as a result of that hyperbole everyone ends up thinking that ZFS (dtrace gets a lot of the same treatment) is some kind of magically irreplaceable technology.

And... it's really not. Like I said above if it disappeared from the universe and everyone had to use dm/lvm for the actual problems they need to solve with storage management[1], no one would really care.

[1] Itself an increasingly vanishing problem area! I mean, at scale and at the performance limit, virtually everything lives behind a cloud-adjacent API barrier these days, and the backends there worry much more about driver and hardware complexity than they do about mere "filesystems". Dithering about individual files on individual systems in the professional world is mostly limited to optimizing boot and update time on client OSes. And outside the professional world it's a bunch of us nerds trying to optimize our movie collections on local networks; realistically we could be doing that on something as awful NTFS if we had to.

nh2 · 2025-08-30T22:10:11 1756591811

How can I, with dm/lvm:

* For some detected corruption, be told directly which files are affected?

* Get filesystem level snapshots that are guaranteed to be consistent in the way ZFS and CephFS snapshots guarantee?

ajross · 2025-08-30T22:24:25 1756592665

On urging from tptacek I'll take that seriously and not as flame:

1. This is misunderstanding how device corruption works. It's not and can't ever be limited to "files". (Among other things: you can lose whole trees if a directory gets clobbered, you'd never even be able to enumerate the "corrupted files" at all!). All you know (all you can know) is that you got a success and that means the relevant data and metadata matched the checksums computed at write time. And that property is no different with dm. But if you want to know a subset of the damage just read the stderr from tar, or your kernel logs, etc...

2. Metadata robustness in the face of inconsistent updates (e.g. power loss!) is a feature provided by all modern filesystems, and ZFS is no more or less robust than ext4 et. al. But all such filesystems (ZFS included) will "lose data" that hadn't been fully flushed. Applications that are sensitive to that sort of thing must (!) handle this by having some level of "transaction" checkpointing (i.e. a fsync call). ZFS does absolutely nothing to fix this for you. What is true is that an unsynchronized snapshot looks like "power loss" at the dm level where it doesn't in ZFS. But... that's not useful for anyone that actually cares about data integrity, because you still have to solve the power loss problem. And solving the power loss problem obviates the need for ZFS.

koverstreet · 2025-08-30T23:25:27 1756596327

1 - you absolutely can and should walk reverse mappings in the filesystem so that from a corrupt block you can tell the user which file was corrupted.

In the future bcachefs will be rolling out auxiliary dirent indices for a variety of purposes, and one of those will be to give you a list of files that have had errors detected by e.g. scrub (we already generally tell you the affected filename in error messages)

2 - No, metadata robustness absolutely varies across filesystems.

From what I've seen, ext4 and bcachefs are the gold standard here; both can recover from basically arbitrary corruption and have no single points of failure.

Other filesystems do have single points of failure (notably btree roots), and btrfs and I believe ZFS are painfully vulnerable to devices with broken flush handling. You can blame (and should) blame the device and the shitty manufacturers, but from the perspective of a filesystem developer, we should be able to cope with that without losing the entire filesystem.

XFS is quite a bit better than btrfs, and I believe ZFS, because they have a ton of ways to reconstruct from redundant metadata if they lose a btree root, but it's still possible to lose the entire filesystem if you're very, very unlucky.

On a modern filesystem that uses b-trees, you really need a way of repairing from lost b-tree roots if you want your filesystem to be bulletproof. btrfs has 'dup' mode, but that doesn't mean much on SSDs given that you have no control over whether your replicas get written to the same erase unit.

Reiserfs actually had the right idea - btree node scan, and reconstruct your interior nodes if necessary. But they gave that approach a bad name; for a long time it was a crutch for a buggy b-tree implementation, and they didn't seed a filesystem specific UUID into the btree node magic number like bcachefs does, so it could famously merge a filesystem from a disk image with the host filesystem.

bcachefs got that part right, and also has per-device bitmaps in the superblock for 'this range of the device has btree nodes' so it's actually practical even if you've got a massive filesystem on spinning rust - and it was introduced long after the b-tree implementation was widely deployed and bulletproof.

magicalhippo · 2025-08-31T00:32:25 1756600345

> XFS is quite a bit better than btrfs, and I believe ZFS, because they have a ton of ways to reconstruct from redundant metadata if they lose a btree root

As I understand it ZFS also has a lot of redundant metatdata (copies=3 on anything important), and also previous uberblocks[1].

In what way is XFS better? Genuine question, not really familiar with XFS.

[1]: https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSMetadata...

koverstreet · 2025-08-31T00:47:25 1756601245

I can't speak with any authority on ZFS, I know its structure the least out of all the major filesystems.

I do a ton of reading through forums gathering user input, and lots of people chime in with stories of lost filesystems. I've seen reports of lost filesystems with ZFS and I want to say I've seen them at around the same frequency of XFS; both are very rare.

My concern with ZFS is that they seem to have taken the same "no traditional fsck" approach as btrfs, favoring entirely online repair. That's obviously where we all want to be, but that's very hard to get right, and it's been my experience that if you prioritize that too much you miss the "disaster recovery" scenarios, and that seems to be what's happened with ZFS; I've read that if your ZFS filesystem is toast you need to send it to a data recovery service.

That's not something I would consider acceptable, fsck ought to be able to do anything a data recovery service would do, and for bcachefs it does.

I know the XFS folks have put a ton of outright paranoia into repair, including full on disaster recovery scenarios. It can't repair in scenarios where bcachefs can - but on the other hand, XFS has tricks that bcachefs doesn't, so I can't call bcachefs unequivocally better; we'd need to wait for more widespread usage and a lot more data.

p_l · 2025-08-31T13:12:22 1756645942

The lack of traditional 'fsck' is because its operation would be exact same as normal driver operation. The most extreme case involves a very obscure option that lets you explicitly rewind transactions to one you specify, which I've seen used to recover a broken driver upgrade that led to filesystem corruption in ways that most FSCK just barf on, including XFS'

For low-level meddling and recovery, there's a filesystem debugger that understands all parts of ZFS and can help for example identifying previous uberblock that is uncorrupted, or recovering specific data, etc.

koverstreet · 2025-08-31T14:39:16 1756651156

Rewinding transactions is cool. Bcachefs has that too :)

What happens on ZFS if you lose all your alloc info? Or are there other single points of failure besides the ublock in the on disk format?

magicalhippo · 2025-08-31T16:39:57 1756658397

> What happens on ZFS if you lose all your alloc info?

According to this[1] old issue, it hasn't happened frequently enough to prioritize implementing a rebuild option, however one should be able to import the pool read-only and zfs send it to a different pool.

As far as I can tell that's status quo. I agree it is something that should be implemented at some point.

That said, certain other spacemap errors might be recoverable[2].

[1]: https://github.com/openzfs/zfs/issues/3210

[2]: https://github.com/openzfs/zfs/issues/13483#issuecomment-120...

koverstreet · 2025-08-31T18:30:00 1756665000

I take a harder line on repair than the ZFS devs, then :)

If I see an issue that causes a filesystem to become unavailable _once_, I'll write the repair code.

Experience has taught me that there's a good chance I'll be glad I did, and I like the peace of mind that I get from that.

And it hasn't been that bad to keep up on, thanks to lucky design decisions. Since bcachefs started out as bcache, with no persistent alloc info, we've always had the ability to fully rebuild alloc info, and that's probably the biggest and hardest one to get right.

You can metaphorically light your filesystem on fire with bcachefs, and it'll repair. It'll work with whatever is still there and get you a working filesystem again with the minimum possible data loss.

magicalhippo · 2025-08-31T20:06:37 1756670797

As I said I do think ZFS is great, but there are aspects where it's quite noticeable it was born in an enterprise setting. That sending, recreating and restoring the pool is a sufficient disaster recovery plan to not warrant significant development is one of those aspects.

As I mentioned in the other subthread, I do think your commitment to help your users is very commendable.

koverstreet · 2025-08-31T21:23:55 1756675435

Oh, I'm not trying to diss ZFS at all. You and I are in complete agreement, and ZFS makes complete sense in multi device setups with real redundancy and non garbage hardware - which is what it was designed for, after all.

Just trying to give honest assessments and comparisons.

ajross · 2025-09-02T23:27:46 1756855666

> 2 - No, metadata robustness absolutely varies across filesystems.

That's misunderstanding the subthread. The upthread point was about metadata atomicity in snapshots, not hardware corruption recovery. A filesystem like ZFS can make sure the journal is checkpointed atomically with the CoW snapshot moment, where dm obviously can't. And I pointed out this wasn't actually helpful because this is a problem that has to be solved above the filesystem, in databases and apps, because it's isomorphic to power loss (something that the filesystem can't prevent).

nh2 · 2025-09-03T01:43:55 1756863835

I believe it is helpful because you can stop an app (such as a DB), FS-snapshot, and then e.g. rsync the snapshot or use any other file based backup tool, and this snapshot is fast and will be correct.

Doing the same with a block device snapshot is not so easy.

ajross · 2025-09-03T16:12:31 1756915951

Again, if your system is "incorrect" having been stopped and snapshotted like that, it is also unsafe vs. power loss, something ZFS cannot save you from. Power loss events are vastly more common than poorly checkpointed database[1] events.

[1] FWIW: every database worth being called a "database" has some level of robust journaling with checkpoints internally. I honestly don't know what software you're talking about specifically except to say that you're likely using it wrong.

nh2 · 2025-09-05T08:50:25 1757062225

You are conflating Consistency and Durability in a way that is not necessary.

FS snapshotting can be useful to work on files on which fsync() is never called, but for which you need a consistent cross-file view nontheless while the system is online.

Another example is the case of sqlite with default settings as discussed recently (https://news.ycombinator.com/item?id=45005866), where the presence of a file determines whether or not transactions will be missing on next access. Because SQLite does not fsync the file's parent directory by default, transactions would be correctly recorded in by FS snapshotting, but lost in a block device snapshot. Your argument is correct that they would also be lost in power loss, but that does not matter for the fact that there exists a valid way to use that software in some situations where you care about consistency and not about power losses.

This is why having such a feature in file systems is useful.

ajross · 2025-08-30T19:43:38 1756583018

And once more, you're positing the lack of a feature that is available and very robust (c.f. "yell on the internet" vs. "discuss solutions to a problem"). You don't need your filesystem to integrate checksumming when dm/lvm already do it for you.

yjftsjthsd-h · 2025-08-30T22:11:19 1756591879

> You don't need your filesystem to integrate checksumming when dm/lvm already do it for you.

https://wiki.archlinux.org/title/Dm-integrity

> It uses journaling for guaranteeing write atomicity by default, which effectively halves the write speed

I'd really rather not do that, thanks.

ajross · 2025-09-02T23:25:36 1756855536

So... there's a reason you had to cite a throwaway comment on a distro wiki and not documentation. Needless to say journaling metadata (something done in some form by every filesystem you will ever use!) does not, in fact, "halve the write speed".

yjftsjthsd-h · 2025-09-03T15:30:32 1756913432

> So... there's a reason you had to cite a throwaway comment on a distro wiki and not documentation.

No, I read the official kernel docs too; the Arch wiki just happened happened to be a quicker way to describe it.

From https://docs.kernel.org/admin-guide/device-mapper/dm-integri... -

> The dm-integrity target can also be used as a standalone target, in this mode it calculates and verifies the integrity tag internally. In this mode, the dm-integrity target can be used to detect silent data corruption on the disk or in the I/O path.

> There’s an alternate mode of operation where dm-integrity uses a bitmap instead of a journal. If a bit in the bitmap is 1, the corresponding region’s data and integrity tags are not synchronized - if the machine crashes, the unsynchronized regions will be recalculated. The bitmap mode is faster than the journal mode, because we don’t have to write the data twice, but it is also less reliable, because if data corruption happens when the machine crashes, it may not be detected.

This is more clearly presented lower down in the list of modes, in which most options describe how they don't actually protect against crashes, except for journal mode:

> J - journaled writes

> data and integrity tags are written to the journal and atomicity is guaranteed. In case of crash, either both data and tag or none of them are written. The journaled mode degrades write throughput twice because the data have to be written twice.

On further reflection, I grant that that might only be talking about the integrity metadata, in which case we just don't know about the impact to data writes and it would be useful to go benchmark to see what the hit is in practice.

EDIT: So I went looking to see if anyone had done that benchmarking and found https://github.com/t13a/dm-integrity-benchmarks which seems to show that actually yes dm-integrity is that bad on data writes. Of course, its possible saving grace is that everything else with the same features also had a performance hit. I also found https://www.reddit.com/r/linuxadmin/comments/1crtggd/why_dmi... talking about it.

ajross · 2025-09-03T16:17:34 1756916254

FWIW, the github link you show clearly shows the ext4-on-dm stack to be FASTER than ZFS!

It only falls behind, and very signficantly so, on the 1M sequential write test, exactly the situation where you'd expect there to be the least delta between systems! I'm going to bet anything that's a misconfigured RAID.

Frankly looking at that from a "will this work best for my general purpose filesystem used mostly to handle giant software builds and Zephyr test suites" it seems like a no brainer to pick dm, especially so given the simplicity argument.

khimaros · 2025-08-31T00:30:26 1756600226

i'm not one for internet arguments and really just want solutions. maybe you could point me at the details for a setup that worked for you?

based on my own testing, dm has a lot of footguns and, with some kernels, as little as 100 bytes of corruption to the underlying disk could render a dm-integrity volume completely unusable (requiring a full rebuild) https://github.com/khimaros/raid-explorations

justincormack · 2025-08-31T10:23:46 1756635826

Well the intention of the integrity things is to preserve integrity that is an explicit choice, in particular for encrypted data. You definitely need a backup strategy.

fluidcruft · 2025-08-30T19:52:59 1756583579

One feature I like about ZFS and have not seen elsewhere is that you can have each filesystem within the pool use its own encryption keys but more importantly all of the pool's data integrity and maintenance protection (scrubs, migrations, etc) work with filesystems in their encrypted state. So you can boot up the full system and then unlock and access projects only as needed.

The dm stuff is one key for the entire partition and you can't check it for bitrot or repair it without the key.

trashface · 2025-08-31T05:22:20 1756617740

> And the ones that aren't are in some other cooked format.

Maybe, if you never create anything. I make a lot of game art source and much of that is in uncompressed formats. Like blend files, obj files, even DDS can compress, depending on the format and data, due to the mip maps inside them. Without FS compression it would be using GBs more space.

I'm not going to individually go through and micromanage file compression even with a tool. What a waste of time, let the FS do it.

pdimitar · 2025-08-30T18:50:49 1756579849

> The dm layer gives you cow/snapshots for any filesystem you want already and has for more than a decade. Some implementations actually use it for clever trickery like updates, even.

O_o

Apparently I've been living under a rock, can you please show us a link about this? I was just recently (casually) looking into bolting ZFS/BTRFS-like partial snapshot features to simulate my own atomic distro where I am able to freely roll back if an update goes bad. Think Linux's Timeshift with something little extra.

tux3 · 2025-08-30T18:57:31 1756580251

There are downsides to adding features in layers, as opposed to integrating them with the FS, but dm can do quite a lot:

https://docs.kernel.org/admin-guide/device-mapper/snapshot.h...

tptacek · 2025-08-30T20:27:39 1756585659

DM has targets that facilitate block-level snapshots, lazy cloning of filesystems, compression, &c. Most people interact with those features through LVM2. COW snapshots are basically the marquee feature of LVM2.

dilyevsky · 2025-08-30T20:27:35 1756585655

The other thing dm/lvm gives you is dogshit performance

NewJazz · 2025-08-30T18:12:50 1756577570

Btrfs is the closest in-tree bcachefs alternative.

zozbot234 · 2025-08-30T19:18:54 1756581534

Does btrfs still eat your data if you try to use its included RAID featureset? Does it still break in a major way if you're close to running out of disk space? What I'm seeing is that most major Linux distributions still default to non-btrfs options for their default install, generally ext4.

skibbityboop · 2025-08-30T19:33:40 1756582420

Anecdotal but btrfs is the only filesystem I've lost data with (and it wasn't in a RAID configuration). That combined with the btrfs tools being the most aggressively bad management utilities out there* ensure that I'm staying with ext4/xfs/zfs for now.

*Coming from the extremely well thought out and documented zfs utilities to btrfs will have you wondering wtf fairly frequently while you learn your way around.

ibgeek · 2025-01-08T21:06:16 1736370376

One of the goals of containers are to unify the development and deployment environments. I hate developing and testing code in containers, so I develop and test code outside them and then package and test it again in a container.

Containerized apps need a lot of special boilerplate to determine how much CPU and memory they are allowed to use. It’s a lot easier to control resource limits with virtual machines because the application in the system resources are all dedicated to the application.

Orchestration of multiple containers for dev environments is just short of feature complete. With Compose, it’s hard to bring down specific services and their dependencies so you can then rebuild and rerun. I end up writing Ansible playbooks to start and stop components that are designed to be executed in particular sequences. Ansible makes it hard to detach a container, wait a specified time, and see if it’s running. Compose just needs to be updated to support management of shutting down and restarting containers, so I can move away from Ansible.

Services like Kafka that query the host name and broadcast it are difficult to containerize since the host name inside the container doesn’t match the external host name. Requires manual overrides which are hard to specify at run time because the orchestrators don’t make it easy to pass in the host name to the container. (This is more of a Kafka issue, though.)

westurner · 2025-01-08T22:56:29 1736376989

Systemd, k8s, Helm, and Terraform model service dependencies.

Quadlet is the podman recommended way to do podman with systemd instead of k8s.

Podman supports kubes of containers and pods of containers;

  man podman-container
  man podman-generate-kube
  man podman-kube
  man podman-pod

`podman generate kube` generates YAML for `podman kube play` and for k8s `kubectl`.

Podman Desktop can create a local k8s (kubernetes) cluster with any of kind, minikube, or openshift local. k3d and rancher also support creating one-node k8s clusters with minimal RAM requirements for cluster services.

kubectl is the utility for interacting with k8s clusters.

k8s Ingress API configures DNS and Load Balancing (and SSL certs) for the configured pods of containers.

E.g. Traefik and Caddy can also configure the load balancer web server(s) and request or generate certs given access to a docker socket to read the labels on the running containers to determine which DNS domains point to which containers.

Container labels can be specified in the Dockerfile/Containerfile, and/or a docker-compose.yml/compose.yml, and/or in k8s yaml.

Compose supports specifying a number of servers; `docker compose up web=3`.

Terraform makes consistent.

Compose does not support rolling or red/green deployment strategies. Does compose support HA high-availability deployments? If not, justify investing in a compose yaml based setup instead of k8s yaml.

Quadlet is the way to do podman containers without k8s; with just systemd for now.

ibgeek · 2025-01-09T04:00:06 1736395206

Thanks! I’ll take a look at quadlet.

I find that I tend to package one-off tasks as containers as well. For example, create database tables and users. Compose supports these sort of things. Ansible actually makes it easy to use and block on container tasks that you don’t detach.

I’m not interested in running kubernetes, even locally.

westurner · 2025-01-10T22:47:21 1736549241

Podman kube has support for k8s Jobs now: https://github.com/containers/podman/pull/23722

k8s docs > concepts > workloads > controllers > Jobs: https://kubernetes.io/docs/concepts/workloads/controllers/jo...

Ingress, Deployment, StatefulSets,: https://news.ycombinator.com/item?id=37763931

ibgeek · 2025-01-08T21:47:44 1736372864

Ok one more to add that is a kind-of an abuse of containers: Some compute cluster solutions (like those used for HPC) are using containers to manage software installations on the clusters. They are trying to unify containers with the standard Unix environment, however, so that users still see their home directory (mounted in the container) and other paths so that running applications in the container is the same experience as running it directly on the host OS. This is just a TERRIBLE solution. I much prefer Environment Modules or something like Python's virtual environments (if it worked for arbitrary software installs) as a solution.

https://en.wikipedia.org/wiki/Environment_Modules_(software)

ibgeek · 2024-12-15T22:03:14 1734300194

Good write up. The only real bias I can detect is that the author seems to conflate their (lack of) familiarity with ease of use. I bet if they spent a few months using DuckDB and Polars on a daily basis, they might find some of the tasks just as easy or easier to implement.

ibgeek · 2024-12-03T23:45:29 1733269529

I think the title is misleading. This isn't really about either language in production environments. As other commenters mentioned, a post about production would cover topics like whether there were any tooling / dependency updates that broke a build, whether they encountered any noticeable bugs in production caused by libraries / run time, and how efficiently the run times handle high load (e.g., with GC).

This is more about syntax differences. Even then, I'd be curious how well both languages accommodate themselves to teams and long term projects. In both cases, you will have multiple people working on parts of the code base. Are people able to read and modify code they haven't written -- for example, when fixing bugs? When incorporating new sub components, how well did the type systems prevent errors due to refactoring? It would be interesting to know if Haskell prevents a number of practical problems that occurred with OCaml or if, in practice, there was no difference for the types of bugs they encountered.

This blog post feels more like someone is comparing basic language features found in reviews for new users rather than sharing deep experience and gotchas that only come from long-term use.

pjmlp · 2024-12-04T10:13:01 1733307181

Some production use cases,

https://engineering.fb.com/2015/06/26/security/fighting-spam...

https://www.docker.com/blog/how-docker-desktop-networking-wo...

https://www.janestreet.com/tech-talks/ocaml-all-the-way-down...

ibgeek · 2024-12-05T03:36:27 1733369787

The Meta post is particularly interesting. Thanks for sharing!

ibgeek · 2024-12-02T02:02:30 1733104950

It's not clear from the article whether actors offer significant benefits (or disadvantages) for data modeling versus the traditional OO paradigm. The article reads more like an introduction that describes the problem and teases a solution rather a complete article that offers a solution and evaluation of it.

jeremycarter · 2024-12-02T02:09:40 1733105380

That's fair feedback. I wanted to post it in 3 parts, but I see now I probably should have just made one large post.

macintux · 2024-12-02T03:57:05 1733111825

You could always post the 3 parts simultaneously, so anyone who wants to dig deeper can continue into the series.

ibgeek · 2024-12-02T02:00:21 1733104821

The article seems to be smashing together two (seemingly) unrelated topics and doesn't offer much in the way of a solution. What alternative design does the author propose? Is it possible to solve the problem with traditional object-oriented design techniques? It's not clear that the issues presented require or substantially benefit from the actor model without seeing a best in-class OO example.

jeremycarter · 2024-12-02T02:08:26 1733105306

As I mentioned there's nothing novel in this post, especially for a senior. This is more about getting some context out of the way so that I can show some techniques in a future post.

ibgeek · on Nov 3, 2024

This is very cool!

shayonj · on Nov 3, 2024

Thank you! Still very early days, would love to hear any feedback.