Hacker Newsnew | past | comments | ask | show | jobs | submit | materiallie's commentslogin

Zig certainly has a lot of interesting features and good ideas, but I honestly don't see the point of starting a major project with it. With alternatives like Rust and Swift, memory safety is simply table stakes these days.

Yes, I know Zig does a lot of things to help the programmer avoid mistakes. But the last time I looked, it was still possible to make mistakes.

The only time I would pick something like C, C++, or Rust is if I am planning to build a multi-million line, performance sensitive project. In which case, I want total memory safety. For most "good enough" use cases, garbage collectors work fine and I wouldn't bother with a system's programming language at all.

That leaves me a little bit confused about the value proposition of Zig. I suppose it's a "better C". But like I said, for serious industry projects starting in 2025, memory safety is just tablestakes these days.

This isn't meant to be a criticism of Zig or all of the hard work put into the language. I'm all for interesting projects. And certainly there are a lot of interesting ideas in Zig. I'm just not going to use them until they're present in a memory safe language.

I am actually a bit surprised by the popularity of Zig on this website, given the strong dislike towards Go. From my perspective, both languages are very similar, from the perspective that they decided to "unsolve already solved problems". Meaning, we know how to guarantee memory safety. Multiple programming languages have implemented this in a variety of ways. Why would I use a new language that takes a problem a language like Rust, Java, or Swift already solved for me, and takes away features (memory safety) that I already have?


> memory safety is simply table stakes

Why?

And also, this is black and white thinking, implying that "swift and rust" are completely memory "safe" and zig is completely "unsafe". It's a spectrum.

The real underlying comparison statement here is far more subjective. It's along the lines of: "I find it easier to write solid code in rust than in zig". This is a more accurate and fair way to state the semantics of what you are saying.

Saying things like "rust is memory safe. Zig is not memory safe" is reductionist and too absolutist.


>Why?

Memory bugs are hard to debug, potentially catastrophic (particularly concerning security) and in large systems software tend to constitute the majority of issues.[1]

It is true that Rust is not absolutely memory safe and Zig provides some more features than C but directionally it is correct that Rust (or languages with a similar design philosophy) eliminate billion dollar mistakes. And you can take that literally rather than metaphorically. We live in a world where vulnerable software can take a country's infrastructure out.

[1] https://www.zdnet.com/article/microsoft-70-percent-of-all-se...


If decades of experience shows us anything it is that discipline and skill is not enough to achieve memory safety.

Developers simply aren’t as good at dealing with these problems as they think they are. And even if a few infallible individuals would be truly flawless, their co-workers just aren’t.


I'm not convinced that on average Zig is any less safe, or produces software that is any less stable, then Rust.

Zig embraces reality in its design. Allocation exist, hardware exists, our entire modern infrastructure is built on C. When you start to work directly with those things, there is going to be safety issues. That's just how it is. Zig tries to give you as many tools as possible to make good decisions at every turn, and help you catch mistakes. Like it's testing allocator detecting memory leaks.

Rust puts you in a box, where the outside world doesn't exist. As long as you play by its rules everything will be fine. But it eventually has to deal with this stuff, so it has unsafe. I suspect if Rust programmers went digging through all their dependencies, especially when they are working on low level stuff, they would be surprised by how much of it actually exists.

Zig tried to be more safe on average and make developers aware if pitfalls. Rust tried to be 100% safe where it can, and then not safe at all where it can't. Obviously Rusts approach has worked for it, but I don't think that invalidates Zigs. Especially when you start to get into projects where a lot of unsafe operations are needed.

Zig also has an advantage in that it simplifies memory management through its use of allocators. If you read Richard Feldman's write up on the Roc compilers rewtire in Zig, he talks about how he realized their memory allocation patterns were simple enough in Zig that they just didn't need the complexity of Rust.


To be clear, Rust encourages the development of safe abstractions around unsafe code, so that the concern goes from proportion of unsafe to encapsulation of unsafe. Whether you trust some library author to encapsulate their unsafe is, I think, reducible to whether you trust a library author to write a good library. Unsafe is not all-or-nothing. Thus, as with all languages, good general programming practices come before language features.


That's kind of my point. Because it's isolated and abstracted I wouldn't be surprised if most Rust devs have no idea how much unsafe code is actually out there.

Rust does not want you to think about memory management. You play by its rules and let it worry about allocations/deallocation. Frankly in that regard Rust has more in common with GC languages than it does Zig or C. Zig chooses to give the developer full control and provides tools to make writing correct/safe code easier.


Although not a comprehensive report, people tend to count the source lines of unsafe in a Rust codebase as a metric. Moreover, reputable libraries worth using typically take care to reduce unsafe, and where it is used, encapsulate it well. I don't think you have a substantive point on the matter. Unsafe certainly can be abused, but it's not a bogeyman that people scarcely catch glimpses of. Unsafe doesn't demote the safety of Rust to that of C, or something like that.

Your comments on Rust's philosophy towards memory management are off base. Rust is unlike GC languages, even Swift, in that it makes allocations and deallocations explicit. For example, I know that one approach to implementing async functions in trait objects was rejected because it would've made implicit heap allocations. Granted, Rust is far behind on reified and custom allocators. Rust has functionality to avoid the default alloc crate, which is the part of libstd that does heap allocations, and a library ecosystem for alternate data structures. Rust doesn't immediately give you total access, but it's only a few steps away. Could it be easier to work with? Absolutely. The same goes for unsafe.


Thank you for the thoughtful reply, but I think you missed my point.

I'm not saying Rust isn't substantially safer than C. When people like Greg Kroah-Hartman say that Rust by its design eliminates a lot of the memory bugs he's been fighting for 40 years, I believe him.

My point is that people tend to talk about it as an all or nothing proposition. That Rust is memory safe. Period. And any language that can't put that on the tin is immediately disqualified, that somehow their approach to solving similar problems is invalid.

By the very nature of the system no language that wants to interact with the hardware can be entirely memory safe, not even Rust. It has chosen a specific solution, and a pretty damn interesting one as far as that goes, but it still has to deal with unsafe. And the more directly your program has to deal with the hardware the more unsafe code it's going to have to deal with.

Zig has also chosen an approach to deal with the problem. Their's is one that gives far more direct control to the programmer. Every single memory allocation is explicit in that you have to directly interact with an allocator and you have to free that memory. It's not hidden behind constructors/destructors and controlled via RAII patterns (side note, there are managed data structures that you give an allocator to via an init and free via a deinit, but you still have to pass in the allocator and those are being largely replaced).

If you are only dealing with problems where you can interact with Rust's abstractions I'm sure it is more safe then Zig, but I don't think it's as big a difference as people think. And when you start digging down into systems level programming where the amount of unsafe code you have to write grows, Rusts advantage starts to diminish significantly.

To my point about Rust not wanting you to think about memory, take Vector as an example. You and I know that's doing heap allocations, but I guarantee you a not insignificant number of Rust devs just don't even think about it. And they certainly don't think about all the allocations/deallocations that have to happen to grow and shrink it dynamically.

Compare that to Zigs ArrayList. When you create it you have to explicitly hand it an allocator you created. It could be a general purpose allocator, but it could just as easily be an arena allocator, or even an allocator backed by a buffer you pre-allocated specifically for it. As the programmer you have to directly deal with the fact that thing allocats and deallocates.

Thats what I mean when I say Rust has more in common with GC languages in some ways. When I type "new" in Java I know I'm heap allocating a new object, just Java doesn't want me to think about that because the GC will deal with it. When you create a vector in Rust, it doesn't want you to think about the memory, it just wants you to follow it's borrow checker rules. Which is very different then thinking about allocation/deallocation patterns.


Thank you for your insights. I think this comes down to a culture problem. Some people such as myself obsess over the low-level details, so every usage of Vec stands out immediately. At the same time, many people like to use Rust in areas where heap allocations and the like aren't as significant, where GC languages would typically be used too. Discussions of memory safe languages are flawed, as you point out, though I feel like Rust is just the poster child and not a unique instance. What I take away from Rust's philosophy usually comes from the Rust team who develop the language, and I find them to generally be levelheaded when it comes to discussing Rust's merits and drawbacks. I do think Rust is a highly promising language, for which its team tirelessly addresses its major flaws[0], but it doesn't excuse hype-induced bias.

[0] I can't say it'll ever compile as fast as, say, Go, but all the work on compile times is impressive


I think that's honestly been a success for Rust nobody really talks about. It's enabled people who would never write in a non-garbage collected language to do just that.

Between the borrow checker and Cargo, it's brought system engineering to a whole group of people that never would have touched it before. People who don't want to deal with memory management can write code that doesn't require a GC. And they can easily pull in dependency for things they don't want to write.


I can attest to that. I had tried learning C++ previously, but getting started as a beginner is a bit of a nightmare (maybe that's because I was using Linux, so I couldn't use Visual Studio, but there's still a ton of complexity).

I was able to work through the Rust book though, and that was a great way to ease into the more difficult parts of systems programming.

I'm not contributing to a C project, but I feel a lot more comfortable since Rust caught all my early dumb mistakes, so I've learned to avoid the things that would shoot me in the foot.


>> memory safety is simply table stakes

> Why?

Because it's a stepping stone to other kinds of safety. Memory safety isn't the be-all and end-all, but it gets us to where we can focus other important things.

And turns out in this particular case we don't even have to pay much for it in terms of performance.

> The real underlying comparison statement here is far more subjective. It's along the lines of: "I find it easier to write solid code in rust than in zig".

Agreed! But also how about "We can get pretty close to memory safety with the tools we provide! Mostly at runtime! If you opt-in!" ~~ signed, people (Zig compiler itself, Bun, Ghostty, etc) who ship binaries built with -Doptimize=ReleaseFast


Zig has a pretty great type system, and sometimes languages like Rust and C++ are not great with preventing accidental heap allocations. Zig and C make this very explicit, and it's great to be able to handle allocation failures in robust software.


What's great about its type system? I find it severely limited and not actually useful for conveying and checking invariants.


That is the usual fallacy, because it assumes everyone has full access to whole source code and is tracking down all the places where heap is being used.

It also assumes that the OS doesn't lie to the application when allocations fail.


Zig make allocations extremely explicit (even more than C) by having you pass around the allocator to every function that allocates to the heap. Even third-party libraries will only use the allocator you provide them. It's not a fallacy, you're in total control.


> pass around the allocator to every function that allocates to the heap.

what prevents a library from taking an allocator, saving it hidden somewhere and using it silently?


Nothing, but it would be bad design (unless there is a legitimate documented reason for it). Then it's up to you as the developer to exercise your judgment and choose what third-party libraries you choose to depend on.


authors of the library


Why, are you going to abort if too many calls to the allocator take place?


You can if you want. You can write your own allocator that never actually touches the heap and just distributes memory from a big chunk on the stack if you want to. The point is you have fine grained (per function) control over the allocation strategy not only in your codebase but also your dependencies.


Allocation strategy isn't the same as knowing exactly exactly when allocations take place.

You missed the point that libraries can have their own allocators and don't expose customisation points.


sure they can. but why would they choose to?


Because the language doesn't prevent them, and they own their library.


That's not really an argument. What prevents the author of a library in any language from acting in bad faith and use antipatterns? That's not a problem that would only happen in the Zig language.


the question remains – why would they choose to?


Because they decided to do so, regardless of people like yourself thinking they are wrong.


They can, and they wouldn't be necessarily be wrong.

However if the library is trying to be as idiomatic / general purpose / good citizen as possible, then they should strongly consider not doing that and only use the user provided allocator (unless there is a clear and documented reason why that is not the case).

I don't think it would make sense to restrict this at the language level. As a developer it's up to you to exercise your judgement when you examine what libraries you choose to depend on.

I appreciate the fact it's a common design pattern in Zig libraries and I also appreciate the fact I'm not forced to do it if I don't want to. If it matters to me then I'll consider libraries that are designed that way, if it does not matter to me then I can consider libraries that do not support this.


and so why should he be forced to not do so? he can cook his own thing outside of what most people using this language will do, and they will just not use it, and nothing's wrong with that.


> It also assumes that the OS doesn't lie to the application when allocations fail.

Gotta do the good ol'

  echo 2 >/proc/sys/vm/overcommit_memory
and maybe adjust overcommit_ratio as well to make sure the memory you allocated is actually available.


OS specific hack and unrelated to C.


Your comment was also OS-specific because Windows doesn't lie to applications about failed allocations.


Not at all, rather there is no guarantee that the C abstract machine described on ISO C, actually returns NULL on memory allocation failures as some C advocates without ISO C legalese expertise seem to advocate.


>> Why would I use a new language...

If you are asking that question you should not use a new language. Stick with what works for you. You need to feel that something is unsatisfactory with what you are using now in order to consider changing.


To me the argument is that memory errors are just one type of logic error that can lead to serious bugs. You want a language that reduces logic errors generally, not just memory safety ones, and zig's focus on simplicity and being explicit might be the way to accomplish that.

For large performant systems, what makes sense to me is memory safety by default, with robust, fine-grained levers available to opt in to performance over safety (or to achieve both at once, where that's possible).

Zig isn't that, but it's at least an open question to me. It has some strong safe-by-default constructs yet also has wide open safety holes. It does have those fine-grained levers, plus simplicity and explicitness, so not that far away. Perhaps they'll get there by 1.0?


Logical errors and Memory errors aren’t even close to being in the same ballpark.

Memories errors are deterministic errors with non-deterministic consequences. Logical errors are mostly non-deterministic (subjective and domain dependent) but with deterministic consequences.


> For most "good enough" use cases, garbage collectors work fine and I wouldn't bother with a system's programming language at all.

It's not just about performance, it's about reusability. There is a huge amount of code written in languages like Java, JS, Go, and Python that cannot be reused in other contexts because they depend on heavy runtimes. A library written in Zig or Rust can be used almost anywhere, including on the web by compiling to wasm.


> ...memory safety is simply table stakes these days.

Is there like a mailing list Rust folks are on where they send out talking points every few months? I have never seen a community so in sync on how to talk about a language or project. Every few months there's some new phrase or talking point I see all over the place, often repeated verbatim. This is just the most recent one.


Conspiracy nonsense. GP is advocating for GC languages, not Rust.


It was a joke dude. Relax.

Also, literally the first language he mentioned was Rust, and it's the only one he mentioned that would be in the same class as Zig.


> I am actually a bit surprised by the popularity of Zig on this website

Maybe this just indicates that memory safety is table stakes for you, but not for every programmer on Earth?


> I am actually a bit surprised by the popularity of Zig on this website, given the strong dislike towards Go.

I'm not sure that's true, that there is a strong dislike towards Go on here. Maybe that's just competitors or their zealots creating such a perception. Even if that was the case (it being disliked on here), it would only hold true for this site, and not in general. Go is ranked as a top 10 language by TIOBE (as of October 2025). Rust is ranked 16, Swift is ranked 22, Vlang is ranked 42, and Zig behind it at 43.

> for serious industry projects starting in 2025...

In regards to starting projects with Zig, as long as people are clear that it's still in beta, then they should be fine with accepting the risks that go along with that. It's also fine not to want to use Zig and stick with the language that you are comfortable with. If a person wants to experiment or to see what's available, there are many languages in the "better C" or "C alternative" category.


Yes, we know how to offer memory safety; we just don't know how to offer it without exacting a price that, in some situations, may not be worth it. Memory safety always has cost.

Rust exists because the cost of safety, offered in other languages, is sometimes too high to pay, and likewise, the cost Rust exacts for its memory safety is sometimes too high to pay (and may even adversely affect correctness).

I completely agree with you that we reach for low-level languages like C, C++, Rust, or Zig, in "special" circumstances - those that, for various reasons, require precise control over hardware resource, and or focuses more on the worst case rather than average case - and most software has increasingly been written in high-level languages (and there's no reversal in this trend). But just as different factors may affect your decisions on whether to use a low-level language, different factors may affect your decision on which low-level language to choose (many of those factors may be subjective, and some are extrinsic to the language's design). Of course, if, like the vast majority of programmers, you don't do low-level programming, then none of these languages are for you.

As a long-time low-level programmer, I can tell you that all of these low-level languages suffer from very serious problems, but they suffer from different problems and require making different tradeoffs. Different projects and different people may reasonably want different tradeoffs, and just as we don't have one high-level language that all programmers like, we also don't have one low-level language that all programmers like. However, preferences are not necessarily evenly distributed, and so some languages, or language-design approaches, end up more popular than others. Which languages or approaches end up more popular in the low-level space remains to be seen.

Memory safety is clearly not "table stakes" for new software written in a low level language for the simple reason that most new software written in low level languages uses languages with significantly less memory safety than Zig offers (Zig offers spatial memory safety, but not temporal memory safety; C and C++ offer neither, and most new low level software written in 2025 is written in C or C++).

I can't see a strong similarity between Go, a high-level language, and Zig, a low-level language, other than that both - each in its own separate domain - values language simplicity. Also, I don't see Zig as being "a better C", because Zig is as different (or as similar) from C as it is from C++ or Rust, albeit on different axes. I find Zig so different from any existing language that it's hard to compare it to anything. As far as I know, it is the first "industry language" that's designed almost entirely around the concept of partial evaluation.


Would you say writing something like a database (storage and query engine) from scratch is better done in Rust or Zig?


It depends on the purpose. If the objective is maximum scale and performance then Zig. The low-level mechanics of userspace I/O and execution scheduling in top-end database architectures strongly recommends a language comfortable expressing complex relationships in contexts where ownership and lifetimes are unavoidably ambiguous. Zig is designed to enable precise and concise control in these contexts with minimal overhead.

If performance/scale-maxxing isn't on the agenda and you are just trying to crank out features then Rust probably brings more to the table.

The best choice is quite arguably C++20 or later. It has a deep set of somewhat unique safety features among systems languages that are well-suited to this specific use case.


First, I would avoid using any low-level language if at all possible, because no matter what you pick, the maintenance and evolution costs are going to be significantly higher than for a high-level language. It's a very costly commitment, so I'd want to be sure it's worth it. But let's suppose I decided that I must use a low-level language (perhaps because worst-case behaviour is really important or I may want to run in a low-memory device or the DB was a "pure overhead" software that aims to minimise memory consumption that's needed for a co-located resource heavy application).

Then, if this were an actual product that people would depend on for a long time, the obvious choice would be C++, because of its maturity and good prospects. But say this is hypothetical or something more adventurous that allows for more risk, then I would say it entirely depends on the aesthetic preferences of the team, as neither language has some clear intrinsic advantage over the other. Personally, I would prefer Zig, because it more closely aligns with my subjective aesthetic preferences, but others may like different things and prefer Rust. It's just a matter of taste.


> the DB was a "pure overhead" software that aims to minimise memory consumption that's needed for a co-located resource heavy application)

Thanks pron for the reply. This describes it the best. To minimize resource consumption in a "pure overhead" software. It's currently written in Java and we are planning a rewrite in a systems PL.


I would first spend a good amount of time figuring out if you can't just keep it in Java, because it's not just the rewrite in C++ that's expensive, but a low-level language would make maintenance and evolution costlier forever.

If memory footprint is the issue, I would recommend watching this: https://youtu.be/mLNFVNXbw7I. A lot of people misunderstand memory consumption, and even in Java there are many options to try. There could definitely still be situations where a low-level language is the only choice, but it's not a decision to be taken lightly.


Since most system level api provide a C interface and the c interoperability of zig is top notch you don’t require a marshaling/interop layer.


From what I've seen, it also has much less of an impedence mismatch. You can sling pointers around to your heart's desire in Zig, whereas in Rust you have a lot of sanitization and restructuring when going crossing the barrier.


That's true of Rust as well, so it's not really an advantage unique to Zig.


Is it? Most of the time I read you have to create a wrapper, like here: https://docs.rust-embedded.org/book/interoperability/c-with-...

Please provide some documentation of how to use c libraries without such interop layer in rust. And while bindgen does most of the work it can be pretty tedious to get running.


rust has really high learning curve


Perhaps worse is the fatigue curve that some people claim sets in after a few years of using it.


Do you have links on people’s experience with the fatigue curve?

I’ve heard of “hard to learn, easy to forget” but I haven’t seen people document it for career reasons.


I have no hard data. I have seen comments to this effect in HN. Somewhat famously Primagen threw in the towel on it. I would love to hear from others with 4+ years of Rust experience though.


I think that's mostly async fatigue.

Avoid it and you're good, you just have to accept that a big part of the language is not worth its weight. I guess at that point a lot of people get disillusioned and abandon it whole, when in reality you can just choose to ignore that part of the language.

(I'm rewriting my codex-rs fork to remove all traces of async as we speak.)


That does seem like a lot to give up however if doing any amount of I/O. No?


If "any amount" means millions of concurrent connections maybe. But in reality we've build thread based concurrency, event loops and state machines for decades before automatic state machine creation from async code came along.

Async doesn't have access to anything that sync rust doesn't have access to, it just provides syntactic sugar for an opinionated way of working with it.

On the contrary, async is very incompatible with MMAP for example, because a page fault can pause the thread but will block the entire executor or executor thread.

I'd even argue that once you hit that scale you want more control than async offers, so it's only good at that middle ground where you have a lot of stuff, but you also don't really care enough to architect the thing properly.


Thanks for the perspective. Very much appreciated.


I guess lack of job positions could be one kind of fatigue curve.


Tried to use Swift outside Xcode and it’s a pain. Especially when writing CLI apps the Swift compiler chocked and says there is an error, mentioning no line number. Good luck with that. Also the Swift tooling outside Xcode is miserable. Even Rust tooling is better than that, and Swift has a multi billion dollar company behind it. What a shame…


None of all those memory-safe languages allows you to work without a heap. And I don't mean "avoid allocations in that little critical loop". I mean "no dynamic allocation, never ever". A lot of tasks doesn't actually require dynamic allocation, for some it's highly undesirable (e. g. embedded with limited memory and long uptimes), for some it's not even an option (like when you are writing an allocator). Rust has some support for zero-runtime, but a lot of it's features is either useless of outright in the way when you are not using a heap. Swift and others don't even bother.

Unpopular opinion: safety is a red herring. A language shouldn't prevent the programmer from doing the unsafe thing, rather it should provide an ergonomic way to do things in a safe way. If there is no such way - that's on language designer, not the programmer. Rust being the worst offender: there is still no way to do parent links, other than ECS/"data oriented" which, while it has it's advantages, is both quite unergonomic and provides memory safety by flaying it, stuffing the skin with cow dung and throwing the rest out of the window.

>strong dislike towards Go.

Go unsolves problem without unlocking any new possibilities. Zig unsolves problem before it aims towards niches where the "solution" doesn't work.


> Rust has some support for zero-runtime, but a lot of it's features is either useless of outright in the way when you are not using a heap.

Could you give some examples?


Genuinely curious because I don't know: when you group Swift with Rust here, do you mean in terms of memory safety guarantees or in the sense of being used for systems-level projects? I've always thought of Swift as having runtime safety (via ARC), not the same compile-time model as Rust, and mostly confined to Apple platforms.

I'm surprised to see them mentioned alongside each other, but I may very well be missing something basic.


Swift is mostly runtime-enforced now but there are a lot of cultural affinities (for lack of a better term) between Swift and Rust and there’s a proposal to add ownership https://github.com/swiftlang/swift/blob/main/docs/OwnershipM...


This is a very friendly and cordial response. Given that the parent comment was implying that the creators of Zed don't actually know how to build software. Based on their credentials building Rails crud apps, I suppose.


Put another way: errors tend to either be handled "close by" or "far away", but rarely "in the middle".

So Java's checked exceptions force you to write verbose and pointless code in all the wrong places (the "in the middle" code that can't handle and doesn't care about the exception).


> So Java's checked exceptions force you to write verbose and pointless code in all the wrong places (the "in the middle" code that can't handle and doesn't care about the exception).

It doesn't, you can just declare that the function throws these as well, you don't have to handle it directly.


It pollutes type signatures. If some method deep down the call stack changes its implementation details from throwing exception A you don't care about to throwing exception B you also don't care about, you also have to change the type of `throws` annotation on your method.

This is annoying enough to deal with in concrete code, but interfaces make it a nightmare.


You mean like using Result with a long list of possible errors, thus having crates that handle this magically with macros?


Yes, the exact same problem is present in languages where "errors are just values".

To solve this, Rust does allow you to just Box<dyn Error> (or equivalents like anyhow). And Go has the Error interface. People who list out all concrete error types are just masochists.


Go as usual, got this clever idea to use strings and having people parse error messages.

It took until version 1.13 to have something better, and even now too many people still do errors.New("....."), because so is Go world.


It feels like there's a lot of shifting goalposts. A year ago, the hype was that knowledge work would cease to exist by 2027.

Now we are trying to hype up enhanced email autocomplete and data analysis as revolutionary?

I agree that those things are useful. But it's not really addressing the criticism. I would have zero criticisms of AI marketing if it was "hey, look at this new technology that can assist your employees and make them 20% more productive".

I think there's also a healthy dose of skepticism after the internet and social media age. Those were also society altering technologies that purported to democratize the political and economic system. I don't think those goals were accomplished, although without a doubt many workers and industries were made more productive. That effect is definitely real and I'm not denying that.

But in other areas, the last 3 decades of technological advancement have been a resounding failure. We haven't made a dent in educational outcomes or intergenerational poverty, for instance.


I think benchmark targeting is going to be a serious problem going forward. The recent Nate Silver podcast on poker performance is interesting. Basically, the LLM models still suck at playing poker.

Poker tests intelligence. So what gives? One interesting thing is that for whatever reason, poker performance isn't used a benchmark in the LLM showdown between big tech companies.

The models have definitely improved in the past few years. I'm skeptical that there's been a "break-through", and I'm growing more skeptical of the exponential growth theory. It looks to me like the big tech companies are just throwing huge compute and engineering budgets at the existing transformer tech, to improve benchmarks one by one.

I'm sure if Google allocated 10 engineers a dozen million dollars to improve Gemini's poker performance, it would increase. The idea before AGI and the exponential growth hypothesis is that you don't have to do that because the AI gets smarter in a general sense all on it's own.


I think that's generally fair, but this point goes too far:

> improve benchmarks one by one

If you're right about that in the strong sense — that each task needs to be optimised in total isolation — then it would be a longer, slower road to a really powerful humanlike system.

What I think is really happening though that each specific task (eg. coding) is having large spillover effects on other areas (eg. helping them to be better at extended verbal reasoning even when not writing any code). The AI labs can't do everything at once, so they're focusing where:

- It's easy to generate more data and measure results (coding, maths etc.) - There's a relative lack of good data in the existing training corpus (eg. good agentic reasoning logic - the kinds of internal monologs that humans rarely write down) - Areas where it would be immediately useful for the models to get better in a targeted way (eg. agentic tool-use; developing great hypothesis generation instincts in scientific fields like algorithm design, drug discovery and ML research)

By the time those tasks are optimised, I suspect the spill over effects will be substantial and the models will generally be much more capable.

Beyond that, the labs are all pretty open about the fact that they want to use the resulting AI talents for coding, reasoning and research skills to accelerate their own research. If that works (definitely not obvious yet) then finding ways to train a much broader array of skills could be much faster because that process itself would be increasingly automated.


This is my experience, too. As a concrete example, I'll need to write a mapper function to convert between a protobuf type and Go type. The types are mirror reflections of each other, and I feed the complete APIs of both in my prompt.

I've yet to find an LLM that can reliability generate mapping code between proto.Foo{ID string} to gomodel.Foo{ID string}.

It still saves me time, because even 50% accuracy is still half that I don't have to write myself.

But it makes me feel like I'm taking crazy pills whenever I read about AI hype. I'm open to the idea that I'm prompting wrong, need a better workflow, etc. But I'm not a luddite, I've "reached up and put in the work" and am always trying to learn new tools.


An LLM ability to do a task is roughly correlated to the number of times that task has been done on the internet before. If you want to see the hype version, you need to write a todo web app in typescript or similar. So it's probably not something you can fix with prompts, but having a model with more focus on relevant training data might help.


These days, they'll sometimes also RL on a task if it's easy to validate outputs and if it seems worth the effort.


This honestly seems like something that could be better handled with pre-LLM technology, like a 15-line Perl script that reads one on stdin, applies some crufty regexes, and writes the other to stdout. Are there complexities I'm not seeing?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: