Hacker Newsnew | past | comments | ask | show | jobs | submit | superlopuh's commentslogin

In MLIR, there are two representations of memory, `tensor` and `memref`, which enables you to do some high-level things[0] in SSA before "bufferizing" to memrefs, which are eventually lowered to LLVM pointers.

[0]: https://mlir.llvm.org/docs/Dialects/TensorOps/


Python with Pyright in strict mode. I work on a ~200kLOC fully typed Python project [0] and am having fun.

[0]: https://github.com/xdslproject/xdsl


Have you worked with TypeScript? I’m working with both every day and I’m always frustrated by the limits of the 'type' system in Python- sure it’s better than nothing but it’s so basic compared to what you can do in TypeScript. It’s very easy to use advanced generics in TypeScript but a hell to do (sometimes outright impossible) in Python.


Yep, although never in a project of a similar size. One advantage of the Python setup is that the types are ignored at runtime, so there's no overhead at startup/compilation time. Although it's also a disadvantage in terms of what you can do in the system, of course.


Deno and latest versions of Nodejs run TS code without transpilation


I agree it is pretty nice (with uv and as long as you REALLY don't care about performance). But even if you are one of the enlightened few to use that setup, you still have to deal with dependencies that don't have type annotations, or only basic ones like `dict`.

Typescript (via Deno) is still a better option IMO.


Can someone familiar with performance of LLMs please tell me how important this is to the overall perf? I'm interested in looking into optimizing tokenizers, and have not yet run the measurements. I would have assumed that the cost is generally dominated by matmuls but am encouraged by the reception of this post in the comments.


Tokenization is typically done on CPU and is rarely (if ever) a bottleneck for training or inference.

GPU kernels typically dominate in terms of wall clock time, the only exception might be very small models.

Thus the latency of tokenization can essentially be “hidden”, by having the CPU prepare the next batch while the GPU finishes the current batch.


Tokenizing text is ridiculously small part of the overall computation that goes into serving a request. With that said if you’re doing this on petabytes of data, never hurts to have something faster.


A language that isn’t memory-safe can definitely hurt. AI needs more security, not less.


To echo the other replies, the tokenizer is definitely not the bottleneck. It just happens to be the first step in inference, so it's what I did first.


Tokenization performance is complicated, but your guidepost is that the institutions with the resources and talent to do so choose to write extremely fast tokenizers: sentencepiece and tiktoken both pay dearly in complexity (particularly complexity of deployment because now you've got another axis of architecture-specific build/bundle/dylib to manage in addition to whatever your accelerator burden always was: its now aarch64 cross x86_64 cross CUDA capability...)

Sometimes it can overlap with accelerator issue, but pros look at flame graphs: a CPU core running the AVX lanes hard isn't keeping the bus fed, million things. People pre-tokenize big runs all the time.

I don't know why this thread is full of "nothing to see here", this obliterates the SOTA from the money is no object status quo: I'd like to think better of the community than the obvious which is that C++ is threatening a modest mindshare comeback against a Rust narrative that's already under pressure from the explosion of interest in Zig. Maybe there's a better reason.


I really want to switch to Zed from Cursor but the battery usage for my Python project with Pyright is unjustifiable. There are issues for this on GitHub and I'm just sad that the team isn't prioritising this more.


It’s funny you mention this because I have an issue with Cursor where if I leave it open for more than a few hours my M3 Max starts to melt, the fans spin up, and it turns out to be using most of my CPU when it should be idling.

Zed on the other hand works perfectly fine for me. Goes to show how a slightly different stack can completely change one’s experiences with a tool.


I love that language and frequently show it to people. I'm sad to see that my local install doesn't work any more. I actually used it to solve a puzzle in Evoland 2 that I'm relatively sure was added as a joke, and is not solvable in a reasonable time without a solver. I'm actually doing a PhD in compilers right now, and would love to chat about sentient if you have the time. My email is sasha@lopoukhine.com.


You might be interested in looking at MiniZinc (https://minizinc.org/) which is an open source modelling language for combinatorial problems. The system comes from a constraint programming background but the language is solver agnostic can be used to compile into many different types of solvers.


I just get a bunch of stars.


    $ traceroute bad.horse --resolve-hostnames
    traceroute to bad.horse (162.252.205.157), 64 hops max
      1   192.168.218.150 (_gateway)  0.483ms  0.580ms  0.548ms 
      2   *  *  * 
      3   172.24.41.146 (172.24.41.146)  35.920ms  29.950ms  30.014ms 
      4   *  *  * 
      5   172.24.195.1 (172.24.195.1)  30.097ms  24.669ms  28.539ms 
      6   172.24.195.10 (172.24.195.10)  30.086ms  40.166ms  31.028ms 
      7   80.233.113.58 (80.233.113.58)  29.500ms  23.967ms  43.113ms 
      8   185.6.36.58 (3ireland.ipv4.sw01.inex.ie)  33.895ms  24.975ms  34.959ms 
      9   185.6.36.86 (inex-01.ixp.dubie.as8220.net)  22.309ms  19.770ms  19.886ms 
     10   217.163.150.193 (1-1-c21-1.ear2.Dublin6.Level3.net)  36.257ms  41.033ms  39.510ms 
     11   4.69.218.54 (ae1.8.bar4.Toronto1.net.lumen.tech)  113.272ms  112.558ms  126.953ms 
     12   4.16.51.30 (level3-gw.core02.tor1.prioritycolo.com)  115.108ms  123.140ms  109.593ms 
     13   67.223.96.90 (67.223.96.90)  121.502ms  119.570ms  121.227ms 
     14   162.252.205.130 (bad.horse)  113.161ms  116.657ms  133.119ms 
     15   162.252.205.131 (bad.horse)  119.704ms  128.309ms  116.360ms 
     16   162.252.205.132 (bad.horse)  125.646ms  125.660ms  130.224ms 
     17   162.252.205.133 (bad.horse)  132.803ms  122.749ms  126.137ms 
     18   162.252.205.134 (he.rides.across.the.nation)  133.363ms  132.422ms  145.895ms 
     19   162.252.205.135 (the.thoroughbred.of.sin)  154.954ms  136.148ms  142.345ms 
     20   162.252.205.136 (he.got.the.application)  158.392ms  151.055ms  148.331ms 
     21   162.252.205.137 (that.you.just.sent.in)  156.020ms  146.436ms  159.945ms 
     22   162.252.205.138 (it.needs.evaluation)  167.918ms  157.544ms  171.169ms 
     23   162.252.205.139 (so.let.the.games.begin)  161.076ms  173.227ms  162.558ms 
     24   162.252.205.140 (a.heinous.crime)  182.682ms  175.853ms  167.833ms 
     25   162.252.205.141 (a.show.of.force)  186.703ms  222.181ms  590.896ms 
     26   162.252.205.142 (a.murder.would.be.nice.of.course)  602.997ms  175.406ms  179.810ms 
     27   162.252.205.143 (bad.horse)  196.847ms  187.985ms  190.071ms 
     28   162.252.205.144 (bad.horse)  193.553ms  184.281ms  196.659ms 
     29   162.252.205.145 (bad.horse)  187.502ms  195.218ms  194.829ms 
     30   162.252.205.146 (he-s.bad)  207.271ms  194.007ms  200.673ms 
     31   162.252.205.147 (the.evil.league.of.evil)  204.942ms  200.672ms  199.780ms 
     32   162.252.205.148 (is.watching.so.beware)  202.099ms  214.924ms  213.818ms 
     33   162.252.205.149 (the.grade.that.you.receive)  233.816ms  220.345ms  207.857ms 
     34   162.252.205.150 (will.be.your.last.we.swear)  229.135ms  224.107ms  221.103ms 
     35   162.252.205.151 (so.make.the.bad.horse.gleeful)  272.907ms  241.476ms  222.392ms 
     36   162.252.205.152 (or.he-ll.make.you.his.mare)  250.987ms  224.407ms  223.308ms 
     37   162.252.205.153 (o_o)  246.874ms  234.844ms  234.278ms 
     38   162.252.205.154 (you-re.saddled.up)  269.535ms  241.234ms  238.331ms 
     39   162.252.205.155 (there-s.no.recourse)  243.611ms  244.293ms  236.522ms 
     40   162.252.205.156 (it-s.hi-ho.silver)  241.793ms  268.596ms  281.158ms 
     41   162.252.205.157 (signed.bad.horse)  286.400ms  267.379ms  280.296ms


Beautiful


A bunch of stars aren't a very good horse?


I once used Sentient[0] to solve a similar puzzle in a different game that I think made a puzzle that required a constraint solver as a joke on the player. I'm a bit saddened to see that the repo hasn't been updated in 6 years, and a node update broke the binary installed on my computer, I find it a much more ergonomic environment than Prolog, hopefully someone else will pick up the mantle.

[0]: https://sentient-lang.org/


My experience with Zig is that it's also a plausible replacement for C++


explanation needed, bro.

nerdy langnoob or noobie langnerd here. not sure which is which, cuz my parsing skills are nearly zilch. ;)


Zig’s comptime feature gives a lot of the power of C++ templates. This makes it easier to implement a lot of libraries which are just a pain to do in plain C. Even seemingly simple things like a generic vector are annoying to do in C, unless you abandon type safety and just start passing pointers to void.

I believe Zig even has some libraries for doing more exotic things easily, such as converting between an array of structs and a struct of arrays (and back).


>Zig’s comptime feature gives a lot of the power of C++ templates. This makes it easier to implement a lot of libraries which are just a pain to do in plain C. Even seemingly simple things like a generic vector are annoying to do in C, unless you abandon type safety and just start passing pointers to void.

I get it, thanks. in C you have to cast everything to (void *) to do things like a generic vector.

>I believe Zig even has some libraries for doing more exotic things easily, such as converting between an array of structs and a struct of arrays (and back).

yes, iirc, there was a zig thread about this on hn recently.


Depends on which C++ version you're talking about.


C programmers have long suffered the cumbersome ritual of struct manipulation—resorting to C++ merely to dress up structs as classes.

Zig shatters that with comptime and the 'type' type.


jay. eff. cee.

the prejudice and preconceived notions of some of these hn frigtards would be amazing, if it was not known in advance, for a long time now, by me and many others around here, who have more maturity and sense, and who don't have insecurity chips on their shoulders.

i mean, downvoting a perfectly innocuous comment. maybe due to use of the words nerd or noob, even though applied by me to myself, triggers their atavistic, dumbfuck response?

it's good that more sensible people like chongli, pjmlp (old standby here, and usually interesting to read his proglang comments) and infamouscow, are around here (referring to this subthread), otherwise this site would not be worth visiting at all. I, and many others probably, only stick around here, for the relatively rare better threads, that are worth checking out.

as amy hoy of stacking the bricks famously said, fuck hn. it's only optional, not part of my daily tech / biz fix (any more), and i am actively looking out for other, better places to hang out online, to read comments by, and chat with, more sensible people.

https://stackingthebricks.com/


Possibly your GP comment came across as a snarky attack because of the first sentence. It's clear that you didn't mean it that way, but there isn't enough information in "explanation needed, bro" to disambiguate your intent (https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...).


saw this just now.

will reply in a day or two after reviewing my own comments, yours, and your link.


I know of a few projects looking in that direction, each optimising for different things, and none getting near the capability of LLVM, which is going to take some time. I spoke with some of the core MLIR developers about this, and they're generally open to the notion, but it's going to take a lot of volunteer effort to get there, and it's not clear who the sherpa will be, especially given the major sponsors of the LLVM project aren't in a particular hurry. If you're interested in this feel free to look our paper up in a week or two, we've had a bit of trouble uploading it to arxiv but should be ready soon.

https://2025.cgo.org/details/cgo-2025-papers/39/A-Multi-Leve...

Here's a quick pres from the last dev meeting on how this can be leveraged to compile NNs to a RISC-V-based accelerator core: https://www.youtube.com/watch?v=RSTjn_wA16A&t=1s


The typing version is still useful when you want to communicate that the result conforms to a certain interface, which doesn't include mutability in the case of Set, but not the exact type.

Edit: I see that he imported the typing Set, which is deprecated, instead of collections.abc.Set, which is still useful, so your comment is correct.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: