More

cshenton · on April 26, 2024

My experience is that the ecosystem is a mess, have hit winit, wgpu, and countless bevy bugs, iteration times are abysmal, documentation is nonexistent. In the time it would take me to make a game in popular Rust tooling I could build the game and engine from scratch in C and also have something that would crash less.

Klonoar · on April 27, 2024

> documentation is nonexistent

You know, I think this point is important to get right: there are generally docs, Rust does a very good job of making it easy to write docs.

What doesn't always exist are guides that explain how to piece things together. Sometimes you wind up needing to really know the inner platform to piece together things in Rust, and while I love the language, this is one area where the community could improve.

logicprog · on April 27, 2024

Yeah, in general with large and Powerful libraries or frameworks, I find that pure API documentation, even if very thorough and well explained on an individual function or data structure level, is just simply not enough. I also want a reference manual type experience, where that API reference is integrated with explanations of the reasoning behind how the framework was designed, and how to actually think about using it, and examples of many common things you might want to do that integrate well together. The gold standard for this in my opinion is the opengl Red Book.

logicprog · on April 27, 2024

This, and the fact that correctness and safety and stability aren't quite as important in game development, or even game engine development, as they are in other fields where rust is applicable, is why I purposefully happily use the powerful, futureful, well established, copiously documented C or C++ libraries I need, instead of tge Rust alternatives, for almost everything. It works extremely well for me because I get to leverage the power and amazing ecosystem around things like dear imgui or sdl2 or opengl or physx, while being able to use rust, which grants me essentially a cleaner, even more modern version of C++ with all of the features I love from ocaml, in a way that restricts any weird crash or memory safety errors to the places where I interface with the lower level libraries, and sometimes not even there, depending on how high level the bindings are. It's honestly pretty nice for me.

cshenton · on April 23, 2024

You absolutely cannot implement stream compaction “at the speed of native” as WebGPU is missing the wave/subgroup intrinsics and globally coherent memory necessary to do that efficiently as possible.

tehsauce · on April 24, 2024

It's possible you might not need direct access to wave/subgroup ops to implement efficient stream compaction. There's a great old Nvidia blog post on "warp-aggregated atomics"

https://developer.nvidia.com/blog/cuda-pro-tip-optimized-fil...

where they show that their compiler is sometimes able to automatically convert global atomic operations into the warp local versions, and achieve the same performance as manually written intrinsics. I was recently curious if 10 years later these same optimizations had made it into other GPUs and platforms besides cuda, so I put together a simple atomics benchmark in WebGPU.

https://github.com/PWhiddy/webgpu-atomics-benchmark

The results seem to indicate that these optimizations are accessible through webgpu on chrome on both MacOS and Linux (with nvidia gpu). Note that I'm not directly testing stream compaction, just incrementing a single global atomic counter. So that would need to be tested to know for sure if the optimization still holds there. If you see any issues with the benchmark or this reasoning please let me know! I am hoping to solidify my knowledge in this area :)

FL33TW00D · on April 23, 2024

Oh look it's subgroup support landing last week: https://github.com/gfx-rs/wgpu/pull/5301

jsheard · on April 23, 2024

That's a wgpu-specific extension, not part of the actual WebGPU spec, so you can't use it on the web.

https://github.com/gpuweb/gpuweb/blob/main/proposals/subgrou...

There is a proposal for supporting subgroups in WebGPU proper but it's still in the draft stage.

FL33TW00D · on April 23, 2024

I'm aware. It is an implementation of the linked proposal.

The `wgpu` implementation linked will make its way into Firefox eventually. Dawn will follow up with a similar one for Chrome.

I was linking it to demonstrate there are no technical hurdles and it's only really approval remaining.

sp332 · on April 24, 2024

Ok, but that's not what "landing" means.

pjmlp · on April 23, 2024

Native extensions unusable on Web browsers don't count.

littlestymaar · on April 23, 2024

Then nothing involving WebGPU counts since it's not implemented on other browsers than Chromium and not on Linux even in Chromium…

WebGPU is brand new, and the paint is still wet. It doesn't make sense to dismiss things that haven't landed in browsers yet as “unusable on the web”.

mr_toad · on April 24, 2024

There’s an advanced setting in Safari to enable it, but I can’t say how well it works. In this instance it doesn’t.

senorrib · on April 24, 2024

It doesn't work at all. Doesn't even exist in Safari anymore because they ditched the old implementation and are rewriting everything.

FL33TW00D · on April 24, 2024

Multiple engineers are working on adding it back: https://github.com/WebKit/WebKit/pulls?q=is%3Apr+is%3Aclosed...

pjmlp · on April 24, 2024

Welcome to Web standards, and Google's ChromeOS transformation of the Web, with help of many Web developers out there.

Doesn't change the fact that is a Web standard, for Web browsers.

littlestymaar · on April 24, 2024

It is a WIP web standard. And the spec is still evolving most things are stable at that points, but new features are still being added, like this one!).

And that's how the web works, it was the same for WebRTC which spent 2-3 years in such a state, same for MSE, etc.

torginus · on April 24, 2024

I think compilers should be smart enough to substitute group-shared atomics with horizontal ops. If it's not already doing it, it should be!

But anyways, Histogram Pyramids is a more efficient algorithm for implementing parallel scan anyways. It essentially builds a series of 3D buffers, each having half the dimension of the previous level, and each value containing the sum of the amounts in each underlying cells, with the top cube being just a single value, the total amount of cells.

Then instead of doing the second pass where you figure out what index thread is supposed to write to, and writing it to a buffer, you just simply drill down into said cubes and figure out the index at the invocation of the meshing part by looking at your thread index (lets say 1526), and looking at the 8 smaller cubes (okay, cube 1 has 516 entries, so 1100 to go, cube 2 has 1031 entries, so 69 to go, cube 3 has 225 entries, so we go to cube 3), and recursively repeat until you find the index. Since all threads in a group tend go into the same cubes, all threads tend to read the same bits of memory until getting down to the bottom levels, making it very GPU cache friendly (divergent reads kill GPGPU perf).

Forgive me if I got the technical terminology wrong, I haven't actually worked on GPGPU in more than a decade, but it's fun to not that something that I did cca 2011 as an undergrad is suddenly relevant again (in which I implemented HistoPyramids from a 2007ish paper, and Marching Cubes, an 1980s algorithm). Everything old is new again.

masspro · on April 23, 2024

You seem knowledgeable, and I’m possibly going back into a GPGPU project after many years out of the game, so: overall do you see a good future for filling these compute-related gaps in the WebGPU API? Really I’m wondering whether wgpu is an okay choice versus raw Vulkan for native GPGPU outside the browser.

jsheard · on April 23, 2024

The answer to that for any given feature is "can untrusted code be trusted with that?". Wave intrinsics are probably doable. Bindless maybe, but expect a bunch of bounds checking overhead. Pointers/BDA, absolutely not.

Native libraries like wgpu can do whatever they want in extensions, safety be damned, but you're stepping outside of the WebGPU spec in that case.

littlestymaar · on April 23, 2024

What's BDA in that context, please? I can only confidently assume it's not “battle damage assessment”.

jsheard · on April 23, 2024

Buffer Device Address, the Vulkan name for raw pointers in shaders.

littlestymaar · on April 23, 2024

tormeh · on April 23, 2024

Don't know about GPGPU, but can give you a probably correct answer: Compared to "native" APIs you trade features for compatibility. It's always going to lag behind Vulkan/DX/Metal. Are you ok with excluding platforms? Vulkan/Metal/DX. If not, then I'd give wgpu a chance. Wgpu is also higher-level than Vulkan, which is borh a pro and a con.

pjmlp · on April 24, 2024

Middleware, the portability, latest features of native APIs, and nice GPGPU tooling.

dekhn · on April 23, 2024

shhh... you might cause their reality distortion field to fail!

Archit3ch · on April 23, 2024

The demo doesn't work on mobile Chrome. Worse, the blog post crashes the embedded browser in the HN app. May I suggest just linking to the demo instead?

spintin · on April 23, 2024

This is the eternal browserbros. attempt to make us think native has zero value now that we have a completely captured and bloated browser.

The browser is dead, the only thing you can use it for is filling out HTML forms and maybe some light inventory management.

The final app is C+Java where you put the right stuff where it is needed. Just like the browser used to be before Oracle did it's magic on the applet.

worik · on April 23, 2024

> The browser is dead,

Yea. Nah!

That obit is a bit premature

teaearlgraycold · on April 23, 2024

So you're telling me you write Java professionally?

pdpi · on April 23, 2024

Funnily enough, in a world with WASM, we might actually have Java in the backend and C in the frontend rather than vice versa as it would've been likelier in the 90s.

pjmlp · on April 24, 2024

The irony of half world backed by VC money, trying to reinvent Erlang, Java and .NET application servers, while pretending to be innovative.

spintin · on April 23, 2024

WASM is adding GC... recreating the wheel of the applet but without escaping the problem of javascript glue.

Go is just Java without the WM.

Rust is just a native compiler that creates slow programs and complains a lot.

worik · on April 23, 2024

> Rust is just a native compiler that creates slow programs and complains a lot.

Good morning Troll

I'll give you "complains a lot."

neonsunset · on April 24, 2024

Corrective upvote from me - the comment is too funny

junon · on April 24, 2024

You had me all the way up until the rust bit.

spintin · on April 23, 2024

It's pretty much the only professional language you can write.

If you consider respect and responsibility.

cshenton · on Oct 9, 2023

Really lovely. A lot here reminds me of design in Odin lang. Short integral types, no const, composite returns over out params. Big fan of the approach of designing for a single translation unit and exploiting the optimisations that provides from RVO etc.

cshenton · on July 22, 2023

Odin is lovely and a great fit for graphics and games programming. I’ve been doing my real time graphics research in it (the CPU parts at least).

cshenton · on May 19, 2023

Absolutely! On a related note, looking for C development jobs is a real pain for this reason.

cshenton · on Dec 28, 2022

I personally find the experience of writing GPU compute code pretty nice on graphics APIs. The interface is pretty much the same “dispatch a 1-3D set of 1-3D work group indices”.

The main pain points vs dedicated compute stuff like cuda is libraries and boilerplate to manage memory and launch kernels.

synergy20 · on Dec 29, 2022

and, how to make the kernel and memory-allocation code working with tensorflow/pytorch, GPGPU is really now just a few libraries made for Tensorflow and Pytorch to invoke, same as CUDA, as far as ML is concerned.

cshenton · on Dec 28, 2022

For 3D, use a game engine. Unity, Unreal, maybe Godot depending on your perf requirements.

bmitc · on Dec 28, 2022

Thanks. That seems to be the case. My project is currently on top of .NET, so Godot and Stride have been the two I've looked at, although Stride is the only one supporting .NET Core (i.e., .NET 5+). Godot and Unity require Mono still.

cshenton · on Dec 28, 2022

When you look at how much nicer the programming experience is for something like Metal it’s hard to blame them.

Think I’d prefer to write 3 backends for Metal level complexity APIs than a single one targeting Vulkan.

thewebcount · on Dec 29, 2022

Yes, so much this. It’s like someone looked at OpenGL and said, “Hey, how can we take all the hard, ugly, terrible parts of OpenGL and throw out all the nice, useful parts?” And that became Vulkan. I’ve written code for OpenGL since the 1.0 days. These days I do Metal. I wanted to check out what it would take to port something to Vulkan. I couldn’t make it through the basic Vulkan tutorial of putting a single triangle on the screen. It was too long, and required too many really low-level things. It felt almost like you needed to write your own memory allocator to use it properly. It was nuts.

pjmlp · on Dec 29, 2022

Worse part is that Khronos doesn't seem to get it, as with OpenGL, they expect the community to come up with ways to fix it.

There are some steps into that direction with LunarG SDK efforts, NVidia was the one coming up with C++ bindings, but that is mostly it.

Nothing as nice out of the box as the proprietary APIs.

A guy from a studio I know has best described it, as one needs to be a graphics developer and a device driver expert to properly code against Vulkan.

cshenton · on Dec 28, 2022

If someone wants to get into graphics API programming that’s probably a bit too high level, as lovely as it is.

pjmlp · on Dec 29, 2022

In 2022 starting with raw 3D APIs is like telling someone they need to learn Assembly from day 1.

cshenton · on Dec 28, 2022

I know you mean major desktop platforms, but in terms of market share for real time 3D applications, mobile and console dwarf Mac and Linux.

tmtvl · on Dec 28, 2022

Yeah, but mobile means Linux and Apple, and console means Windows, BSD, and whatever Switch is, Vulkan works across pretty much all of that. Actually, considering Zink exists, OpenGL may be the most universal...

pjmlp · on Dec 29, 2022

Zero Vulkan or OpenGL on PlayStation and XBox.

On the Switch if your really want the full 3D capabilities, NVN is the answer.