> The overhead of vulkan is absolutely insane (and not warranted, in my opinion)...

mschuetz · on July 6, 2023

I meant the development/learning overhead. With Vulkan you can do incredible low-level optimizations to squeeze every last bit of performance out of your 3D application, but because you are basically mandated to do it that way, you have a very harsh learning curve and need lots of code for the simplest tasks. I'd rather prefer approaches that make the common things that everyone wants to do easy (draw your first simple scenes), and then optionally gives you all the features to sqeeze out performance where you really need to. Because at least for me, I don't work with massive scenes with millions of instances of thousands of different objects. I do real-time graphics research, mostly with compute, and I'd just like to present the triangles I've created or the framebuffers I created via compute (like shadertoy).

I'll readily admit, I'm neither smart nor patient enough for Vulkan so I quickly gave up and learned CUDA instead, because it was way easier to write a naive software-rasterizer for triangles in CUDA, than it was to combine a compute shader and a vertex+fragment shader in Vulkan. I'm just rendering a single buffer with ~100k compute-generated triangles, and learning Vulkan for that just wasn't worth it.

DeathArrow · on July 6, 2023

Well, then there's OpenGL and DirectX.

dosshell · on July 7, 2023

And webgpu.

Sure, the implementations today are thin layers over D3D12, Vulkan and Mantel. But it is a proper API, cross platform and without the implementation overhead.

Next time I need to draw some graphics I will definitely use WebGPU instead of OpenGL or Vulkan.

PS. Don't let the name fool you. It is a proper render API, Mozilla has written their implementation in rust and Google in C++.

Would not be surprised if WebGPU do get native support by the drivers in the future.

mschuetz · on July 6, 2023

Yeah, but OpenGL doesn't get updates anymore. My timeline goes like: I needed pointers(and pointer casting) for my compute shaders so I checked the corresponding GLSL extension, which was only available in Vulkan so I tried switching from OpenGL to Vulkan. After a week I gave up - the pointer/Buffer reference extension did not look promising anyway - and I tried out CUDA instead. That's when I found out that CUDA is the greatest shit ever. That's what I want graphics programming to be like. Since then I just render all the triangles and lines in CUDA, because it easily handles hundreds of thousands of them in real-time with a naive, unoptimized software-rasterizer, and that's all I need. In addition to the billion points you can also render in real-time in CUDA with atomics.

cesarb · on July 6, 2023

> That's when I found out that CUDA is the greatest shit ever. That's what I want graphics programming to be like.

Something which requires you to buy new hardware from a specific brand, and load an out-of-tree binary-only module on your kernel? That's not what I want graphics programming to be like.

The Vulkan API might be clunkier (I don't know, I haven't looked at the CUDA API, since I don't have the required hardware), but at least it can work everywhere.

mschuetz · on July 7, 2023

It's the exact reason why I've avoided CUDA for years, but I hit a dead end with OpenGL and Vulkan, and CUDA happened to be a fantastic, easy and fast solution. Of course I don't want graphics programming to be NVIDIA-only, but I want it to be like CUDA, just for all platforms.

pjmlp · on July 7, 2023

Except it doesn't work everywhere, far from it.

adastra22 · on July 7, 2023

I'm confused by what you mean here by "software-rasterizer" when you are compiling it to run on the GPU. What do you think the graphics driver is doing when you program it with OpenGL? It's doing more or less the same thing under the hood.

I guess you can technically call that software rasterization now that GPUs are very programmable. But it's not how the word is usually used.

TazeTSchnitzel · on July 7, 2023

GPUs have dedicated hardware for rasterisation. Using compute shaders to do it would be wasteful.

adastra22 · on July 7, 2023

Not necessarily. I'm working on an application that does exactly that, as we are rendering spheres. It ends up being much more efficient to rasterize the sphere in a combination of vertex and fragment shaders than to instantiate triangles.

mschuetz · on July 7, 2023

Nanite renders small triangles faster by avoiding the dedicated hardware and using compute, instead.

TazeTSchnitzel · on July 9, 2023

Micro-triangles are a special case and Nanite still uses the hardware for more reasonably sized triangles.

mschuetz · on July 9, 2023

"Turns out we can beat the hardware with triangles much bigger than expected, far past micropoly. We software rasterize any clusters whos triangles are less than 32 pixels long." - https://advances.realtimerendering.com/s2021/Karis_Nanite_SI...

It's not just micro-triangles, it works for triangles that span multiple pixels. And that's not a special case, that's the standard nowadays, except for games targeting very low-end devices.

The dedicated hardware is nice for general-purpose support for arbitrary triangle-soups. But if you structure triangles a certain way and have a certain amount of density (which you want for modern games), you can specifically optimize for that and beat the general-purpose hardware rasterizer.