Bindless Vulkan is in some ways simpler than binding for each draw. But there are now more synchronization conditions.
That big table of descriptors is an unusual data object. It's normally mapped as writable from the CPU and readable from the GPU. The CPU side has to track which slots are in use. When the CPU is done with a texture slot, the GPU might not be done yet, so all deletes have to deferred until the end of the frame. This isn't inherently difficult but has to be designed in.
The usual curse of inefficient Vulkan is maxing out the main CPU thread long before the GPU is fully utilized. This is fixable. There can be multiple draw threads. Assets can be loaded into the GPU while drawing is in progress, using transfer queues and DMA. Except that for integrated memory GPUs, you don't have to copy from CPU to GPU at all. If you do all this, GPU utilization should reach 100% before CPU utilization does.
Except most of this stuff doesn't work on mobile or WebGPU yet. Portable code has way too many cases. Look at WGPU.
What you get by using Unity or Unreal Engine is that a big team already worked through all this overhead. Most of the open source renderers aren't there yet. Or weren't as of a year ago.
> What you get by using Unity or Unreal Engine is that a big team already worked through all this overhead.
Obviously, given the low-level subject matter, this post wasn't aimed at turnkey engine users. In fact, people interested in this stuff are probably actively trying to be independent of those engines.
I will add that we probably need more people able to implement rendering engines. Having to chose between two corporate giants and suboptimal engines isn't ideal, neither for developers nor for players.
In the Rust graphics community, we have My First Renderer about four times, and no good renderers. It's about a year of work to get to where you can load the standard glTF examples and render them. Those are all small static scenes. Then you hit the scaling, performance, and dynamic update issues, where it gets hard.
The effort level for a good renderer in Rust is not yet known, but it's above two developer years. I'd guess three to five. Currently, there are three failed efforts and one in progress with some EU funding.
For multi-platform (desktop, laptop, browser, phone) add another year or two. All those use Vulkan, but with slightly different subsets and constraints.
But it's Yet Another Reason, if you are looking to develop a game, to shitcan your dreams of writing it from scratch Jonathan Blow-style and just use one of the established engines.
GPU-driven rendering in video games still requires some data transfer from CPU - at least for transformations of game objects. In some cases it can create a bottleneck. That's why it should be considered to implement game logic on GPU and limit data transfer only by player input.
I once experimented with such approach. It generally works, but it's hard to do so - one need to use only a shading language for general game logic, debugging is practically non-existing.
In terms of textures you could already do pseudo-bindless using texture arrays. However there were quite a few limitations but I think the two bigs ones are that this was consider to be conceptually one resource and the textures had to be the same (format size etc). I recall some engines did this kind of bindless for particle effects.
That being said the non indexable feature feature that remains is the pipeline itself. Some engines have 10s of thousands of shaders.
Bindless Vulkan is in some ways simpler than binding for each draw. But there are now more synchronization conditions.
That big table of descriptors is an unusual data object. It's normally mapped as writable from the CPU and readable from the GPU. The CPU side has to track which slots are in use. When the CPU is done with a texture slot, the GPU might not be done yet, so all deletes have to deferred until the end of the frame. This isn't inherently difficult but has to be designed in.
The usual curse of inefficient Vulkan is maxing out the main CPU thread long before the GPU is fully utilized. This is fixable. There can be multiple draw threads. Assets can be loaded into the GPU while drawing is in progress, using transfer queues and DMA. Except that for integrated memory GPUs, you don't have to copy from CPU to GPU at all. If you do all this, GPU utilization should reach 100% before CPU utilization does.
Except most of this stuff doesn't work on mobile or WebGPU yet. Portable code has way too many cases. Look at WGPU.
What you get by using Unity or Unreal Engine is that a big team already worked through all this overhead. Most of the open source renderers aren't there yet. Or weren't as of a year ago.
> What you get by using Unity or Unreal Engine is that a big team already worked through all this overhead.
Obviously, given the low-level subject matter, this post wasn't aimed at turnkey engine users. In fact, people interested in this stuff are probably actively trying to be independent of those engines.
I will add that we probably need more people able to implement rendering engines. Having to chose between two corporate giants and suboptimal engines isn't ideal, neither for developers nor for players.
In the Rust graphics community, we have My First Renderer about four times, and no good renderers. It's about a year of work to get to where you can load the standard glTF examples and render them. Those are all small static scenes. Then you hit the scaling, performance, and dynamic update issues, where it gets hard.
The effort level for a good renderer in Rust is not yet known, but it's above two developer years. I'd guess three to five. Currently, there are three failed efforts and one in progress with some EU funding.
For multi-platform (desktop, laptop, browser, phone) add another year or two. All those use Vulkan, but with slightly different subsets and constraints.
I'm a user of renderers, not a developer of one.
But it's Yet Another Reason, if you are looking to develop a game, to shitcan your dreams of writing it from scratch Jonathan Blow-style and just use one of the established engines.
GPU-driven rendering in video games still requires some data transfer from CPU - at least for transformations of game objects. In some cases it can create a bottleneck. That's why it should be considered to implement game logic on GPU and limit data transfer only by player input.
I once experimented with such approach. It generally works, but it's hard to do so - one need to use only a shading language for general game logic, debugging is practically non-existing.
In terms of textures you could already do pseudo-bindless using texture arrays. However there were quite a few limitations but I think the two bigs ones are that this was consider to be conceptually one resource and the textures had to be the same (format size etc). I recall some engines did this kind of bindless for particle effects.
That being said the non indexable feature feature that remains is the pipeline itself. Some engines have 10s of thousands of shaders.
Fingers crossed for webgpu bindless keeping on! https://hackmd.io/PCwnjLyVSqmLfTRSqH0viA?view