For those out of the loop: Unreal Engine 5 is in early access, two of the most exciting new features are Nanite and Lumen
Nanite is a system for efficiently rendering masses of geometry (more so than using traditional methods)
Lumen is a technique to enable global illumination (rendering with multiple light bounces, traditional engines do 1 bounce + trickery) by first converting geometry to signed distance fields. Signed distance field techniques are very interesting, swapping triangles for analytic surface equations, they can be used to accelerate complex physics and lighting calculations. They've only recently started making their way into big commercial engines – starting with Media Molecule's Dreams game and now in UE5
The PS4 game "Dreams" from 2015 is pretty much entirely signed distance fields and software based rendering. It a great example of what you can do with them.
Here is a link to an admittedly very long and very large PDF detailing the methods they used.
What do you mean by "software based rendering"? I was under impression that all of rendering is on GPU; first with point clouds being calculated in a compute shader, and after that the points being converted to 'splats' and rendered as quads.
It is strange that Dreams didn't make a bigger splash. It is the first time an engine focused on making low-polygon assets look as aesthetically pleasing as possible. The ability to turn up the rendering roughness in some early sketches, and then tune it down once more details are added is incredible. I expected a small revolution with many competing engines reusing the tech, but nothing happened so far?
software based rendering doesn't use the fixed function hardware stages of the GPU. So it won't use the ROPs or other fixed function rasterisation hardware in the GPU, it instead does everything in software using GPGPU techniques.
I watched the talk you linked to, but I couldn't understand most of it. Currently reading through Real-time Rendering book to make sense of it.
I thought once they create splats as quads, the rest of pipeline is standard? Each quad should be 2 triangles that get clip view coordinates, then pixel shader paints the brush texture, and then ROP assembles them on screen. Could you point out where I'm wrong? And in what way can a pixel end up on the screen without going through ROPs?
I'm not actually sure anymore after looking at more content. Based on the video here https://www.youtube.com/watch?v=1Gce4l5orts&t=1225s . It appears that they are even using the splat engine anymore (which I thought they where), and are using the older 'brick' based engine.
Correct. An SDF is basically an acceleration structure for an "omnidirectional" raycast. A "sphere-cast", if you will. If you are extra careful, you can also approximate a cone trace. Unreal Engine 4 has used it as a way of doing shadows for objects that don't animate for a long time [0].
But it's now a pretty standard staple of VFX and other sorts of industries to do a cheap approximate raycast on the GPU.
Lumen simply uses the same data structure to accelerate bounce lighting cone tracing.
There's a lot more practical information which hints at how it works in the Nanite documentation.[0]
The biggest repeated claim there is that Nanite makes rendering cost scale primarily with screen resolution and with little impact from scene complexity.
I haven't had the opportunity to test it myself, but I've spoken to colleagues who got the demo running, ran some tests, and analysed it in RenderDoc. As incredible as it sounds, at a glance it seems Nanite largely delivers on what it promises. I've been very skeptical, but this is really exciting.
Super cool stuff. There are few things in computer science as mentally stimulating and satisfying as computer graphics. And this new wave of UE5 stuff just goes to show it.
My gamedev path is (stubbornly?) along a low-level path through C and Haskell. So I doubt I'll ever use UEx. But I can't wait to know enough to learn from this and build libraries for myself.
I'm by far more proficient in Haskell than any other language (half a decade professionally at this point), and in general I find that it really helps me manage complexity, which games bring in spades. Abstraction feels like it is key, and Haskell so far even has really allowed for nice abstraction.
I've mostly done some 2D stuff, at most using some fixed-function-pipeline-sort of things (blending.) I'm still early in my computer graphics journey, but the Haskell Vulkan bindings are good (there's even a Quake3 engine written with Haskell and Vulkan) so I figure starting with learning Vulkan will be a good starting point. I'm a CompE by education so I would really prefer to start low-level I think.
Really impressive. I sometimes wonder if in 30 years from now it would be still possible to write a 3D engine from scratch and having the same graphical level of that time.
Its possible. General purpose game engines still can be many times slower than custom one. The main reason general purpose engines are used is because they allow faster development and have standardized workflow which is very important for game studios that have large amount of people working together.
Impressive as it is, and also not trying to take away from your main point, but I thought I'd mention that that particular demo is using a screenspace method, which doesn't need to deal with the main challenge of full global illumination.
For a single person, it is unclear if that's really even possible now. The only way I see that changing is if hardware advances to the point where the trivial rendering approach can actually run in real time. But even then, so much of impressive visuals from 3D engines are actually the assets / procedural effects you feed into them, so large teams might have the upper hand indefinitely.
I do not see why not, people were thinking like that 20 years ago, it isn't like your 3D engine will have to implement all techniques from the last 50-60 years (no need for wolf3d raycasting, for example and a lot of the shading and lighting code will become simplified once everything is capable of raytracing). Also AFAIK UE5 contains something like 5 render paths for different rendering approaches for all the different hardware and game types they need to cater to, but a specialized engine wont need to be that generic.
I don't think it's been possible for years. A person can write a cutting edge tech demo but it's not really usable as a general 3D engine. A small team might be able to write a limited 3D engine that would be state of the art in some circumstances.
It probably will be, just extremely computationally expensive. But 30 years is also further out than when most experts predict the singularity will arrive; after which all bets are off.
There is a lot of complexity here not covered by the overview, such as the way that the clustering works (the adaptive splitting like that means that we have not just a tree but a full DAG traversed by the GPU), the way the software rasterizer works (lots of careful balancing to manage the GPU well), the way that the materials are calculated on this geometry (don't have GPU derivatives accurate enough, so analytic derivatives are used instead), the interaction with the rest of the frame (decals use stencil so a lot of juggling is done to rearrange things). Heck, even the two rasterization passes to handle occlusion is pretty unobvious.
The 10000ft overview is perhaps easy to understand but the details are anything but basic.
Marketing speak makes a lot of tech sound more complex, advanced, or magical than it usually is. This sadly seems to be becoming more common lately in the 3D graphics industry.
(In a similar marketing trend, Nvidia's "Deep Learning Super Sampling" is more like 2x2 grid deinterlacing, not magical AI upscaling.)
DLSS is TAAU, it's still temporal, based on a 2x2 grid. 2.0 is simply a ML implementation. It doesn't upscale single frames, it doesn't add additional detail to textures. It's "temporal upscaling", which is pretty much just a fancy way of saying deinterlacing.
If discount the entirety of the implementation responsible for the quality improvement as "simply ML", then it is just fancy a deinterlacing indeed. Just like every ML breakthrough is just a fancy matrix multiplication.
But the fact remains that DLSS 2 is friggin amazing: achieving combination of visual quality and performance that was not possible before it, and in some cases it looks even better than rendering at the native resolution. It's solving a hard problem, and does it very very well.
My point is rather that most of the marketing speak makes it sound like an _impossible_ upscaling tech, rather than an impressive temporal super sampling tech.
I do wonder about the "better than rendering at native", though. In true TV-set marketing fashion, DLSS 2.0 comes topped up with a free sharpening filter to make it look slightly sharper than native renders, by default.
Alternatively, for each cluster, render the edge triangles of the cluster at the neighbouring cluster's resolution with Z-write disabled, then render the cluster (overdrawing the real edge triangles), then render the Z-write of the edge. That'll reasonably fill the cracks with little overhead.
This is one the things that it's surprising that it wasn't available before.
As far as I can tell this architecture was possible at least starting from the introduction of DX11 hardware more than 10 years ago, and it seems to be the most straightforward way of rendering a realistic 3D scene.
A lot of what makes Nanite possible isn't available 10 years ago and that's how compute is integrated into the 3D pipeline. There were tricks then but now it's supported by design. As you can see in the breakdown a whole lot of the pipeline is done in compute.
Parts of this pipeline have been used since 2015. You can see most of the techniques explained in here https://www.advances.realtimerendering.com/s2015/aaltonenhaa...
. What nanite does is to continue the development of this by combining it with Visibility Buffers (decouples rasterization from materials) and adds incredibly fancy LOD system and rasterization for small triangles.
The architecture may be possible in the sense that feature-wise you have the pieces (or most of them). But resources like memory, bandwidth and gpu speed, would fall way below the thresholds and tradeoffs needed to make this useful and not just "possible".
Nanite is a system for efficiently rendering masses of geometry (more so than using traditional methods)
Lumen is a technique to enable global illumination (rendering with multiple light bounces, traditional engines do 1 bounce + trickery) by first converting geometry to signed distance fields. Signed distance field techniques are very interesting, swapping triangles for analytic surface equations, they can be used to accelerate complex physics and lighting calculations. They've only recently started making their way into big commercial engines – starting with Media Molecule's Dreams game and now in UE5
Lumen deep dive: https://docs.unrealengine.com/5.0/en-US/RenderingFeatures/Lu...
Signed Distance Functions, Inigo Quilezles: https://iquilezles.org/www/articles/distfunctions/distfuncti...