Thus is entirely the wrong way around: raytracing scales less than linearly with the scene complexity while rasterization is mosyly linear in the number of visible primitives. It is not a simple complexity function in either case, though.
I meant that in the sense that O(log n) < O(n). But when comparing raytracing to rasterization you have the problem that the complexity depends mostly on different variables for each algorithm.
On the other hand, if we assume that every pixel is covered by a constant number of triangles and a typical scene doesn't push more triangles than there are pixels, rasterization suddenly looks a lot more like O(pixels) or O(1) per pixel while ray tracing is still O(log triangles) per pixel accounting only for direct rays.
Then again, it's the constant factors that really matter. A couple arithmetic operations plus a single spatially coherent memory fetch (depth test) and a single write to the frame- or g-buffer vs. 10+ fetches for the kd/bvh tree traversal followed by several full ray vs. tri tests (each one taking at least an order of magnitude more arithmetic ops than the sign-based test in rasterization).
I guess when we are not yet able to use Voxel Cone Tracing in AAA games, will the temporal filter or RNN based denoising really be enough to make ray traced GI worth the cost?
[This became quite a rambling post, but I'm too lazy to shorten it. Sorry.]
For a proper response to this I'd need to dig up the literature that analyzed the complexity in detail. I haven't had a reason to look at that yet. In practice, we do not care that much for theoretical complexity of algorithms. It does not tell you anything really useful. For certain problems, a grid beats a BVH, and for others, a BVH beats a grid. Sometimes, switching the heuristic used for BVH construction makes or breaks performance. Sometimes, rasterization performs worse than ray tracing.
Voxel cone tracing is at its core a volume rendering technique. It requires a brute force sampling of the generated reflectance volume at each grid cell along the ray. The dynamic generation of the volumetric data is a three-dimensional rasterization step that is not cheap. And the output is only really good for surfaces with a certain amount of glossiness. I was a bit surprised that Epic axed it from UE4 before release (implementing it takes a lot of work!), but I think in the end the combined results from reflection mapping and screen space reflections was of a similar quality. It's a shame, though, that Cyril Crassin's research work went essentially unused.
This is fundamentally different from path tracing with BVH traversal. A lot of manpower and money has been sunk into the later problem in the last couple of decades. Ray intersection kernels like OptiX use every trick in the book to run fast on the hardware they are designed for - and they are really, really fast when you consider what they have to work with. Unfortunately, the hardware manufacturers are hell bent on keeping a lot of their tricks secret.
Wavelet filter based denoising really takes the rendered input down to about 1 or 2 paths per pixel. I have had that demonstrated to me in real time on quite complex scenes (one was San Miguel) - on currently available commodity hardware, too. Otherwise I wouldn't believe it. These filters make realtime path tracing work.