For the weekend, I’ll be writing just sidenote posts! I have two in mind, given the big week in gaming tech we just had, and also, as I start this post, I’m about 20 hours from getting married, so I’ll save the posts with deep dives into WoW and FFXIV content for after that!
So, last generation, Nvidia made a bold bet on a new set of technologies for their graphics cards. Rather than advancing pure rasterization performance, improving performance in current games by a larger margin, the bet Nvidia made was on new technologies. Introduced as a suite of technologies with the Turing architecture in the RTX 2080 Ti, 2080, 2070, 2060, and all of the various Super flavors, these cards made a risky tradeoff – a reduced generational leap in traditional raster performance in order to include two new types of processing cores – Tensor cores (which had previously been used in the Volta cards including the gaming-adjacent Titan V) and RT cores (hardware designed specifically to optimize bounding-volume hierarchy calculations, used to determine the bounces of lighting rays). The end result was that the top-end RTX 2080 Ti represented a smaller generational leap of at best 30% in rasterization performance over the GTX 1080 Ti, while also having higher prices.
This was a bet that Nvidia could make for a few reasons. Firstly, there was a gamble that AMD wouldn’t have a Radeon card to compete, which did in fact come true. The early 2019 launch of the Radeon VII failed to beat the GTX 1080 Ti in gaming despite having 16GB of RAM and a 1 TB/s memory bandwidth, and the mid-2019 launch of the Radeon RX 5700 XT, while good, was mostly competitive on performance and price with the RTX 2070. The second bet, however, was that rolling out these new technologies during this window would allow them to grab headlines with legitimately cool technology, and then to scale it up later and work to optimize the technologies after rollout and seeing how developers worked with them.
Ampere in the RTX 3000 cards as announced this week offers up the continuation of the RTX technologies, optimizing several elements of the GPU architecture to increase performance drastically, offering large increases in shader and raster performance while also substantially boosting raytraced performance, while also reducing prices in the mainstream segments of the lineup, making Ampere cards look vastly better than the Turing equivalents did, all the while also leaving options on the table for more performance segmentation later (Ti versions, a Titan with more SMs than the 3090, doubled RAM versions of the 3080 and 3070, etc). It also gives them a second generation of ray-tracing cards before AMD launches their first generation GPU with the technology.
But the question is, ultimately – was RTX as a technology kit a bust? Did it underperform?
First, we have to define what RTX is, because it is a set of multiple different technologies that synergize with one-another.
The namesake tech, the RT in RTX, is real-time raytracing. Raytracing is a different modality for 3D rendering that offers more likelike and realistic output at the cost of being incredibly computationally expensive. Real-time raytracing in modern games and GPUs is not pure raytracing, however – it is hybrid rendering. The app renders a frame using traditional rasterization technologies, and then the output is handed off to the RT cores in the RTX GPUs to perform BVH calculations – the determination of how light interacts with the scene and casts around. While raytracing includes a broad set of different visual elements, RTX allows developers to pick and choose and implement only selected elements for the sake of performance or artistic styling. You can choose global illumination (using raytraced scene lighting to illuminate the whole scene), raytraced reflections (using raytracing to add realistic reflections instead of shortcuts like screen-space reflections or a complete lack of reflections on surfaces like mirrors, glass, and water), raytraced soft shadows (using the light path calculations to correctly place and draw shadows in a realistic manner), or a combination of these effects, increasing visual fidelity and realism through the mix of these techniques. The RT cores, despite their naming, cannot do the raytracing needed at full resolution and fidelity, so the RT cores do enough math for the Tensor cores to then use an AI denoising calculation to remove visual artifacts and complete the scene.
In addition, however, RTX offered a couple of other benefits. One of these is offloading different software tasks to the GPU components added for RTX. RTX Voice, new generation NVENC video encoding for streaming and recording, and additional features coming with Ampere all use the RT and Tensor cores in the RTX GPUs to accelerate these workloads at no cost to the core GPU performance for rasterization. The Tensor cores, in particular, can accelerate multiple types of AI and machine learning workloads, which makes them useful outside of raytracing.
The biggest feature, arguably, however, is DLSS (Deep Learning Super Sampling). In the initial iteration, the GPU could use a supercomputer-determined scaling mechanism implemented via AI scaling on the Tensor cores to rescale a game’s resolution to improve performance, by rendering the game internally at a lower resolution and then outputting the result to the Tensor cores to be scaled and output at higher resolution. The v1.0 of DLSS, frankly, sucked – it required that the game specifically be trained for DLSS by having Nvidia’s supercomputer for this purpose render scenes from the game by generating a “perfect frame” and then using the Neural Network acceleration in the supercomputer at Nvidia to downsample the image, rerunning the scaling to look for aliasing artifacts it could smooth out and correct for. This meant that you needed a game that supported it, with a profile for DLSS that was only distributed with the Nvidia drivers. Unless a developer worked with Nvidia to complete the Neural Net training for DLSS prior to launch and to get that training included in a pre-launch Game-Ready Driver, it wouldn’t be a setting you could simply turn on, and the end results varied wildly in quality, with the foundational title for DLSS 1.0, Final Fantasy XV, being a smudgy, blurry mess.
In August 2019, Nvidia rolled out DLSS 2.0, which was intended to fix a lot of these issues. A game still has to support it, but does so using two different mechanisms – a “ground truth” image at 16k resolution and then a low-resolution 720p image of the same scenes, and the use of motion vectors in both results to show what objects will move in order to better train the results to reduce smearing and blurry artifacts caused by the scaling. The DLSS 2.0 component of drivers is now a set of neural network parameters designed to properly scale images while maintaining fidelity and accounting for motion.
These technologies feed each other by a logical progression – raytracing is computationally expensive and gains in resource and time cost as resolution scales up, so the idea is that DLSS allows a game to be rendered at lower resolution, perform raytracing calculations on the smaller image, accounting for fewer pixels, and then scaling that result up using trained neural network data and motion vector accounting to maintain a good looking image. As DLSS has rolled out and improved, the results have gotten a lot better, with DLSS 2.0 implementations showing sharpness better than native rendering in some cases, as Nvidia is all too eager to show:
So now we come back to the original question – was RTX a bust? Well, in some ways, yes, it was. Real-time raytracing has largely failed to catch-on as of yet, with a very small number of titles launching with it and most of the best titles with the feature still being those that launched near the first-gen RTX cards – Battlefield V, Metro Exodus, and Shadow of the Tomb Raider. Some newer titles added have been popular, like Minecraft RTX – but things like limits on what worlds you can use in Minecraft RTX means that not every player who likes Minecraft and has an RTX card would even enable it.
DLSS has gotten better, and titles like Control and Wolfenstein: Youngblood make very good use of it, in addition to Nvidia’s showcase with Death Stranding above. DLSS, however, complicates things for me as an enthusiast in two ways – the first, is that it complicates benchmarks, as Nvidia is incredibly eager to use it and show performance data with DLSS on improving framerates rather than a native 4k to 4k test in their marketing materials. The second issue is that even with the image quality improvements it can bring, developers have to support it, and a part of that (probably the biggest part for Nvidia) is selling supercomputer time or DGX-1 towers to developers to process and run the neural network sessions needed to build a DLSS profile in either version. It is a great technology, but it is native to Nvidia, requires their hardware on both the training/deep learning side and the client PC side, and until game support expands sharply, it remains a sort of niche.
The biggest thing that I think hurt RTX technologies for the last two years, however, is that Nvidia did themselves no favors with the lineup. There was no low-end ray tracing card, as the cheapest RTX card was the 2060, which was still $300+, and the lower-end Turing cards dropped the raytracing support to become GTX 1600 cards, maintaining DLSS (another reason why I think DLSS succeeded in many ways where ray-tracing failed). AMD not having ray-tracing was another point of failure, as with a minority of the PC gaming audience even having it as a supported feature led to a lack of developer support. Outside of Nvidia-partnered titles, there just wasn’t enough reason to add real-time raytracing in any form to a game in the last two years. Now, with a second-generation RTX GPU lineup coming out, support for ray-tracing coming in the Radeon 6000 cards this year, and the next-gen consoles supporting it via that same AMD Radeon graphics technology, you see more games this year starting to add support for ray-tracing.
Lastly, it is worth visiting the sales numbers of Turing GPUs. In Q4 2018, after the launch of the RTX 2000 lineup, Nvidia reported lower-than-expected sales numbers, bellyaching that the Turing deal offered to gamers wasn’t being taken up. The generational improvement was too low for many gamers (myself being one of them) to justify moving from a 10-series GPU to an equivalent 20-series part. 30% in the most ideal scenarios for $1,200 is, by nearly all metrics, a poor bargain.
Nvidia, knowing they didn’t have AMD competition for the generation, made two choices, one I think was good, and the other was bad. The first was adding ray-tracing, a genuinely cool technology that has long-term implications. Being the first ones to do that was hard, and I commend them for making the effort. It will be a foundational technology to enhancing artistry in games going forward, and gives us generations of performance-enhancing hardware to look forward to. However, knowing that competition would be lacking, and with the sizzle-reel demos crafted to portray best-possible RTX scenarios, Nvidia jacked the prices up substantially with Turing. The 80 Ti tier card, launched in the 10-series at $700, instead launched at $1,200 (technically, there were supposed to be options from $999 up, but most partner cards and the vast majority of Nvidia direct sales were in that $1,200+ bracket). With that to start, the 2080 came in at a price point higher than the 1080 Ti for the same rasterization performance except worse in higher resolutions due to reduced VRAM, and the 2070 at $500+, Nvidia attempted a brazen cash grab, offering little or no performance increase, worse performance per dollar, and most gamers saw through it if they had Pascal-based cards. The problem compounds when you consider that the leap from the Maxwell-based GTX 900 series cards to 10-series was much larger than the equivalent jump from 10 to 20 series, so many gamers who love to upgrade already had a huge increase thanks to Pascal and had no real reason to buy into Turing. Couple that with poor early support for RT and a disappointing visual quality for DLSS 1.0, and the lineup kind of went splat.
It wasn’t until DLSS 2.0 and the GTX 16-series cards that DLSS adoption ticked up noticeably, and with ray-tracing cards selling not so well, ray-tracing was slow rolled.
So in the end, yeah, RTX kind of was a bust for the first generation. At the launch of Turing, RTX 20-series cards offered minor performance gains at high prices with halo features that didn’t work for most people and in most games, after a generation where Nvidia offered substantially better performance gains and had convinced a large part of the enthusiast base to upgrade to 10-series cards, which compared favorably to their 20-series brothers. However, as with many first-generation technologies, something has to come out to clear the way for improvements and support to happen, and hopefully, with Ampere, RDNA2, and the next-generation consoles all supporting ray tracing and new methods of resolution scaling, we’ll see some interesting and new art come to life.
However, if you bought a Turing card? Yeah, that kinda sucks.