Initially, I wanted to write this as a series of posts about each manufacturer and topic, but they feed into each other in ways that are worth investigating as a whole, comprehensive picture.
Gaming tech is set to come a long way forward in the back half of 2020, on the back of new generation consoles and the advancement of the wars between AMD and Intel on CPU and AMD and Nvidia on the GPU front. The advancements coming are, frankly, long overdue, but have also been brewing up over the last handful of years thanks to AMD’s resurgence and the competitive response that has forced.
With the framing done, let’s start with CPUs!
Intel in 2020
To say Intel was caught on the backfoot by the Ryzen launch and the generational improvements made by AMD for the last several years would be an understatement. Part of it was a genuine attempt at making strong advances in process technology, causing their 10 nanometer silicon manufacturing process to fail to come to market in any meaningful capacity even now, 4 years after it was supposed to be mainstream Intel process, but a large part of it was their utter lack of desire to push the envelope forward on several fronts, leaving 4-cores and 8 threads as the desktop vanguard for nearly a decade before suddenly boosting core counts 3 times in the last 3 years (the most recent increase of which is about to hit the market).
Competition has been a mixed blessing for Intel, as their improved core counts and the faster clock speeds they’ve pushed out of the now 5-year old Skylake core architecture has been interesting to see, and it highlights Intel’s core competencies with design and manufacturing in a way that helpfully contrasts the failures of their 10nm process. On the other hand, they are starting to falter with that 5-year old core design, as AMD now has an IPC lead that isn’t more noticed solely because Intel can push most of their Skylake designs to 5 GHz or higher with relative ease, where AMD caps out on the newest Ryzen lineup at 4.7 GHz until the introduction of exotic cooling and extreme overclockers. Meanwhile, Intel has been hit by the revelation of a boatload of security vulnerabilities that have nearly all been evaded by AMD, with new exploits being found on a pretty regular cadence in Intel CPUs over the last 2 years and those exploits affecting designs as far back as the first-generation Core CPUs from nearly 15 years ago.
Coming up this summer, Intel has their 10th generation Core i lineup coming out. Codenamed Comet Lake, these CPUs are…well, they’re Skylake again, but with the maximum core count in the mainstream desktop lineup now topping out at 10 cores. For the first time in a while, Intel’s lineup is fairly easy to decipher, as everything has HyperThreading for a first, and the lineup is built around product brackets at specified core counts, with the i9 at 10 cores, i7 at 8, i5 at 6, and i3 at 4, all with HT on to double the effective thread count, and the usual mix of K series unlocked CPUs without coolers included that allow overclocking, F series without the integrated graphics, and the mix of KF being what most pure gamers would go for.
The only really new feature in this lineup, and one that is specific to the i9, is a new iteration of boost tech. Called “Intel Thermal Velocity Boost,” this allows the top i9 to reach a single core clockspeed of 5.3GHz and an all-core boost of 4.9 GHz on all 10 cores, provided you can keep the processor below 70 degrees Celsius. This is a curse more than a blessing, it seems, though, because the new CPUs only accomplish their rated clockspeeds through making complete jokes of their TDP ratings. While the top i9 is rated as a 125 watt CPU, rumors from motherboard manufacturers have it that the CPU can suck up as much as 224w and tends to get very hot under load even with a strong closed loop liquid cooler on it, meaning hopes of maintaining those TVB boost clocks should be pushed out of mind.
On top of all of this, the chips need a new motherboard as a new socket and pinout are used, so upgrading to this lineup for existing Intel users isn’t easy. The last bit of rumors are that Rocket Lake, a late 2020 release from Intel, will at least still use the Z490 boards being made available, and will unlock features manufacturers have built in to the upcoming boards even though Comet Lake doesn’t support them, like PCI Express Gen 4. Rocket Lake will be the first new core design from Intel in the desktop since Skylake, but a tradeoff – it is a backported design from 10nm which will bring Intel’s first IPC gains since Skylake, but it will be competing against Zen 3 designs from AMD, which (spoiler alert for later!) have an IPC increase over the already-increased Zen 2 IPC we currently have in Ryzen 3rd gen parts. The new architecture will also face a conundrum, as while it will be built on 14nm like all Intel parts for half a decade now, it may not clock as high, and if Intel does run it at or around 5 GHz, the risk is that when a refined design comes out on Intel’s 10nm or 7nm process, those designs may not clock as high because Intel’s engineers have not refined them to the same degree.
All in all, here’s the summary – Intel continues to have the troubles, but simple math tells us one good thing for them – the Core i9 10900K will likely be the fastest available pure gaming CPU at its launch, but they’re going to be spending much of the next few years cleaning up after the mess of complacency they made for themselves and they may not be able to hold that crown for long…
AMD CPUs in 2020
AMD has been on a come-from-behind, feel good story with a few blunders and unforced errors. For as long as I’ve been a PC hardware enthusiast, AMD has been a scrappy underdog capable of big wins that never breaks out of that underdog mindset, and Ryzen has been the perfect example of this. In 2017 when the first Ryzen CPUs launched, they were legitimately great and while they didn’t dethrone Intel on gaming performance, they brought enough to the table to make them worthwhile for content creation and most gaming rigs save for the absolute bleeding edge high refresh rate gaming. Zen+ in the second-generation brought modest gains and kept the status quo, and is where I personally got on the Ryzen train. The newest lineup in Ryzen 3rd gen has brought sizable performance gains, more cores, and a feeling that the exciting innovations in CPUs are coming mostly from AMD.
In 2020, this is supposed to continue with Ryzen 4000 CPUs, part of the Zen 3 architecture. With Zen 3, AMD is rumored to be bringing yet another 10%+ gain in IPC through some major architectural improvements.
For the uninitiated, Zen 1 was built on a design hinging on CCX technology. A CCX, or core complex, is a set of 4 cores. All Zen CPUs to date use them, so an 8 core design is two CCXes, and higher core counts use multiple chips, each with their own pairs of CCXes. To get to even multiples of 8 is easy – a 16-core chip is 4 CCX, 32 is 8, etc. When paring down, however, it has traditionally been done symmetrically. My Ryzen 9 3900X, for example, is 12 cores, which is done by disabling 1 core in each of its 4 CCXes, resulting in four 3-core CCXes for the total of 12. Resources that are shared are in the middle of each CCX (the large L3 cache), and while two CCXes are on each die, if a core from 1 CCX needs to talk to a core in a different CCX, it has to take a trip on AMD’s Infinity Fabric interconnect, which adds a lot of latency. The IF clock is based on RAM clock, which is why the most common advice when buying Ryzen is to buy the fastest RAM you can get – faster RAM makes for a faster Infinity Fabric, which makes for faster CCX-to-CCX communication, which improves performance more than you would expect from just the RAM speed.
This design has some tradeoffs – it is great for adding and increasing core counts quickly, and works really well with the modular approach AMD took. In the first and second generation, AMD could make a large number of a single product – the Zeppelin die that was a Ryzen CPU, and then portion it out differently for each product segment. One full die makes a Ryzen 8 core, a die with defective cores could be binned down to a 6-core Ryzen 5 or 4-core Ryzen 3, and then the server Epyc chips and high-end desktop Threadripper chips used 4 dies with varying levels of enabled cores and other hardware.
However, with 7nm chips in the third-generation and the chiplet design used in Ryzen 3rd gen, it doesn’t make a lot of sense to keep CCXes. In my example, my Ryzen 9 3900x is arbitrarily divided into two CPU core chiplets with each being split in half into two CCXes. If two cores on the same physical chip have to talk with one another, it stands to reason perhaps that should be faster and maybe the latency-heavy IF trips could be saved for communication between two different chiplet dies, but AMD maintained the design since a lot of developers were starting to code for CCXes and OS updates were making Windows 10 aware of the topology of Ryzen. However, the latency of this design is holding back the Ryzen design in gaming, which is latency-sensitive, and some other workloads.
For me, the most exciting rumor of Zen 3 isn’t just the improved IPC but rather that the chips are apparently going to use a new topology, maintaining the CCX in a way, but making it a full-chip CCX. This will mean a few things – within 8 core CPUs in the mainstream, latency will drop tremendously which will help performance, but also that each core will functionally have access to twice as much L3 cache, which was a huge help in Zen 2 designs. For multi-chiplet designs like Ryzen 9 12 and 16-core parts, Threadripper, and Epyc, this will mean the topology loses a layer of complexity and will have more resources available in a low-latency trip within the same die. This is even more exciting because a recent CPU from AMD, the Ryzen 3 3300X, uses a single-CCX design for the first time. Rather than being two CCXes with 2 cores each, this 4-core part uses a single fully-enabled CCX, and as a result, it has substantially improved gaming performance clock for clock with its peers, and is less reliant on higher memory clocks to drive more performance.
However, it wouldn’t be AMD without noting that they’ve shot themselves in the foot slightly here. While the mantra of Zen and the AM4 desktop socket has been continued support through 2020, the new mantra is that the Ryzen 4000 CPUs will only be supported on the newest motherboard chipsets – the 500 series released with Ryzen 3000 CPUs, and any new one (presumably a 600 series will come with the new CPUs). This is a sharp change from what AMD has been publicizing, and in one manufacturing partner’s case (MSI) means they now have falsely advertised a motherboard as being supported for “all future AM4 CPUs.” Yikes. The excuse given is that the BIOS size of the motherboards can’t be confirmed to support all the microcode needed for the full Zen family on AM4, and each motherboard manufacturer can pick their own size to suit their cost reduction desires (MSI famously used 16 Mbit ROMS for their early Ryzen boards, making them too small and requiring them to manufacture new B450 boards with “MAX” branding which led to the claim I mentioned above!). Whether this is valid or not depends on your perspective, but a lot of enthusiasts have called bullshit, and I tend to fall on that side. While I am planning a huge new system that would include a new motherboard and a 16 core Ryzen 4000 when it launches, the fact that this was announced so late feels rather iffy and kind of steps on the shots AMD has taken at Intel even as recently as last year over the lack of upgrade path Intel so often takes.
Despite that, AMD in the CPU market is doing very well and there is a lot of evidence that they are going to be closer than ever to taking the gaming performance crown with Ryzen 4th gen, provided that the rumors mentioned pan out!
AMD GPUs in 2020
The year started kind of iffy for AMD on the GPU front. While the Navi-based RX 5000 lineup has done fairly well and had pretty good performance, the AMD driver situation has been a hot mess with all kinds of different issues. This hasn’t stopped AMD from pushing forward the new RDNA architecture, and it has some huge design wins with RDNA2 based GPUs being the foundation of both next-generation consoles from Sony and Microsoft.
Since the RX 5600XT was announced at CES in January and launched, AMD has been relatively quiet on the desktop graphics front. Speculation is rampant, but it seems likely we’ll see RDNA2 based GPUs in 2020 from AMD, which will bring the next-gen console feature-set on graphics into desktops, like AMD’s hardware raytracing, variable rate shading, and a 50% performance-per-watt improvement over the power-hungry Navi designs AMD launched as the RX 5000 lineup. While not much is really known about this design outside of the console details and what AMD shared at their recent Investor Day event, it is exciting if only because the hardware underpinning these cards will be functionally identical to the console hardware that will define the next 5 years of gaming.
Nvidia in 2020
Nvidia has been on top of gaming graphics for a long time now, with larger, higher-performance dies and aggressive competitive behavior including closed-source libraries and partner deals with developers. However, they have produced compelling products for that time, and whatever you can say about their business practices, for the last several years, they’ve simply offered the best experience when compared to AMD.
2018 brought the Turing architecture, which brought RTX technology and real-time raytracing to gaming. While raytracing has been a curiosity more than a real, tangible thing that most games benefit from, the number of games offering it has increased (and will include WoW by the end of the year!), and with AMD’s RDNA2 supporting it and bringing it to the next-gen consoles, the number of titles that make some use of raytracing features is going to explode.
Turing brought a mix of features that helped raytracing and other Nvidia technology, like NVENC being able to use the Tensor cores in the Turing design to offload video encoding for livestreaming and capture, but it was largely a guinea pig generation that brought a first-draft version of the RTX technology to the forefront along with the multiple math pathways that Nvidia uses in server tech to accelerate AI via tensor cores, improved pipelining that allows concurrent execution of a lot of math instead of FIFO queuing, and the non-raytracing RTX technologies like DLSS scaling.
While Nvidia has not yet shown off their next generation gaming hardware, this week through their virtual GTC event, they did show off the datacenter implementation of Ampere, their next architecture, and confirmed via interviews that it will be making its way into gaming GPUs as well.
Ampere for the server is a poor basis for deriving the gaming performance of the next Geforce, as the server part has no raytracing cores. What it does tell us is some of the directions that Nvidia will be taking. For one, the number of tensor cores per shader module has decreased over Turing and the Volta datacenter architectures, but Nvidia is indicating that this has only helped performance. The tensor cores that remain appear more performant, capable of more varied mathematical operations. The raw FP32 performance doesn’t increase much, but the combined performance of the mixed modes of math can be as much as 9 times higher than that of Volta in the datacenter, an architecture whose introduction on desktop, limited to the Titan V, was pretty close to Turing RTX cards in its own right. With 7nm process technology over Turing’s 12nm, the chips can be more transistor-dense and either clock higher, or use the same power envelope to feed far more hardware, which from Ampere for the datacenter, at least seems to point to a much higher CUDA core count and larger caches.
Rumors about Ampere for gaming, however, paint this picture a bit better. Ampere for gaming is expected to have raytracing performance around 4 times the speed of the RTX cards from the Turing era, and could offer 30%-70% more performance than the RTX 2000-series equivalents. It’s also likely that the new lineup wouldn’t include any GTX non-raytracing cards, which would be nice. Turing had some GTX cards introduced at the low-end with the Geforce 1660 and 1650 families – cards which did offer some improvements from Turing like INT8 capability and enhanced video encoding for streaming as a result – but it has also kept raytracing a niche feature as the cheapest buy-in for playable real-time raytracing is the RTX 2060 at $300. A top-to-bottom product stack with raytracing at all price points is something that would help increase adoption of the feature, and it is definitely in Nvidia’s interest to beat AMD to market with a full lineup of raytracing hardware before RDNA2 comes out and the consoles already made for that raytracing feature set begin to influence things.
For me, the percentage increases are most appealing – as a 1080Ti owner, I’m pretty happy with it, as the 2080 Ti posts a maximum 30% improvement over it while costing over $1,000, and while I am certainly not one who would shy from that as a long term goal, it wasn’t enough for me to move to one given that I had just built my new system prior to the RTX 2000 announcement. But if the (theoretical) 3080 Ti offers a 70% increase over the 2080 Ti, at that point, I’m staring a doubling of performance in the face along with the features that I don’t have now like RTX and the improved hardware for NVENC, and then it starts to make a lot of sense to me to look at bundling a 3080 Ti in for my next system build (which is my plan, coincidentally!).
The idea that we might get close to such a performance jump over generations, similar to when I was a teenager and power doubled every 12-18 months, is incredibly exciting!
Sure, but what about software?
This week, Epic Games showed off their Unreal Engine 5, the newest version of their massively successful middleware which powers a ton of games, and their first stab at an engine for the next gen consoles coming this year. It was an impressive demo, using the new hardware and a lot of new rendering techniques to build a legitimately impressive demo that was running in realtime on a Playstation 5.
While the claim that the technology of the PS5 is completely unobtainable to a consumer was sort of questionable (a RAID 0 array of PCI-E gen 4 NVMe SSDs can beat the compressed performance of the PS5 in both read and write, while top-end PC hardware is currently still stronger in all regards, but with the overhead of a general-purpose OS), the demo was genuinely jaw-dropping, and it will also be on PC powering tons of games as-is, so it will be fascinating to see what settings and features can port neatly to PC when Tim Sweeny talked about it being largely made for the PS5.
Overall, even if I weren’t planning a new build for late in the year, this would still be one of the most exciting technology years in a long time, and I can’t wait to see how the various bits of fact and rumors we’ve seen come together into actual products!