This is not correct. Despite only having a single IO die, the Zen2/Zen3 IOD is a...

namibj · on Feb 3, 2022

Thanks for the clarifications. I wasn't aware the EPYC IOD was so severely sliced, and just assumed the NPS4 mode would be for isolating neighbour VMs and improving DRAM row buffer locality, both mostly by reducing channel interleaving and setting up somewhat-explicit NUMA.

paulmd · on Feb 3, 2022

Yeah! Most people don't realize it because it does pretty much just behave like UMA until you get to the extremes of performance tuning. The one gotcha that does potentially affect the general public is that thing about making sure you populate sets of 8 sticks if at all possible, but most server users will be populating sets of 8 anyway.

It's actually stunning how good a job AMD did there, I'm not dumping on it at all, for 99% of users it might as well be UMA. Naples very much acted like a four-socket system and Rome's quadrants more or less Just Work. I've always been very curious about what changed that it's so different, whether it's the off-chip interconnects being that much higher-latency than the on-chip interconnects, or what.