AMD stacks memory cache in 3D to boost datacenter CPUs

Enterprise

Did you miss a session at the Data Summit? Watch On-Demand Here.


Advanced Micro Devices is announcing it is shipping its third-generation AMD Epyc processors with AMD 3D V-Cache — which AMD says is the highest-performing x86 server processor for technical computing.

Code-named Milan-X, AMD’s processor is the first datacenter central processing unit (CPU) to incorporate 3D die stacking, a way of building memory components in three dimensions to pack more memory into a confined space and keep processors fed with data.

Milan-X will enable AMD to give its Epyc CPU family a mid-cycle boost and deliver up to 66% higher performance on technical computing workloads compared to comparable non-stacked third-generation AMD Epyc processors. That’s important as AMD and Intel are slugging it out to be the performance leader in the lucrative server processor market.

Milan-X enables a third-level memory storage unit, dubbed L3 cache, to stack in three dimensions, making it the largest L3 cache in the industry, said Ram Peddibhotla, corporate vice president of product management at AMD, in an interview with VentureBeat. AMD calls the 3D cache a “chiplet,” or a component that it can drop into its processors.

“We have unquestioned leadership, as third-gen Epyc currently owns more than 250 performance world records,” Peddibhotla said. “We span the gamut of datacenter use cases.”For the technical computing segment, we’re raising that bar even further with Milan-X processors.”

That will help applications such as computational fluid dynamics (CFD), finite element analysis (FEA), electronic design automation (EDA), and structural analysis, he said.

Milan-X processors have access to 768MB of L3 cache, three times more than third-generation Epyc processors without 3D V-Cache, delivering faster time-to-results on targets workloads.

For electronic design automation (or chip design software), the AMD Epyc 7373X CPU can deliver up to 66% more performance for EDA RTL Simulations, such as Synopsys VCS, when compared to the EPYC 73F3 CPU.

The 64-core, AMD EPYC 7773X processor can deliver on average 44% more performance on Altair Radioss simulation applications compared to the competition’s top of stack processor, Peddibhotla said.

“The interesting thing about Milan-X is that it shows AMD is focusing on-chip packaging technologies as well, just as we’ve seen Intel do recently. By stacking the cache on top of the CPU cores, Milan-X gets a performance advantage without needing to make any major architectural changes,” said Bob O’Donnell, chief analyst at Technalysis Research, in an email to VentureBeat.

And the 32-core AMD Epyc 7573X processor can solve up to 118% more CFD problems per day than the competition’s 32-core count processor, while running ANSYS CFX, he said.

Peddibhotla said that with Milan-X 32-core processors, you can get the same amount of work done with half the servers. On top of that, the servers use almost half the power, dropping from 178 kilowatt hours per year to 91 kilowatt hours of power used per year. That adds up to a lower 51% lower total cost of ownership. Peddibhotla said the energy savings when you compare AMD’s solution to Intel’s, it’s saving the equivalent of 80 acres of forest land in terms of carbon sequestration every year.

“When you think about this, we’re replacing 20, Intel servers with 10 Epyc servers. So saving power across all of those servers can really result in a tremendous impact from an environmental sustainability point of view,” he said.

Milan-X will be available from a wide array of manufacturing partners, including, Atos, Cisco, Dell, Gigabyte, HPE, Lenovo, QCT, and Supermicro. And it is supported by AMD software ecosystem partners, including, Altair, Ansys, Cadence, Dassault Systèmes, Siemens, and Synopsys.

“The Milan-X is the start of more innovative packaging of die to meet specialized workloads,” said Kevin Krewell, an analyst at Tirias Research, in an email to VentureBeat. “Some workloads can use large caches effectively and you can get a generational jump in performance with the V-Cache.”

The L3 cache is three times larger with 96 megabytes per CCD compared to 32 megabytes in current products. Overall, that adds up to 768 megabytes of cache per processor socket. Milan-X uses the same microarchitecture design for its processor as the predecessor Milan chip. And so it is completely software compatible, said Peddibhotla.

“Milan-X is the world’s highest performance x86 server processor for technical compute,” Peddibhotla said. “When we drive this unprecedented 768 megabytes of L3 cache per socket, and it’s also fully compatible with existing Milan platforms, it drives great productivity and lower total cost of ownership and better energy efficiency,”

Peddibhotla said that AMD has been on a multi-year journey on 3D packaging and now it has come to fruition.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn More

Author

Topics