Intel Sapphire Rapid-SP Xeon CPUs To Feature Up To 64 GB HBM2e Memory, Also Talks Next-Gen Xeon & Data Center GPUs For 2023+

At SC21 (Supercomputing 2021), Intel hosted a brief session where they discussed their next-generation data center roadmap and talked about their upcoming Ponte Vecchio GPUs & the Sapphire Rapids-SP Xeon CPUs.

Intel Talks Sapphire Rapids-SP Xeon CPUs & Ponte Vecchio GPUs at SC21 – Also Reveals Next-Gen Data Center Lineup For 2023+

Intel had already discussed most of the technical details regarding its next-gen data center CPU & GPU lineup at Hot Chips 33. They are reaffirming what they’ve said and also revealing a few more tidbits at SuperComputing 21.

Intel Celebrates 50th Anniversary of 4004: The World’s First Commercial Microprocessor – A Retrospective

The current generation of Intel Xeon Scalable processors has been extensively adopted by our HPC ecosystem partners, and we are adding new capabilities with Sapphire Rapids – our next-generation Xeon Scalable processor that is currently sampling with customers. This next-generation platform delivers multi-capabilities for the HPC ecosystem, bringing for the first time in-package high bandwidth memory with HBM2e that leverages the Sapphire Rapids multi-tile architecture. Sapphire Rapids also brings enhanced performance, new accelerators, PCIe Gen 5 and other exciting capabilities optimized for AI, data analytics and HPC workloads.

HPC workloads are evolving rapidly. They are becoming more diverse and specialized, requiring a mix of heterogeneous architectures. While the x86 architecture continues to be the workhorse for scalar workloads, if we are to deliver orders-of magnitude performance gains and move beyond the exascale era, we must critically look at how HPC workloads are run within vector, matrix and spatial architectures, and we must ensure these architectures seamlessly work together.Intel has adopted an “entire workload” strategy, where workload-specific accelerators and graphics processing units (GPU) can seamlessly work with central processing units (CPU) from both hardware and software perspectives.

We are deploying this strategy with our next-generation Intel Xeon Scalable processors and Intel Xe HPC GPUs (code-named “Ponte Vecchio”) that will power the 2 exaflop Aurora supercomputer at Argonne National Laboratory. Ponte Vecchio has the highest compute density per socket and per nodes, packing 47 tiles with our advanced packaging technologies: EMIB and Foveros. There are over 100 HPC applications running on Ponte Vecchio. We are also working with partners and customers including – ATOS, Dell, HPE, Lenovo, Inspur, Quanta and Supermicro – to deploy Ponte Vecchio in their latest supercomputers.

via Intel

Intel Sapphire Rapids-SP Xeon Data Center CPUs

According to Intel, the Sapphire Rapids-SP will come in two package variants, a standard, and an HBM configuration. The standard variant will feature a chiplet design composed of four XCC dies that will feature a die size of around 400mm2. This is the die size for a singular XCC die and there will be four in total on the top Sapphire Rapids-SP Xeon chip. Each die will be interconnected via EMIB which has a pitch size of 55u and a core pitch of 100u.

The standard Sapphire Rapids-SP Xeon chip will feature 10 EMIB interconnects and the entire package will measure at a mighty 4446mm2. Moving over to the HBM variant, we are getting an increased number of interconnects which sit at 14 and are needed to interconnect the HBM2E memory to the cores.

Intel Core i7-12800H Alder Lake-P CPU Benchmarks Leak Out Too, Up To 25% Faster Than AMD Ryzen 7 5800H CPU In Single-Threaded Test

The four HBM2E memory packages will feature 8-Hi stacks so Intel is going for at least 16 GB of HBM2E memory per stack for a total of 64 GB across the Sapphire Rapids-SP package. Talking about the package, the HBM variant will measure at an insane 5700mm2 or 28% larger than the standard variant. Compared to the recently leaked EPYC Genoa numbers, the HBM2E package for Sapphire Rapids-SP would end up 5% larger while the standard package will be 22% smaller.

Intel Sapphire Rapids-SP Xeon (Standard Package) – 4446mm2
Intel Sapphire Rapids-SP Xeon (HBM2E Package) – 5700mm2
AMD EPYC Genoa (12 CCD Package) – 5428mm2

Intel also states that the EMIB link provides twice the bandwidth density improvement and 4 times better power efficiency compared to standard package designs. Interestingly, Intel calls the latest Xeon lineup Logically monolithic which means that they are referring to the interconnect that’ll offer the same functionality as a single-die would but technically, there are four chiplets that will be interconnected together. You can read the full details regarding the standard 56 core & 112 thread Sapphire Rapids-SP Xeon CPUs here.

Intel Xeon SP Families:

Family Branding	Skylake-SP	Cascade Lake-SP/AP	Cooper Lake-SP	Ice Lake-SP	Sapphire Rapids	Emerald Rapids	Granite Rapids	Diamond Rapids
Process Node	14nm+	14nm++	14nm++	10nm+	Intel 7	Intel 7	Intel 4	Intel 3?
Platform Name	Intel Purley	Intel Purley	Intel Cedar Island	Intel Whitley	Intel Eagle Stream	Intel Eagle Stream	Intel Mountain Stream Intel Birch Stream	Intel Mountain Stream Intel Birch Stream
MCP (Multi-Chip Package) SKUs	No	Yes	No	No	Yes	TBD	TBD (Possibly Yes)	TBD (Possibly Yes)
Socket	LGA 3647	LGA 3647	LGA 4189	LGA 4189	LGA 4677	LGA 4677	LGA 4677	TBD
Max Core Count	Up To 28	Up To 28	Up To 28	Up To 40	Up To 56	Up To 64?	Up To 120?	TBD
Max Thread Count	Up To 56	Up To 56	Up To 56	Up To 80	Up To 112	Up To 128?	Up To 240?	TBD
Max L3 Cache	38.5 MB L3	38.5 MB L3	38.5 MB L3	60 MB L3	105 MB L3	120 MB L3?	TBD	TBD
Memory Support	DDR4-2666 6-Channel	DDR4-2933 6-Channel	Up To 6-Channel DDR4-3200	Up To 8-Channel DDR4-3200	Up To 8-Channel DDR5-4800	Up To 8-Channel DDR5-5600?	TBD	TBD
PCIe Gen Support	PCIe 3.0 (48 Lanes)	PCIe 3.0 (48 Lanes)	PCIe 3.0 (48 Lanes)	PCIe 4.0 (64 Lanes)	PCIe 5.0 (80 lanes)	PCIe 5.0	PCIe 6.0?	PCIe 6.0?
TDP Range	140W-205W	165W-205W	150W-250W	105-270W	Up To 350W	Up To 350W	TBD	TBD
3D Xpoint Optane DIMM	N/A	Apache Pass	Barlow Pass	Barlow Pass	Crow Pass	Crow Pass?	Donahue Pass?	Donahue Pass?
Competition	AMD EPYC Naples 14nm	AMD EPYC Rome 7nm	AMD EPYC Rome 7nm	AMD EPYC Milan 7nm+	AMD EPYC Genoa ~5nm	AMD Next-Gen EPYC (Post Genoa)	AMD Next-Gen EPYC (Post Genoa)	AMD Next-Gen EPYC (Post Genoa)
Launch	2017	2018	2020	2021	2022	2023?	2024?	2025?

Intel Ponte Vecchio Data Center GPUs

Moving over to Ponte Vecchio, Intel outlined some key features of its flagship data center GPU such as 128 Xe cores, 128 RT units, HBM2e memory, and a total of 8 Xe-HPC GPUs that will be connected together. The chip will feature up to 408 MB of L2 cache in two separate stacks that will connect via the EMIB interconnect. The chip will feature multiple dies based on Intel’s own ‘Intel 7’ process and TSMC’s N7 / N5 process nodes.

Intel also previously detailed the package and die size of its flagship Ponte Vecchio GPU based on the Xe-HPC architecture. The chip will consist of 2 tiles with 16 active dies per stack. The maximum active top die size is going to be 41mm2 while the base die size which is also referred to as the ‘Compute Tile’ sits at 650mm2.

The Ponte Vecchio GPU makes use of 8 HBM 8-Hi stacks and contains a total of 11 EMIB interconnects. The whole Intel Ponte Vecchio package would measure 4843.75mm2. It is also mentioned that the bump pitch for Meteor Lake CPUs using High-Density 3D Forveros packaging will be 36u.

Aside from these, Intel also posted a roadmap in which they confirm that the next-generation Xeon Sapphire Rapids-SP family and the Ponte Vecchio GPUs will be available in 2022 but there’s also the next-generation product lineup which is planned for 2023 and beyond. Intel hasn’t explicitly told what it plans to bring but we know that Sapphire Rapids successor will be known as Emerald and Granite Rapids and the successor to that will be known as Diamond Rapids.

For the GPU side, we don’t know what the successor to Ponte Vecchio will be known but expect it to be competing with NVIDIA’s and AMD’s next-generation GPUs for the data center market.

Moving forward, Intel has several next-generation solutions for advanced packaging designs such as Forveros Omni and Forveros Direct as they enter the Angstrom Era of transistor development.

Next-Gen Data Center GPU Accelerators

GPU Name	AMD Instinct MI200	NVIDIA Hopper GH100	Intel Xe HPC
Flagship Product	AMD Instinct MI250X	NVIDIA H100	Intel Ponte Vecchio
Packaging Design	MCM (Infinity Fabric)	MCM (NVLINK)	MCM (EMIB + Forveros)
GPU Architecture	Aldebaran (CDNA 2)	Hopper GH100	Xe-HPC
GPU Process Node	6nm	5nm?	7nm (Intel 4)
GPU Cores	14,080	18,432?	32,768?
GPU Clock Speed	1700 MHz	TBA	TBA
L2 / L3 Cache	2 x 8 MB	TBA	2 x 204 MB
FP16 Compute	383 TOPs	TBA	TBA
FP32 Compute	95.7 TFLOPs	TBA	~45 TFLOPs (A0 Silicon)
FP64 Compute	47.9 TFLOPs	TBA	TBA
Memory Capacity	128 GB HBM2E	128 GB HBM2E?	TBA
Memory Clock	3.2 Gbps	TBA	TBA
Memory Bus	8192-bit	8192-bit?	8192-bit
Memory Bandwidth	3.2 TB/s	~2.5 TB/s?	5 TB/s
Form Factor	Dual Slot, Full Length / OAM	Dual Slot, Full Length / OAM	OAM
Cooling	Passive Cooling Liquid Cooling	Passive Cooling Liquid Cooling	Passive Cooling Liquid Cooling
TDP	Q4 2021	2H 2022	2022-2023?

[ad_2]