Skip to content

LUMI-C: The CPU Partition

The LUMI-C partition consists of 1536 compute nodes with an aggregated LINPACK performance of 5.63 Petaflops.

Nodes CPUs CPU cores Memory Local storage Network
1376 AMD EPYC 7763
(2.45 GHz base, 3.5 GHz boost)
128
(2x64)
256 GiB none 1x 100 Gb/s
128 AMD EPYC 7763
(2.45 GHz base, 3.5 GHz boost)
128
(2x64)
512 GiB none 1x 100 Gb/s
32 AMD EPYC 7763
(2.45 GHz base, 3.5 GHz boost)
128
(2x64)
1024 GiB none 1x 100 Gb/s

CPU

Each LUMI-C compute nodes are equipped with two AMD EPYC 7763 CPUs with 64 cores each running at 2.5 GHz for a total of 128 cores per node. The cores have support for 2-way simultaneous multithreading (SMT) allowing for up to 256 threads per node.

The EPYC 7763 CPU cores are "Zen 3" compute cores, the same core that can be found in the Ryzen 5000 series consumer CPUs. These cores are fully x86-64 compatible and support AVX2 256-bit vector instructions for a maximum throughput of 16 double precision FLOP/clock (AVX2 FMA operations). The cores have 32 KiB of private L1 cache, a 32 KiB instruction cache, and 512 KiB of L2 cache.

Overview of an AMD EPYC 7763 CPU
Overview of an AMD EPYC 7763 CPU

The EPYC CPUs consist of multiple chiplets, so-called core complex dies (CCDs). There are 8 CCDs per processor with 8 cores each. The L3 cache is shared between the eight cores of a CCD and has a capacity of 32 MiB for a total 256 MiB of L3 cache per processor. Note this differs from the earlier Zen 2 and EPYC 7002-series processors where 4 cores shared the L3 cache and there were two groups of 4 cores (a "CCX") inside each CCD. This can improve the performance of certain workloads as a single core can have access to more L3 cache.

The CCD units are all connected to a central I/O die which contains the memory controller. There are 8 memory channels with a peak theoretical bandwidth of 204.8 GB/s per socket.

The LUMI-C compute nodes are configured with 4 NUMA zones ("quadrant mode") with 2 CCDs per quadrant. The figure below, gives you an overview of the distances between the NUMA nodes.

Distances between NUMA nodes
Distances between NUMA nodes

The two processors within a node are connected by four bi-directional and 16 Bit wide links operating at 16 Gbps each. This provides a peak bandwidth of 256 GB/s.

Network

At first, the LUMI-C nodes will have a 100 Gb/s HPE-Cray Slingshot-10 network card.

The nodes will later be upgraded to 200 Gb/s Slingshot-11 interconnect when LUMI-G becomes operational in 2022.

Storage

There is no local storage on the compute nodes in LUMI-C. You have to use one of the parallel file systems.