HPC market changes Overpowering Intel to face ARM and AMD’s left and right clampdown in 2020

Intel’s market share has been more than 90% for a long time in the lucrative server and HPC (high-performance computing) processor market, but the market is not always static. With the introduction of AMD Epyc processors and the entry of Arm architecture processors into this market, Intel will struggle to avoid the competition they bring in in 2020.

However, market share will not change rapidly, and Intel x86 processors will continue to dominate the market for the next five years or more, making new products even more important in the face of strong competitors.

HPC market changes Overpowering Intel to face ARM and AMD's left and right clampdown in 2020

The high-performance computing (HPC) market has grown for many years. There are currently at least three architectureed CPUs available for HPC tasks, including the X86, Arm and Power, with more than six reliable vendors, and two (about to become three) GPU architectures. However, the vast majority of HPC systems in this area today are powered by Intel CPUs (sometimes Nvidia GPUs). Starting this year, things will begin to change.

Recent changes are likely to come from the x86 space, and AMD Epyc’s momentum will face Intel’s fiercest competition since the Opteron (AMD’s 64-bit processor) era in 2003. In particular, it is almost certain that the second-generation Epyc chip, also known as “Rome”, will capture Intel’s market share in the server space, including HPC.

Rome’s impressive price/performance ratio is undoubtedly the key to getting super computer systems such as the US, UK, Germany and Finland to choose them, and most will be launched this year. The key question now is how much the updated Ethestprocessor processor (14nm “Cooper Lake” ToeExscaling processor), and the future 10nm Ice Lake Escaling processor, will reduce market share.

Arm is also slowly entering the HPC market, and we believe that the main reason for the slowness is that from a technical point of view, the architecture has no particular advantage over the x86 or any other general purpose processor. Arm’s advantage is that its IP is licensed, so the architecture can meet the needs of custom processors in different markets, which is bundled with the global software ecosystem.

It needs to be noted that this malleability is a long-term advantage, not a short-term one. Fujitsu spent at least five years designing and developing the A64FX, the first HPC-specific processor based on the Arm architecture. It will be debuted in RIKEN Lab’s Fugaku 400 petaflops supercomputer and will test the feasibility of arm architecture in high-end HPC and the corresponding ecosystem. Coincidentally, it will also show the advantages and disadvantages of no accelerators or external memory systems.

Cavium Kai’s ThunderX2 SoC, which has been acquired by Marvell in 2017, is a more generic chip with the goal of being a low-level HPC, which was launched in 2018 and quickly won an order from Marvell. It became the basis for one of the first Arm-based HPC clusters in the UK and elsewhere. Although ThunderX2 is not ultra-high performance, it is allocated by its better integrated memory controller and performs well in applications that are limited by memory bandwidth. Marvell hopes to build on the success of the ThunderX2, which is expected to be released early this year.

Marvell predicts that the third generation, based on a 7nm process, will compete with AMD’s “Roman” Epyc 7002 and Intel’s Ice Lake Escalability processors, with performance more than twice as much as ThunderX2, faster clock frequencies and higher energy efficiency.

This year, Arm-based high-performance computing will have another important option: a commercial system based on A64FX. For example, due to a cooperation agreement with Fujitsu, customers can now select A64FX-enabled CS500 clusters from Cray/HPE. For the Japanese or European markets, Fujitsu will also offer FX700 and FX1000-based FX700 and FX1000 systems based on the A64FX.

If these systems can attract enough customers in their respective regions, other OEMs may have similar agreements with Fujitsu.

The future of a Commercial System Based on A64FX can also be predicted. Isambard 2 is an iterative version of the University of Bristol’s original ThunderX2-driven Isambard cluster, which will use the A64FX Cray CS500. Although no announcement has been announced, it would not be surprising if one (or more) of Europe’s three E-class supercomputers, which can perform billions of math per second, also uses the A64FX chip.

We believe that the current enthusiasm of users and vendors for Arm-based clusters is based on the fact that change appears to have reached an inflection point. Hyperion Research, which has been tracking Arm sales at HPC, expects a COMPOUND annual growth rate of 64.7 percent over the next five years.

Although only 50,000 Arm chips will be used for HPC in 2019, Hyperion expects that number to exceed 233,000 by 2020 and 610,000 by 2024. The fact that many of these systems will be outside the United States reflects the fact that all the original Arm-based Mega-Scale systems will be built and deployed in Europe, China and Japan. These regions account for more than half of the market for high-performance computing. That is, while Arm can maintain high growth rates in this area, x86 processors will dominate the market for the next five years or more.

In terms of Power architecture, IBM is the only player in the game, despite the OpenPower program. The Power 10 processor was originally scheduled to launch this year, but it now looks set to be available in 2021, and the company is not relying on HPC to increase shipments. Although the Power10 can be an impressive chip in high-performance computing, there are no large systems that provide computing power from the chip (the Department of Energy has approved the IBM and Power10’s CORAL-2 contract).

One potential growth point is the European Open Computer Architecture Laboratory (LOCA) plans to select OpenPower as one of three architectures for developing open source HPC processors.

For the foreseeable future, The Power architecture seems destined to play a secondary role in high-performance computing.

GPUs and the wider accelerator spree are sure to grow, especially if you consider the products in China (C.Sugon’s DCU and Matrix-3000 DSP), Europe (RISC-V and other specific areas of accelerators) pursuing custom designs (European Processor Program), and countless AI accelerators entering the market For example, Intel recently introduced neural network processors: NNP-T and NNP-I. Of course, there are various FPGA iterations from Xilinx and Intel for semi-customized Hardware HPC applications in the chip.

However, for mainstream HPC users, GPUs will remain the preferred accelerator platform. Nvidia dominates the field, but AMD and its Radeon Instinct are poised to capture some of the market’ shares. The top-of-the-line MI60 delivers 74 teraflops with Infinity Fabric, 32 GB of HBM2 memory and 200 GB/s connection to the GPU. In future iterations, connection performance will extend to AMD’s Epyc CPU so that the GPU and CPU can communicate on the same structure. This feature will be tested on a large-scale scale “Frontier” 10 billion supercomputer at Oak Ridge National Lab, with Infinity Fabric connecting four Radeon Instinct GPUs and an Epyc CPU in each node, which Frontier plans to launch in 2021.

In the same year, Aurora’s E-rated supercomputing is expected to be available at argonne National Laboratory. The system will be equipped with Intel’s Xe GPU, a coprocessor designed to speed up HPC and neural network training, just like Nvidia’s V100 and T4. Therefore, Aurora will be the first large-scale test of HPC and AI loads for this processor. Since there are no Xe processors (scheduled for release later this year), their performance and programmability are unknown.

In this regard, Nvidia has an advantage because the company has been methodically extending its CUDA software around its CUDA hardware for more than a decade and has a large number of developers and users. The company’s GPU has proved a bit elusive, and with the emergence of a new generation of (Amp) architectures that could be launched later this year, Nvidia may once again have an advantage.

But now, at least, it’s a three-man game. As the new decade begins, this will make the accelerator market even more interesting.