Intel recently released the 38th version of the ISA instruction set extension, confirming a very strange configuration: an important instruction set added in the 14nm era, but once disappeared into 10nm, waiting until the second generation returned. This is AVX512_BF16, a vector neural network instruction that supports the bfloat16 (BF16) format.
This format is between the standardized half-precision FP16 and single-precision FP32 formats, which can reduce the accuracy of the 16-bit space to obtain a larger numerical space, store more data in memory, reduce the time of data in and out of space, but also reduce the complexity of the circuit, and ultimately lead to higher computing speed.
This format has become the de facto standard for deep learning, supported by Google TPU and Intel FPGA/Neural Network processors, and Cooper Lake, which will be delivered in the first half of this year, will introduce it to the Xeon product line for the first time.
Cooper Lake is still a 14nm process and Skylake architecture, up to 56 core, but the interface into the LGA4189, arguably the only bright spot is the AVX512_BF16, and therefore not to be seen, Intel has repositioned it for a very small four-way, eight-way market.
More to be expected is the 10nm Ice Lake’s next generation of Ethel, a new process architecture, to be delivered later this year, but according to Intel’s latest documentation, Ice Lake Has added new instruction sets such as PCONFIG, WBNOINVD, MKTME, ENCLV, etc. But AVX512_BF16 somehow disappeared, confirming earlier speculation.
Intel didn’t explain why, most likely because Ice Lake replaced the new architecture with out AVX512_BF16 in mind.
Fortunately, the second generation of 10nm Sapphire Rapids will re-support AVX512_BF16 and join a large wave of other instruction sets, including AVX512_VP2INTERSECT, CET, ENQCMD, PTWRITE, TPAUSE/UM, Arch LBRs, HLAT, SERIAL, TSXLDTRK, but there is also a suspense that PCIZEONG, WBNOINVD, MKT, ENMEC, etc. continue to support the set.
Sapphire Rapids will be released next year, and Aurora, America’s next top supercomputer, will be equipped with the platform (and amd a version).
On the consumer side, the latest document also confirms Alder Lake, which is expected to be a desktop version of the 10nm process, but is rumoured to have changed the interface again to the LGA1700.