AVX512 has so many more features above-and-beyond Intel's typical SIMD implement...

jabl · on Dec 22, 2017

I wonder why they did the BW thing instead of just defining a vector length register like other vector ISA's (which would have allowed to get rid of a remainder loop, leading to less code bloat and more efficient execution for short loops where the number of iterations is not an integer multiple of the ISA vector length).

thecompilr · on Dec 22, 2017

VL - Extend AVX512 to operate on only 256-bit and 128-bits at a time. (vector length extension)

DQ - Extend AVX512 to Longs, Long Longs. (double word and quadword extension)

BW - Extend AVX512 to Bytes, Shorts. (byte and word extension)

dman · on Dec 22, 2017

Intel did a self goal here by limiting availability of AVX512 to select Xeon SKUs.