cancel
Showing results for 
Search instead for 
Did you mean: 

3 Video Link Unlock Skylake X full performace via AVX512

restsugavan
Level 13
There are 3 years after Skylake X SKUs release but look like people couldn't use its microarchitect full potential.
I'd see the most interesting video like about AVX512 below.

https://www.youtube.com/watch?v=D-mM6X5xnTY
https://www.youtube.com/watch?v=I3efQKLgsjM
https://www.youtube.com/watch?v=543a1b-cPmU

Hope everyone find the path to full Skylake X SKUs unlock performance. It's also can be apply with Icelake X Tigerlake Rocket lake Alderlake and
new wave of Intel AVX512 support CPUs. :cool:
W11CANARY 26100.1 Core i9 7980XE 02007108 MCE ME 11.12.95.2499 R6E Modified BIOS 3801 SAMSUNG OG9 FW 1019.0 SSD 970 EVO PLUS 1 TB x 3 NVIDIA RTX 4090 GAME READY 552.22 64GB GSKILL DDR4 3200MHz JBL 9.1 Sound Bar DTS-X
1,104 Views
1 REPLY 1

Int8bldr
Level 12
restsugavan wrote:
There are 3 years after Skylake X SKUs release but look like people couldn't use its microarchitect full potential.
I'd see the most interesting video like about AVX512 below.

https://www.youtube.com/watch?v=D-mM6X5xnTY
https://www.youtube.com/watch?v=I3efQKLgsjM
https://www.youtube.com/watch?v=543a1b-cPmU

Hope everyone find the path to full Skylake X SKUs unlock performance. It's also can be apply with Icelake X Tigerlake Rocket lake Alderlake and
new wave of Intel AVX512 support CPUs. :cool:


There are a lot of misconceptions about AVX-512 and it has an undeservedly bad reputation (Linus Torvald's comments did not help either).
Some, so call experts, are even saying that there is only a dozen or so people in the world that can use AVX-512 properly to extract performance out of thee intel CPUs.

The truth is that you actually need real programming skills to vectorize you code but the rewards are amazing!
Speed ups of 10x is not uncommon.
BUT it requires you to actually know how to design and build your systems vectorized!
You cannot just take your normal sequential code that you write in some high-level language and hope that somehow the compiler will find a way to compile into efficient AVX-512 binary. This is not going to happen!

You have to design you system for a SIMD target platform in mind from the very beginning and vectorize algorithms.
This requires linear algebra and multivariable calculus understanding, making the threshold high.

BUT again the payback is enormous if you find a away to write efficient code that use AVX-512. I've seen speed ups in 100x for certain workloads, especially when you can organize the problem and the data into large matrices and vectors.

Matrix to matrix multiplication is one example of this. The bottle next is no longer the CPU (AVX-512 just plows through these workloads and a 10980XE have 2! FMAs per core that's right 2 effectively you have 36 FMAs cores in a 10980XE not 18!) but the memory subsystem. Basically the memory subsystem cannot feed the CPU fast enough, requiring you to align data for optimal reading and writing (with optimal cache access etc.)


In my view, AVX-512 is an absolute pleasure to work with! It is so incredible powerful and rewarding! It's not an obscurity BUT it requires the Developer to design and build vectorized code not hope the complier will somehow do that for you!