I still don't see any misconceptions. The reason why the kernel uses SIMD with FPU save/restore is to optimize context switches. We addressed the topic in https://netdevconf.info/0x12/session.html?kernel-tls-handsha... . I guess early versions of WireGuard used the same approach: save FPU context at the beginning of softirq, process may packets with SIMD in one shot, and restore FPU state.
There are also other issues with the kernel code and I addressed them in the article, why it doesn't makes sense (while still possible) to use C++ in the kernel code.
There are also other issues with the kernel code and I addressed them in the article, why it doesn't makes sense (while still possible) to use C++ in the kernel code.