I'm not sure that unified memory is particularly relevant for that-- so e.g. on ...

I'm not sure that unified memory is particularly relevant for that-- so e.g. on zen4/zen5 epyc there is more than enough arithmetic power that LLM inference is purely memory bandwidth limited.

On dual (SP5) Epyc I believe the memory bandwidth is somewhat greater than this apple product too... and at apple's price points you can have about twice the ram too.

Presumably the apple solution is more power efficient.