I have no immediate numbers for prefill, but the memory bandwidth is ~4x greater on a 4090 which will lead to ~4x faster decode.
I have no immediate numbers for prefill, but the memory bandwidth is ~4x greater on a 4090 which will lead to ~4x faster decode.