Here for the example mentioned in the blog Making Deep Learning go Brrrr From First Principles. Example here.
Why is memory bw capped at 250 GB/sec, where as Nvidia A100 has peak of 1.9 TB/sec?
Why is it memory bound at 250 GB/sec?
Here for the example mentioned in the blog Making Deep Learning go Brrrr From First Principles. Example here.
Why is memory bw capped at 250 GB/sec, where as Nvidia A100 has peak of 1.9 TB/sec?
Why is it memory bound at 250 GB/sec?