Some questions regarding TPU

  1. Which is preferred, using TPU, or multiple GPUs in parallel?
  2. Are we going to see TPUs used in parallel in the future, similar to gpus used in parallel?
  3. All of the GPU that I have heard of, work in the field of gaming / video rendering also, but I do not hear any TPU being used in such areas, how can they be primarily for neural networks, and not deliver upgraded performance in other areas?
  4. When it comes to GPU, then people have them in their personal computers also, but nothing like that for TPU, why do we only hear usage of cloud TPU, and nothing like personal TPUs?
  5. on Wikipedia it says, “ Compared to a graphics processing unit, it is designed for a high volume of low precision computation (e.g. as little as 8-bit precision)[7] with more input/output operations per joule, and lacks hardware for rasterisation/texture mapping.”, does this mean that most of the upcoming neural network training, is going to be through quantized weight parameters (example make 32-bit to 8-bit), trained on TPUs?
  6. If 5 is true, then the reason we do not use TPUs for games is that, if we represent each pixel with less number of bits, then graphics processing would go faster but it would not look convincing, so it is avoided?