There’s also a possible difference in the execution order of the operations. I suppose dot product in CPU is done in sequence, while in GPU there must be a reduction.
Example:
CPU:
(((a + b) + c) + d)
GPU:
((a+b) + (c+d))
There’s also a possible difference in the execution order of the operations. I suppose dot product in CPU is done in sequence, while in GPU there must be a reduction.
Example:
CPU:
(((a + b) + c) + d)
GPU:
((a+b) + (c+d))