Using forward hook for data parallel with multiple GPUs

For those who still need help, I solved the problem by applying the solution from https://discuss.pytorch.org/t/aggregating-the-results-of-forward-backward-hook-on-nn-dataparallel-multi-gpu/28981/10.

2 Likes