Can I compare CUDA and CPU operations in the hook

I want to achieve an accuracy comparison in the forward_hook


In each hook, let all the nn.Module runs once in the cuda device and run once in the CPU device

Then compare whether their results are consistent (In the allowable margin of tolerance)

Please someone guide me, how to do it, can I do it use hooks (register_module_forward_hook, register_module_full_backward_hook)

Here’s what I’m thinking: (probably all wrong):

# 1. deepcopy the module ,input, output in the hook.
module_copy = copy.deepcopy(module)
input_copy = copy.deepcopy(input)
output_copy = copy.deepcopy(output)

# 2. to cpu
module_copy ='cpu')

#  3. call the function , maybe like that

# 4. compare the output_copy the output

This sounds OK, are you running into any issues?