How to update pytorch model with supervision provided by caffe model?

I want to do something like this:

# pytorch_model to train, caffe_model freezed
torch_out = pytorch_model(input)
caffe_out = caffe_model(torch_out)
loss = criterion(caffe_out, label)
loss.backward() # or something like torch_out.backward()

I can easily get the gradient of torch_out provided by caffe_model.backward(), but how can I update pytorch_model with that?

re-implement the caffe model using pytorch and then integrate the 2 models together in code-level and finally load the saved caffe model parameters into the new pytorch model