How to get the df_do as we used in torch7

dafang_He · February 2, 2017, 4:42pm

Hi, I am using pytorch right now and there is one problem I encountered.
I previously used torch7 framework, and when I use any criterion, I can explicitly get the df_do.
Basically the back-propagated error to the network.
I am wondering can I obtain that in pytorch?
I need to modify that during training.
When I see the example, the code is like this:

     optimizer.zero_grad()
     output = model(data)
     loss = criterion(output, target)
     loss.backward()
     optimizer.step()

So all the network is back-propagated when I call loss.backward() ?

fmassa · February 2, 2017, 4:53pm

You can add hooks in the variables (or modules) of the network.
This is explained in this part of the documentation.
In your case, you could add a hook in the loss, to change the gradients as you wish.

dafang_He · February 2, 2017, 5:07pm

Thanks!
I wasn’t so sure what hook mean when I first read the document.
But as you point out, I think now I get it.
I will take a look at the document.

Thanks!

dafang_He · February 2, 2017, 7:05pm

Hi Francisco ,
Can user-defined hook function take any argument?
For example, for different batch of input data, the hook might perform differently based on some input.
Seems like I could not find related answers.

Thanks!

apaszke · February 2, 2017, 9:11pm

You need to make the hook a closure - just redefine it at every step, and use the data in its body:

optimizer.zero_grad()

output = model(data)
def hook(d_output):
    return d_output * data.mean() # you can use the data here
output.register_hook(hook)

loss = criterion(output, target)
loss.backward()
optimizer.step()