Autograd Function vs nn.Module?

Shihan_su · March 22, 2017, 9:09pm

Hi, I am new to pytorch. I want to implement a customized layer and insert it between two LSTM layers within a RNN network.
The layer should take input h and do the following:

parameters = W*h + b # W is the weight of the layer
a = parameters[0:x]
b = parameters[x:2x]
k = parameters[2x: ]
return some_complicated_function(a, b, k)

It seems that both autograd Function and nn.Module are used to design customized layers.
My question is

What are the difference between them in a single layer case ?
autograd Function usually take weights as input arguments. Can it store weights internally ？
Which one should I pick for my implementation ?
when do I need to specify backward function while gradients are all auto computed ?

Thanks!

albanD · March 23, 2017, 10:01am

Hi,

This post Difference of methods between torch.nn and functional should answer most of your questions.

2: I would say nn.Module since you have parameters
3: You need to specify the backward function if you implement a Function because it works with Tensors. On the other hand, nn.Module work with Variable and thus are differentiated with autograd.