Unclear about tensor dimensions when extending PyTorch with custom function

I’m wondering a bit about the dimensions of the tensors that PyTorch wants when implementing the forward() and backward() steps for a custom function. Basically the function would have the signature f(x,y) = z, where x is an n-vector, y an m-vector and z a k-vector.

Now backward() gets a vector grad_input back. This should be k-dimensional? Is it batched? What should be the dimensions of the pair of tensors I should return?

The current documentation seems to document everything except the dimensions and since the example is quite simple, it doesn’t showcase this.

It depends a lot on what sort of layers the data has already been through so there is no simple answer. You just have to figure it out by checking the docs for similar existing functions.

  • A Conv2D layer takes data of shape (batches, color_channels, x_coords, y_coords), and outputs data of shape (batches, filters, x_coords, y_coords).
  • LSTMs take data of shape (timesteps, batches, features) unless you specify batch_first=True.
  • A Linear layer takes data of shape (batches, *, features), where * represents zero or more additional dimensions and returns data of shape (batches, *, number_of_units_in_linear_layer).

In your example, I think forward() should take tensors of shape (batches, n), and (batches, m) and should return something of shape (batches, k).

In most cases you don’t actually have to code the backward() function, pytorch will figure it out. That said, the grad_input would be of shape (batches, k) and backward should return tensors of shape (batches, n) and (batches, m). But that part ought to be obvious once you know what shapes forward() takes and returns.

1 Like