Setting part of the training value into zero

When I training my network, I want to set a specific part of latent space into the zero value based on the label.

For example, after input values come out from convolution layer, there would be (B, C, H, W) 4-D tensor value would come out.

I want to make half of the Channel part into zero based on the label.

image

Like this but I got an error like this.

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 128, 15, 15]], which is output 0 of ReluBackward1, is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

How can I assign a specific part of the tensor value into zero?

1 Like

Hi,

This problem raises in two kind of situations:

  1. When you use inplace=True in your code and somehow regarding your model, backpropagation needs the values unchanged.
  2. When you change a value like the image you have attached and backprop needs previously value again to compute gradient.

In the first situation if cannot change the way you have implemented your model, you can just set inplace=False and it will be ok.

But in the second one, you have to preserve let assume X value.

Here is an small example you can test:

A = torch.nn.Parameter(torch.randn(3, 3))
    B = torch.randn(2, 4, 3)
    B[0, :, :] = B[0, :, :].mm(A)
    loss = B.sum()
    loss.backward()

This will give you same error you have posted. And here is the solution:

A = torch.nn.Parameter(torch.randn(3, 3))
B = torch.randn(2, 4, 3)
z = B[0, :, :].mm(A)
loss = z.sum()
loss.backward()

As you can see, in the second example, B matrix, in your case X has not been changed so gradient could be computed.

By the way, operations like x += y, x[:, ...] = y etc will be considered inplace operation.

Good luck

2 Likes

So, I think I solved the problem.

  1. I used inplace=False in previous layer
  2. assigned 0 value in specific tensor
  3. sent to next layer

Thank you very much Nikronic!

Hello Yupjun
The way you are setting parts of your latent vector. Does it take a lot of memory?
Do you get : Cuda out of memory error

Nope It works fine with me. Sorry for late response