Hello everyone.
I have an autoencoder where the input size is 15, and at the decoder, I only want the first 10 numbers. If I use slicing the autograd will capture that as a grad_fn, I wanted to know will this affect the network behavior? I made a simple example for this and the grads for the sliced out elements were zero.
a = torch.tensor([1,2,3], requires_grad=True, dtype=torch.float)
b = 2 * a
c = 2 * a + b
d = c[:1] # like the slicing in the last layer of decoder before calculating the loss
d.backward()
print(a.grad) # prints: tensor([4., 0., 0.])
# if I would do this for c, I would get this
# c.backward()
# print(a.grad) # prins: tensor([4., 4., 4.])
But I want the effect of these 5 elements in the latent space (does this capture that?). Making their gradient zero means that their weights won’t change. Is this correct? What is the best way to do this?