Hi I want to replicate features coming from top two parallel layer and pass it linear at bottom and still use it as Variable for backward pass.
transition_matrix = np.eye(10,10)
transition_matrix[4,:] = 1
transition_matrix[5,:] = 1
transition_matrix[6,:] = 1
N = len(np.nonzero(transition_matrix)) #N and transition_matrix are predefined
Let’s say fm1 and fm2 are two feature maps of size (5,10,256) and (5,10,256) for batch size of 5
final_fm = torch.tensor(5,N,256*2)
count = 0
for i in range(transition_matrix .shape[0]):
for j in range(transition_matrix .shape[0]):
if transition_matrix[i,j]>0:
final_fm[:, count, :256] = fm1[:, i, :]
final_fm[:, count, 256:] = fm2[:, j, :]
count += 1
#N and transition_matrix are predefined
l_layer= nn.linear(N,20)
out = l_layer(final_fm) # genertare 5x20 output
loss = loss_function(out,gt)
Thanks for the reply.
I tried doing final_fm = Variable(torch.tensor(5,N,256*2), require_grad=True)
That gives me an error about ‘in place modification’. Apperently, you are not allowed to do that. If set require_grad=False then forward pass is fine but I am concerned about the gradients. Will this allow gradients to flow back properly at backword pass?
requires_grad should be False. Don’t worry about the gradient. I’ve tested this before.
If you don’t believe, you can register hook on fm1 and fm2, and print out the gradient.