I am building a network which includes a bidirectional GRU layer. After getting all my dimensions to fit, I am now trying to use simple inputs to check that the network is actually doing as intended. However, I can not seem to get the expected output from my GRU layer?
My code looks like this:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.nHidden=30
self.nFilter=20
self.conv1 = nn.Conv2d(1,self.nFilter,(5, 5), stride=(1, 1), padding=(2, 2))
self.birnn = nn.GRU(self.nFilter*129,self.nHidden,1,bidirectional=True,batch_first=True)
#debugging:
self.conv1.bias[:]=0
self.birnn.bias_ih_l=0
self.birnn.bias_hh_l=0
# self.birnn.weight_ih_l=0
# self.birnn.weight_hh_l=0
def forward(self, x):
#filtering:
x = F.relu(self.conv1(x))
#biGRU:
x=x.permute((0,2,1,3))
x=x.reshape((-1,29,20*129))
print(x.size())
print(torch.sum(x,(1,2)))
x,hn = self.birnn(x)
print(torch.sum(x,(1,2)))
return x
net=Net()
x=torch.zeros(3,1,29,129)
x[1,:,2,:]=1
output=net(x)
and running it gives the following:
torch.Size([3, 29, 2580])
tensor([ 0.0000, 1551.1537, 0.0000], grad_fn=<SumBackward1>)
tensor([-6.9812, 5.4175, -6.9812], grad_fn=<SumBackward1>)
my intention is that one data point is all 1’s, and the rest are zeros. As I understand nn.GRU, if inputs are 0 and bias’s are zero, shouldn’t outputs be zero too? So, my expectation was to have an output that was [0,K,0] (like the input was). Why is that not happening?
Thank you for reading