I am building a network which includes a bidirectional GRU layer. After getting all my dimensions to fit, I am now trying to use simple inputs to check that the network is actually doing as intended. However, I can not seem to get the expected output from my GRU layer?
My code looks like this:
class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.nHidden=30 self.nFilter=20 self.conv1 = nn.Conv2d(1,self.nFilter,(5, 5), stride=(1, 1), padding=(2, 2)) self.birnn = nn.GRU(self.nFilter*129,self.nHidden,1,bidirectional=True,batch_first=True) #debugging: self.conv1.bias[:]=0 self.birnn.bias_ih_l=0 self.birnn.bias_hh_l=0 # self.birnn.weight_ih_l=0 # self.birnn.weight_hh_l=0 def forward(self, x): #filtering: x = F.relu(self.conv1(x)) #biGRU: x=x.permute((0,2,1,3)) x=x.reshape((-1,29,20*129)) print(x.size()) print(torch.sum(x,(1,2))) x,hn = self.birnn(x) print(torch.sum(x,(1,2))) return x net=Net() x=torch.zeros(3,1,29,129) x[1,:,2,:]=1 output=net(x)
and running it gives the following:
torch.Size([3, 29, 2580]) tensor([ 0.0000, 1551.1537, 0.0000], grad_fn=<SumBackward1>) tensor([-6.9812, 5.4175, -6.9812], grad_fn=<SumBackward1>)
my intention is that one data point is all 1’s, and the rest are zeros. As I understand nn.GRU, if inputs are 0 and bias’s are zero, shouldn’t outputs be zero too? So, my expectation was to have an output that was [0,K,0] (like the input was). Why is that not happening?
Thank you for reading