I wrote this function to imitate time-delay neural network:
def SGD(batch,weight,bias):
Layers=[0,0,0,0,0,0,0,0]
userCount = np.asarray(batch["lable"]).shape[1]
Layers[0] = torch.nn.Conv1d(24,512,5).cuda()
Layers[1] = torch.nn.Conv1d(512,512,3,dilation=2).cuda()
Layers[2] = torch.nn.Conv1d(512,512,3,dilation=3).cuda()
Layers[3] = torch.nn.Conv1d(512,512,1).cuda()
Layers[4] = torch.nn.Conv1d(512,1500,1).cuda()
Layers[5] = torch.nn.Linear(3000,512).cuda()
Layers[6] = torch.nn.Linear(512,512).cuda()
Layers[7] = torch.nn.Linear(512,userCount).cuda()
lable = Variable(torch.FloatTensor(batch["lable"]).cuda(),requires_grad=False)
layer0input=Variable(torch.FloatTensor(batch["data"]).cuda(),requires_grad=False)
layer1input=torch.sigmoid(Layers[0](layer0input))
layer2input=torch.sigmoid(Layers[1](layer1input))
layer3input=torch.sigmoid(Layers[2](layer2input))
layer4input=torch.sigmoid(Layers[3](layer3input))
layer4out=torch.sigmoid(Layers[4](layer4input))
mean = torch.mean(layer4out,dim=2)
std = torch.std(layer4out,dim=2)
layer5input=torch.cat([mean,std],dim=1)
layer6input=torch.sigmoid(Layers[5](layer5input))
layer7input=torch.sigmoid(Layers[6](layer6input))
softmax_input=torch.sigmoid(Layers[7](layer7input))
softmax_out = torch.nn.functional.softmax(softmax_input,dim=1)
loss = - torch.trace(torch.mm(torch.log(softmax_out),torch.t(lable)))
print(loss)
loss.backward()
And the problem is that it fill gpu memory on each call, so i have “cuda runtime error (2) : out of memory” after few batches
I tried to “del loss del Layers” but it dont help
I’m using pytorch 0.3.1 and python 3.6.4