I am running a latent space model on a network, the network itself dosent take up more than 2gb memory when stored in the local ram of the computer.

But when I try to run my pytorch model i get the following error:

RuntimeError: cuda runtime error (2) : out of memory at c:\anaconda2\conda-bld\pytorch_1519501749874\work\torch\lib\thc\generic/THCTensorMathPointwise.cu:343

I am running on a gtx 1080ti which should have more than enough memory to handle this.

```
def distMatrix(m):
n = m.size(0)
d = m.size(1)
x = m.unsqueeze(1).expand(n, n, d)
y = m.unsqueeze(0).expand(n, n, d)
return torch.sqrt(torch.pow(x - y, 2).sum(2) + 1e-4)
def loss(tY):
d = -distMatrix(tZ)+B
sigmoidD = torch.sigmoid(d)
reduce = tY*torch.log(sigmoidD)+(1-tY)*torch.log(1-sigmoidD)
#remove diagonal
reduce[torch.eye(n).byte().cuda()] = 0
return -reduce.sum()
tZ = autograd.Variable(torch.cuda.FloatTensor(Z), requires_grad=True)
B = autograd.Variable(torch.cuda.FloatTensor([0]), requires_grad=True)
tY = autograd.Variable(torch.cuda.FloatTensor(Ytest), requires_grad=False)
losses = []
biases = []
testScore = []
learning_rate = 1e-3
epochs = 10000
sigmoid = np.vectorize(lambda x: math.exp(-np.logaddexp(0, -x)))
percentDone = 0
percent = 2
for i in range(1,epochs):
count = (float(i)/epochs)*100
if count % percent == 0:
print(count)
l = loss(tY)
l.backward(retain_graph=True)
losses.append(float(l))
biases.append(B.data)
tZ.data = tZ.data - learning_rate * tZ.grad.data
B.data = B.data - learning_rate * B.grad.data
tZ.grad.data.zero_()
B.grad.data.zero_()
```

Z is 25059 by 2, Ytest is 25059 by 25059

Any hints on how to avoid the memory error would be greatly appreicated.