Hi guys,
I’m training my model using pytorch.
I created a simple neural network with 2 layers training on MNIST dataset, and applied a custom method named LS on every neuron between two hidden layers.
However, during the training, my memory usage constantly increased over time so I’m running out of memory and the process was killed in the first epoch.
Below is my implementation of the neural network and the training process:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(28*28*1, 20)
self.fc2 = nn.Linear(20, 10)
def forward(self, x):
x1 = self.fc1(x)
x2 = torch.sigmoid(x1)
x3 = self.fc2(x2)
com_x3 = torch.sigmoid(x3)
x2 = x2.clone().requires_grad_()
for i in range(x2.shape[0]):
for j in range(x2.shape[1]):
for k in range(com_x3.shape[1]):
# Applying
x2[i,j] = LS(x2[i,j],com_x3[i,k])
return x3
def train(epoch):
train_loss = 0
train_accuracy = 0
model.train()
for data, label in tqdm(loader_train, desc="Training"):
data, label = data.view(-1, 28*28).to(device), label.to(device)
optimizer.zero_grad()
y_pred_prob = model(data)
torch.autograd.set_detect_anomaly(True)
loss = loss_fn(y_pred_prob, label)
loss.backward()
optimizer.step()
train_loss += loss.item()
y_pred_label = torch.max(y_pred_prob, 1)[1]
train_accuracy += torch.sum(y_pred_label == label).item() / len(label)
I’m using pytorch 2.6, running the code on WSL2 (RAM for WSL is 24Gb). Batch size is 128.
Please help me with this problem.
Thank you and best regards.