Why Testing a CNN is taking a lot of memory?

Hi,

I am training a very simple CNN. While training everything goes well, but when it comes to testing I am getting Runtime error for not having enough memory! I am new in using PyTorch! I am posting the testing portion of the code below, any help will be highly appreciated!

loss_function = nn.MSELoss()
net.eval()
Variable_store_3 = np.empty((0,1)) 
Variable_store_1 = []  
for epoch in range(1):
    running_loss = 0
    Variable_store_2 = []
    for batch in test_dataloader:
        
        x, y = batch
        outputs = net(x)
        Variable_store_1.extend(outputs)   #for plotting purposes
        Variable_store_2 .extend(y)            #for plotting purposes
        outputs_np = outputs.detach().numpy()
        Variable_store_3 = np.append(Variable_store_3, outputs_np, axis = 0)
        loss = loss_function(outputs, y)
        running_loss += loss.item() * 128  #Batch_size
        
    final_loss = math.sqrt((running_loss / len(test_dataset)))
    print(f"{epoch+1} epoch | testing loss = {final_loss}") 

Could you wrap your testing code into a with torch.no_grad() block?
This would avoid storing the intermediate tensors, which would be needed to calculate the gradients in the backward pass.

Also, you might want to define a training and validation function, if that’s not already the case.
Since Python uses function scoping, some variables from the training might be freed additionally.

Hoy much size is your data? I think that’s happen because you are loading all the outputs to Variable_store_3, and all this data can’t fit in memory. Also try using torch.no_grad(), this means taht you are not saving the gradients to use backprop (because you are not training)

for batch in test_dataloader:
    with torch.no_grad():
        x, y = batch
        outputs = net(x)
        Variable_store_1.extend(outputs)   #for plotting purposes
        Variable_store_2 .extend(y)            #for plotting purposes
        outputs_np = outputs.detach().numpy()
        Variable_store_3 = np.append(Variable_store_3, outputs_np, axis = 0)
        loss = loss_function(outputs, y)
        running_loss += loss.item() * 128  #Batch_size
        
1 Like

Thank you very much for replying and for the solution.

As you have advised, I have wrapped my testing code with torch.no_grad(). I just did a small training and with that the testing. And it seems the problem is solved! But, now I will be doing longer training and then testing and will post the update here.

Again, thank you, very much

Thank you for your reply and for the solution.

The size of my testing data was 6080000.
Then downsampled into 23850.
and torch.Size([23750, 100, 1]) was this.