Cuda always get out of memory in google colabs

So i have 2DCNN models to classify image, there are just 2 class, i have 300 images each class.
here is my nn module class

class Flatten(Module):

    def forward(self, input):

        return input.view(input.size(0), -1)

class ActionNet(Module):

    def __init__(self, num_class=4):

        super(ActionNet, self).__init__()

        

        self.cnn_layer = Sequential(

            #conv1

            Conv2d(in_channels=1, out_channels=32, kernel_size=1, bias=False),

            BatchNorm2d(32),

            PReLU(num_parameters=32),

            MaxPool2d(kernel_size=3),

            #conv2

            Conv2d(in_channels=32, out_channels=64, kernel_size=1, bias=False),

            BatchNorm2d(64),

            PReLU(num_parameters=64),

            MaxPool2d(kernel_size=3),

            #flatten

            Flatten(),

            Linear(576, 128),

            BatchNorm1d(128),

            ReLU(inplace=True),

            Dropout(0.5),

            Linear(128, num_class)

        )

    

    def forward(self, x):

        x = self.cnn_layer(x)

        return x

i split training and test 80%:20%, for more detail here is my training code: https://pastebin.com/uzBFTrDc , when i try to train with colab there is an error:

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 15.90 GiB total capacity; 15.20 GiB already allocated; 1.88 MiB free; 15.20 GiB reserved in total by PyTorch)

i use google colab because i don’t have powerfull GPU and implement batch that i dont know wether it’s correct or not, the training data is just 400 image with size 32x32,i think colabs is powerfull enough to train my data, anyone can help me?

I cannot access the training code, but apparently the device memory is not sufficient for your current training.
Do you get this error in the first iteration or after a few iterations or epochs?
In the latter case, do you see an increase in the device memory?
If so, make sure you are not storing tensors, which are attached to the computation graph, such as the loss or output of your model.
If you are storing these tensors in e.g. a list, detach them before storing via:

losses.append(loss.detach())

i solved this problem, out of memory is because i didn’t use batch in validation data