R squred score for neural network model comparision

i am trying to create a neural network with geometric data as output and i have unrolled them into a vector i would like to share you my file through jovian. i have few doubts .
i am training the data but the training loss does not go below 80. can you please tell me what i am doing wrong?
the main idea is to create a nn with variable hidden layer and pruning value alpha

One is i am using R square score as metric and i am getting a very bad r square score how can i improve it.

how do i overfit the model i have just 441 samples, all of which are of size 2043x3 as output and just 2x1 input

also i am getting issue when i change the hidden layer to 8 or 10 as
CUDA out of memory. Tried to allocate 76.00 MiB (GPU 0; 4.00 GiB total capacity; 1.85 GiB already allocated; 37.20 MiB free; 1.88 GiB reserved in total by PyTorch)
i have used torch.cuda.empty_cache()
but it still doesnt work

also my next goal is to optimise this whole code to me used in cuda
can you please help me

You could try to overfit a really small dataset first by playing around with hyperparameters and make sure your model is able to do so.

torch.cuda.empty_cache() won’t help in this case besides slowing down your code.
Your GPU doesn’t have enough memory for 10 layers, so you could either lower the batch size or use torch.utils.checkpoint to trade compute for memory.

thank you it helped . there was random points in the data set. as soon as i tried to print and see the data set i found the issue and my r square was around 0.9 and above after that