Linear layer gives cuda out of memory

hi,

i have linear layer which gives the error,

output = input.matmul(weight.t())


RuntimeError: CUDA out of memory. Tried to allocate 56.00 MiB (GPU 0; 10.92 GiB total capacity; 9.72 GiB already allocated; 17.44 MiB free; 11.40 MiB cached)

how to fix it…