VRAM explosion with Custom Linear

Titouan_Parcollet · April 6, 2018, 10:42pm

Hi everyone,
I’m struggling with a very very high memory consumption with a custom Module Linear that you can find here:

If I switch the w_r,w_i,w_j to a simple w and don’t perform any concatenation (actually it’s just a nn.linear), then the consumption is normal (equal to a nn.Linear, 6Go on 12Go) but when I use torch.concat, then OOM (12Go +). Is it normal that just a torch.cat operation blows the allocated memory ?

Thanks you !

albanD · April 7, 2018, 1:03pm

Hi,

torch.cat has to create a new tensor of the size of the concatenated tensors, so it will use double the memory.

Titouan_Parcollet · April 7, 2018, 1:23pm

Hi,

So you’re saying that in memory I’ll have 4x(128x128) matrices and one (512x512) right ?

Ok I need to find something to alleviate this.

albanD · April 9, 2018, 8:49am

Yes, the memory will contain both