+ 50% VRAM used on torch 1.3 compared to 1.2

hadaev8 · February 10, 2020, 7:44pm

This is generative seq2seq model with 3 main modules: encoder, decoder and postnet.
More here

I have removed encoder and postnet, decomposition of decoder is a bit harder
Now it looks like

albanD · February 10, 2020, 8:38pm

I tried again with a similar setup as the one you tried in this post: + 50% VRAM used on torch 1.3 compared to 1.2
But using just linear layers does not seem to cause any issue. Could you share the definition of the LinearNorm forward function please? Do you see a similar issue when you use a single LinearNorm layer?

hadaev8 · February 10, 2020, 8:50pm

github.com

NVIDIA/tacotron2/blob/master/layers.py#L8


import torch
from librosa.filters import mel as librosa_mel_fn
from audio_processing import dynamic_range_compression
from audio_processing import dynamic_range_decompression
from stft import STFT




class LinearNorm(torch.nn.Module):
    def __init__(self, in_dim, out_dim, bias=True, w_init_gain='linear'):
        super(LinearNorm, self).__init__()
        self.linear_layer = torch.nn.Linear(in_dim, out_dim, bias=bias)


        torch.nn.init.xavier_uniform_(
            self.linear_layer.weight,
            gain=torch.nn.init.calculate_gain(w_init_gain))


    def forward(self, x):
        return self.linear_layer(x)

hadaev8 · February 10, 2020, 8:56pm

I tried with default linear layers, still 9gb vs 6gb cashed.

hadaev8 · February 10, 2020, 9:31pm

I tested lstmcell in separate notebook, seems like its it
https://colab.research.google.com/drive/18R1aMLcM2uL91gTbYdhm9urWRJaEQUuE

Maybe i should just use lstm layer instead? But not sure how to adjunct decoder code.