What wrong with mine masking?

hadaev8 · September 24, 2019, 9:05am

Pytorch 1.2, tacotron gst model.
I wonna try to mask out gst outputs.
I have two 3d array with shape [32, 177, 512]

So i just wonna have zeros in first tensor like in second

Taking mask and expand shape
gst_mask = ~get_mask_from_lengths(text_lengths)
gst_mask = gst_mask.unsqueeze(-1).expand_as(gst_outputs)
Seems ok

Then i do
gst_outputs.data.masked_fill_(gst_mask, 0.0)

But get unexpected output:

albanD · September 24, 2019, 2:06pm

Hi,

How is gst_outputs created? Can your add a gst_outputs = gst_outputs.clone() before doing the masking?

hadaev8 · September 24, 2019, 5:02pm

Basically it is like this

github.com

mozilla/TTS/blob/master/models/tacotrongst.py#L70




def inference(self, characters, speaker_ids=None, style_mel=None):
    B = characters.size(0)
    inputs = self.embedding(characters)
    encoder_outputs = self.encoder(inputs)
    encoder_outputs = self._add_speaker_embedding(encoder_outputs,
                                                  speaker_ids)
    if style_mel is not None:
        gst_outputs = self.gst(style_mel)
        gst_outputs = gst_outputs.expand(-1, encoder_outputs.size(1), -1)
        encoder_outputs = encoder_outputs + gst_outputs
    mel_outputs, alignments, stop_tokens = self.decoder.inference(
        encoder_outputs)
    mel_outputs = mel_outputs.view(B, -1, self.mel_dim)
    linear_outputs = self.postnet(mel_outputs)
    linear_outputs = self.last_linear(linear_outputs)
    return mel_outputs, linear_outputs, alignments, stop_tokens


def _add_speaker_embedding(self, encoder_outputs, speaker_ids):
    if hasattr(self, "speaker_embedding") and speaker_ids is not None:
        speaker_embeddings = self.speaker_embedding(speaker_ids)

will the gradient propagate if i copy tensor?

albanD · September 24, 2019, 6:02pm

Yes it will
As I was suspecting, gst_outputs is created with an expand. This means that this is not backed by actual memory and writing to one place of it will have effect on other places.
Using clone will force it to be backed by real memory and remove this problem !