Zero padding prior to fft in forward change my results

yarah · April 18, 2020, 2:02pm

Hi,

I am struggling with a problem, can’t find a way to debug it, really hope i can find some direction here…
I built an AE where between encoder and decoder I am adding some operations ifft, adding noise and then fft, data is complex numbers, therefore I suppurated them to real and imaginary and built the functions myself. Loss function for backward is sum of F.mse_loss and loss that I am adding with a function I built. Problem is that without padding and padding removal I get very different results compared to when I do use padding, I can’t understand why, I expect results to be similar because I am just adding zeros from the side and removing them later, I normalized the fft and ifft so that variance per sample is the same with and without padding… I validated all operations and functions I used in my forward, the problem is with the built in backward, no idea how to check if what I am doing causing a problem there.

Sample code:

def forward(self, x, noise, device, is_train=1):
        mean_x = x.mean()
        var_x = x.var()
        fft_size = int(x.shape[1] / 2)
        x = self.encode(x)
        x = self.layer_norm_2(x, mean_x, var_x, is_train)
        #############ZERO-PADDING#####################
        p1d = [fft_size * 2, fft_size * 2]  # pad last dim by fft_size on each side
        x = F.pad(input=x, pad=p1d, mode="constant", value=0)  # effectively zero padding
        fft_size_oversamp = int(x.shape[1] / 2)
        ##########################################
        x = ifft_calc(x, fft_size_oversamp, self.idft_block_torch)
        x_ifft = x.clone()
        x = x + noise
        x = fft_calc(x, fft_size_oversamp, self.dft_block_torch)
        x = x[:, fft_size * 2:fft_size * 4].clone()  # remove padding
        x = self.decode(x)
        return x, x_ifft

The loss part calculation in the training function:

            pred_log_probs, tx_ifft = estimator.forward(x_batch, noise_batch, device)
            sec_loss = sec_calc(tx_ifft, Nsc + Nsc * pad_cnt)
            sec_loss_max = torch.max(papr_loss, dim=0).values  
            model_optimizer.zero_grad()
            loss1 = F.mse_loss(pred_log_probs, x_batch) + lamda * sec_loss_max 
            loss1.backward()
            for p in estimator.parameters():
                if p.grad is None:
                    print(p.grad)
            model_optimizer.step()


def sec_calc(x, fft_size):
    even_number = torch.arange(0, 2 * fft_size, 2)
    odd_number = torch.arange(1, 2 * fft_size, 2)
    x_real = torch.transpose(torch.transpose(x, 0, 1)[even_number], 0, 1)  
    x_imag = torch.transpose(torch.transpose(x, 0, 1)[odd_number], 0, 1)
    abs = torch.sqrt(torch.pow(x_real, 2) + torch.pow(x_imag, 2))
    numi = torch.max(torch.pow(abs, 2), dim=1)
    numi_vals = numi.values
    denumi = torch.mean(torch.pow(abs, 2), dim=1)

    sec_res = numi_vals / denumi
    return sec_res

Hope I was clear, and to finally find some direction to solve my problem…

Tnx guys!

ptrblck · April 19, 2020, 4:05am

How large is the difference and where does this difference occur in your model?

yarah · April 19, 2020, 7:58am

Tnx for the response! There is a big difference, I plot a graph that indicates the second loss part effect and it is much better without the padding and the reconstruction loss also is worse… For example, I also tried to do the same but with padding, ifft before the forward, and remove padding and ftt after the forward so that they aren’t part of the backward, only encoder, adding noise and then decoder in the forward, I had to play with the network parameters since in this case there are much more neurons at the input but results were reasonable… I don’t see why should the padding as I presented above would change the results, I only added zeros from the sides, ifft, add noise, fft, and remove the zeros before the decoder, normalized as well so that variance is the same with and without padding. I am afraid that something goes wrong in the backward process, I don’t know what else can I do. I can put graphs to show the difference if that helps, will prepare them. Thank you!