Need advice on how to modify the GRU to made it work with raw vibration data

Hello,

I am trying to feed raw vibration to a GRU layer. The idea of feeding these vibration raw data, is so the GRU layer extract the features of the raw vibration data, so it can optimize the feature extraction to get better prediction.

This is the GRU layer I have:

class GruPeakFR(nn.Module):
    def __init__(self, num_inputs, num_hiddens=1, sigma=0.01):
        super().__init__()
        self.num_hiddens = num_hiddens
        init_weight = lambda * shape: nn.Parameter(torch.randn(*shape) * sigma)
        
        # Calculate the size of the rfft output
        self.rfft_size = num_inputs # // 2 + 1
        
        triple = lambda: (init_weight(num_inputs, num_hiddens),
                          init_weight(num_hiddens, num_hiddens),
                          nn.Parameter(torch.zeros(num_hiddens)))
        self.W_xz, self.W_hz, self.b_z = triple()  # Update gate
        self.W_xr, self.W_hr, self.b_r = triple()  # Reset gate
        self.W_xh, self.W_hh, self.b_h = triple()  # Candidate hidden state
        self.input_dropout = nn.Dropout(0.0)
        self.input_dropout = nn.Dropout(0.0)

    def forward(self, inputs, H=None):
        if H is None:
            H = torch.zeros((inputs.shape[1], self.num_hiddens), device=inputs.device)
        outputs = []
        for X in inputs:
            fft_x = torch.fft.fft(X).real
            update_gate = (torch.matmul(fft_x, self.W_xz) +
                            torch.matmul(H, self.W_hz) + self.b_z)
            Z_fft = torch.sigmoid(update_gate)
            Z = torch.fft.ifft(Z_fft).real
            Z_peak = calculate_the_peak(Z)
            reset_gate = (torch.matmul(fft_x, self.W_xr) +
                            torch.matmul(H, self.W_hr) + self.b_r)
            R_fft = torch.sigmoid(reset_gate)
            R = torch.fft.ifft(R_fft).real
            R_peak = calculate_the_peak(R)
            
            # Z = torch.fft.irfft(Z_fft, n=X.size(-1))

            candidate_hidden_gate = torch.matmul(fft_x, self.W_xh) + torch.matmul(R* H, self.W_hh) + self.b_h
            candidate_hidden_gate = torch.fft.ifft(candidate_hidden_gate).real
            # candidate_hidden_gate = calculate_the_peak(candidate_hidden_gate)
            
            H_tilde = torch.tanh(candidate_hidden_gate)
            H_tilde = calculate_the_peak(H_tilde)
            H = Z_peak * H + (1 - Z_peak) * H_tilde
            # print(H)
            outputs.append(H)
            
            # print("Z shape:", Z.shape)
            # print("R shape:", R.shape)
            # print("H shape:", H.shape)
            # print("R * H shape:", (R * H).shape)
            # print("self.W_hh shape:", self.W_hh.shape)
            
        outputs = torch.stack(outputs, dim=0)
        if outputs.isnan().any():
            print("Gru Peak giving NaN values")
        return outputs, H

As the sigmoid function destroy the vibration signal, I use the FFT real values to prevent this. I also tried by getting the absolute value (the magnitude), After the sigmoid and tanh function, I extract the max value of the recreated vibration signal with the ifft.

It does train, however it fails on the test set. I was thinking that this layer increase the risk of overfitting and tried different methods to prevent overfitting, but with no avail. Iā€™m still new with working with RNN and trying to change their way how their work. If there are people more used to GRU that can give some tips about creating your own custom GRU layers to tailor it to my task.

One possibility is to change the layer by a Convolutional and changing the kernel, but I would like to give this GRU one more shoot before giving up on it. If you have also resources regarding GRU that can help me understand better how to custom them, I am all ears.

Thanks !

Instead of a matmul on real values, can you try masking a percentage of the data with 0 and 1, because there are papers that talk about the efficacy of masking data this way

O interesting, I will look into this masking. Do you have links for these papers so I can read them ?

Is it this one : Online additive updates with FFT-IFFT operator on the GRU neural networks | IEEE Conference Publication | IEEE Xplore ?

@Mystify2823 there are many papers and articles on this

It is used in places where you want the model to learn patterns without overfitting

1 Like