I have this simple tf code, what is the equivalent in pytorch? I am stuck trying to code it. I have encountered multiple errors, due to the dimensions.

This is the tensorflow code:

```
Bidirectional(GRU(units=50, return_sequences=True)),
tfa.layers.GroupNormalization(50),
Dropout(0.2),
Dense(units=1, activation='sigmoid')
```

How can I implement the same in pytorch? i’m stuck at this step:

```
def __init__(self, input_dim, hidden_dim, output_dim, n_layers, drop_prob=0.2):
super(GRU, self).__init__()
self.hidden_dim = hidden_dim
self.n_layers = n_layers
self.gru = nn.GRU(input_size=input_dim, hidden_size=hidden_dim, num_layers=n_layers, batch_first=True, bidirectional=True)
self.gn = nn.GroupNorm(50, hidden_dim)
self.dr = nn.Dropout(drop_prob)
self.lin = nn.Linear(input_dim, output_dim)
self.sig = nn.Sigmoid()
def forward(self, x, h):
print(x.shape)
print(h.shape)
out, h = self.gru(x, h)
print(out.shape)
out = self.gn(out)
out = self.lin(out)
out = self.sig(out)
return out, h
```

get output:

```
torch.Size([32, 64, 7])
torch.Size([2, 32, 64])
torch.Size([32, 64, 128])
```

and error:

```
15 out, h = self.gru(x, h)
16 print(out.shape)
---> 17 out = self.gn(out)
18 out = self.lin(out)
19 out = self.sig(out)
Expected number of channels in input to be divisible by num_groups, but got input of shape [32, 64, 128] and num_groups=50
```