This is a technical proof of concept prior to experimenting with an MoE RNN.

Essentially, I want to run two RNNs in parallel on the same input sequence. The caveat is that the number of parallel RNNs to run is unknown - thus the for loop.

```
class pRNN(nn.Module):
def __init__(self, input_size=0, hidden_size=0, num_layers=1, bidirectional=False, dropout=0, subunit_count=1):
super().__init__()
self.subunit_count = subunit_count
subunit_size = math.ceil(hidden_size / subunit_count)
self.rnn = []
hidden_size_remaining = hidden_size
for i in range(self.subunit_count):
self.rnn.append(nn.GRU(input_size=input_size, hidden_size=min(hidden_size_remaining, subunit_size), num_layers=num_layers, bidirectional=bidirectional, dropout=dropout))
self.rnn = nn.ModuleList(self.rnn)
def forward(self, x):
out = None
for i in range(self.subunit_count):
if out is None:
out, hidden = self.rnn[i](x)
else:
out2 = self.rnn[i](x)
out = torch.cat((out, out2[0]), dim=-1)
hidden = torch.cat((hidden, out2[1]), dim=-1)
return out, hidden
```

Unfortunately, this is maybe the fifth failed attempt to come up with a technique that runs in parallel - no matter what I try it runs sequentially (including reducing the already minimal batch size of 128 by the number of RNNs). I am running on a single V100.

Any suggestions are appreciated.