I have been working on video compression lately. Even though 3D CNN is an option, I am eager to use something like ConvGRU (GRU with convolutional gates, instead of dense layer gates). The concept has been around for almost half a decade now, however I am unable to find any ready to use module of ConvGRU in torch.
Even though i hav found Git repos containing ConvGRU models, the architecture assumes by default that there isnt any spatial maxpooling in between layers. So compression of data becomes tedious.
Any help would be appreciated.
And on a second note, if I am to implement an autoencoder using ConvGRU layers, would replacing Conv2d by Conv2dTranspose inside the ConvGRU modules do the trick ? Or should i just carry on with ConvGRU and only use MaxUnpool2d() to build the decoder part ?