Convert TwoStream Inception I3D from Keras to Pytorch

Hello @ptrblck,
x1 and x2 shapes are :

print(x1.shape, x2.shape)

=> torch.Size([2, 11, 2]), torch.Size([2, 11, 2]) 
  • 1st dimension (2) represents the batch_size
  • 2nd dimension (11) represents the number of classes.
  • I don’t know what do represent the 3rd dimension.

Here is a reminder of the logits layer :

end_point = 'Logits'
self.avg_pool = nn.AvgPool3d(kernel_size=[2, 7, 7],
                              stride=(1, 1, 1))
self.dropout = nn.Dropout(dropout_keep_prob)
self.logits = Unit3D(in_channels=384+384+128+128, output_channels=self._num_classes,
                      kernel_shape=[1, 1, 1],
                      padding=0,
                      activation_fn=None,
                      use_batch_norm=False,
                      use_bias=True,
                      name='logits')

My inputs are the following :

rgb_clips : torch.Size([2, 3, 20, 224, 224])
flow_clipss : torch.Size([2, 2, 20, 224, 224])
  • 1st dimension represents the batch_size
  • 2nd dimension the channels (3 for rgb, 2 for optical flow)
  • 3rd represents the number of frames that were selected from the clips
  • 4th dimension the weight
  • 5th the height

My problem seems to be similar to this one : How to Concatenate layers in PyTorch similar to tf.keras.layers.Concatenate