Convert TwoStream Inception I3D from Keras to Pytorch

yellowishlight · May 14, 2020, 8:51am

Hello @ptrblck,
x1 and x2 shapes are :

print(x1.shape, x2.shape)

=> torch.Size([2, 11, 2]), torch.Size([2, 11, 2])

1st dimension (2) represents the batch_size
2nd dimension (11) represents the number of classes.
I don’t know what do represent the 3rd dimension.

Here is a reminder of the logits layer :

end_point = 'Logits'
self.avg_pool = nn.AvgPool3d(kernel_size=[2, 7, 7],
                              stride=(1, 1, 1))
self.dropout = nn.Dropout(dropout_keep_prob)
self.logits = Unit3D(in_channels=384+384+128+128, output_channels=self._num_classes,
                      kernel_shape=[1, 1, 1],
                      padding=0,
                      activation_fn=None,
                      use_batch_norm=False,
                      use_bias=True,
                      name='logits')

My inputs are the following :

rgb_clips : torch.Size([2, 3, 20, 224, 224])
flow_clipss : torch.Size([2, 2, 20, 224, 224])

1st dimension represents the batch_size
2nd dimension the channels (3 for rgb, 2 for optical flow)
3rd represents the number of frames that were selected from the clips
4th dimension the weight
5th the height

My problem seems to be similar to this one : How to Concatenate layers in PyTorch similar to tf.keras.layers.Concatenate