Is my timeDistributed Layer doing the same thing as a convolutional layer with a (1,1) kernal?

So after using multiple convolutional layers on an image, I end up with (b, c, x, y). I am now wanting each of my spatial units with their channels to go through their own separate linear layers.

I believe I have accomplished this with a time-distributed layer via this code: (Please excuse my coding verbosity)

class TimeDistributed(nn.Module):
    def __init__(self, module):
        super(TimeDistributed, self).__init__()
        self.module = module

    def forward(self, x):
        # input shape =  batch, channel, h, w
        # transform into batch, timestep, channel
        sh = x.size()
        #isolate batch, channel, x and y sizes
        b = sh[0]
        c = sh[1]
        x = sh[2]
        y = sh[3]
        #move around and reshape
        x = x.permute(0,2,3,1).contiguous().view(b, x*y, c)
        #put it through the neural module. 
        x = self.module(x)
        # Reshape it back
        x = x.permute(0, 2, 1).contiguous()
        x = x.view((b, :, x, y)) 
        return x

With the neural module being a series of linear layers. I believe this should work as linear layers can take 3d input. (correct me if I am wrong).

Now that I have done this, however, I am wondering whether I could have accomplished the same thing via a conv layer with a (1,1) kernel. Is my thinking there correct? Are there any advantages/disadvantages between the two techniques?

Let future me answer past me: The answer is yes. A 1 by 1 convolution is essentially the same as doing a time distributed layer over the spatial dimensions. If you are in this situation I suggest the 1by1 conv because it is more efficient.