Does pytorch optimize the group parameter in convs?

(XingChen) #1

Does pytorch optimize the group parameter? So can efficient the MobileNet architecture.


no, group parameter right now is done via a naive for loop.

(Lolong) #3

does this do depth wise convolution by setting group parameter = input data depth?

From pytorch doc, I couldnt find any information about this new feature, I can imagine that if I do

model = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=3, stride=2),
            nn.MaxPool2d(kernel_size=3, stride=2), # output is 64 depth
            nn.Conv2d(64, 64, kernel_size=3, group=64), # will this layer use SpatialDepthWiseConvolution instead of group loop?
            ... # next layers

will this model use SpatialDepthWiseConvolution?


(Francisco Massa) #4

No, it won’t use that depthwise function, just the standard codepath for groups.
also, note that that implementation of depthwise convolutions is very naive, and will be as slow as setting groups to 64 in your case.

(Feras Almasri) #5

I didn’t really get you point here.
so if he used group=64 this doesn’t mean that each output channel is doing convolution with only one channel ?
I’m testing this function but the network is not converging so I really want to know if the code is working or not