Can I directly load pre-trained weights from common convolution to dilated convolution?

lity · March 8, 2019, 3:05am

Hi, I want to do experiment on dilated convolution. However, I have no time and GPUs to pre-trained on ImageNet. So I want to load pre-trained weights on ImageNet though they are common convolution which have 1 rate. From 3x3 common conv whilch has 1 rate to 3x3 dilated conv which has 2 rate, the numbers of parametres are same.

I have a try and it seems that dilated convolution can load weights successfully. However, the performance is very bad and the net cannot work normally. So I want to know, can I directly load pre-trained weights from common convolution to dilated convolution? It will cause some problem？Or just because there is no real pre-train on Imagenet, the performance is very poor.

ptrblck · March 11, 2019, 5:09pm

I guess the performance is bad, as the activations might be completely different using dilated convolutions.
Are you just checking the performance or are you also fine tuning the model?
The latter might help adapt the dilated convolutions to your use case.

lity · March 14, 2019, 1:47am

I fine tune the model and the result is a little lit bad. Perhaps I need to train from scratch in ImageNet?

MingHongL · May 9, 2019, 2:04am

Hi,
I also did some experiment on dilated convolution. but I found that using the dilated convolution will cause several times the training time than convolution without dilation. I want to know if this situation is normal.
if you know the answer, could you tell me？Thanks!

ptrblck · May 9, 2019, 11:00am

You might solve the performance issue setting torch.backends.cudnn.benchmark=True, if your input sizes are static (i.e. they don’t change in each iteration).
Have a look at this issue for more information.

This issue might be fixed by cudnn 7.5.1, which was shipped in pytorch 1.1.
Please give it a try

MingHongL · May 10, 2019, 2:00am

Thanks for your reply！It is really helpful.