Pre-train a fully connected network

Chieh_Wu · January 10, 2018, 5:57pm

Hello

I am not getting good results training my simple fully connected layered network. I am aware that since it is a highly non-convex function, i am probably just finding the local minimum. I have constructed my own resnet, with Kaiming initialization, and relu as activation function. In search of improving my result, I am now considering pre-training the network with rbm, since i think it will regularize my optimization problem.

My question is if anyone has any experience doing pretraining on pytorch? Does pytorch have build in method to do it ( I didn’t see any) ? Is this even a good idea ?

Thank you

ptrblck · January 10, 2018, 11:29pm

You could try fine tuning an already pre-trained network.
Depending on the amount of your data and its mismatch to the dataset used to pre-train the network, you can decide how many layers to fine tune.
The bigger the mismatch (e.g. pre-trained on “natural” images, your dataset contained medical x-ray images), the more layers I would try to fine tune.

Here you can find a nice transfer learning tutorial.

Chieh_Wu · January 11, 2018, 4:37am

That is an interesting idea for my image data. I think pytorch has a pre-trained set with mnist right? I will use this idea for my CNN model, however, what if the pre-train doesn’t already exist? For example, my network is not CNN but a simple fully connected layered resnet? Then I assume I will have to do the pre-training myself.

ptrblck · January 11, 2018, 12:20pm

You can have a look at various pre-trained models from torchvision.models.
I think these models are pre-trained on imagenet.

You are right, if you have a specific network architecture, you have to train the network yourself.
At least I don’t know any obvious technique for weight transferring between different architectures.