tldr:
I want to transfer trained weights from a conv layer like:
self.conv_init = nn.Conv1d(num_in_channels, num_hidden, 1)
to a new one with more input channels (initializing the rest with 0’s or according to some initialization scheme) - and everything else remaining same, like eg. :
self.conv_init = nn.Conv1d(num_in_channels+5, num_hidden, 1).
As far as i know load_state_dict will either ignore that layer or throw an error for the size mismatch. Is there a built in way to do the above? …Or another way?
story for context:
I’m working on building a net for time series data. The dataset contains sensor (let’s call them group a) readings since early 00’s. At some point after '17 extra sensors (‘group b’) where added.
I’d like to train a model on the 1st original group (to take advantage of the 2 decades of data) , and then use that as a ‘starting point’ to train another including group b.
The input is an 1d multichannel tensor, that’s a fixed size sliding window on the data where every channel is a sensor’s readings.
The 1st conv layer (shown/mentioned above) has kernel size 1, and is used to create the hidden channels that the rest of the layers will use, that will stay the same between the two models.