(Anyone?) question: Can I update my model structure (not the trainable parameters) during the training process?

LongWU · March 29, 2021, 1:52pm

Hi there,
I want to know if I can change the model structure during the training. I know there are trainable parameter that get updated every backward propagation, and that’s not what I am asking.

So for example, I have a initial model with a nn.conv2d with input size of (1, 3,11,22) and one nn.linear(11,22), however during the training I find that some feature is not important or pure noise, so I need to change my nn.linear to (9,22).

Basically, I want the model update the structure itself so the model after the training might total different than the initial one. It will getting more efficient by updating itself during the training by focus more on important feature rather than noise.

But the auto machine learning is another story which I think it’s total different strategy. Here I just want to some minor adjust to the model without autoML.
Thanks for any advise.

ptrblck · March 31, 2021, 3:56am

Yes, you can create new nn.Parameters after each iteration manually.
However, you would need to pass these new parameters to the optimizer again, so that they’ll be updated. In case the optimizer has internal states, you could also try to reuse (some) of them or let the optimizer just create new ones for the new parameter.

Also, you would have to come up with a strategy to change the parameters. I.e. you could slice the weight matrix or reduce it in any other way (e.g. mean) to decrease the features and you could e.g. concatenate a randomly initialized tensor to the weight matrix to increase the size of the weight matrix.
Note that changing the out_features of one layer would also need changes in the subsequent layer in its in_features (and thus also internal parameters).

Given that, it might be easier to use the functional API and work directly with all parameters.

LongWU · March 31, 2021, 9:24am

Thanks for the answer. I have think of that and it sounds very error prone.
In order to train the existing parameter after the adjustment you have to keep the right parameter and prune the unnecessary one. I haven’t try yet, but considering the large training epoch and so the number of adjustment need to make, it’s very likely to keep the wrong parameter in the next train.
I personally think adjust model in real time is very agile and should be a demanding feature, so I wonder if there are any best practice or framework already address this?