I want to duplicate the paper’s work:Incremental network quantization.
In this paper ,the author quantize 50% parameters in a CNN network to pow of 2 and do not update these parameters(update the other 50% parameters to compensate the inaccuracy caused by quanzation)
I make a mask to indicate the parameters weather are quantized.
My code is like this:
qua_weight = qua_tensor(weight, pos_shreshold, mask_weight, max_ind, 2**3)
net.state_dict()['features.0.weight'].data = qua_weight
I found the code can run, but net.state_dict()['features.0.weight'].data = qua_weight
this sentence can’t modify the weight of layer1
.
Why and what should I do ?
I need to modify gradient. A very kind people in this forum tell me that i should use register_hook
function. In fact, i use grad×mask
,I find my solution is also work? Am I right?
Mask is a tensor of 0 and 1 elements(one mask for one layer’s weight).My code is like layer.weight.grad×mask. And then use optimizer.step()