Hello Cyril,
As far as I am aware the pruning module does not prune models by removing weights or something. Instead, it adds new parameters called weight_orig
and weight_mask
. The weight_orig
parameter stores the original weights. The pruning is performed by applying the weight_mask
, which is a tensor of 0s and 1s depending on whether that weight is pruned, in the forward and backward passes to mask them out (by using forward_pre_hook
etc.). So pruning currently actually requires a bit more memory and a bit more computation.
I asked a question myself recently on what would be the most straightforward way to obtain speedups but it is not trivial unfortunately (Discussion of practical speedup for pruning).