Module prune has no effect on speed and memory consumption

I am developing gpu friendly app , to do so I aimed for pruning the model weights to reduce gpu memory usage and increase speed

pruning strategy

parameters_to_prune = ( (generator.dense_motion_network.hourglass.encoder.down_blocks[0].conv,'weight'), (generator.dense_motion_network.hourglass.encoder.down_blocks[1].conv, 'weight'), (generator.dense_motion_network.hourglass.encoder.down_blocks[2].conv, 'weight'), (generator.dense_motion_network.hourglass.encoder.down_blocks[3].conv, 'weight'), (generator.dense_motion_network.hourglass.encoder.down_blocks[4].conv, 'weight'), (generator.dense_motion_network.hourglass.decoder.up_blocks[0].conv,'weight'), (generator.dense_motion_network.hourglass.decoder.up_blocks[1].conv,'weight'), (generator.dense_motion_network.hourglass.decoder.up_blocks[2].conv,'weight'), (generator.dense_motion_network.hourglass.decoder.up_blocks[3].conv,'weight'), (generator.dense_motion_network.hourglass.decoder.up_blocks[4].conv,'weight'), (generator.first.conv,'weight'), (generator.down_blocks[0].conv,'weight'), (generator.down_blocks[1].conv,'weight'), (generator.up_blocks[0].conv,'weight'), (generator.up_blocks[1].conv,'weight') )

prune.global_unstructured( parameters_to_prune, pruning_method=prune.L1Unstructured, amount=0.2, )

by varying amount i got different final results , that means that pruning is done correctly , but gpu usage and inference time are the same when amount for example is 0.1 and 0.9