Understanding the use of pytorch pruning

Hello, everyone! I have this question and I couldn’t clarify it for myself. The torch.nn.utils.prune module is used only to see how the pruning algorithms may affect your model, or can you use it to actually decrease the size and inference time of your model?

I saw on a few discussions that unstructured pruning is not yet able to do that, due to its sparse nature, but from what I understood, you may able to do so with structured pruning. I haven’t been able to reduce the size or inference with structured version and I haven’t seen any clear way to do it - so tied in with the initial question: If there is a way to reduce size and time, what is a simple code to prune and save a model in order to achieve this?