How to disregard neurons with zero weight?

In the paper “Learning both Weights and Connections for Efficient Neural Networks”, the author mentioned that he masks the model and make the model disregards the neurons with zeros weight. I’m trying to implement this pruning techniques, however, I found that the size of the model and the inference time would not change if I only mask the model. I’m wondering is it possible to make the model disregard the neurons with zero weights to reduce the inference time or save the model with sparse tensor to reduce the size of the model?