Join the questions. Not sure if it possible to speedup multiplication of sparse tensors for now. Global structured pruning could reduce multiplication count.
Who knows something about it, please check my related questions here Global structured pruning.