Hello everyone, I am new to Pytorch, but I am loving the experience. Recently I have been trying to prune the TimeSformer model to get better inference times. I prune the model and save the new model as follows:
ARG = [12, 1,'model.pyth']
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = TimeSformer(img_size=224, num_classes=400, num_frames=8, attention_type='divided_space_time',ARGS=ARG).to(device=device)
#model.head = torch.nn.Linear(in_features=768, out_features=50, bias=True)
num_zeros, num_elements, sparsity = measure_global_sparsity(model)
print(num_zeros,num_elements,sparsity)
amount=0.9
for module_name, module in model.named_modules():
if isinstance(module, torch.nn.Linear):
prune.l1_unstructured(module,name="weight",amount=amount)
pruned_model = remove_parameters(model)
num_zeros, num_elements, sparsity = measure_global_sparsity(pruned_model)
print(num_zeros, num_elements, sparsity)
torch.save(model.state_dict(),"pruned_model_"+str(amount)+".pyth")
measure_global_sparsity measures the number of zeros, elements and sparsity before and after pruning. While remove_parameters removes the weight and bias parameters after computing masks:
def remove_parameters(model):
for module_name, module in model.named_modules():
if isinstance(module, torch.nn.Conv2d):
try:
prune.remove(module, "weight")
except:
pass
try:
prune.remove(module, "bias")
except:
pass
elif isinstance(module, torch.nn.Linear):
try:
prune.remove(module, "weight")
except:
pass
try:
prune.remove(module, "bias")
except:
pass
return model
When I prune the model, the size of the model does not change and the inference speed is also unaffected. Although the measured sparsity does increase as I increase the amount of pruning. Why are the model size and inference speed unaffected? Does anyone know what am I doing wrong here?
Thank you.