I am trying to run code for lottery-ticket hypothesis, where first I have a model initialised with name model_pre which I will train for 30 epochs with linear increase in warmup and then save the model state dict and opt state dict.
Next step I start the pruning the models first with 100% weights and then so on. I then initialise same model as mode_pre with name model and load the model_pre weights and then train model with 100% weights and also the same optimiser after the warmup before pruning. This seems to give acc around 87% but it is expected to give around 93%.
When I do the exact same steps with initialising extra optimiser step with the model after warmup(model_pre) and then loading the optimizer with optimizer stated after the warmup , I get an acc around 93%. Its the same thing I am doing in bothways, I dont know why that happens. Can any one point me in the right direction.
opt_class, opt_kwargs = load.optimizer(args.optimizer)
optimizer = opt_class(generator.parameters(model_pre), lr=0.1, weight_decay=1e-4,momentum = 0.9, **opt_kwargs)
#scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[20,40], gamma=10)
#scheduler= torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max = args.pre_epochs)
#optimizer = optim.SGD(model_pre.parameters(), lr=0.1, weight_decay = 1e-4, momentum=0.9)
full_train(model_pre, loss, optimizer,train_loader, test_loader, device, args.warmepochs, args.verbose)
torch.save(model_pre.state_dict(),"{}/model.pt".format(args.result_dir))
torch.save(optimizer.state_dict(),"{}/optimizer.pt".format(args.result_dir))
#torch.save(scheduler.state_dict(),"{}/scheduler.pt".format(args.result_dir))
model = load.model(args.model, args.model_class)(input_shape,
num_classes,
args.dense_classifier,
args.pretrained).to(device)
#optimizer = opt_class(generator.parameters(model), lr=0.1, weight_decay=5e-4,momentum = 0.9, **opt_kwargs)
#scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[20,40], gamma=10)
model.load_state_dict(model_pre.state_dict())
#torch.save(model_pre.state_dict(),"{}/model.pt".format(args.result_dir))
torch.save(model.state_dict(),"{}/model.pt".format(args.result_dir))
torch.save(optimizer.state_dict(),"{}/optimizer.pt".format(args.result_dir))
model.load_state_dict(torch.load("{}/model.pt".format(args.result_dir), map_location=device))
optimizer = opt_class(generator.parameters(model), lr=0.1, weight_decay=1e-4,momentum = 0.9, **opt_kwargs)
#optimizer = optimizer
optimizer.load_state_dict(torch.load("{}/optimizer.pt".format(args.result_dir), map_location=device))
In the second code snippet , in the second line where I initialise the opt again works well, but If I comment that line and run the model its worse.