Hello!
I want to know how to check the gradient value while training.
Im training the ResNet-34 in CIFAR-10 (image classification task)
I think that storing all gradients requires too much memory storage.
So, i want to check the mean & variance of the gradients at every epoch.
How can i do this??
(Attach the main script below)
def main():
parser = get_parser()
args = parser.parse_args()
train_loader, test_loader = build_dataset(args)
device = ‘cuda’ if torch.cuda.is_available() else ‘cpu’
if args.resume:
ckpt = load_checkpoint(ckpt_name)
start_epoch = ckpt[‘epoch’]
curve = os.path.join('curve', ckpt_name)
curve = torch.load(curve)
train_losses = curve['train_loss']
test_accuracies = curve['test_acc']
else:
ckpt = None
start_epoch = -1
train_losses = []
test_accuracies = []
net = build_model(args, device, ckpt=ckpt)
criterion = nn.CrossEntropyLoss()
optimizer = Adam(args, net.parameters())
start_time = time.time()
for epoch in range(start_epoch + 1, args.total_epoch):
start = time.time()
train_loss, train_acc = train(net, epoch, device, train_loader, optimizer, criterion, args)
test_loss, test_acc = test(net, epoch, device, test_loader, criterion)
end = time.time()
print('Time {}'.format(end-start))
train_losses.append(train_loss)
test_accuracies.append(test_acc)
end_time = time.time()
print('End Time: {}'.format(end_time - start_time))