Hi, I’m training to copy a model structure from Keras to Pytorch, but the problem is that there is no progress in learning. It’s like there is no actual training.
###################
# train the model #
###################
model.train()
steps_train = math.ceil(loaders['train_size']/batch_size)
print(f"****training stpes: {steps_train}*****")
for batch_idx in tqdm(range(steps_train)):
data, true_map, true_binary = next(loaders['train'])
# move to GPU
if use_cuda:
data, true_binary, true_map = data.cuda(), true_binary.cuda(), true_map.cuda()
## find the loss and update the model parameters accordingly
# clear the gradients of all optimized variables
optimizer.zero_grad()
with torch.set_grad_enabled(True):
# forward pass: compute predicted outputs by passing inputs to the model
output_map, output_binary = model(data)
# calculate the batch loss
loss = criterion(output_binary, output_map, true_binary, true_map)
# backward pass: compute gradient of the loss with respect to model parameters
loss.backward()
# perform a single optimization step (parameter update)
optimizer.step()
This is my custom loss function:
def my_loss(output_binary,output_map, true_binary, true_map):
loss_binary = torch.mean((output_binary - true_binary)**2)
loss_map = torch.mean((output_map - true_map)**2)
return 0.5*loss_binary + 0.5*loss_map
optimizer = optim.Adam(denseNet.parameters())
Are this part of the code legit? or have I made a rookie mistake here?