Update the input of model(output of training model as input to another pretrained model) in a loop

    for batch_idx,img in enumerate(loaders['rgbimage_dataloader']):
        optimizer.zero_grad()
        # move to GPU
        if use_cuda:
            real = img[0].to(device)
            sketch_image = cycle_model(real)
        else:
            real = img[0].to(device)
            sketch_image = cycle_model(real)
        
        styleGan_output = model(real,sketch_image,Target_color.float(),noise)
       ####Pretrained model#################
        fakeEdge_image = cycle_model(styleGan_output)   
        
        Edge_loss = criterion(fakeEdge_image,sketch_image)   
        total_loss = Edge_loss
        total_loss.backward()
        optimizer.step()
        
       ####This is the way i am saving ########################
       torch.save(model.state_dict(),f"model_history/mu_stylegan{epoch}.pth")

In this way is the the model saving state of both the pretrained model as well ?
or How could we do that ?

I don’t know which model is the pretrained one, but in your current code you are storing the state_dict of model only, which would contain all properly registered parameters, buffers, and other submodules.

Hi Patrick,

The cycle model is the pretrained one in the script .

Is it necessary to save this ?

Problem is that if I am using criterion as above my model is not improving.

While if am using below (not their in above script)

Criterion(styleganoutput,real) it improves .
(I don’t want to use the above Criterion as I need to disentangle the output)

If cycle_model has parameters or buffers you would also need to save its state_dict.
Make sure that all parameters get a valid gradient after the first backward call and thus that the computation graph is not detached.

Ok patrick…

I have few transformations done after getting the model output and cycle gan like getting color information and then using it as a criterion.

Is this state also save ,if not then how is the process of saving ?

I’m not exactly sure which issue you are currently trying to solve.

To save and restore the training you would need to save all trained parameters, buffers, etc.
To do so, you would need to save the state_dict from all models, the optimizer, and schedulers.
This will make sure that the internal “states” of these objects are properly restored and you could continue the training or perform the inference.

On the other hand, if you are trying to debug the issue that your models are not training properly, make sure the computation graph is not detached e.g. by checking the .grad attributes of all parameters which should be trained.

If these transformations do not have an internal state you wouldn’t have to save anything (there won’t be anything to save besides the actual source code).

Hi Patrick,

Apologies for the confusion this is my entire training loop.

###Below if my Code

   valid_loss_min = valid_loss_min_input 

for epoch in range(start_epochs, n_epochs+1):
    # initialize variables to monitor training and validation loss
    train_loss = 0.0
     ###################
    # train the model #
    ###################
    model.train()
   
    for batch_idx,img in enumerate(loaders['rgbimage_dataloader']):
        optimizer.zero_grad()
        # move to GPU
        if use_cuda:
            real = img[0].to(device)
            sketch_image = cycle_model(real)
        else:
            real = img[0].to(device)
            sketch_image = cycle_model(real)
       
       ###Getting output from main model#####
        styleGan_output = model(real,sketch_image,Target_color.float(),noise)
     ###Feeding to the pretrained model########
        fakeEdge_image = cycle_model(styleGan_output)            

###Unormalizing Image to feed to PIL image

        x_tensor = transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])(styleGan_output)
        min_i = x_tensor.min(dim=(1), keepdim=True).values.min(dim=(2), keepdim=True).values
        max_i = x_tensor.max(dim=(1), keepdim=True).values.max(dim=(2), keepdim=True).values
        x_tensor = ((x_tensor-min_i) / (max_i - min_i)) * 255
        styleGAN_transformedimage = [transforms.ToPILImage()(x_) for x_ in x_tensor.type(torch.uint8)]

###Get Dominant Color

        styleGAN_transformedimage = [get_dominant_color(x) for x in styleGAN_transformedimage]
      
        Avg_color = [torch.tensor(x,dtype=float,requires_grad=False) for x in styleGAN_transformedimage]
        Avg_color = [color.to(device) for color in Avg_color]        
        Avg_color = torch.stack(Avg_color)
        
        Target_color_match = torch.cat(len(Avg_color)*[Target_color])
        Avg_color = torch.unsqueeze(Avg_color,1)

###Loss calculation########

        Edge_loss = criterion(fakeEdge_image,sketch_image)   
        color_loss = criterion(Avg_color,Target_color_match)
        total_loss = Edge_loss + color_loss
        total_loss.backward()
        optimizer.step()
        
         
        train_loss = train_loss + ((1 / (batch_idx + 1)) * (total_loss.data - train_loss))
     # print training/validation statistics 
    print('Epoch: {} \tTraining Loss: {:.6f}'.format(epoch,train_loss))
    
    del total_loss
    del styleGan_output
    del fakeEdge_image
    del Edge_loss
  
   torch.save(model.state_dict(),f"model_history/mu_stylegan{epoch}.pth")

This is my entire training loop

Below steps in the above code are not from the model output i.e sketch_image,Avg_color and Target_color_match.

Edge_loss = criterion(fakeEdge_image,sketch_image)
color_loss = criterion(Avg_color,Target_color_match)

Does states of all these to get saved below

torch.save(model.state_dict(),f"model_history/mu_stylegan{epoch}.pth")

Below is the way i am Evaluating:

Target_color = torch.zeros(1, 1, 3, dtype=torch.float)
Target_color[:,:,2] = 255
Target_color_blue = Target_color.to(device)

model.load_state_dict(torch.load(os.path.join(“model_history/mu_stylegan178.pth”),map_location=torch.device(‘cuda’)))
#progress_bar = tqdm(enumerate(trainA_dataloader,trainB_dataloader),total=len(trainA_dataloader))
model.eval()
outputs = []
for batch_idx, img in enumerate(loaders[‘test_rgbimage_dataloader’]):
if use_cuda:
real = img[0].to(device)
sketch_image = Cyclegan_model(real)
recon = mu_stylegan(real,sketch_image,Target_color_blue.float(),noise)
sketch_fake_test = recon.detach().cpu().numpy()
for img in sketch_fake_test:
img = img/2+0.5
plt.imshow(np.transpose(img, (1, 2, 0)))
plt.show()

I’m still unsure which issue we are debugging, but in any case:

color_loss = criterion(Avg_color,Target_color_match)

won’t calculate the gradients for the model parameters, since you were recreating tensors and are thus detaching the computation graph in:

Avg_color = [torch.tensor(x,dtype=float,requires_grad=False) for x in styleGAN_transformedimage]

Hi Patrick,

I converted this as a tensor back as i am sending the the image in PIL format to the below function(get_dominant_color)

Code

styleGAN_transformedimage = [get_dominant_color(x) for x in styleGAN_transformedimage]

What way i can do this to retain it as a tensor

Also what “detaching the computation graph” mean ?what have to do to fix this in the step ?

Avg_color = [torch.tensor(x,dtype=float,requires_grad=False) for x in styleGAN_transformedimage]

If you are using other libraries, such as numpy or PIL, you would detach the tensor from the computation graph since Autograd isn’t able to track these operations. This would mean that previous operations are not attached to the newly created tensors in:

Avg_color = [torch.tensor(x,dtype=float,requires_grad=False) for x in styleGAN_transformedimage]

and thus no parameters used in the previous operation will get any gradients.
You would have to either use PyTorch operations or write custom autograd.Functions as described here.

How can we find dominant color of an image in pytorch from the incoming Tensor.have to remove all the white pixels outside the edges of the tensor ?.

The above operations needs to take place in the training loop …

I have built this using PIL in my above training loop

I don’t know what exactly get_dominant_color does, but you should be able to mask the pixels in the tensor directly.

Hello patrick ,domiant color means the most used color in a image out of all …