Hi, I’m facing a weird problem. Here I use the ultralytics model to run object detection, it gives reasonable results with predicted bonding box. Later I rebuilt the model without reading that yaml file, because it’s random weights, it gives bonding box with 0 value inside. And then the weird thing happens, when I load parameters from the pretrained ultralytics model to the rebuilt model, it gives empty bounding box. And I randomly checked the maximum loaded parameters of the rebuilt model and the pretrained one, it’s same. I don’t know what happened, but I’m sure it the problem of parameter transfer problem. Here’s the loading code I used. Hope anyone can help me. Thank you so much, I really appreciate it.
rebuilt_model.conv1.load_state_dict(pretrain_model.model._modules[‘model’][0].state_dict())
rebuilt_model.conv2.load_state_dict(pretrain_model.model._modules[‘model’][1].state_dict())
rebuilt_model.c2f1.load_state_dict(pretrain_model.model._modules[‘model’][2].state_dict())
rebuilt_model.conv3.load_state_dict(pretrain_model.model._modules[‘model’][3].state_dict())
rebuilt_model.c2f2.load_state_dict(pretrain_model.model._modules[‘model’][4].state_dict())
rebuilt_model.conv4.load_state_dict(pretrain_model.model._modules[‘model’][5].state_dict())
rebuilt_model.c2f3.load_state_dict(pretrain_model.model._modules[‘model’][6].state_dict())
…
Sorry, later I tried remove detect layer parameter transfer code, and now it can give bounding box with 0 values, but I want the model outputs the bounding box same as pretrained model, can someone help me? Thank you so much again.
In your code snippet it seems you are loading state_dict
s for a few specific layers not all modules. Could you explain why loading the “full” state_dict
won’t work as I would speculate a few layers might be missed and are thus still randomly initialized.
Hi, thank you so much for your reply. Actually I load parameters block by block, I just show part of the blocks. I’ve loaded all the 22 blocks. Just the last block detection block is where the problem happens, I feel.
rebuilt_model.detect.load_state_dict(pretrain_model.model._modules[‘model’][22].state_dict())
Could you explain why you feel this block fails? Do you see any errors os warnings when loading the parameters?
I’m so sorry, seems there’s block I mistakely put a shortcut. But state_dict function can’t detect the mismatch. Thank you for your quick and kind answer.
I’m unsure if I understand your response correctly, but is this difference in your model causing the issue?