Size mismatch error. RuntimeError: Error(s) in loading state_dict for DAGMM:

Ajay_Chawda · July 7, 2022, 11:12am

I am trying to load the model for evaluation and get size mismatch error. All the parameters are same as I used in training.

The error -

Traceback (most recent call last):
  File "/p/fm/AjayChawda/ajay_chawda/eval.py", line 168, in <module>
    net.load_state_dict(torch.load(save_path), strict=False)
  File "/p/fm/AjayChawda/envs/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1497, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for DAGMM:
        size mismatch for gmm.mixtures.0.Phi: copying a param with shape torch.Size([]) from checkpoint, the shape in current model is torch.Size([1]).
        size mismatch for gmm.mixtures.1.Phi: copying a param with shape torch.Size([]) from checkpoint, the shape in current model is torch.Size([1]).

Training Block

 compression = CompressionNetwork(embedding, numerical, input_dim, output_dim, emb, args.latent_dim)
    estimation = EstimationNetwork(args.dim_embed, args.num_mixtures)
    gmm = GMM(args.num_mixtures,args.dim_embed)
    mix = Mixture(args.dim_embed)
    net = DAGMM(compression, estimation, gmm)
    optimizer =  optim.Adam(net.parameters(), lr=1e-1)
    for epoch in range(epochs):
        print('EPOCH {}:'.format(epoch + 1))
        running_loss = 0
        for i, data in enumerate(dataloader):
            rec_data = torch.cat([data[0], data[1]], -1)
            if numerical:
                rec_data = data[1]
            out = net(data[0], data[1], rec_data)
            optimizer.zero_grad()
            L_loss = compression.reconstruction_loss(data[0], data[1], rec_data)
            G_loss = mix.gmm_loss(out=out, L1=0.1, L2=0.005)
            loss = (L_loss + G_loss)/ len(data[1])
            loss.backward()
            optimizer.step()
            running_loss += loss.item()
        writer.add_scalar("Loss/train", running_loss, epoch)
        print(running_loss)
    torch.save(net.state_dict(), save_path)

Training command - `python train.py --dataset vehicle_insurance --model dagmm --latent_dim 4 --num_mixtures 2 --dim_embed 6
–encoding gel_encode --epoch 50 --batch_size 512 --file_name vehicle_insurance_dagmm_GEL_4_2

Evaluation Block

compression = CompressionNetwork(embedding, numerical, input_dim, output_dim, emb, args.latent_dim)
    estimation = EstimationNetwork(args.dim_embed, args.num_mixtures)
    gmm = GMM(args.num_mixtures,args.dim_embed)
    mix = Mixture(args.dim_embed)
    net = DAGMM(compression, estimation, gmm)
    net.load_state_dict(torch.load(save_path), strict=False)
    net.eval()
    
    for i, data in enumerate(dataloader):
        rec_data = torch.cat([data[0], data[1]], -1)
        if numerical:
            rec_data = data[1]
        out = net(data[0], data[1], rec_data)
        out = out.detach().numpy().reshape(-1)
        L =  data[2].detach().numpy().reshape(-1)
        score = np.hstack((score, out))   
        label = np.hstack((label, L))
    threshold = np.percentile(score, args.threshold, axis=0)  
    y_pred = (score > threshold).astype(int)
    y_test = label

Evaluation command - python eval.py --dataset vehicle_insurance --model dagmm --latent_dim 4 --num_mixtures 2 --dim_embed 6 --encoding GEL --save_path model/vehicle_insurance_dagmm_GEL_4_2 --threshold 95

Please help me with this issue. I have searched a lot and found it would be mismatch with the dimensions but that does seem to be the case here.

Thank you.