3D object reconstruction Dataset , model problem

Hello Everone. I am working on 3D image reconstruction using a single image. Now
I have trained my model but it’s not working. Due to the lack of guidance in 3D, I can’t figure out the error. Please help me if there is some error in my approach to training.

First, let me explain what I am following.

1.I have taken synthetic dataset(I am uploading the mini dataset for 1 type of object per class).
Dataset-
https://drive.google.com/open?id=1AMjKQUujUlNSRRqnoKsJSteNed9mTEiI

2.I have taken 4 classes(objects) with 100 different types of objects per class.

  1. Each object is having 24 images where all these 24 images have one binvox file.

3.so I have associate each image with its 3D model separately.

For eg ->we have one chair class.Lets take a type of chair which have 24 images and one 3D model.

So I associate each of the image with 3D model.

4.Now I put them into traning with a batch size of 24 without shuffling.

5.Learning rate,loss is according to your paper.

6.after training to 50 epochs my average loss reduces from 11.92 to 4.76.

As I check after training, my model is outputting the same output for all the classes.

7.i have also used batch normalisation with Xavier initialisation.

8.after training to test I have taken argmax(returns indices of max value) along the depth or channel dimensions(dim=1).

My input shape-[400,24,128,128,3] to the model.

(Here 400 means 4 classes *100 different objects per class)

Output by model something like-[24,2,32,32,32]

After argmax -[24,32,32,32]

This is all I have done.

Please rectify me if I have done something wrong in preparation of dataset(or I have to extend my dataset by getting more images or augmentation). Or there is a different way to train these types of models. Or at argmax.

Here is the given model and training code–

n_epochs = 10
#model=model.cuda()
model.train()
train_loss = 0.0
for epoch in range(1, n_epochs+1):
    print("number of epoch" ,epoch)
    print("ALL ABOUT LOSS(training)--------",(train_loss/(500)),"----------/n")
    train_loss = 0.0 

    model.train()
    for i in range(len(arr)):
                  
                  data=arr[i].cuda()
                  tar=arr_3d[i].cuda()
                  tar=tar.long()
                  optimizer.zero_grad()
                   
                  output=model(data)
                  
                  loss = criterion(output, tar)
                  loss.backward()
                    # perform a single optimization step (parameter update)
                  optimizer.step()
                    # update training loss
                  train_loss += loss.item()*data.size(0)
                  
                  if(i%100==0 and i!=0):
                    scheduler.step()
                    print(scheduler.get_lr())


    

Model –


@ptrblck Please help if possible
I have also redesigned the dataset by taking 4 classes with 25 different objects each (shape Net dataset). And use augmentation by different transformations to get around 4 different images for each image per object per class.

The code looks generally alright.
Which criterion are you using and what is the last non-linearity in your model?

PS: Tagging certain people might discourage others to answer in your thread and as you can see I’m quite clueless here. :wink:

@ptrblck thanks for your reply.
I am using Cross Entropy Loss as criterion. Last non Linearity in my model is Softmax across dimension=1 ie. across the channels[else i can say across the depth].
In my case my model outputs[batch_size,2,32,32,32]. So i took softmax across dim=1.

PS: Your answers always helps.And this gives me motivation to carry further my work.Thanks to you.:innocent:

nn.CrossEntropyLoss expects raw logits, so remove the softmax at the output of your model and try to train it again.

That’s good to hear and you are most welcome! :wink:

Yeh I will try it. Thanks

Hey @Ashish_Gupta1 I am working on the same problem if you don’t mind can you share the code you used to load and render 3d models in python

@Muhammad_Wajahat For my problem I have used pascal voc and shapeNet. The shapeNet dataset 3D models comes in .binvox format and the link for thr binvox lib is… https://github.com/dimatura/binvox-rw-py

Let’s connect via linkdin-- https://www.linkedin.com/in/ashish-gupta-36934874

@Ashish_Gupta1 sure lets connect via linkedin: https://www.linkedin.com/in/muhammad-wajahat-614249118/
and I am also using shapeNet but I am using obj files because I want to use point cloud for my neural network and meshes to render images of shapeNet.
Also I want to know how you rendered the images because I am using blender api to render images but it is messing up the textures of my renders