Determine number of class

Hello Dear all,

I have newly started using pytorch. Actually my question is not completely related to pytorch.
I want to know the image whether consist a specific object or not. Simply I mean, does a picture consist a Plane?

To determining this and using cnn, what should the number of output class be? My code is posted below. I bolded the output class number as “1”.

This method gives the below error:
File “/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py”, line 156, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
File “/usr/local/lib/python2.7/dist-packages/torch/autograd/init.py”, line 98, in backward
variables, grad_variables, retain_graph)
RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/torch/lib/THC/THCTensorCopy.cu:100

Thank you all.

class Net(nn.Module):
def init(self):
super(Net, self).init()
self.pool = nn.MaxPool2d(2, 2)
self.conv1 = nn.Conv2d(3, 10, 5)
self.conv2 = nn.Conv2d(10, 20, 5)
self.fc1 = nn.Linear(20 * 5 * 5, 256)
self.fc2 = nn.Linear(256, 1)

def forward(self, x):
    # print x.size()
    x = self.pool(F.relu(self.conv1(x)))
    x = self.pool(F.relu(self.conv2(x)))
    x = x.view(-1, 20*5*5) # flatten the sensor
    x = F.relu(self.fc1(x))
    x = self.fc2(x)
    return F.log_softmax(x)

Try F.logsigmoid if you have one output.
I’m not sure the error comes from the softmax, but the function doesn’t make sense in your use case.

1 Like

Firstly, thank you @ptrblck for your quick response. I am gonna try your recommendation asap.
Maybe I should have been more clear.
As I said, I am trying to detect existence of a specific object in an image. Actually my dataset has two classes. Half of them consist a plane, other half don’t consist plane. ( images are like there is a sky in all images, half of them have plane others don’t have).

To do this, I could not clearly determine what my network output should be. One class is okay?

Yes, one output is fine. You could use a nn.Sigmoid layer as the model’s output and BCELoss.
In case you would like to use logits, you could skip the sigmoid and just use BCEWithLogitsLoss. The second approach should be numerically more stable.

In the post before I said logsigmoid which was wrong regarding the BCELoss.

1 Like

Dear @ptrblck, I have one more question. After the using sigmoid layer like below:

class Net(nn.Module):
def init(self):
super(Net, self).init()
self.pool = nn.MaxPool2d(2, 2)
self.conv1 = nn.Conv2d(3, 10, 5)
self.conv2 = nn.Conv2d(10, 20, 5)
self.fc1 = nn.Linear(20 * 5 * 5, 256)
self.fc2 = nn.Linear(256, 1)

def forward(self, x):
# print x.size()
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 2055) # flatten the sensor
x = F.relu(self.fc1(x))
x = self.fc2(x)
return F.sigmoid(x)
How can I determine the output label?

I mean, After the image pass the network, the output become a value between [0 1]. And how can I determine the existence of the object in the image. For example, Should I choose a threshold value such as 0.5?

In the simplest case, you can just use 0.5 as the threshold to calculate your predictions.
You could also compute the ROC curve, chose a point with a suitable TRP vs FPR, and use this as your threshold.
Have a look at some examples from scikit-learn.

1 Like