I have the following data:
Ground truth image
Segmentation mask image
.npy file that contains the pixel wise labels for the ground truth.
I want to use this data to train a FCN from scratch. The structure of the FCN is as follows -
This block is repeated three times to finish the model.
How do I pass the
.npy data to a FCN so that I can train it from scratch to generate segmentation masks?
You should be able to load the numpy array via
np.load, transform it to a tensor via
torch.from_numpy(array), and pass the tensor to the PyTorch model.
Thanks this worked! However, my classification accuracy is really bad. My network is as follows:
self.conv1 = nn.Conv2d(3, 3, kernel_size = 1)
self.bn1 = nn.BatchNorm2d(3)
self.act1 = nn.ReLU(inplace = True)
self.conv2 = nn.Conv2d(3,2, kernel_size = 1)
self.bn2 = nn.BatchNorm2d(2)
self.act2 = nn.ReLU(inplace = True)
self.conv3 = nn.Conv2d(2, 34, kernel_size = 1)
self.act3 = nn.Softmax(dim = 1)
def forward(self, x):
x = self.conv1(x)
x = self.bn1(x)
x = self.act1(x)
x = self.conv2(x)
x = self.bn2(x)
x = self.act2(x)
x = self.conv3(x)
x = self.act3(x)
it’s a pixel wise classifier so I pass batches of pixels of size N such that the input is
N x 3 x 1 x 1 for a given RGB image. Any tips on how to improve the accuracy?
I guess you are using
nn.CrossEntropyLoss as the loss function for your segmentation use case.
If so, remove the
nn.Softmax layer as
nn.CrossEntropyLoss expects raw logits not probabilities.
Gotcha, I assume
nn.CrossEntropyLoss. has a built in Softmax operation. Although I did what you suggested but my accuracy is still really bad.
Could you try to overfit a small dataset (e.g. just 10 samples) by playing around with some hyper-parameters? If this still doesn’t work there might be another issue in the code which I missed.
Hi, I think my features were just too weak since I was trying to see if I could get some classification results with a few samples only. I am trying to now plot the intermediate features.
I have a tensor of size (1,32,256,256) - can I plot a t-sne for this? just want to visualize the data dont want to compare it to the ground truth.
I figured out the problem. I am doing
m = pixel_classifier()
pred = m(train_batch)
# <AddmmBackward0 object at 0x7f865771f590>
_, pred = torch.max(pred, dim = 1)
torch.max() is breaking the graph and my loss function + optimizer parameters arent updating for this reason.
Is there a workaround for torch.max()? I need it to get the labels for the predictions.
torch.argmax (i.e. the second return value from
torch.max) is not differentiable as the gradients would be almost everywhere zero.
You can use it to calculate the accuracy but not to train the model.
For a multi-class classification use
nn.CrossEntropyLoss and pass the logits to it.