I have the following data:
Ground truth image
Segmentation mask image
And a .npy
file that contains the pixel wise labels for the ground truth.
I want to use this data to train a FCN from scratch. The structure of the FCN is as follows -
Conv2D
Dropout
BN
Activation
This block is repeated three times to finish the model.
How do I pass the .npy
data to a FCN so that I can train it from scratch to generate segmentation masks?
You should be able to load the numpy array via np.load
, transform it to a tensor via torch.from_numpy(array)
, and pass the tensor to the PyTorch model.
Thanks this worked! However, my classification accuracy is really bad. My network is as follows:
class label_net_3c(nn.Module):
def __init__(self):
super(label_net_3c, self).__init__()
self.conv1 = nn.Conv2d(3, 3, kernel_size = 1)
self.bn1 = nn.BatchNorm2d(3)
self.act1 = nn.ReLU(inplace = True)
self.conv2 = nn.Conv2d(3,2, kernel_size = 1)
self.bn2 = nn.BatchNorm2d(2)
self.act2 = nn.ReLU(inplace = True)
self.conv3 = nn.Conv2d(2, 34, kernel_size = 1)
self.act3 = nn.Softmax(dim = 1)
def forward(self, x):
x = self.conv1(x)
x = self.bn1(x)
x = self.act1(x)
x = self.conv2(x)
x = self.bn2(x)
x = self.act2(x)
x = self.conv3(x)
x = self.act3(x)
return x
it’s a pixel wise classifier so I pass batches of pixels of size N such that the input is N x 3 x 1 x 1
for a given RGB image. Any tips on how to improve the accuracy?
I guess you are using nn.CrossEntropyLoss
as the loss function for your segmentation use case.
If so, remove the nn.Softmax
layer as nn.CrossEntropyLoss
expects raw logits not probabilities.
Gotcha, I assume nn.CrossEntropyLoss
. has a built in Softmax operation. Although I did what you suggested but my accuracy is still really bad.
Yes, nn.CrossEntropyLoss
calls F.log_softmax
and nn.NLLLoss
internally.
Could you try to overfit a small dataset (e.g. just 10 samples) by playing around with some hyper-parameters? If this still doesn’t work there might be another issue in the code which I missed.
Hi, I think my features were just too weak since I was trying to see if I could get some classification results with a few samples only. I am trying to now plot the intermediate features.
I have a tensor of size (1,32,256,256) - can I plot a t-sne for this? just want to visualize the data dont want to compare it to the ground truth.
I figured out the problem. I am doing
m = pixel_classifier()
pred = m(train_batch)
print(pred.grad_fn)
# <AddmmBackward0 object at 0x7f865771f590>
_, pred = torch.max(pred, dim = 1)
print(pred.grad_fn)
# None
Seems like torch.max()
is breaking the graph and my loss function + optimizer parameters arent updating for this reason.
Is there a workaround for torch.max()? I need it to get the labels for the predictions.
torch.argmax
(i.e. the second return value from torch.max
) is not differentiable as the gradients would be almost everywhere zero.
You can use it to calculate the accuracy but not to train the model.
For a multi-class classification use nn.CrossEntropyLoss
and pass the logits to it.