Tensorboard writer.add_embedding example

Xanthan · March 23, 2022, 8:46am

I want to create a visualization on tensorboard of the data just like the one here:

I’m loading my image data (.jpg files) from a folder and hence I get errors as .data and .targets are not present. What would be the correct way to use add_embedding in my case.

This is how I’m loading my data:

train_data_dir = "D:\\dataset\\train"
transform = transforms.Compose([transforms.Resize(255),transforms.CenterCrop(224),transforms.ToTensor()])

train_dataset = datasets.ImageFolder(train_data_dir, transform=transform)

ptrblck · March 24, 2022, 2:20am

The targets are used to get the class labels and pass them to add_embedding.
However, add_embedding only expects an input tensor while other input arguments are optional. If you don’t have the targets or don’t want to add labels to the plot, you could skip it.

Xanthan · March 24, 2022, 7:06am

yeah i get that. But my problem was actually with how do I actually use writer.add_embedding with a ImageFolder object or a Dataloader object as none of these objects have data and target as an attribute. Later, I just tried images, labels = next(iter(train_dataloader)) which i guess works as well. But the only problem I have is, the images in the train_dataloader are not grayscale, hence there is an error AssertionError: #labels should equal with #data points. Below is the implementation:

train_dataloader = torch.utils.data.DataLoader(train_dataset, batch_size=32, shuffle=True)
images, labels = next(iter(train_dataloader))
image, label = select_n_random(images, labels)

# get the class labels for each image
class_labels = [classes[lab] for lab in label]

# log embeddings
print(image.shape, label.shape)

features = image.view(-1, 28 * 28)
print(features.shape)
writer.add_embedding(features,
                    metadata=class_labels,
                    label_img=images.unsqueeze(1))
writer.close()

As, image.view(-1, 28 * 28) combines my batch and channels (RGB) it throws an error. So, is there a way to convert the RGB tensor to grayscale? and no I can’t use transform.grayscale for this (as my model requires RGB input)

ptrblck · March 24, 2022, 7:12am

ImageFolder uses the internal .samples and .targets attributes so you could use these.

Xanthan · March 24, 2022, 7:20am

yes, that solves the .data missing attribute problem. But, the tensor still needs to be converted to 1 channel image before doing the operation features = image.view(-1, 28 * 28) , any ideas on that?

ptrblck · March 24, 2022, 7:28am

add_embedding expects a 2D embedding matrix in the shape [batch_size, features] not an input image.
I don’t know why you are trying to use the inputs, but you could transform them to grayscale (you wouldn’t pass them to the model anyway) or flatten the channel dimension with the spatial dimensions into the “feature” dimension: x = x.view(x.size(0), -1).

Xanthan · March 24, 2022, 8:40am

oh okay got it. I solved it by extracting the features from my model (output of the fully connected layer) which I think I was suppose to do at the first place. thanks for the help!