[cs231n Query] output with shape [1, 32, 32] doesn't match the broadcast shape [3, 32, 32]

VDzungsignature · June 18, 2022, 9:39am

Hello every one. I doing cs231n 's last assignment: Self supervised learning and have a trouble:
Here is the code:

The error is:
…
return F.normalize(tensor, self.mean, self.std, self.inplace)
File “c:\Users\Admin\Anaconda3\lib\site-packages\torchvision\transforms\functional.py”, line 363, in normalize
tensor.sub_(mean).div_(std)
RuntimeError: output with shape [1, 32, 32] doesn’t match the broadcast shape [3, 32, 32]
…

I know this issue relates to the transform: here is the transform (also the required solution of assignment so I dont want to change it)
…
def compute_train_transform(seed=123456):
“”"
This function returns a composition of data augmentations to a single training image.
Complete the following lines. Hint: look at available functions in torchvision.transforms
“”"
torch.random.seed(seed)
random.manual_seed(seed)
# Transformation that applies color jitter with brightness=0.4, contrast=0.4, saturation=0.4, and hue=0.1
color_jitter = transforms.ColorJitter(0.4, 0.4, 0.4, 0.1)
train_transform = transforms.Compose ([
# Step 1: Randomly resize and crop to 32x32.
transforms.RandomResizedCrop(32),
# Step 2: Horizontally flip the image with probability 0.5
transforms.RandomHorizontalFlip(p=0.5),
# Step 3: With a probability of 0.8, apply color jitter (you can use “color_jitter” defined above.
transforms.RandomApply(torch.nn.ModuleList([color_jitter]), p=0.8),
# Step 4: With a probability of 0.2, convert the image to grayscale
transforms.RandomApply(torch.nn.ModuleList([transforms.Grayscale()]), p=0.2),
transforms.ToTensor(),
transforms.Normalize([0.4914, 0.4822, 0.4465], [0.2023, 0.1994, 0.2010])])
return train_transform
…

I dont have this issue before. My current version is:
The torch version is 1.11.0
and torchvision version is 0.12.0

Could any one know how to solve this issue?

Thank you very much

ptrblck · June 19, 2022, 9:41pm

The issue is raised in the Noemalize transformation as it expects an input with 3 channels (since the stats contain 3 values each) while the previous GrayScale transformation will create an output containing a single channel.
You could use Grayscale(num_output_channels=3) to repeat the gray color channel, but would then have to consider changing the stats of Normalize as you are currently using different values for each (RGB) channel.

VDzungsignature · June 25, 2022, 1:22pm

I have tried transforms.Grayscale(num_output_channels=3) this method and it work .
When I train the model with the following code, I get another issue. I guess it is related to mentioned transform function:

…

# Prepare the data.

train_transform = compute_train_transform(seed=2147483647)

train_data = CIFAR10Pair(root=‘E:/ALL_COURSE/CS231n/Lecture 12/assignment3/cs231n/datasets/CIFA10’, train=True, transform=train_transform, download=True)

train_data = torch.utils.data.Subset(train_data, list(np.arange(int(len(train_data)*percentage))))

train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True, num_workers=2, pin_memory=True, drop_last=True)

# Set up the model and optimizer config.

model = Model(feature_dim)

model.load_state_dict(torch.load(pretrained_path, map_location=‘cpu’), strict=False)

model = model.to(device)

flops, params = profile(model, inputs=(torch.randn(1, 3, 32, 32).to(device),))

flops, params = clever_format([flops, params])

optimizer = optim.Adam(model.parameters(), lr=1e-3, weight_decay=1e-6)

c = len(memory_data.classes)

# Training loop.

results = {‘train_loss’: [], ‘test_acc@1’: [], ‘test_acc@5’: []} #<< – output

best_acc = 0.0

for epoch in range(1, epochs + 1):

train_loss = train(model, train_loader, optimizer, epoch, epochs, batch_size=batch_size, temperature=temperature, device=device) ==> this is where I get the error

…

I go the following errors:

TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class ‘torchvision.transforms.transforms.Compose’>

This errors come as when it loops the data loader as bellow:

…

total_loss, total_num, train_bar = 0.0, 0, tqdm(data_loader)

for data_pair in train_bar: è this is when the error comes

……

Where my train_transform would be like:

…
def compute_train_transform(seed=123456):
“”"
This function returns a composition of data augmentations to a single training image.
Complete the following lines. Hint: look at available functions in torchvision.transforms
“”"
torch.random.seed(seed)
random.manual_seed(seed)
color_jitter = transforms.ColorJitter(0.4, 0.4, 0.4, 0.1)
train_transform = transforms.Compose ([
transforms.RandomResizedCrop(32),
transforms.RandomHorizontalFlip(p=0.5),
transforms.RandomApply(torch.nn.ModuleList([color_jitter]), p=0.8),
transforms.RandomApply(torch.nn.ModuleList([transforms.Grayscale(num_output_channels=3)]), p=0.2),
transforms.ToTensor(),
transforms.Normalize([0.4914, 0.4822, 0.4465], [0.2023, 0.1994, 0.2010])])
return train_transform
…

I have tried several ways to such as include the transforms.ToTensor() into the function, howerver, it does not work.
Note: this is not what I want to ask for solution of the assignment. It is that the given code does not work with my environment (such as pytorch version, torch version, …) and it takes me quite a lot of time to fix it.

ptrblck · June 25, 2022, 11:57pm

Your CIFAR10Pair dataset seems to return a torchvision.transforms.transforms.Compose instead of the transformed samples.

PS: you can post code snippets by wrapping them into three backticks ```, which makes debugging easier.

VDzungsignature · July 1, 2022, 4:08am

Thank you,
The above problem solved. When I run training (with same code above), I got the following error:
“PicklingError: Can’t pickle <class ‘cs231n.simclr.data_utils.CIFAR10Pair’>: it’s not the same object as cs231n.simclr.data_utils.CIFAR10Pair”?
Could you help me to solve this issue?