RuntimeError: Expected tensor to have size 256 at dimension 1, but got size 3 for argument #2 'batch2' (while checking arguments for bmm)

When I execute the program mapping_mat = torch.matmul(frame, scaling_mat),there is a RuntimeError: Expected tensor to have size 256 at dimension 1, but got size 3 for argument #2 ‘batch2’ (while checking arguments for bmm).
frame is torch.Size([1, 3, 256, 256])
scaling_mat is torch.Size([1, 3, 3])

scaling_mat should be of size (1, 256, 3) for matmul to work.

Check the pytorch doc for input dimension instructions.

@ Kushaj Thanks for your reply. frame is an input image, and templateimgw is obtained by warping the template with a homography.
I have wrote the programm to compute the IOU between two images, but there are some errors in these codes below.
ZOOM_OUT_SCALE = 4
batch_size = frame.shape[0]
fake_template = torch.ones([batch_size, 1, 74 * ZOOM_OUT_SCALE, 115 * ZOOM_OUT_SCALE], device=frame.device)
scaling_mat = torch.eye(3, device=frame.device).repeat(batch_size, 1, 1)
scaling_mat[:, 0, 0] = scaling_mat[:, 1, 1] = ZOOM_OUT_SCALE
target_mask = warp.warp_image(fake_template, scaling_mat, out_shape=fake_template.shape[-2:])
mapping_mat = torch.matmul(frame, scaling_mat)
mapping_mat = torch.matmul(templateimgw.inverse(), mapping_mat)output_mask = warp.warp_image(fake_template, mapping_mat, out_shape=fake_template.shape[-2:])
output_mask = (output_mask >= 0.5).float()
target_mask = (target_mask >= 0.5).float()

What is the error? Show the complete log for easy debugging.

I want to compute the IOU between two images “frame” and “templateimgw”
The code is as follow.

ZOOM_OUT_SCALE = 4
batch_size = frame.shape[0]
fake_template = torch.ones([batch_size, 1, 74 * ZOOM_OUT_SCALE, 115 * ZOOM_OUT_SCALE], device=frame.device)
scaling_mat = torch.eye(3, device=frame.device).repeat(batch_size, 1, 1)
scaling_mat[:, 0, 0] = scaling_mat[:, 1, 1] = ZOOM_OUT_SCALE
target_mask = warp.warp_image(fake_template, scaling_mat, out_shape=fake_template.shape[-2:])
mapping_mat = torch.matmul(frame, scaling_mat)
mapping_mat = torch.matmul(templateimgw.inverse(), mapping_mat)
output_mask = warp.warp_image(fake_template, mapping_mat, out_shape = fake_template.shape[-2:])
output_mask = (output_mask >= 0.5).float()
target_mask = (target_mask >= 0.5).float()

The error is as follow.

Traceback (most recent call last):
File “ioucompute.py”, line 199, in mapping_mat = torch.matmul(frame, scaling_mat)
RuntimeError: Expected tensor to have size 256 at dimension 1, but got size 3 for argument #2 ‘batch2’ (while checking arguments for bmm)

I haven’t checked the logic of your code. But your code to work, you have to change this line

scaling_mat = torch.eye(3, device=frame.device).repeat(batch_size, 1, 1)

to

scaling_mat = torch.eye(256, device=frame.device).repeat(batch_size, 1, 1)

I have changed this line

scaling_mat = torch.eye(3, device=frame.device).repeat(batch_size, 1, 1)

to

scaling_mat = torch.eye(256, device=frame.device).repeat(batch_size, 1, 1)

but there is a error as follow.

Traceback (most recent call last):
File “ioucompute.py”, line 197, in
target_mask = warp.warp_image(fake_template, scaling_mat, out_shape=fake_template.shape[-2:])
File “warp_image.py”, line 46, in warp_image
xy_warped = torch.matmul(H, xy) # H.bmm(xy)
RuntimeError: Expected tensor to have size 256 at dimension 1, but got size 3 for argument #2 ‘batch2’ (while checking arguments for bmm)

The “warp_image” function is as follow.
def warp_image(img, H, out_shape=None, input_grid=None):
if out_shape is None:
out_shape = img.shape[-2:]
if len(img.shape) < 4:
img = img[None]
if len(H.shape) < 3:
H = H[None]
assert img.shape[0] == H.shape[0], ‘batch size of images do not match the batch size of homographies’
batchsize = img.shape[0]
# create grid for interpolation (in frame coordinates)
if input_grid is None:
y, x = torch.meshgrid([
torch.linspace(-utils.BASE_RANGE, utils.BASE_RANGE,
steps=out_shape[-2]),
torch.linspace(-utils.BASE_RANGE, utils.BASE_RANGE,
steps=out_shape[-1])
])
x = x.to(img.device)
y = y.to(img.device)
else:
x, y = input_grid
x, y = x.flatten(), y.flatten()

# append ones for homogeneous coordinates
xy = torch.stack([x, y, torch.ones_like(x)])
xy = xy.repeat([batchsize, 1, 1])  # shape: (B, 3, N)
# warp points to model coordinates
xy_warped = torch.matmul(H, xy)  # H.bmm(xy)
xy_warped, z_warped = xy_warped.split(2, dim=1)

# we multiply by 2, since our homographies map to
# coordinates in the range [-0.5, 0.5]
xy_warped = 2.0 * xy_warped / (z_warped + 1e-8)
x_warped, y_warped = torch.unbind(xy_warped, dim=1)
# build grid
grid = torch.stack([
    x_warped.view(batchsize, *out_shape[-2:]),
    y_warped.view(batchsize, *out_shape[-2:])
],
    dim=-1)

warped_img = torch.nn.functional.grid_sample(
    img, grid, mode='bilinear', padding_mode='zeros')
batchsize0 = img.shape[0]
batchsize1 = img.shape[1]
batchsize2 = img.shape[2]
if utils.hasnan(warped_img):
    print('nan value in warped image! set to zeros')
    warped_img[utils.isnan(warped_img)] = 0
return warped_img

You have to check the logic of your code again. The warp function takes scaling_mat as input. scaling_mat was earlier of shape (3,3) but now is (256,256).

scaling_mat is the homography matrix with shape [B, 3, 3] or [3, 3], since a matrix with shape [256, 256] is not a valid transformation matrix.