Understanding error msg "view size is not compatible with input tensor's size and stride"

Ideally, I would like to execute something like
t = torch.zeros([4, 3, 64, 64])
t[:, :, ::8, ::8].view(4, -1)
but that produces the error

RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

Unfortunately, I can’t use .reshape() or .contiguous() because of memory consumption. This code is called too often to make a copy of the tensor each time. Instead I would like to create one big tensor and slice it each time.

Is there some way to use .transpose() or something similar in combination with the above .view() to achieve my goal? Is there a way to get a more detailed error message to understand which dimension exactly is the problem?
Thanks a lot in advance!

The problem is that element spacing is irregular when you merge dimensions:

t[:,:,::8,::8].stride()
(12288, 4096, 512, 8)

So it is impossible to collapse three last numbers into one.

Thanks for your reply, could you elaborate a little? Can’t result of
t[:,:,::8,::8].view(4, -1)
be the tensor with the same underlying data in storage as t, size = (4, 3*64*64), and stride = (12288, 8)? Both 4096 and 512 are divisible by 8, so wouldn’t that be the solution view() goes for?

Yes, sorry, output from stride() should be reasoned about together with tensor size. Here, your last dimension says: increase pointer by 8 (last stride) 8 times (last dim size), but the third dimension wants to to take steps of 512 elements (it is skipping 7 segments altogether).

Stride of 8 corresponds to t[:,:,:,::8], where you have stride[i] = stride[i+1] * size[i+1]

Hi @googlebot @ToddChavezz
I am facing a similar issue
def train(epoch):
tr_loss, tr_accuracy = 0, 0
nb_tr_examples, nb_tr_steps = 0, 0
tr_preds, tr_labels = ,
# put model in training mode
model.train()

for idx, batch in enumerate(training_loader):

    ids = batch['ids']
    ids = ids.transpose(0, 1) 
    ids=ids.to(device)
    mask = batch['mask']
    mask=mask.transpose(0, 1) 
    mask=mask.to(device)
    targets = batch['targets']
    targets=targets.transpose(0, 1) 
    targets=targets.to(device)


    optimizer.zero_grad()
    print(ids.shape)
    print(mask.shape)
    outputs= model(input_ids=ids, attention_mask=mask, labels=targets)
    loss, tr_logits = outputs.loss, outputs.logits
    tr_loss += loss.item()

    nb_tr_steps += 1
    nb_tr_examples += targets.size(0)

    if idx % 100==0:
        loss_step = tr_loss/nb_tr_steps
        print(f"Training loss per 100 training steps: {loss_step}")

    # compute training accuracy
    flattened_targets = targets.view(-1) # shape (batch_size * seq_len,)
    active_logits = tr_logits.view(-1, model.num_labels) # shape (batch_size * seq_len, num_labels)
    flattened_predictions = torch.argmax(active_logits, axis=1) # shape (batch_size * seq_len,)
    # now, use mask to determine where we should compare predictions with targets (includes [CLS] and [SEP] token predictions)
    active_accuracy = mask.view(-1) == 1 # active accuracy is also of shape (batch_size * seq_len,)
    targets = torch.masked_select(flattened_targets, active_accuracy)
    predictions = torch.masked_select(flattened_predictions, active_accuracy)

    tr_preds.extend(predictions)
    tr_labels.extend(targets)

    tmp_tr_accuracy = accuracy_score(targets.cpu().numpy(), predictions.cpu().numpy())
    tr_accuracy += tmp_tr_accuracy

    # gradient clipping
    torch.nn.utils.clip_grad_norm_(
        parameters=model.parameters(), max_norm=MAX_GRAD_NORM
    )

    # backward pass

    loss.backward()
    optimizer.step()

epoch_loss = tr_loss / nb_tr_steps
tr_accuracy = tr_accuracy / nb_tr_steps
print(f"Training loss epoch: {epoch_loss}")
print(f"Training accuracy epoch: {tr_accuracy}")

Training epoch: 1
torch.Size([512, 4])
torch.Size([512, 4])

RuntimeError Traceback (most recent call last)
in <cell line: 1>()
1 for epoch in range(EPOCHS):
2 print(f"Training epoch: {epoch + 1}")
----> 3 train(epoch)

3 frames
/usr/local/lib/python3.10/dist-packages/transformers/models/roberta/modeling_roberta.py in forward(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, labels, output_attentions, output_hidden_states, return_dict)
1230 elif self.config.problem_type == “single_label_classification”:
1231 loss_fct = CrossEntropyLoss()
→ 1232 loss = loss_fct(logits.contiguous().view(-1, self.num_labels), labels.contiguous().view(-1))
1233 elif self.config.problem_type == “multi_label_classification”:
1234 loss_fct = BCEWithLogitsLoss()

RuntimeError: view size is not compatible with input tensor’s size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(…) instead.

could you provide a minimal example? which command exactly throws the error? what do you want to reshape to? what are the relevant tensors’ sizes?