Deactivate .cuda() output

Harpiye · December 17, 2022, 2:38pm

Due to the error that I had tensors on both the cpu and on cuda, I had to add .cuda() to my code (as suggested here), such that it contains the following line:

loss_fct = nn.CrossEntropyLoss(weight=torch.tensor([1.0, 100.0])).cuda()

This works but outputs the tensors which really interrupts my training output and makes it unreadable. How can I deactivate this output?

ptrblck · December 17, 2022, 9:58pm

Could you describe the issue in more detail, please? I don’t fully understand which output is causing the issue.

Harpiye · January 15, 2023, 2:41pm

The output looks like this:

0%| | 0/12 [00:00<?, ?it/s]
17%|█████████▋ | 2/12 [00:20<01:40, 10.02s/it]
25%|██████████████▌ | 3/12 [00:40<02:07, 14.20s/it]
33%|███████████████████▎ | 4/12 [01:00<02:11, 16.38s/it]
42%|████████████████████████▏ | 5/12 [01:20<02:03, 17.65s/it]
50%|█████████████████████████████ | 6/12 [01:40<01:50, 18.45s/it]
58%|█████████████████████████████████▊ | 7/12 [02:00<01:34, 18.96s/it]
67%|██████████████████████████████████████▋ | 8/12 [02:20<01:17, 19.30s/it]
75%|███████████████████████████████████████████▌ | 9/12 [02:40<00:58, 19.54s/it]
83%|███████████████████████████████████████████████▌ | 10/12 [03:00<00:39, 19.69s/it]
92%|████████████████████████████████████████████████████▎ | 11/12 [03:20<00:19, 19.80s/it]
100%|█████████████████████████████████████████████████████████| 12/12 [03:41<00:00, 18.46s/it]

I’m running my models in screen session and when I later come back to it, this output blocks in parts the output that I actually need to see. (I tried to redirect my output to a file instead but that didn’t work either and is a bit awkward either way.)

ptrblck · January 15, 2023, 10:29pm

This seems to be a tqdm issue rather than a PyTorch one or are you only seeing it when the loss function is moved to the GPU?
If so, could you post a minimal and executable code snippet showing this behavior?

Harpiye · January 22, 2023, 2:20pm

I use my own dataset but I guess this also shows the behavior although with this model and dataset combination it doesn’t output as many lines:

from transformers import DefaultDataCollator, AutoTokenizer, \
    AutoModelForSequenceClassification, TrainingArguments, Trainer
from datasets import load_dataset
import torch
from torch import nn

num_train_epochs = 3
# use GPU instead of CPU
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
tokenizer = AutoTokenizer.from_pretrained("ishan/distilbert-base-uncased-mnli")  # PyTorch model
dataset = load_dataset("rotten_tomatoes")
encoded_dataset = dataset.map(lambda examples: tokenizer(examples['text'], padding='max_length', truncation=True, max_length=8))
train_dataset = encoded_dataset["train"]
test_dataset = encoded_dataset["test"]


class CustomTrainer(Trainer):
    def __init__(self, model, train_dataset, eval_dataset, args):
        super().__init__(model=model, args=args, train_dataset=train_dataset, eval_dataset=eval_dataset)
    def compute_loss(self, some_model, inputs, return_outputs=False):
        labels = inputs.get("labels")
        outputs = some_model(**inputs)
        logits = outputs.get("logits")
        loss_fct = nn.CrossEntropyLoss(weight=torch.tensor([1.0, 10.0])).cuda()
        loss = loss_fct(logits.view(-1, self.model.config.num_labels), labels.view(-1))
        return (loss, outputs) if return_outputs else loss


model = AutoModelForSequenceClassification.from_pretrained("ishan/distilbert-base-uncased-mnli", num_labels=2,
                                                           ignore_mismatched_sizes=True)
model.to(device)
data_collator = DefaultDataCollator(return_tensors="pt")
training_args = TrainingArguments(
    output_dir="./output",
    num_train_epochs=num_train_epochs, 
    per_device_train_batch_size=16,  
    per_device_eval_batch_size=64,  
)
trainer = CustomTrainer(
    model=model,  
    args=training_args, 
    train_dataset=train_dataset,  
    eval_dataset=test_dataset,  
)
trainer.train()

I can’t say for sure if it only happens when moving the loss function to the GPU. I tried removing the lines I thought were responsible for moving everything to the GPU but apparently I didn’t caught every line as I got the error RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument weight in method wrapper_nll_loss_forward) and already had problems with that when introducing the loss function.
`

ptrblck · January 22, 2023, 8:20pm

Your code is unfortunately not a minimal code snippet as it uses the transformers library which is apparently using tqdm somewhere internally. If this issue is specific to HuggingFace you might want to create an issue in their GitHub repository or in their discussion board. If not and you can reproduce it in “plain PyTorch”, please post the code here and I’ll try to see what might be causing this output.