Due to the error that I had tensors on both the cpu and on cuda, I had to add .cuda() to my code (as suggested here), such that it contains the following line:
I’m running my models in screen session and when I later come back to it, this output blocks in parts the output that I actually need to see. (I tried to redirect my output to a file instead but that didn’t work either and is a bit awkward either way.)
This seems to be a tqdm issue rather than a PyTorch one or are you only seeing it when the loss function is moved to the GPU?
If so, could you post a minimal and executable code snippet showing this behavior?
I use my own dataset but I guess this also shows the behavior although with this model and dataset combination it doesn’t output as many lines:
from transformers import DefaultDataCollator, AutoTokenizer, \
AutoModelForSequenceClassification, TrainingArguments, Trainer
from datasets import load_dataset
import torch
from torch import nn
num_train_epochs = 3
# use GPU instead of CPU
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
tokenizer = AutoTokenizer.from_pretrained("ishan/distilbert-base-uncased-mnli") # PyTorch model
dataset = load_dataset("rotten_tomatoes")
encoded_dataset = dataset.map(lambda examples: tokenizer(examples['text'], padding='max_length', truncation=True, max_length=8))
train_dataset = encoded_dataset["train"]
test_dataset = encoded_dataset["test"]
class CustomTrainer(Trainer):
def __init__(self, model, train_dataset, eval_dataset, args):
super().__init__(model=model, args=args, train_dataset=train_dataset, eval_dataset=eval_dataset)
def compute_loss(self, some_model, inputs, return_outputs=False):
labels = inputs.get("labels")
outputs = some_model(**inputs)
logits = outputs.get("logits")
loss_fct = nn.CrossEntropyLoss(weight=torch.tensor([1.0, 10.0])).cuda()
loss = loss_fct(logits.view(-1, self.model.config.num_labels), labels.view(-1))
return (loss, outputs) if return_outputs else loss
model = AutoModelForSequenceClassification.from_pretrained("ishan/distilbert-base-uncased-mnli", num_labels=2,
ignore_mismatched_sizes=True)
model.to(device)
data_collator = DefaultDataCollator(return_tensors="pt")
training_args = TrainingArguments(
output_dir="./output",
num_train_epochs=num_train_epochs,
per_device_train_batch_size=16,
per_device_eval_batch_size=64,
)
trainer = CustomTrainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=test_dataset,
)
trainer.train()
I can’t say for sure if it only happens when moving the loss function to the GPU. I tried removing the lines I thought were responsible for moving everything to the GPU but apparently I didn’t caught every line as I got the error RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument weight in method wrapper_nll_loss_forward) and already had problems with that when introducing the loss function.
`
Your code is unfortunately not a minimal code snippet as it uses the transformers library which is apparently using tqdm somewhere internally. If this issue is specific to HuggingFace you might want to create an issue in their GitHub repository or in their discussion board. If not and you can reproduce it in “plain PyTorch”, please post the code here and I’ll try to see what might be causing this output.