ValueError: --optim adamw_torch_fused with --fp16 requires PyTorch>2.0

I install torch version 2. But again its give me the error

ValueError: --optim adamw_torch_fused with --fp16 requires PyTorch>2.0

Can you check the version of PyTorch imported on your script: print(torch.__version__). Maybe you installed PyTorch 2.0 to your system/venv but are using venv/system python to run your script. Another thing to note is that the last stable version of PyTorch at the moment is 2.0.0. So are you getting the error PyTorch>2.0 or PyTorch>=2.0 ?

Yeah , its show the pytorch version 2.0.0+cu117

But they not working

Here is the code

import transformers
from datasets import load_dataset

datasets = data.map(
    tokenize, batched=True, remove_columns=data["train"].column_names
)

trainer = transformers.Trainer(
    model=model, 
    train_dataset=datasets['train'],
    args=transformers.TrainingArguments(
        per_device_train_batch_size=4, 
        gradient_accumulation_steps=4,
        num_train_epochs=3,
        torch_compile=True,
        optim="adamw_torch_fused",
        warmup_steps=100, 
        max_steps=200, 
        report_to = "tensorboard",
        learning_rate=2e-4, 
        fp16=True,
        logging_steps=1, 
        output_dir='PB7B'
    ),
    data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False)
)
model.config.use_cache = False  # silence the warnings. Please re-enable for inference!
trainer.train()

For faster training I was add this two parameters

torch_compile=True,
optim="adamw_torch_fused",


As is says so here, you might need a nightly version of PyTorch, meaning the unstable master release which will explain the PyTorch >2.0 error. I did not get a chance to try this yet, but it definitely is worthy of checking. Image is from Trainer

Is that error coming from HF trainer or from PyTorch? Regardless you can try casting your data to fp16 but I’d double check in HF forums what that error is about

It’s come from hugging face trainer. He suggest that you should need to upgrade pytorch version grater then 2.

Related PR: [trainer] add `--optim adamw_torch_fused` for pt-2.0+ by stas00 · Pull Request #22144 · huggingface/transformers · GitHub

They are not helpful. I have still the issues and can’t solve yet.

well, did you update to the nightly release?

Yeah,
I just update to 2.0.0 using official nightly…

The nightlies should start with 2.1.0 now - you can pick the selector here to get the right version https://pytorch.org/

1 Like

Thanks for answering :hugs: