ValueError: --optim adamw_torch_fused with --fp16 requires PyTorch>2.0

Imran_Ullah · March 19, 2023, 9:28am

I install torch version 2. But again its give me the error

ege_b · March 19, 2023, 10:06am

Can you check the version of PyTorch imported on your script: print(torch.__version__). Maybe you installed PyTorch 2.0 to your system/venv but are using venv/system python to run your script. Another thing to note is that the last stable version of PyTorch at the moment is 2.0.0. So are you getting the error PyTorch>2.0 or PyTorch>=2.0 ?

Imran_Ullah · March 19, 2023, 10:09am

Yeah , its show the pytorch version 2.0.0+cu117

Imran_Ullah · March 19, 2023, 10:11am

But they not working

Here is the code

import transformers
from datasets import load_dataset

datasets = data.map(
    tokenize, batched=True, remove_columns=data["train"].column_names
)

trainer = transformers.Trainer(
    model=model, 
    train_dataset=datasets['train'],
    args=transformers.TrainingArguments(
        per_device_train_batch_size=4, 
        gradient_accumulation_steps=4,
        num_train_epochs=3,
        torch_compile=True,
        optim="adamw_torch_fused",
        warmup_steps=100, 
        max_steps=200, 
        report_to = "tensorboard",
        learning_rate=2e-4, 
        fp16=True,
        logging_steps=1, 
        output_dir='PB7B'
    ),
    data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False)
)
model.config.use_cache = False  # silence the warnings. Please re-enable for inference!
trainer.train()

For faster training I was add this two parameters

torch_compile=True,
optim="adamw_torch_fused",

ege_b · March 19, 2023, 10:18am

As is says so here, you might need a nightly version of PyTorch, meaning the unstable master release which will explain the PyTorch >2.0 error. I did not get a chance to try this yet, but it definitely is worthy of checking. Image is from Trainer

marksaroufim · March 22, 2023, 12:32am

Is that error coming from HF trainer or from PyTorch? Regardless you can try casting your data to fp16 but I’d double check in HF forums what that error is about

Imran_Ullah · March 22, 2023, 1:42am

It’s come from hugging face trainer. He suggest that you should need to upgrade pytorch version grater then 2.

ptrblck · March 22, 2023, 5:56am

Related PR: [trainer] add `--optim adamw_torch_fused` for pt-2.0+ by stas00 · Pull Request #22144 · huggingface/transformers · GitHub

Imran_Ullah · March 22, 2023, 12:49pm

They are not helpful. I have still the issues and can’t solve yet.

ege_b · March 22, 2023, 4:03pm

well, did you update to the nightly release?

Imran_Ullah · March 22, 2023, 10:36pm

Yeah,
I just update to 2.0.0 using official nightly…

marksaroufim · March 23, 2023, 4:01am

The nightlies should start with 2.1.0 now - you can pick the selector here to get the right version https://pytorch.org/

Imran_Ullah · March 23, 2023, 5:02am

Thanks for answering