AttributeError: 'TextDataset' object has no attribute 'to'

Chukwujike_Ezemandu · June 30, 2023, 9:04am

I attempted to pass my input and model to the device as instructed during training, However, I was confronted with the following error

Cell In[27], line 27, in train_chatbot(directory, model_output_path)
     11 training_args = TrainingArguments(
     12     output_dir=model_output_path,
     13     overwrite_output_dir=True,
   (...)
     19     logging_dir='./logs',
     20 )
     22 # Train the model
     23 trainer = Trainer(
     24     model=model.to(device),
     25     args=training_args,
     26     data_collator=data_collator,
---> 27     train_dataset=train_dataset.to(device),
...
     29 )
     31 trainer.train()
     32 trainer.save_model(model_output_path)

AttributeError: 'TextDataset' object has no attribute 'to'

Here is the code used for implementation

def train_chatbot(directory, model_output_path):
    device = torch.device("cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu")
    tokenizer = GPT2Tokenizer.from_pretrained("./pretrained")
    model = GPT2LMHeadModel.from_pretrained("./pretrained")

    train_dataset = TextDataset(tokenizer=tokenizer, file_path= (directory + "train.txt"), block_size=128)
    val_dataset = TextDataset(tokenizer=tokenizer, file_path=(directory + "test.txt"), block_size=128)
    data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False)

    # Set up the training arguments
    training_args = TrainingArguments(
        output_dir=model_output_path,
        overwrite_output_dir=True,
        per_device_train_batch_size=4,
        per_device_eval_batch_size=4,
        num_train_epochs=100,
        save_steps=10_000,
        save_total_limit=2,
        logging_dir='./logs',
    )

    # Train the model
    trainer = Trainer(
        model=model.to(device),
        args=training_args,
        data_collator=data_collator,
        train_dataset=train_dataset.to(device),
        eval_dataset=val_dataset.to(device),
    )

    trainer.train()
    trainer.save_model(model_output_path)
    
    # Save the tokenizer
    tokenizer.save_pretrained(model_output_path)

ptrblck · June 30, 2023, 4:10pm

The .to() method is defined for tensors and modules, but not datasets. Are you using a custom Dataset where this method should be available but isn’t?

Chukwujike_Ezemandu · June 30, 2023, 9:13pm

I fixed the issue. Simply set use_mps_device in the training arguments to True.

Chukwujike_Ezemandu · June 30, 2023, 9:14pm

I do have another question but it is related to Macs.
is there a way to use CPU only rather than mps while training