Getting a "UnpicklingError: invalid load key, '\xff'." error

I’m getting a weird error trying to feed data into my first pytorch cov-net. Here is the error traceback.


UnpicklingError Traceback (most recent call last)
in
6
7 for epoch in range(max_epochs):
----> 8 for i in (partition[‘train’], labels) in enumerate(training_generator):
9 # Run the forward pass
10 outputs = model(images)

C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py in next(self)
344 def next(self):
345 index = self._next_index() # may raise StopIteration
–> 346 data = self.dataset_fetcher.fetch(index) # may raise StopIteration
347 if self.pin_memory:
348 data = _utils.pin_memory.pin_memory(data)

C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data_utils\fetch.py in fetch(self, possibly_batched_index)
42 def fetch(self, possibly_batched_index):
43 if self.auto_collation:
—> 44 data = [self.dataset[idx] for idx in possibly_batched_index]
45 else:
46 data = self.dataset[possibly_batched_index]

C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data_utils\fetch.py in (.0)
42 def fetch(self, possibly_batched_index):
43 if self.auto_collation:
—> 44 data = [self.dataset[idx] for idx in possibly_batched_index]
45 else:
46 data = self.dataset[possibly_batched_index]

in getitem(self, index)
24
25 # Load data and get label
—> 26 X = torch.load(root_dir + ID)
27 y = self.labels[ID]
28

C:\ProgramData\Anaconda3\lib\site-packages\torch\serialization.py in load(f, map_location, pickle_module, **pickle_load_args)
384 f = f.open(‘rb’)
385 try:
–> 386 return _load(f, map_location, pickle_module, **pickle_load_args)
387 finally:
388 if new_fd:

C:\ProgramData\Anaconda3\lib\site-packages\torch\serialization.py in _load(f, map_location, pickle_module, **pickle_load_args)
561 f.seek(0)
562
–> 563 magic_number = pickle_module.load(f, **pickle_load_args)
564 if magic_number != MAGIC_NUMBER:
565 raise RuntimeError(“Invalid magic number; corrupt file?”)

UnpicklingError: invalid load key, ‘\xff’.

My image file names and data resides in two dictionaries. Dictionary 1 called partition has a train and test key with a list of image filenames.

Here is the data class, transform and data loader:

root_dir = ‘D:\CIS inspection images 0318\train\roof\’

class roof_dataset(Dataset):

'Characterizes a dataset for PyTorch'

def __init__(self, list_IDs, labels, transform):
    'Initialization'
    self.labels = labels
    self.list_IDs = list_IDs
    self.transform = transform

def __len__(self):
        
    'Denotes the total number of samples'
    return len(self.list_IDs)

def __getitem__(self, index):

        'Generates one sample of data'
        # Select sample
        ID = self.list_IDs[index]

        # Load data and get label
        X = torch.load(root_dir + ID)
        y = self.labels[ID]
        
        if self.transform:
            X = self.transform(X)

        return X, y

CUDA for PyTorch

use_cuda = torch.cuda.is_available()
device = torch.device(“cuda:0” if use_cuda else “cpu”)

Parameters

params = {‘batch_size’: 1000,
‘shuffle’: True,
‘num_workers’: 0}
max_epochs = 100

Generators

training_set = roof_dataset(partition[‘train’], labels, transform = train_transforms)
training_generator = data.DataLoader(training_set, **params)

test_set = roof_dataset(partition[‘test’], labels, transform = train_transforms)
test_generator = data.DataLoader(test_set, **params)

Finally, here is where I get the error:

total_step = len(training_generator)
loss_list = []
acc_list = []

for epoch in range(max_epochs):
for i in (partition[‘train’], labels) in enumerate(training_generator):
# Run the forward pass
outputs = model(images)
loss = criterion(outputs, labels)
loss_list.append(loss.item())

    # Backprop and perform Adam optimisation
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Track the accuracy
    total = labels.size(0)
    _, predicted = torch.max(outputs.data, 1)
    correct = (predicted == labels).sum().item()
    acc_list.append(correct / total)

    if (i + 1) % 100 == 0:
        print('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}, Accuracy: {:.2f}%'
              .format(epoch + 1, num_epochs, i + 1, total_step, loss.item(),
                      (correct / total) * 100))

I think it’s in the data class but I’m not sure. Does anyone see what I’m doing wrong?

This is some kind of data loading issue. Pickle is a Python de/serialization module used in Dataloader. /xff might refer to hexademical FF. So most probably you are going out of range, perhaps trying to access a resource that does not exist?

X = torch.load(root_dir + ID)

I suggest to debug or print out root_dir + ID and see if the file really exists (or perhaps it does, but can not be unpickled, e.g., corrupted, wrong format etc).

related: https://github.com/learnables/learn2learn/issues/310