Can't pickle local object 'DataLoader.__init__.<locals>.<lambda>'

Hi all,
I hope everybody reading this is having a great day.

So I have a problem with torchvision.transforms.Lambda() function when used with python function: enumerate. I am using it to make my uni-channeled image into multi-channeled tensor. It works fine and produce data loader instance for torchvision datasets, but when I instantiate the batch’s index with the command enumerate(<batch_name>) it given the following error:


  File "C:\Users\Arsalan\Desktop\project\mlmi\trainer\trainer.py", line 63, in _train_epoch
    for batch_idx, (data, target) in enumerate(self.data_loader):

  File "C:\Users\Arsalan\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 501, in __iter__
    return _DataLoaderIter(self)

  File "C:\Users\Arsalan\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 289, in __init__
    w.start()

  File "C:\Users\Arsalan\Anaconda3\lib\multiprocessing\process.py", line 105, in start
    self._popen = self._Popen(self)

  File "C:\Users\Arsalan\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)

  File "C:\Users\Arsalan\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)

  File "C:\Users\Arsalan\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
    reduction.dump(process_obj, to_child)

  File "C:\Users\Arsalan\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)

AttributeError: Can't pickle local object 'NotMNISTDataLoader.__init__.<locals>.<lambda>'

On a side note:

This lambda transform works fine on my linux machine but on windows it is giving this problem.

Would anybody be kind enough to tell me what is problem is

1 Like

@peterjc123 lambda objects cannot be pickled on Windows?

yes on windows it can’t be be at the moment.
Do you know any solutions?

So they can be pickled?
The error then does not come from the fact that you try to pickle a lambda function?

Sorry I miss-typed, my earlier comment is edited

I guess you will need to change NotMNISTDataLoader and use a proper function there instead of a lambda.

so, for my problem:
transforming uni-channeled image into multi-channeled tensor, what can I do?
because this works fine on linux, but for windows users what’s the alternative.

for example if my batch size is tensor(128, 32, 32), after transformation I want tensor(128, 3, 32, 32)

Do you have the code for the __init__ of NotMNISTDataLoader ? There is a lamda function in there like lambda x: x+1 that needs to be changed to

def tmp_func(x):
    return x + 1

And replace the lambda by tmp_func.

Here you go:

class NotMNISTDataLoader(BaseDataLoader):
    """
    NotMnist data loading demo using BaseDataLoader
    """
    def __init__(self, data_dir, batch_size, shuffle, validation_split, num_workers, training=True):
        print("NotMnist is used")
        self.data_dir = data_dir + "NotMNIST/"
        
        trsfm = transforms.Compose([
            transforms.Resize((32, 32)),
            transforms.ToTensor(),
            transforms.Normalize((0.1307,), (0.3081,)),
            transforms.Lambda(lambda x: x.repeat(3, 1, 1))
            ])
       
        self.dataset = NotMNIST(self.data_dir, train = training, download = True, transform = trsfm)
        super(NotMNISTDataLoader, self).__init__(self.dataset, batch_size, shuffle, validation_split, num_workers)
        trsfm = transforms.Compose([
            transforms.Resize((32, 32)),
            transforms.ToTensor(),
            transforms.Normalize((0.1307,), (0.3081,)),
            transforms.Lambda(lambda x: x.repeat(3, 1, 1))
            ])

Should become

        def tmp_func(x):
            return x.repeat(3, 1, 1)

        trsfm = transforms.Compose([
            transforms.Resize((32, 32)),
            transforms.ToTensor(),
            transforms.Normalize((0.1307,), (0.3081,)),
            transforms.Lambda(tmp_func)
            ])
1 Like

Thank you so much man :slight_smile:
It’s working now.
Have a great day.

By the way can you tell me what’s the problem was before, why it can’t be pickled on windows but does on linux?

I have no idea :confused: That why I pinged peterjc which know windows much better.

Ok thank you for your help.
Let’s hope he replies

No, it is not supported on Windows. The reason is that multiprocessing lib doesn’t have it implemented on Windows. There are some alternatives like dill that can pickle more objects.

3 Likes

Could you please do me a favor and write this one into the official Windows doc?

Excuse my ignorance on the matter, but would you like to tell me how exactly can one use dill for .
I tried installing that too but that didn’t solve the problem either. May be I am not using it right and I have to make some changes in my code in order to use it.
Can you tell me about it’s usage to solve my problem?

You mean here:

Windows Discussion Forum
Or if I am wrong, would you share the link?

No, things should be done at the backend. I don’t think we will introduce dill to resolve this problem. So what @albanD said is the correct solution, at least for now. As for the document, it can be accessed here and modified here.

Hi,

Has there been any update on this in the official repo? I just ran into this issue on the latest Torch from Conda.

Cheers!

1 Like

Hi @albanD, I am facing the same issue while using Pool method. As you suggested, I removed lamda functions. Still the error is persisting. (I am using Windows, I used Colab to try the same code, but it is also showing the same error there … ) Can you suggest, what can be done?