Incompatible for using list and cuda together?

Hi,
I put my training data and labels under a list of zip objects “training_data” as shown in the following code:

training_data.append(zip(train_data_list,target_list))    # Build different batches for train data

When I tried to put the data into 2 separate lists, I ran with the following codes:

for batch, batch_dataLabel in enumerate(training_data):
    data = []
    label = []
        
    for D, true_label in batch_dataLabel:
        D = Variable(D.float())
        true_label = Variable(true_label.float())
        data.append(D)
        label.append(true_label)

    if gpu:
            data = data.cuda()
            label = label.cuda()

However, there was an Attribute Error, stating that ‘list’ object has no attribute ‘cuda’. So, I tried another way suggested on the forum for using nn.ModuleList():

for batch, batch_dataLabel in enumerate(training_data):
    data = nn.ModuleList()
    label = nn.ModuleList()
        
    for D, true_label in batch_dataLabel:
        D = Variable(D.float())
        true_label = Variable(true_label.float())
        data.append(D)
        label.append(true_label)

    if gpu:
            data = data.cuda()
            label = label.cuda()

However, there is a Type Error at this time, stating that TypeError: torch.FloatTensor is not a Module subclass.

Do you have any ideas on how to solve the problem? Is it the list of zip objects being too complicated, so that it would be better for me to use a less complicated data structure?

Thank you very much in advance!

The problem with your first approach is, that a list is a built-in type which does not have a cuda method.

The problem with your second approach is, that torch.nn.ModuleList is designed to properly handle the registration of torch.nn.Module components and thus does not allow passing tensors to it.

There are two ways to overcome this:

  1. You could call .cuda on each element independently like this:
if gpu:
            data = [_data.cuda() for _data in data] 
            label = [_label.cuda() for _label in label] 

And

  1. You could store your data elements in a large tensor (e.g. via torch.cat) and then call .cuda() on the whole tensor:
data = []
    label = []
        
    for D, true_label in batch_dataLabel:
        D = D.float()
        true_label = true_label.float()
        # add new dimension to tensors and append to list
        data.append(D.unsqueeze(0))
        label.append(true_label.unsqueeze(0))
    
    data = torch.cat(data, dim=0)
    label = torch.cat(label, dim=0)

    if gpu:
            data = data.cuda()
            label = label.cuda()

Note: I removed the Variable conversion, since variables and tensors have been merged a while ago in Pytorch 0.4

2 Likes

data = torch.cat(data, dim=0)

I am pretty sure that

data = torch.tensor(data)

would already do the job here. And here

data = data.cuda()

you can use

data = data.to(device)

where “device” is e.g., a GPU, depending on which one you want. For the first one,

device = torch.device('cuda:0')

Mentioning this because that’s the recommended way since v0.4, although your suggestions would also work fine.

2 Likes

Would this effectively make a difference? From my knowledge, they both create a copy of the underlying memory.

Not sure, but it does seem more idiomatic to me if you just want to convert the type from list of lists to 2D tensor. If data is a list of tensor types, then torch.cat would be the way to go, and if it’s a list of lists, then torch.cat will probably not work.

1 Like

Thanks a lot for your help! The code works well!
Since training_data is a list of zip object, it has to be unzipped first. The following code works:


for batch, batch_imgclass in enumerate(training_data):
    data = []
    label = []
    img, classLab = zip(*batch_imgclass)
    for image in img:
        image = image.float()
        data.append(image.unsqueeze(0))
    for true_label in classLab:
        true_label = true_label.float()
        label.append(true_label.unsqueeze(0))

Thanks for your comments, @rasbt
Unfortunately torch.tensor(data) does not work…