Manual Streaming?

Jordan_Howell · June 26, 2020, 6:59pm

Hello,

I’m having a hard time with a tight deadline trying to get a custom data set to load correctly. I have thus created a label, numerical column, categorical column and image tensors separably. However, when I load them into the GPU, I get a memory error.

Is there a way a manually stream batches out of the tensors into the gpu, 10 at a time?

ptrblck · June 28, 2020, 8:50am

What kind of “memory error” are you getting?
Could you post the error message with the stack trace, please?

Jordan_Howell · June 29, 2020, 9:21am

Hi @ptrblck. I already ran a process where I pulled all of the data prep out of the custom data set to include transformation to tensors.

Now my custom data set looks like this:

class tensor_data(Dataset):
    def __init__(self, num, cats, labels, images):
        self.num = num_tensor
        self.cats = cats
        self.labels = labels
        self.images = images
    
    def __len__(self):
        return len(self.num)
    
    def __getitem__(self, idx):
        return self.num[idx], self.cats[idx], self.labels[idx], self.images[idx]

That said, it’s also solved the single image per batch issue and it seems to be running faster. It’s just as easy to define a function to do the prework so I’m happy. Thank you, as always, for the great support. The community is lucky to have you.