Designing custom dataset using multiple different data tables

Hello. Currently I’m working on a recommendation system, and I’m trying to make custom dataset which uses multiple data tables. There are three data tables.

  • rating_data : contains rating data between users and items. First column means userID and second one means item ID. type of np.array , shape of [Nx2]. (N ratings)
  • negative_data : Contains negative items for each users. i’th row is list of itemIDs, and these items are negative items of user whose ID is i. type of np.array. The number of rows is same as number of users and the number of columns is varies from users.
  • image_path_list : 1D array that Contains path to image data corresponding to items. i’th element is path to the image that describes item whose ID is i. type of np.array and size is same as number of items.

In each iteration, I need 1) pair of user ID and positive item ID from rating_data first of all. 2) Then I have to sample 4 negative items from negative_data[userID]. 3) Finally I need image paths corresponding to positive item and 4 negative items from image_path_list

I wrote my code like this.

from torch.utils.data import Dataset
from PIL import Image
import numpy as np

class CustomDataset(Dataset):
    def __init__(self, rating_data, negative_data, image_path_list, transform=None):
        super(CustomDataset, self).__init__()
        self.rating_data = rating_data
        self.negative_data = negative_data
        self.image_path_list = image_path_list
        self.transform = transform

    def __len__(self):
        return len(self.dataset)

    def __getitem__(self, index):
        # Get (userID, positive itemID) pair. (e.g, user=3, item_p=0)
        user, item_p = self.rating_data[index]
        
        # negative items of specific user.
        ng_pool = self.negative_data[user]
        # sample 4 negative items (e.g, item_n = [1,2,7,10])
        idx = np.random.choice(len(ng_pool),4,replace=False) 
        item_n = ng_pool[idx].tolist() 

        p_path = self.image_path_list[item_p] # path to image describes item_p
        n_path = self.image_path_list[item_n] # paths to images describe item_n

        img_p = Image.open(p_path)
        img_n = Image.open(n_path) # Error occurs here

        if self.transform is not None:
            img_p = self.transform(img_p)
            img_n = self.transform(img_n)

        return user, item_p, item_n, img_p, img_n

Since there are 4 negative items, n_path is array type contains four paths. And Image.open() seems to take only str type of input. When I run

dataset = CustomDataset()
dataset[0]

Here is the error message.

*** AttributeError: ‘numpy.ndarray’ object has no attribute ‘read’

How can I make Image.open() work on multiple paths? Can anybody help me to fix this code?