Hello. Currently I’m working on a recommendation system, and I’m trying to make custom dataset which uses multiple data tables. There are three data tables.
rating_data
: contains rating data between users and items. First column means userID and second one means item ID. type ofnp.array
, shape of [Nx2]. (N ratings)negative_data
: Contains negative items for each users. i’th row is list of itemIDs, and these items are negative items of user whose ID is i. type ofnp.array
. The number of rows is same as number of users and the number of columns is varies from users.image_path_list
: 1D array that Contains path to image data corresponding to items. i’th element is path to the image that describes item whose ID is i. type ofnp.array
and size is same as number of items.
In each iteration, I need 1) pair of user ID and positive item ID from rating_data
first of all. 2) Then I have to sample 4 negative items from negative_data[userID]
. 3) Finally I need image paths corresponding to positive item and 4 negative items from image_path_list
I wrote my code like this.
from torch.utils.data import Dataset
from PIL import Image
import numpy as np
class CustomDataset(Dataset):
def __init__(self, rating_data, negative_data, image_path_list, transform=None):
super(CustomDataset, self).__init__()
self.rating_data = rating_data
self.negative_data = negative_data
self.image_path_list = image_path_list
self.transform = transform
def __len__(self):
return len(self.dataset)
def __getitem__(self, index):
# Get (userID, positive itemID) pair. (e.g, user=3, item_p=0)
user, item_p = self.rating_data[index]
# negative items of specific user.
ng_pool = self.negative_data[user]
# sample 4 negative items (e.g, item_n = [1,2,7,10])
idx = np.random.choice(len(ng_pool),4,replace=False)
item_n = ng_pool[idx].tolist()
p_path = self.image_path_list[item_p] # path to image describes item_p
n_path = self.image_path_list[item_n] # paths to images describe item_n
img_p = Image.open(p_path)
img_n = Image.open(n_path) # Error occurs here
if self.transform is not None:
img_p = self.transform(img_p)
img_n = self.transform(img_n)
return user, item_p, item_n, img_p, img_n
Since there are 4 negative items, n_path
is array type contains four paths. And Image.open()
seems to take only str type of input. When I run
dataset = CustomDataset()
dataset[0]
Here is the error message.
*** AttributeError: ‘numpy.ndarray’ object has no attribute ‘read’
How can I make Image.open()
work on multiple paths? Can anybody help me to fix this code?