suppose I have 2 dirs,A contains 70 images and B contains 50 images. Now I need to randomly choose one from A and one from B and then operate them(such as dotwise multiplication), could someone give me a code snippet to show how to write the code?I can’t overwrite the len and getitem properly.
How would you like to sample from these datasets having different number of samples?
Would you like to get every sample or A
and sample the pair from B
with repetition?
Once you have both Datasets
defined, you could wrap them in another Dataset
class and use your sampling logic inside the __getitem__
.
Here is a code example:
class MyDataset(Dataset):
def __init__(self, datasetA, datasetB):
self.datasetA = datasetA
self.datasetB = datasetB
def __getitem__(self, index):
xA, yA = self.datasetA[index]
# implement your sampling logic here
indexB = torch.randint(0, len(self.datasetB), (1,))
xB, yB = self.datasetB[indexB]
# your operations on the datasets
x = xA * xB
return x
def __len__(self):
# use the larget dataset's length
return len(self.datasetA)
# Create dummy datasets
datasetA = TensorDataset(torch.randn(70, 3, 224, 224), torch.randint(0, 10, (70,)))
datasetB = TensorDataset(torch.randn(50, 3, 224, 224), torch.randint(0, 10, (50,)))
dataset = MyDataset(datasetA, datasetB)
I’m not sure, how you would like to use the targets of the datasets (if available).
I’d like to use a pure image and an arbitrary mask to mask the original image so that i can get a masked image.
So it means that I can combine A and B and the true len of dataset is len(A)*len(B)
but I reckon ur code example a good one,the len(dataset)=len(A), and mask is random sampled by the randint func.