Train two Dataset together in Capsule Network

I am working with this code: https://github.com/AlexHex7/CapsNet_pytorch

here the training images come from train_loader. I want train the images from valid_loader at the same time. How can I do that?

train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=cfg.batch_size, shuffle=True)
valid_loader = torch.utils.data.DataLoader(valid_dataset, batch_size=cfg.batch_size, shuffle=True)

for epoch_index in range(cfg.epoch):
for train_batch_index, (img_batch, label_batch) in enumerate(train_loader):
img_batch = variable(img_batch)
label_batch = variable(label_batch).unsqueeze(dim=1)
predict, reconstruct_img = net(img_batch, label_batch, train=True)

    acc = net.calc_acc(predict, label_batch)
    margin_loss = net.margin_loss(predict, label_batch)
    reconstruct_loss = net.reconstruction_loss(img_batch, reconstruct_img)      
    loss = margin_loss + reconstruct_loss 
    net.zero_grad()
    loss.backward()
    opt.step()

Actually my intension is like that

for train_batch_index, (img_batch_v, label_batch_v) in enumerate(valid_loader):

so that i can get the valuse of img_batch_v and label_batch_v as well as img_batch and label_batch

It depends if you need to know if a sample is from the train or validation data set.
If it doesn’t matter, the most elegant solution might be to use a ConcatDataset.
Using this approach you can just concatenate both data sets into a new one. The DataLoader might take care of shuffling if needed.

Here is a small example:

class MyData(Dataset):
    def __init__(self):
        self.data = torch.randn(100, 1)
        self.target = torch.Tensor(100).uniform_(0, 10).long()
    
    def __getitem__(self, index):
        x = self.data[index]
        y = self.data[index]
        return x, y

    def __len__(self):
        return len(self.data)

train_dataset = MyData()
val_dataset = MyData()
train_val_dataset = ConcatDataset([train_dataset, val_dataset])

for batch_idx, (data, target) in enumerate(train_val_dataset):
    print(batch_idx)

However, if you need to call both data sets separately, you could zip both sets in the for loop.
One drawback of this approach (at least in this simple form) is that both datasets must have the same length, since the loop is iterating both sets in one loop.
Here is a small modification to the aforementioned code:

train_loader = DataLoader(train_dataset)
val_loader = DataLoader(val_dataset)

for batch_idx, (train_batch, val_batch) in enumerate(zip(train_dataset, val_dataset)):
    print(batch_idx)
    data, target = train_batch[0], train_batch[1]
    val, target_val = val_batch[0], val_batch[1]

I would prefer the first approach, if it fits your needs :slight_smile:

1 Like

@ptrblck thank you so much for your kind reply.

Then after training how can I get the values.

After training, I need to get the values for source_data1 and source_data2. For an example,

img_batch ?
label_batch?

for both source .

In caffe we are using slice to make separating. Is there any function that we can use?

layer {
name: "fc7_slice"
type: "Slice"
bottom: "fc7_grl"
top: "source_fc7_1"
top: "target_fc7_1"
slice_param {
axis: 0
}
include: { phase: TRAIN }
}

for epoch_index in range(cfg.epoch):

for train_batch_index, (img_batch, label_batch) in enumerate(train_loader):
img_batch = variable(img_batch)
label_batch = variable(label_batch).unsqueeze(dim=1)

 img_batcg_s = torch.slice(img_batch)
 img_batcg_t = torch.slice(img_batch)

May I use this **torch.slice** to do this?

I am not sure, why you were using slice in caffe.

After the training you can get the data and target from both datasets either by calling

data, target = train_dataset[0]
data_val, target_val = val_dataset[0]

or by wrapping it again in a DataLoader and iterate it.

I’m not sure, but it seems the caffe slice layer separates a blob along an axis.
You don’t need it in this case.
Also torch.slice does not exist. You can use narrow however.

Could you explain your use case a little bit more? I’m not sure I understand it correctly.

1 Like

The original code is:
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=cfg.batch_size, shuffle=True)

for epoch_index in range(cfg.epoch):
for train_batch_index, (img_batch, label_batch) in enumerate(train_loader):
img_batch = variable(img_batch)
label_batch = variable(label_batch).unsqueeze(dim=1)
predict, reconstruct_img = net(img_batch, label_batch, train=True)

acc = net.calc_acc(predict, label_batch)
margin_loss = net.margin_loss(predict, label_batch)
reconstruct_loss = net.reconstruction_loss(img_batch, reconstruct_img)
loss = margin_loss + reconstruct_loss
net.zero_grad()
loss.backward()
opt.step()

The original code take data from one source. My intension is to take data from two sources and during training I will add a custom loss in between two sources.
For this reason, I need train data from two sources. For an example on is from train folder another one is from valid folder.

Ok, since you need them simultaneously, I would suggest my 2nd approach (zipping both DataLoaders).

for batch_idx, (train_batch, val_batch) in enumerate(zip(train_loader, val_loader)):
    data, target = train_batch[0], train_batch[1]
    val, target_val = val_batch[0], val_batch[1]

    # Forward call
    ....
    # Loss calculation
    acc = ...
    your_custom_loss = ...

Will this meet your needs?

1 Like

@ptrblck Thank you so much. I think it will work.

My original code is:

for epoch_index in range(cfg.epoch):
for train_batch_index, (img_batch, label_batch) in enumerate(train_loader):
img_batch = variable(img_batch)
label_batch = variable(label_batch).unsqueeze(dim=1)
predict, reconstruct_img = net(img_batch, label_batch, train=True)
acc = net.calc_acc(predict, label_batch)
margin_loss = net.margin_loss(predict, label_batch)
reconstruct_loss = net.reconstruction_loss(img_batch, reconstruct_img)
loss = margin_loss + reconstruct_loss
net.zero_grad()
loss.backward()
opt.step()

My new code is:

for epoch_index in range(cfg.epoch):

for train_batch_index, (img_batch, label_batch) in enumerate (zip (train_loader, valid_loader )):
img_batch, label_batch = img_batch[0], img_batch[1]
img_batch_v, label_batch_v = label_batch[0], label_batch[1]

 img_batch = variable(img_batch)
 label_batch = variable(label_batch).unsqueeze(dim=1)

 img_batch_v = variable(img_batch_v)
 label_batch_v = variable(label_batch_v).unsqueeze(dim=1)

 predict, reconstruct_img = net(img_batch, label_batch, train=True)

 acc = net.calc_acc(predict, label_batch)
 margin_loss = net.margin_loss(predict, label_batch)
 reconstruct_loss = net.reconstruction_loss(img_batch, reconstruct_img) 
 Custom_loss = custom_loss(img_batch, img_batch_v)  # img_batch from 1 source and img_batch_v for another

 loss = margin_loss + reconstruct_loss + Custom_loss
 net.zero_grad()
 loss.backward()
 opt.step()

But got an error.

for epoch_index in range(cfg.epoch):

for train_batch_index, (img_batch, label_batch) in enumerate (zip (train_loader, valid_loader )):

 img_batch, label_batch = img_batch[0], img_batch[1]
 img_batch_v, label_batch_v = label_batch[0], label_batch[1]

 img_batch = variable(img_batch)
 label_batch = variable(label_batch).unsqueeze(dim=1)

 img_batch_v = variable(img_batch_v)
 label_batch_v = variable(label_batch_v).unsqueeze(dim=1)

 predict, reconstruct_img = net(img_batch, label_batch, train=True)
 acc = net.calc_acc(predict, label_batch)
 margin_loss = net.margin_loss(predict, label_batch)
 reconstruct_loss = net.reconstruction_loss(img_batch, reconstruct_img) 
 Custom_loss = custom_lossl(img_batch, img_batch_v)

 loss = margin_loss + reconstruct_loss +Custom_loss
 net.zero_grad()
 loss.backward()
 opt.step()

But got the error:

File “main_mahfuj.py”, line 95, in
img_batch_v = variable(img_batch_v)
File “/media/mahfuj/DATA/New_CODE/Working/CapsNet_pytorch/lib/cuda_utils.py”, line 7, in variable
return Variable(tensor, volatile=volatile).cuda()
RuntimeError: Variable data has to be a tensor, but got int

I got the value of img_batch but I did not get the value of img_batch_v.

Thanks in advanced.

You mixed up the variable names a bit.
I would change img_batch and label_batch to train_batch and val_batch, respectively.

Could you check, what both loaders are returning, i.e. the shape of both tensor inside the tuple?

1 Like

@ptrblck Thank you so much for your kind help.

for epoch_index in range(cfg.epoch):

for train_batch_index, (img_batch, label_batch) in enumerate (zip (train_loader, valid_loader )):

img_batch, label_batch = img_batch[0], img_batch[1]
img_batch_v, label_batch_v = label_batch[0], label_batch[1]

img_batch = variable(img_batch)
label_batch = variable(label_batch).unsqueeze(dim=1)

img_batch_v = variable(img_batch_v)
label_batch_v = variable(label_batch_v).unsqueeze(dim=1)

predict, reconstruct_img = net(img_batch, label_batch, train=True)
acc = net.calc_acc(predict, label_batch)
margin_loss = net.margin_loss(predict, label_batch)
reconstruct_loss = net.reconstruction_loss(img_batch, reconstruct_img)
Custom_loss = custom_lossl(img_batch, img_batch_v)

loss = margin_loss + reconstruct_loss +Custom_loss
net.zero_grad()
loss.backward()
opt.step()

Results:
img_batch size: (28, 28)

But did not get the value of img_batch_v.

torch.Size([28, 28])
Traceback (most recent call last):
File “main_mahfuj.py”, line 91, in
print(img_batch_v[0][0].shape)
TypeError: ‘int’ object is not subscriptable

You still have some errors with the sizes.
Please try print(img_batch_v.shape) (without the indexing).
Also it seems your naming is still a bit confusing. :wink:

@ptrblck Thank you so much.

Now I got it. I have changed the naming and its working. Thank you so much for your kind help. I really appreciate it.

I’m glad it’s working now! :wink:

1 Like

@ptrblck

for epoch_index in range(cfg.epoch):
for batch_idx, (train_batch, val_batch) in enumerate(zip(train_loader, valid_loader)):
data, target = train_batch[0], train_batch[1]
val, target_val = val_batch[0], val_batch[1]
img_batch = variable(data)
label_batch = variable(target).unsqueeze(dim=1)
img_batch_v = variable(val)
label_batch_v = variable(target_val).unsqueeze(dim=1)

For my understanding, data and target take the data and label form train_loader whereas val and target_val take the value of data and label from valid_loader. Is that correct?

I am facing the problem that train_loader contains 60000 images and valid_loader contains 10000 images.
I think need all the images to take from train_loader and valid_loader.

You are right in your understanding.

Yeah, that’s the issue I mentioned in my first post.

If I understand you correctly, you need a pair of training and validation data.
So your training data will be repeated (in your example 6 times) for each sample of the validation data.
Is that correct?

You could artificially make the validation Dataset larger with this somewhat ugly hack:

class MyDataTrain(Dataset):
    def __init__(self, length):
        self.data = torch.randn(length, 1)
        self.target = torch.Tensor(length).uniform_(0, 10).long()
    
    def __getitem__(self, index):
        x = self.data[index]
        y = self.data[index]
        return x, y

    def __len__(self):
        return len(self.data)

class MyDataVal(Dataset):
    def __init__(self, length, fake_length):
        self.data = torch.randn(length, 1)
        self.target = torch.Tensor(length).uniform_(0, 10).long()
        self.fake_length = fake_length
        self.real_length = len(self.data)
    
    def __getitem__(self, index):
        index = index % self.real_length
        x = self.data[index]
        y = self.data[index]
        return x, y

    def __len__(self):
        return self.fake_length

train_dataset = MyDataTrain(length=60000)
val_dataset = MyDataVal(length=10000, fake_length=60000)

train_loader = DataLoader(train_dataset)
val_loader = DataLoader(val_dataset)

for batch_idx, (train_batch, val_batch) in enumerate(zip(train_loader, val_loader)):
    # your code here

Note the modulo operation in __getitem__ of the validation Dataset.
It will give fake_length samples, iterating over the same smaller dataset.

Can you work with this?

1 Like

@ptrblck Thank you so much for your kind help and extraordinary explanation. Yes, I need to make the validation Dataset larger artificially. Then it will serve my purpose.

transform = transforms.Compose([
                    transforms.Resize((28,28)),
                    transforms.Grayscale(num_output_channels=1),
                    transforms.ToTensor(),      
                    #transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
                    transforms.Normalize((0.1307,), (0.3081,))])
train_dataset = datasets.SVHN('SVHN', download=True, transform=transform, split='train')
valid_dataset = datasets.MNIST('MNIST', download=True, transform=transform, train=True)
class MyDataTrain(Dataset):
    def __init__(self, length):
        self.data = torch.randn(length, 1)
        self.target = torch.Tensor(length).uniform_(0, 10).long()
    def __getitem__(self, index):
        x = self.data[index]
        y = self.data[index]
        return x, y

    def __len__(self):
        return len(self.data)

class MyDataVal(Dataset):
    def __init__(self, length, fake_length):
        self.data = torch.randn(length, 1)
        self.target = torch.Tensor(length).uniform_(0, 10).long()
        self.fake_length = fake_length
        self.real_length = len(self.data)
    
    def __getitem__(self, index):
        index = index % self.real_length
        x = self.data[index]
        y = self.data[index]
        return x, y

    def __len__(self):
        return self.fake_length
train_dataset = MyDataTrain(length=60000)
val_dataset = MyDataVal(length=10000, fake_length=60000)
train_loader = DataLoader(train_dataset)
val_loader = DataLoader(val_dataset)

Gives me error.

NameError: name 'Dataset' is not defined

You have to import the Dataset class:

from torch.utils.data import Dataset
1 Like