Help! adding data of the target subject to training performs worse

Hi Pytorch community,

I have a held-one-patient-out experiment. I have designed a CNN+RNN net. During training, since I have an imbalanced dataset I used the sampler within the DataLoader.
Now I am adding some labeled samples (e.g 10) to my training set from the target subject with the objective of improving the performance of my task. Although I would expect this to be beneficial for my model, in some situations I see the performance drops down. So my question is, how can I be sure the DataLoader is using also the added target samples? I would expect adding samples for the target subject would be at least as good as not adding them, but never worse. Any help would be more than appreciated.

I train my model from scratch. I stop the training at 20 epochs. I use BatchNorm with the PyTorch momentum param equal to 0.01

This is a sniped of my code.

import torch
torch.manual_seed(0) # the same seed, to ensure the same weight init

  
train_df = pd.DataFrame()
train_df = pd.concat([train_df, train_df_aux, seeds]) 
# train_df_aux is the training data without any labelled samples from the target subject
# seeds are the labelled samples from the target subject, this is also an umbalaced subset, since I get the labelled samples untill X positive class are found.
train_df.reset_index(drop=True, inplace=True)
  
   
# DATA LOADERS
train_data = torch.utils.data.ConcatDataset([train_data_ori, train_data_trf1])
# train_data_trf1 is the train data with augmentation strategies.
            
sampler = torch.utils.data.sampler.WeightedRandomSampler(weights, len(weights))
#weights is constructed using cutomized fuctions to have during training balanced batched. 

# During training
kwargs = {'num_workers': hparams["num_workers"], 'pin_memory': True} if use_cuda else {}
train_loader = DataLoader(train_data, batch_size=hparams["batch_size"],
                          sampler=sampler,
                              **kwargs)

Thank you in advance!

If you just iterate through DataLoader and print out each sample, do you see the label samples that you added?

Hi nivek! thank you for your reply. Yes, I have checked, and I see the added samples!