Hello everyone. I need to create a simple CNN and incrementally add pieces for a classification problem. Even though I’ve tried setting the seed in every operation involving randomness, rerunning the same cell still gives me different results.
Below is what I did to limit randomness:
def set_seed(seed=0):
"""Set seeds for reproducibility"""
# Python random
random.seed(seed)
# NumPy
np.random.seed(seed)
# PyTorch
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed) # If using multi-GPU
# CuDNN
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
# Python hash
import os
os.environ['PYTHONHASHSEED'] = str(seed)
set_seed(42) # Set the seed for reproducibility
Here for data loading:
# Set up a generator for reproducibility
g = torch.Generator()
g.manual_seed(42)
def seed_worker(worker_id):
worker_seed = torch.initial_seed() % 2**32
# 3. Split train and validation sets
train_subset, val_subset = random_split(train_dataset, [train_size, val_size], generator=g)
# 4. Data Loaders
train_loader = DataLoader(train_subset, batch_size=32, shuffle=True, worker_init_fn=seed_worker, generator=g)
val_loader = DataLoader(val_subset, batch_size=32, shuffle=False, worker_init_fn=seed_worker, generator=g)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False, worker_init_fn=seed_worker, generator=g)
And here is also the initialization of the weights:
def my_init_weights(m):
if isinstance(m, nn.Conv2d) or isinstance(m, nn.Linear):
# Gain weights from a gaussian distribution
nn.init.normal_(m.weight, mean=0.0, std=0.01)
# Bias at 0
if m.bias is not None:
nn.init.zeros_(m.bias)
# Apply to the model
model.apply(my_init_weights)
I know I’ve been a little overkill, but it’s become a matter of principle. Is there still something random that affects my results? Any other tips is appreciated