Hi,
I am training a small autoencoder, using 4 GPUs however it appears that the GPUs aren’t being used properly:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.59 Driver Version: 440.59 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 TITAN Xp COLLEC... Off | 00000000:02:00.0 Off | N/A |
| 25% 40C P8 12W / 250W | 1031MiB / 12196MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 TITAN Xp Off | 00000000:04:00.0 Off | N/A |
| 26% 41C P8 10W / 250W | 749MiB / 12196MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 TITAN Xp Off | 00000000:83:00.0 Off | N/A |
| 32% 50C P8 10W / 250W | 743MiB / 12196MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 TITAN Xp COLLEC... Off | 00000000:84:00.0 Off | N/A |
| 28% 45C P8 13W / 250W | 749MiB / 12196MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 25246 C python 1021MiB |
| 1 25246 C python 739MiB |
| 2 25246 C python 733MiB |
| 3 25246 C python 739MiB |
+-----------------------------------------------------------------------------+
My autoencoder class looks like this :
class AutoEncoder(nn.Module):
def __init__(self, n_embedded):
super(AutoEncoder, self).__init__()
self.encoder = nn.Sequential(
nn.Linear(6144, n_embedded))
self.decoder = nn.Sequential(nn.Linear(n_embedded, 6144))
def forward(self, x):
encoded = self.encoder(x)
decoded = self.decoder(encoded)
return encoded, decoded
My Dataset is stored in HDF5 format so I have a custom dataset class:
class Features_Dataset(data.Dataset):
def __init__(self, archive, phase):
self.archive = h5py.File(archive, 'r')
self.labels = self.archive[str(phase) + '_labels']
self.data = self.archive[str(phase) + '_all_arrays']
self.img_paths = self.archive[str(phase) + '_img_paths']
def __getitem__(self, index):
datum = self.data[index]
label = self.labels[index]
path = self.img_paths[index]
return datum, label, path
def __len__(self):
return len(self.data)
def close(self):
self.archive.close()
I initalize and train/evaluate my model like this:
device = torch.device("cuda")
if torch.cuda.device_count() > 1:
print("Let's use", torch.cuda.device_count(), "GPUs!")
model = AutoEncoder(2048)
nn.DataParallel(model)
model.to(device)
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(),weight_decay=1e-5)
for epoch in range(args.start_epoch, args.num_epochs+1):
train_loss = 0
model.train()
for i, (inputs, labels, paths) in enumerate(dataloaders_dict['train']):
inputs = inputs.to(device)
inputs = inputs.view(-1, 6144)
optimizer.zero_grad()
# ===================forward=====================
encoded, decoded = model(inputs)
loss = criterion(decoded, inputs)
# ===================backward====================
loss.backward()
train_loss += loss.item()
model.eval()
with torch.no_grad():
val_loss = 0
for i, (inputs, labels, paths ) in enumerate(dataloaders_dict['val']):
inputs = inputs.to(device)
inputs = inputs.view(-1, 6144)
encoded, decoded = model(inputs)
val_loss += criterion(decoded, inputs).item()
I’m not sure if its my dataclass thats an issue, whether I have even got the GPUs being used properly, or whether I have placed my model evaluation in an inconvenient place…
I should also mention that the number of workers is 0.
Cheers,
Taran