I am trying to implement FNO-MiONet architecture from this paper ([2303.04778] Fourier-MIONet: Fourier-enhanced multiple-input neural operators for multiphase modeling of geological carbon sequestration). In this architecture, the batch size changes during the forward pass. Below is the code for the network till branch-truck operation (please refer to Table 2 in the paper).
def forward(self, xfield, xscalars, xtime): batchsize_x = xfield.shape size_x, size_y = xfield.shape, xfield.shape size_t = xtime.shape[-1] # ==================================== MIOnet ==================================== x_b1 = self.fc_b1(xfield) #[bs, nz, nx, width] x_b2 = self.fc_b2(xscalars) #[bs, width] x_branch = x_b1 + x_b2[:,None,None,:] #[bs, nz, nx, width] x_trunk = self.fc_t1(xtime) #[nt, width] x_branch = x_branch.unsqueeze(1) x_branch = x_branch.repeat(1, size_t, 1, 1, 1) #[bs, nt, nz, nx, width] # Reshape trunk output to match the dimensions of the branch output x_trunk = x_trunk.unsqueeze(0).unsqueeze(-2).unsqueeze(-2) #[1, nt, 1, 1, width] # multiply branch and trunk output x = x_branch * x_trunk x = x.reshape((batchsize_x*size_t, size_x, size_y, -1)) # ==================================== Fourier layers==================================== x = x.reshape((batchsize_x, size_t, size_x, size_y, -1)) return x
The model works with a single GPU. However, I cannot use this model with the DataParallel option in PyTorch. If I use 4 GPUs with a batch size of four, the model’s output with the DataParallel option is just one sample and not 4.
for x, s, y in train_loader: x, s, y = x.to(device), s.to(device), y.to(device) t = train_t.to(device) optimizer.zero_grad() pred = dp_model(x, s, t)
When I tried to do some testing for my code with 4 GPUs and 4 batchsize for the train_loader, I got only 1 sample for the pred variable.