I have a pytorch model, the forward pass looks roughly like the following
def forward(self, x):
# following two encoders could be run in parallel
lidar_features = self.lidar_encoder(x['pointcloud'])
camera_features = self.camera_encoder(x['images'])
# need to sync here
combined_features = torch.stack((lidar_features, camera_features))
predictions = self.prediction_head(combined_features)
return predictions
If the model is in eval mode, is pytorch 2 smart enough to know that the lidar encoder and camera encoder can be run at the same time on the GPU, but then a sync needs to be inserted before the torch.stack
? or will kernels be run in the serial order of the python code?
What about pytorch 1.X?