RuntimeError: 1only batches of spatial targets supported (3D tensors) but got targets of size: : [1, 3, 96, 128]

The target tensor for a multi-class segmentation use case using nn.CrossEntropyLoss or nn.NLLLoss should have the shape [batch_size, height, width] and contain the class indices in the range [0, nb_classes-1].
If your current target is an RGB image containing color codes for each class, you would have to map these colors to class indices first, e.g. by using a lookup table.