In my model, I take two tracks (track1, track2) and I have an interaction label (1 or 0). In the training epoch, when I print my target I get something like torch.tensor([1,0]) or torch.tensor([0,0]) etc.

How does it convert my single number into a 2d vector of different values? I’d imagine label 1 becomes [1,1] and label 0 becomes [0,0]. I don’t see where it would get [0,1] or [1,0] from.

I’m not sure how to understand the question as you are defining the model architecture and are responsible for the interpretation of the model output. Based on your description I guess you are using two output neurons, apply a sigmoid on them, and are dealing with a multi-label classification use case?
If that’s the case, your model would try to output a high value (close to 1), if the class is active, and a 0 otherwise.