Multiple output head

As explained before, you could split the input batches using the targets and forward the data to the corresponding head during training. However, since this won’t work during testing you would have to come up with a strategy how this can be done and why you want to use different heads.