I followed this guide and trained the model. Then I saved the full model and tried to use it for inference. But the loaded model outputs a tuple of 4 tensors (3 tensors of the length of actions number and one tensor with a single value). I can’t figure out what are those

I’m still a begginer, so I might not know some basic stuff. From the guide I understood that the model is trained to output mean and scale values to define a distribution (specifically normal destruction). Through experiments I figured that the first 2 tensors are mean and scale, I used them to make the distribution and it seemed to work. But what’s the purpose of the other 2 tensors?