Cannot sample n_sample > prob_dist.size(-1) samples without replacement

Hello. I am trying to use imitation learning. However when I try to enter the recorded .demo file the following error, related to torch, appears. Thanks.

RuntimeError: cannot sample n_sample > prob_dist.size(-1) samples without replacement

ERROR LOG BELOW
–‐----------------------------
Version information:
ml-agents: 0.27.0,
ml-agents-envs: 0.27.0,
Communicator API: 1.5.0,
PyTorch: 1.9.0+cu111
[INFO] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
[INFO] Connected to Unity environment with package version 2.0.0-pre.3 and communication version 1.5.0
[INFO] Connected new brain: ZARAgoal?team=1
[WARNING] Deleting TensorBoard data events.out.tfevents.1624485266.AndreasPC.18212.0 that was left over from a previous run.
[INFO] Hyperparameters for behavior name ZARAgoal:
trainer_type: ppo
hyperparameters:
batch_size: 128
buffer_size: 2048
learning_rate: 0.0003
beta: 0.01
epsilon: 0.2
lambd: 0.95
num_epoch: 3
learning_rate_schedule: linear
network_settings:
normalize: False
hidden_units: 256
num_layers: 2
vis_encode_type: simple
memory: None
goal_conditioning_type: hyper
reward_signals:
extrinsic:
gamma: 0.99
strength: 1.0
network_settings:
normalize: False
hidden_units: 128
num_layers: 2
vis_encode_type: simple
memory: None
goal_conditioning_type: hyper
gail:
gamma: 0.99
strength: 0.01
network_settings:
normalize: False
hidden_units: 128
num_layers: 2
vis_encode_type: simple
memory: None
goal_conditioning_type: hyper
learning_rate: 0.0003
encoding_size: None
use_actions: False
use_vail: False
demo_path: Demos/ZARAdemos/
init_path: None
keep_checkpoints: 5
checkpoint_interval: 500000
max_steps: 100000
time_horizon: 64
summary_freq: 60000
threaded: False
self_play: None
behavioral_cloning:
demo_path: Demos/ZARAdemos/
steps: 50000
strength: 1.0
samples_per_update: 0
num_epoch: None
batch_size: None
d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\nn\init.py:388: UserWarning: Initializing zero-element tensors is a no-op
warnings.warn(“Initializing zero-element tensors is a no-op”)
d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\nn\init.py:426: UserWarning: Initializing zero-element tensors is a no-op
warnings.warn(“Initializing zero-element tensors is a no-op”)
Traceback (most recent call last):
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\trainer_controller.py”, line 176, in start_learning
n_steps = self.advance(env_manager)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents_envs\timers.py”, line 305, in wrapped
return func(*args, **kwargs)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\trainer_controller.py”, line 234, in advance
new_step_infos = env_manager.get_steps()
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\env_manager.py”, line 124, in get_steps
new_step_infos = self._step()
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\subprocess_env_manager.py”, line 298, in _step
self._queue_steps()
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\subprocess_env_manager.py”, line 291, in _queue_steps
env_action_info = self._take_step(env_worker.previous_step)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents_envs\timers.py”, line 305, in wrapped
return func(*args, **kwargs)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\subprocess_env_manager.py”, line 429, in _take_step
all_action_info[brain_name] = self.policies[brain_name].get_action(
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\policy\torch_policy.py”, line 212, in get_action
run_out = self.evaluate(decision_requests, global_agent_ids)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents_envs\timers.py”, line 305, in wrapped
return func(*args, **kwargs)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\policy\torch_policy.py”, line 178, in evaluate
action, log_probs, entropy, memories = self.sample_actions(
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents_envs\timers.py”, line 305, in wrapped
return func(*args, **kwargs)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\policy\torch_policy.py”, line 140, in sample_actions
actions, log_probs, entropies, memories = self.actor.get_action_and_stats(
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\networks.py”, line 626, in get_action_and_stats
action, log_probs, entropies = self.action_model(encoding, masks)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\nn\modules\module.py”, line 1051, in _call_impl
return forward_call(*input, **kwargs)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\action_model.py”, line 194, in forward
actions = self._sample_action(dists)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\action_model.py”, line 84, in _sample_action
discrete_action.append(discrete_dist.sample())
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\distributions.py”, line 114, in sample
return torch.multinomial(self.probs, 1)
RuntimeError: cannot sample n_sample > prob_dist.size(-1) samples without replacement

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “C:\Users\antre\AppData\Local\Programs\Python\Python39\lib\runpy.py”, line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File “C:\Users\antre\AppData\Local\Programs\Python\Python39\lib\runpy.py”, line 87, in run_code
exec(code, run_globals)
File "D:\Desktop\Crowds-and-ML-Agents\venv\Scripts\mlagents-learn.exe_main
.py", line 7, in
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\learn.py”, line 250, in main
run_cli(parse_command_line())
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\learn.py”, line 246, in run_cli
run_training(run_seed, options)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\learn.py”, line 125, in run_training
tc.start_learning(env_manager)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents_envs\timers.py”, line 305, in wrapped
return func(*args, **kwargs)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\trainer_controller.py”, line 201, in start_learning
self._save_models()
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents_envs\timers.py”, line 305, in wrapped
return func(*args, **kwargs)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\trainer_controller.py”, line 80, in _save_models
self.trainers[brain_name].save_model()
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\trainer\rl_trainer.py”, line 185, in save_model
model_checkpoint = self._checkpoint()
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents_envs\timers.py”, line 305, in wrapped
return func(*args, **kwargs)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\trainer\rl_trainer.py”, line 157, in checkpoint
export_path, auxillary_paths = self.model_saver.save_checkpoint(
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\model_saver\torch_model_saver.py”, line 59, in save_checkpoint
self.export(checkpoint_path, behavior_name)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\model_saver\torch_model_saver.py”, line 64, in export
self.exporter.export_policy_model(output_filepath)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\model_serialization.py”, line 159, in export_policy_model
torch.onnx.export(
File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\onnx_init
.py", line 275, in export
return utils.export(model, args, f, export_params, verbose, training,
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\onnx\utils.py”, line 88, in export
_export(model, args, f, export_params, verbose, training, input_names, output_names,
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\onnx\utils.py”, line 689, in _export
_model_to_graph(model, args, verbose, input_names,
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\onnx\utils.py”, line 458, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args,
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\onnx\utils.py”, line 422, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\onnx\utils.py”, line 373, in _trace_and_get_graph_from_model
torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\jit_trace.py”, line 1160, in _get_trace_graph
outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\nn\modules\module.py”, line 1051, in _call_impl
return forward_call(*input, **kwargs)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\jit_trace.py”, line 127, in forward
graph, out = torch._C._create_graph_by_tracing(
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\jit_trace.py”, line 118, in wrapper
outs.append(self.inner(*trace_inputs))
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\nn\modules\module.py”, line 1051, in _call_impl
return forward_call(*input, **kwargs)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\nn\modules\module.py”, line 1039, in _slow_forward
result = self.forward(*input, **kwargs)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\networks.py”, line 664, in forward
) = self.action_model.get_action_out(encoding, masks)
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\action_model.py”, line 171, in get_action_out
discrete_out_list = [
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\action_model.py”, line 172, in
discrete_dist.exported_model_output()
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\distributions.py”, line 136, in exported_model_output
return self.sample()
File “d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\distributions.py”, line 114, in sample
return torch.multinomial(self.probs, 1)
RuntimeError: cannot sample n_sample > prob_dist.size(-1) samples without replacement

This error is raised, if n_sample is larger than the provided positive probabilities in multinomial and would thus fail, if replacement is set to False.
Check the values of self.probs and make sure that the sampling would be possible.

Hello, thanks for your reply. Finally, I found the solution to the problem. It was a setting related to mlagents where I have set the branch size to 0 instead of 1. I write it in case someone encounter the same issue🙂