0%| | 0/20000 [1:09:45<?, ?it/s]
RuntimeError Traceback (most recent call last)
in <cell line: 7>()
11 # We re-compute it at each epoch as its value depends on the value
12 # network which is updated in the inner loop.
—> 13 advantage_module(tensordict_data)
14 data_view = tensordict_data.reshape(-1)
15 replay_buffer.extend(data_view.cpu())
7 frames
/usr/local/lib/python3.10/dist-packages/torchrl/objectives/value/functional.py in vec_generalized_advantage_estimate(gamma, lmbda, state_value, next_state_value, reward, done, terminated, time_dim)
299 == terminated.shape
300 ):
→ 301 raise RuntimeError(SHAPE_ERR)
302 dtype = state_value.dtype
303 *batch_size, time_steps, lastdim = terminated.shape
RuntimeError: All input tensors (value, reward and done states) must share a unique shape.
time: 2.52 s (started: 2024-04-24 04:20:38 +00:00)
Please how can I edit the tensordict to make the value, reward and done states to have equal shapes?