Open
Description
Describe the bug
When you get a tensordict rollout of shape (N_envs, N_steps, C, H, W)
out of a collector and you want to apply an advantage module that starts with conv2d
layers:
- directly applying the module will crash with the
conv2d
layer complaining about the input size e.g.RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [2, 128, 4, 84, 84]
- flattening the tensordict first with
rollout.reshape(-1)
so that it has shape[B, C, H, W]
and then calling the advantage module will run but issue the warningtorchrl/objectives/value/advantages.py:99: UserWarning: Got a tensordict without a time-marked dimension, assuming time is along the last dimension.
leaving you unsure of wether the advantages were computed correctly.
So it's not clear how one should proceed.
- I have checked that there is no similar issue in the repo (required)
- I have read the documentation (required)
- I have provided a minimal working example to reproduce the bug (required)