[BUG] It's not clear how to call an advantage module with batched envs and pixel observations.

## Describe the bug

When you get a tensordict rollout of shape `(N_envs, N_steps, C, H, W)` out of a collector and you want to apply an advantage module that starts with `conv2d` layers:
1. directly applying the module will crash with the `conv2d` layer complaining about the input size e.g. `RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [2, 128, 4, 84, 84]`
2. flattening the tensordict first with `rollout.reshape(-1)` so that it has shape `[B, C, H, W]` and then calling the  advantage module will run but issue the warning `torchrl/objectives/value/advantages.py:99: UserWarning: Got a tensordict without a time-marked dimension, assuming time is along the last dimension.` leaving you unsure of wether the advantages were computed correctly.

So it's not clear how one should proceed.

- [x] I have checked that there is no similar issue in the repo (**required**)
- [x] I have read the [documentation](https://github.com/pytorch/rl/tree/main/docs/) (**required**)
- [x] I have provided a minimal working example to reproduce the bug (**required**)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] It's not clear how to call an advantage module with batched envs and pixel observations. #1522

Describe the bug

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] It's not clear how to call an advantage module with batched envs and pixel observations. #1522

Description

Describe the bug

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions