Grid Sample #20538

josauder · 2024-04-02T17:21:23Z

josauder
Apr 2, 2024

It would be great to have an equivalent of torch.functional.grid_sample in jax. This is widely used in 3D vision (view synthesis, re-projections to other camera positions, etc.). My understanding is, that currently, to do a similar thing in JAX, one would have to implement this from scratch like this Tensorflow example, which seems verbose and slow, whereas the PyTorch version seems to do this in one CUDA kernel.

Thank you for your consideration!

jakevdp · 2024-04-02T17:31:30Z

jakevdp
Apr 2, 2024
Maintainer

Hi - the pytorch grid_sample documentation link gives a 404, and I can't seem to find any other reference to what the grid_sample function does. Can you describe more fully what API you're interested in?

6 replies

jakevdp Apr 2, 2024
Maintainer

Thanks - I haven't enountered this before. In the simplest case, it seems like it would be fairly straightforward to construct this in terms of a vmapped scatter operation. If you wanted a custom kernel, it could probably be implemented in terms of pallas for GPU and/or TPU.

I'm not aware of anyone working on this type of application in JAX

josauder Apr 2, 2024
Author

I see - I take it as a respectful "no" then? Thanks for your response!

jakevdp Apr 2, 2024
Maintainer

I don't think it's a "no", but I also don't think anyone on the team would push things off their TODO list in order to implement this. I suspect we'd accept contributions, though.

zz-f-g Jul 30, 2024

I think you can workaround using jax.scipy.ndimage.map_coordinates

from jax.scipy.ndimage import map_coordinates
from jax import vmap


def grid_sample_jax(input, grid):
    assert isinstance(input, jax.Array)
    assert isinstance(grid, jax.Array)
    assert len(input.shape) == 4
    assert len(grid.shape) == 4
    assert input.shape[0] == grid.shape[0]
    assert grid.shape[-1] == 2
    B, C, Hi, Wi = input.shape
    _, Ho, Wo, _ = grid.shape

    coordinates = (
        (jnp.flip(grid, axis=-1) + 1.0) / 2.0 * jnp.array([Hi - 1.0, Wi - 1.0]).reshape(1, 1, 1, 2)
    )
    bilinear_sample_grey = lambda grey, coords: map_coordinates(
        grey, coords.reshape(-1, 2).transpose(), order=1
    )
    bilinear_sample_image = vmap(bilinear_sample_grey, in_axes=[0, None])
    return vmap(bilinear_sample_image)(input, coordinates).reshape(B, C, Ho, Wo)

NILOIDE Nov 5, 2024

I was looking to bilinearly interpolate a grid and obtain the gradients wrt the input coordinates. The jax.scipy.ndimage.map_coordinates function appears to return zero-filled gradients when calling jax.grad wrt to the coordinates regardless of the grid values.

I ended up coding my own 2D/3D interpolation function if anyone is interested. Currently only supports 'linear' and 'nearest' interpolation.

def grid_sample(
        image: jnp.ndarray,
        coords: jnp.ndarray,
        mode: str = 'linear',
        index='ij',
) -> jnp.ndarray:
    """
    Sample an image at arbitrary coordinates.

    Args:
        image: Array of shape [B, H, W, C] or [B, H, W, D, C]
        coords: Array of shape [B, h, w, 2] or [B, h, w, d, 3] containing coordinates in [-1, 1] range
        mode: Interpolation mode ('linear'/'bilinear'/'trilinear' or 'nearest')
    Returns:
        Interpolated values of shape [B, h, w, C] or [B, h, w, d, C]
    """

    B, *spatial_dims, C = image.shape
    if index == 'xy':  # Careful about how coordinates are swapped in 3D array
        coords = jnp.concatenate((coords[..., 1:2], coords[..., 0:1], coords[..., 2:]), -1)
    elif index != 'ij':
        raise ValueError(f'Unsuported indexing type: {index}')

    # Scale coordinates from [-1, 1] to [0, H/W]
    coords = (coords + 1) * (jnp.array(spatial_dims) - 1) / 2
    b_idx = jnp.arange(0, B).reshape(B, *[1]*len(spatial_dims))

    if mode in {'linear', 'bilinear', 'trilinear'}:
        # Get corner coordinates
        i0 = jnp.floor(coords[..., 0]).astype(jnp.int32)
        j0 = jnp.floor(coords[..., 1]).astype(jnp.int32)
        i1 = i0 + 1
        j1 = j0 + 1
        # Clip coordinates to valid range
        i0 = jnp.clip(i0, 0, spatial_dims[0] - 1)
        i1 = jnp.clip(i1, 0, spatial_dims[0] - 1)
        j0 = jnp.clip(j0, 0, spatial_dims[1] - 1)
        j1 = jnp.clip(j1, 0, spatial_dims[1] - 1)

        # Calculate interpolation weights
        wi = coords[..., 0] - i0
        wi = wi[..., None]
        wj = coords[..., 1] - j0
        wj = wj[..., None]
        if len(spatial_dims) == 2:
            output = (
                    image[b_idx, i0, j0] * (1-wi) * (1-wj) +
                    image[b_idx, i1, j0] * wi * (1 - wj) +
                    image[b_idx, i0, j1] * (1 - wi) * wj +
                    image[b_idx, i1, j1] * wi * wj
            )
        else:
            k0 = jnp.floor(coords[..., 2]).astype(jnp.int32)
            k1 = k0 + 1
            k0 = jnp.clip(k0, 0, spatial_dims[2] - 1)
            k1 = jnp.clip(k1, 0, spatial_dims[2] - 1)
            wk = coords[..., 2] - k0
            wk = wk[..., None]
            output = (
                    image[b_idx, i0, j0, k0] * (1 - wi) * (1 - wj) * (1 - wk) +
                    image[b_idx, i1, j0, k0] * wi * (1 - wj) * (1 - wk) +
                    image[b_idx, i0, j1, k0] * (1 - wi) * wj * (1 - wk) +
                    image[b_idx, i0, j0, k1] * (1 - wi) * (1 - wj) * wk +
                    image[b_idx, i1, j0, k1] * wi * (1 - wj) * wk +
                    image[b_idx, i0, j1, k1] * (1 - wi) * wj * wk +
                    image[b_idx, i1, j1, k0] * wi * wj * (1 - wk) +
                    image[b_idx, i1, j1, k1] * wi * wj * wk
            )
    elif mode == 'nearest':
        # Round coordinates to nearest integer
        y = jnp.clip(jnp.round(coords[..., 0]).astype(jnp.int32), 0, spatial_dims[0] - 1)
        x = jnp.clip(jnp.round(coords[..., 1]).astype(jnp.int32), 0, spatial_dims[1] - 1)
        if len(spatial_dims) == 2:
            output = image[b_idx, y, x]
        else:
            z = jnp.clip(jnp.round(coords[..., 2]).astype(jnp.int32), 0, spatial_dims[2] - 1)
            output = image[b_idx, y, x, z]
    else:
        raise ValueError(f"Unsupported interpolation mode: {mode}")
    return output

bhyun-kim · 2025-01-19T02:24:38Z

bhyun-kim
Jan 19, 2025

Hi @josauder and @NILOIDE,

Even though there hasn't been an update in a while, I'm sharing my implementation of grid_sample in JAX here. I have tested it, and it works the same as the PyTorch version. It currently only supports 2D input. While it outperforms the PyTorch version on CPU (it's about 5× faster), it is roughly 10× slower on GPU.

Please check it out, and let me know if you have any feedback on this code.

https://github.com/bhyun-kim/grid-sample-jax.git

0 replies

adam-hartshorne · 2025-01-19T17:22:39Z

adam-hartshorne
Jan 19, 2025

Seems like a good potential use of FFI interface to call a C++ / CUDA kernel.

0 replies

jiyuuchc · 2025-06-19T12:15:22Z

jiyuuchc
Jun 19, 2025

Here's an implementation that works for any dimension. However, this is bilinear only, and also it uses un-normalized coordinates, unlike torch.

def _retrieve_value_at(img, loc, out_of_bound_value=0):
    iloc = jnp.floor(loc).astype(int)
    res = loc - iloc

    offsets = jnp.asarray(
        [[(i >> j) % 2 for j in range(len(loc))] for i in range(2 ** len(loc))]
    )
    ilocs = jnp.swapaxes(iloc + offsets, 0, 1)

    weight = jnp.prod(res * (offsets == 1) + (1 - res) * (offsets == 0), axis=1)

    max_indices = jnp.asarray(img.shape)[: len(loc), None]
    values = jnp.where(
        (ilocs >= 0).all(axis=0) & (ilocs < max_indices).all(axis=0),
        jnp.swapaxes(img[tuple(ilocs)], 0, -1),
        out_of_bound_value,
    )

    value = (values * weight).sum(axis=-1)

    return value

def sub_pixel_samples(
    img: ArrayLike,
    locs: ArrayLike,
    out_of_bound_value: float = 0,
    align_corners: bool = False,
) -> Array:
    """Retrieve image values as non-integer locations by interpolation

    Args:
        img: Array of shape [D1,D2,..,Dk, ...]
        locs: Array of shape [d1,d2,..,dn, k]
        out_of_bound_value: 
        align_corners: 

    Returns:
        values: [d1,d2,..,dn, ...], float
    """

    loc_shape = locs.shape
    img_shape = img.shape
    d_loc = loc_shape[-1]
    locs = jnp.asarray(locs)
    img = jnp.asarray(img)

    if align_corners:
        locs = locs + 0.5

    img = img.reshape(img_shape[:d_loc] + (-1,))
    locs = locs.reshape(-1, d_loc)
    op = partial(_retrieve_value_at, out_of_bound_value=out_of_bound_value)

    values = jax.vmap(op, in_axes=(None, 0))(img, locs)
    out_shape = loc_shape[:-1] + img_shape[d_loc:]

    values = values.reshape(out_shape)

    return values

0 replies

Grid Sample #20538

Uh oh!

Uh oh!

josauder Apr 2, 2024

Replies: 4 comments · 6 replies

Uh oh!

Uh oh!

jakevdp Apr 2, 2024 Maintainer

Uh oh!

jakevdp Apr 2, 2024 Maintainer

Uh oh!

josauder Apr 2, 2024 Author

Uh oh!

jakevdp Apr 2, 2024 Maintainer

Uh oh!

zz-f-g Jul 30, 2024

Uh oh!

NILOIDE Nov 5, 2024

Uh oh!

bhyun-kim Jan 19, 2025

Uh oh!

Uh oh!

adam-hartshorne Jan 19, 2025

Uh oh!

jiyuuchc Jun 19, 2025

josauder
Apr 2, 2024

Replies: 4 comments 6 replies

jakevdp
Apr 2, 2024
Maintainer

jakevdp Apr 2, 2024
Maintainer

josauder Apr 2, 2024
Author

jakevdp Apr 2, 2024
Maintainer

bhyun-kim
Jan 19, 2025

adam-hartshorne
Jan 19, 2025

jiyuuchc
Jun 19, 2025