-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Inconsistent and unexpected results when grouping by more than one coordinate #11264
Description
What happened?
Grouping by more than one coordinate uses all combinations of coordinates.
What did you expect to happen?
I would expect only the observed combinations to be used.
Minimal Complete Verifiable Example
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "xarray[complete]@git+https://github.com/pydata/xarray.git@main",
# ]
# ///
#
# This script automatically imports the development branch of xarray to check for issues.
# Please delete this header if you have _not_ tested this script with `uv run`!
import xarray as xr
xr.show_versions()
import numpy as np
import pandas as pd
df = pd.DataFrame(data=dict(test1=[1, 2, 3, 4, 5], test2=[1, 1, 1, 2, 2]))
df['test3'] = df[['test1', 'test2']].apply(tuple, axis=1)
coords = {}
for c in df.columns:
coords[c] = ("y", df[c].values)
d = xr.DataArray(np.ones((5, 5)), dims=("y", "x"), coords=coords)
d.groupby(["test1", "test2"]).mean() # generate all combinations of test1 and test2
d.groupby("test3").mean() # works as expected
# NotImplementedError
# d.set_xindex(["test1", "test2"], PandasMultiIndex).groupby(["test1", "test2"]).mean()Steps to reproduce
No response
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
- Recent environment — the issue occurs with the latest version of xarray and its dependencies.
Relevant log output
Anything else we need to know?
No response
Environment
Details
INSTALLED VERSIONS
commit: None
python: 3.12.12 | packaged by conda-forge | (main, Jan 27 2026, 00:01:15) [Clang 19.1.7 ]
python-bits: 64
OS: Darwin
OS-release: 25.3.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: ('C', 'UTF-8')
libhdf5: 1.14.6
libnetcdf: None
xarray: 2026.2.0
pandas: 2.3.3
numpy: 2.4.3
scipy: 1.17.1
netCDF4: None
pydap: None
h5netcdf: 1.8.1
h5py: 3.14.0
zarr: 2.18.7
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: 2025.11.0
distributed: 2025.11.0
matplotlib: 3.10.8
cartopy: None
seaborn: 0.13.2
numbagg: None
fsspec: 2026.2.0
cupy: None
pint: 0.25.2
sparse: 0.18.0
flox: 0.11.2
numpy_groupies: 0.11.3
setuptools: 82.0.1
pip: 26.0.1
conda: None
pytest: 9.0.2
mypy: None
IPython: 9.10.0
sphinx: None