Suggestion the default of chunks kwarg from 'auto' to {} or None

**Is your feature request related to a problem? Please describe.**
I’ve noticed a error when upgrading from Xarray 2025.9.1 to 2026.2.0. In the older version, chunks='auto' handled these datasets without issue. However, the newer version now triggers a NotImplementedError when encountering object dtypes. This suggests that the internal size-estimation logic in recent Xarray/Dask updates is now strictly enforcing a check that was previously bypassed
```
Failed to load dataset with key='bvf2-theta-an-gauss.bvf2-theta-an-gauss'
                 You can use `cat['bvf2-theta-an-gauss.bvf2-theta-an-gauss'].df` to inspect the assets/files for this key.
                 
  File "/<REDACTED_PATH>/conda-envs/<ENV_NAME>/lib/python3.14/site-packages/intake_esm/source.py", line 292, in _open_dataset    datasets = dask.compute(*datasets)  
  File "/<REDACTED_PATH>/conda-envs/<ENV_NAME>/lib/python3.14/site-packages/dask/base.py", line 685, in compute    results = schedule(expr, keys, **kwargs)  
  File "/<REDACTED_PATH>/conda-envs/<ENV_NAME>/lib/python3.14/site-packages/intake_esm/source.py", line 67, in _delayed_open_ds    return _open_dataset(*args, **kwargs)  
  File "/<REDACTED_PATH>/conda-envs/<ENV_NAME>/lib/python3.14/site-packages/intake_esm/source.py", line 109, in _open_dataset    ds = xr.open_dataset(url, **xarray_open_kwargs)  
  File "/<REDACTED_PATH>/conda-envs/<ENV_NAME>/lib/python3.14/site-packages/xarray/backends/api.py", line 613, in open_dataset        backend_ds,         ^^^^^^^^^^    
  ...<13 lines>...        chunked_array_type,      
  File "/<REDACTED_PATH>/conda-envs/<ENV_NAME>/lib/python3.14/site-packages/xarray/backends/api.py", line 308, in _dataset_from_backend_dataset        ds,         ^^    
  ...<9 lines>...        inline_array,      
  File "/<REDACTED_PATH>/conda-envs/<ENV_NAME>/lib/python3.14/site-packages/xarray/backends/api.py", line 251, in _chunk_ds        var._data,                 ^    
  ...<5 lines>...        preferred_chunks=var.encoding.get("preferred_chunks", {}),      
  File "/<REDACTED_PATH>/conda-envs/<ENV_NAME>/lib/python3.14/site-packages/xarray/namedarray/utils.py", line 239, in _get_chunk        chunk_shape,                  ^^    
  ...<5 lines>...        limit=limit,      
  File "/<REDACTED_PATH>/conda-envs/<ENV_NAME>/lib/python3.14/site-packages/xarray/namedarray/daskmanager.py", line 57, in normalize_chunks        chunks,           ^^^^    
  ...<5 lines>...        dtype=dtype,      
  File "/<REDACTED_PATH>/conda-envs/<ENV_NAME>/lib/python3.14/site-packages/dask/array/core.py", line 3302, in auto_chunks        "Can not use auto rechunking with object dtype. "    
  ...<2 lines>...    
  NotImplementedError: Can not use auto rechunking with object dtype. We are unable to estimate the size in bytes of object data
```

**Describe the solution you'd like**
This error generally doesn't occur in standard xarray.open_dataset workflows because the default for chunks is None. However, since intake-esm defaults to chunks='auto', it forces a re-evaluation of chunk sizes that fails on object dtypes. I suggest changing the default chunks kwarg to either None (to align with xarray's default) or {} (to enable Dask while respecting the dataset's native on-disk structure).

**Describe alternatives you've considered**
I suggest changing the default for the chunks kwarg from 'auto' to None. This would align more closely with user expectations; if no xarray_open_kwargs are provided, most users assume xarray's native defaults apply. Alternatively, if a Dask-compatible default is preferred, using chunks={} would be a safer choice, as it preserves the original on-disk chunking and avoids the NotImplementedError currently triggered by object dtypes during 'auto' rechunking.

**Additional context**
I’m happy to open a PR to address this. I'd appreciate your thoughts on the best approach, as well as any insight into why auto chunking is being triggered for these specific dtypes. Thanks! I will also make a xarray related issue related to this issue.

The following code will reproduce the error above

```
import intake
catalog_url = "https://data.gdex.ucar.edu/d640000/catalogs/d640000-osdf.json"
cat = intake.open_esm_datastore(catalog_url)
cat_analysis = cat.search(variable='bvf2-theta-an-gauss')
dict_datasets = cat_analysis.to_dataset_dict(xarray_open_kwargs={'engine':'kerchunk'})
```

the conda env yml
```
name: arco_test
channels:
  - conda-forge
  - defaults
dependencies:
  - python>3.11
  - pip
  # Core Data & Geospatial
  - xarray == 2026.2.0
  - netcdf4
  - zarr
  - fastparquet
  - kerchunk
  - dask-jobqueue
  - intake-esm >=2025.12.12
  # Visualization & Utilities
  - matplotlib
  - jupyterlab
  - pip:
    - pelicanfs>=1.3.1
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion the default of chunks kwarg from 'auto' to {} or None #779

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Suggestion the default of chunks kwarg from 'auto' to {} or None #779

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions