Skip to content

how to handle 285 open issues in netcdf-c #3288

@edhartnett

Description

@edhartnett

There are so many open issues, it can feel quite intimidating! This freezes action and creates a vicious cycle where more issues leads to more paralysis, which leads to more issues.

I asked claude sonnet 4.6 to look at our issues and suggest what could be done to close large numbers of them. Here's the analysis:

Top 10 Bugfixes/Improvements to Close the Most Open Issues on Unidata/netcdf-c

Analysis of all 285 open issues on Unidata/netcdf-c, clustered by root cause. Issues often span multiple categories, so a single fix in a high-impact area has outsized effect.


1. Fix nc_get/put_vars Stride Performance (~15 issues)

Categories: Performance, Windows perf, nccopy slowness

The NCDEFAULT_get/put_vars code path is the single most complained-about performance bottleneck. It causes:

Fix: Rewrite NCDEFAULT_get/put_vars to use chunk-aware bulk I/O instead of element-by-element dispatch. Also reconcile HDF5 vs netCDF stride semantics for unlimited dimensions.


2. Modernize CMake Build System & Fix Windows/MSVC Builds (~35 unique issues)

Categories: CMake (22), Windows/MSVC (20), Static/Linking (19) — heavy overlap

The build system is the #1 source of user frustration. Recurring themes:

Fix: CMake modernization (#2713) using proper targets, find_package configs, and generator expressions. Fix Windows symbol exports with a single .def file or __declspec audit. This alone would close 30+ issues.


3. Fix NCZarr Interoperability & Correctness (~25 issues)

Categories: NCZarr/Zarr (25), overlaps with S3 and filters

NCZarr is the newest major feature and has the most open bugs per feature:

Fix: Zarr V2 spec compliance audit + Xarray interop testing. Most of these are metadata-handling bugs in libnczarr. A systematic pass through the Zarr V2 spec would close ~15 issues.


4. Fix DAP2/DAP4 Client Bugs (~24 issues)

Categories: DAP/OPeNDAP (24), overlaps with ncdump, authentication

DAP issues cluster into three sub-problems:

Fix: (a) Rewrite the cookie/auth handling to use libcurl's cookie jar properly. (b) Fix DAP4 string/attribute parsing. These two fixes would close ~15 DAP issues.


5. Fix Big-Endian / s390x Support (~10 issues, blocks CI)

Categories: Big-Endian (10), overlaps with test infrastructure

Every release breaks on big-endian:

Fix: Add a big-endian CI workflow (#3282) and fix the ncx.m4 byte-swap code. Most of these are the same root cause — untested byte-swap paths. A CI + ncx.m4 fix would close all 10.


6. Fix VLEN/Compound Type Handling (~9 issues)

Categories: VLEN/Compound (9), overlaps with memory safety

VLEN types are a persistent source of crashes and data corruption:

Fix: Audit the VLEN reclaim/allocation paths in libhdf5 and libdispatch. The crashes (#2181, #2496) and the charvlenbug (#2160) likely share a root cause in how VLEN memory is managed during read-back with unlimited dimensions.


7. Harden Memory Safety in libhdf5 (~14 issues)

Categories: Memory/Crash/Segfault (14)

Multiple fuzzer-found and user-reported crashes:

Fix: Add bounds checking and NULL guards in hdf5open.c:get_attached_info() and hdf5var.c:NC4_get_vars(). Fix the hashmap leak. This is ~4 functions that account for 8+ crash reports.


8. Fix Filter/Plugin Path & Discovery (~25 issues)

Categories: Filter/Plugin/Compression (25)

Plugin handling is broken in multiple ways:

Fix: Centralize plugin path resolution — one function that checks HDF5_PLUGIN_PATH, configure-time path, and install-time path in order. Fix the CMake find_package for optional compression libs. Would close ~12 issues.


9. Thread Safety (~7 issues, high user impact)

Categories: Thread Safety (7)

Thread safety has been requested since 2017 (#382) and remains unfixed:

Fix: Implement per-thread HDF5 error stack isolation and audit global state in libdispatch. Even partial thread safety (read-only concurrent access) would satisfy most users and close 5+ issues.


10. Documentation Overhaul (~14 issues)

Categories: Documentation (14)

Recent audit found massive gaps:

Fix: A systematic doxygen pass through the public API headers + fixing the doc build system (#2581) would close all 14 in one effort.


Summary Table

# Fix Issues Closed Key Issue Numbers
1 Stride/vars performance rewrite ~15 #1381, #1757, #1877, #2721, #1947
2 CMake modernization + Windows fixes ~35 #2713, #554, #1108, #3172, #2697
3 NCZarr spec compliance & Xarray interop ~25 #2449, #3214, #3108, #2474, #2657
4 DAP2/DAP4 auth + correctness fixes ~24 #3042, #1966, #3113, #3151, #2184
5 Big-endian CI + ncx.m4 fix ~10 #3286, #3284, #3282, #2696, #1687
6 VLEN/Compound type handling ~9 #2181, #2212, #2738, #2160, #1489
7 Memory safety in libhdf5 ~14 #2664-2668, #2626, #2436, #3044
8 Filter/plugin path & discovery ~12 #3025, #3048, #2381, #2831, #1245
9 Thread safety (at least read-only) ~7 #382, #1373, #3193, #2496
10 Documentation overhaul ~14 #3274, #3278, #2483, #2952, #2566

Total unique issues addressable: ~130-150 out of 285 (many issues span multiple categories, so the raw sum double-counts). The CMake/Windows cluster (#2) and NCZarr (#3) are the two highest-leverage targets by sheer volume.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions