It would be possible (maybe slightly cursed?) to implement NonMaxU32 (and larger) in terms of some stable standard library type that already has a niche at u32::MAX, removing the run-time costs.
There unfortunately are not a lot of these, but I found std::os::fd::BorrowedFd which is available in std on Unix platforms & WASI and has been stable since 1.63.0.
Proof of concept