Verify heterogeneous keys hash to the same value in set/map#6950
Verify heterogeneous keys hash to the same value in set/map#6950dwblaikie wants to merge 1 commit intocarbon-language:trunkfrom
Conversation
This doesn't feel like the nicest way to do it (adding the check in 2 different places per lookup function) but since it has to wait until after the insertion callback has constructed the key, with the current structure I don't see a great alternative. The return operation could be restructured into a function that DCHECKs and then constructs the key value result type, and some of this code could be commoned into an intermediate base (between BaseImpl and MapBase/SetBase)? This only tests one of the 6 checks introduced in this patch, because death tests are especially expensive (partly innate, and partly due to the cost of symbolizing failures - if we figure out how to disable the symbolizer in death tests (llvm has this problem too/could benefit from a solution) then maybe we could more comprehensively test these paths) (the test could be made more explicit by using custom types that intentionally hash differently - but that seems more verbose/perhaps harder to follow that relying on the pointer conversion that was where the problem was initially observed) Discovered while working on carbon-language#6940
| static_cast<void*>(base_pointer)); | ||
|
|
||
| Map<Base2*, int> m; | ||
| ASSERT_DEATH( |
There was a problem hiding this comment.
This test will fail in opt builds. I'm not sure we need this test anyway; we generally don't test the behavior of violating an invariant, and I'm not sure this invariant is so critical that it needs to be an exception.
There was a problem hiding this comment.
Yeah, fair point indeed.
(I'm a bit more on the fence when it comes to invariant violation checking in core/reusable components where its ability to not be misused is a feature, etc - compared to more opportunistic assertions/checks we might litter throughout the codebase - but happy to go with your leaning here)
| CARBON_DCHECK(entry, "Should always result in a valid index."); | ||
|
|
||
| if (LLVM_LIKELY(!inserted)) { | ||
| CARBON_DCHECK( |
There was a problem hiding this comment.
I'm concerned about the performance impact of this. I know it's disabled in opt mode, but inefficiency in fastbuild is still a problem if it slows down the edit-compile-test loop. I think we can and should solve this statically rather than dynamically.
I think the derived/base pointer issue shows that heterogeneous pointer == is "not suitable for hash tables", so the prescribed fix is to add a CarbonHashtableEq overload for heterogeneous pointer types, which presumably casts to void* before performing the comparison.
However, I think we chose the wrong default here: we shouldn't be supporting heterogeneous lookup between arbitrary types just because they're equality-comparable; instead, we should only support it for pairs of types where we affirmatively know that equal values will have equal hashes. In other words, we should get rid of the generic heterogeneous overload of CarbonHashtableEq, and replace it with a generic homogeneous overload, plus heterogeneous overload pairs for whatever types we need to support heterogeneous lookup on (I think that's just StringRef/std::string and ArrayRef/MutableArrayRef).
There was a problem hiding this comment.
Ah, fair points all. I'll look into that.
This doesn't feel like the nicest way to do it (adding the check in 2
different places per lookup function) but since it has to wait until
after the insertion callback has constructed the key, with the current
structure I don't see a great alternative.
The return operation could be restructured into a function that DCHECKs
and then constructs the key value result type, and some of this code
could be commoned into an intermediate base (between BaseImpl and
MapBase/SetBase)?
This only tests one of the 6 checks introduced in this patch, because
death tests are especially expensive (partly innate, and partly due to
the cost of symbolizing failures - if we figure out how to disable the
symbolizer in death tests (llvm has this problem too/could benefit from
a solution) then maybe we could more comprehensively test these paths)
(the test could be made more explicit by using custom types that
intentionally hash differently - but that seems more verbose/perhaps
harder to follow that relying on the pointer conversion that was where
the problem was initially observed)
Discovered while working on #6940