Skip to content

Schema validation#17

Open
philhassey wants to merge 14 commits intomainfrom
schema-validation
Open

Schema validation#17
philhassey wants to merge 14 commits intomainfrom
schema-validation

Conversation

@philhassey
Copy link

No description provided.

Or *binaryJSON `json:"||,omitempty"`
Add *binaryJSON `json:"+,omitempty"`
Subtract *binaryJSON `json:"-,omitempty"`
Subtract *binaryJSON `json:"-,omitempty"` //nolint:staticcheck // "-" is the Cedar subtraction operator, handled by custom marshal/unmarshal

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual error here has a better suggestion than nolint:

 should encoding/json ignore this field or name it "-"? Either use `json:"-"` to ignore the field or use `json:"'-',omitempty"` to specify "-" as the name

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, staticcheck seems to have a bug here? json:"'-',omitempty" doesn't work

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it looks like that syntax is a json/v2 thing:

JSON names containing commas or quotes, or names identical to "" or "-", can be specified using a single-quoted string literal, where the syntax is identical to the Go grammar for a double-quoted string literal, but instead uses single quotes as the delimiters.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened a bug here: dominikh/go-tools#1699

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No change needed — you resolved this yourself in the follow-up comments. The //nolint:staticcheck stays since the single-quote syntax is json/v2 only. Thanks for filing the upstream bug! [AI]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, confirmed. Keeping the nolint. [AI]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find on the json/v2 distinction. We can revisit this when we upgrade. [AI]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for filing it! [AI]

"github.com/cedar-policy/cedar-go/x/exp/schema/resolved"
)

// cedarType is the sum type representing Cedar types for the type checker.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add //sumtype:decl for this to be true

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added //sumtype:decl annotation. [AI]

Comment on lines +17 to +30
type typeNever struct{} // bottom type, subtype of all
type typeTrue struct{} // singleton bool true
type typeFalse struct{} // singleton bool false
type typeBool struct{} // Bool primitive
type typeLong struct{} // Long primitive
type typeString struct{} // String primitive
type typeSet struct{ element cedarType } // Set with element type
type typeRecord struct { // Record with attribute types
attrs map[types.String]attributeType
openAttributes bool
}
type typeEntity struct{ lub entityLUB } // Entity with LUB of types
type typeAnyEntity struct{} // Unknown entity type
type typeExtension struct{ name types.Ident } // Extension type (ipaddr, decimal, etc.)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment I hope to resolve myself later: should these just be the same set of types that appear in x/exp/resolved?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No change — noted as future work per your comment. [AI]

Comment on lines +61 to +63
func singleEntityLUB(et types.EntityType) entityLUB {
return entityLUB{elements: []types.EntityType{et}}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems silly. Can't we just call newEntityLUB() with a single element?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping as-is. The Rust reference also has a dedicated single_entity() method (types.rs:850-854) separate from the general BTreeSet constructor. This avoids the clone/sort/compact overhead of newEntityLUB for the common single-element case. Added a comment explaining this. [AI]

type typeSet struct{ element cedarType } // Set with element type
type typeRecord struct { // Record with attribute types
attrs map[types.String]attributeType
openAttributes bool

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find an instance where this is set to true

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct that production code never sets it to true today. However it IS used in several code paths (mayHaveAttr, isSubtypeRecord, lubRecord, typechecker) and IS exercised in tests. The Rust reference has the same OpenTag concept (types.rs:1095-1113). This is infrastructure for when entity open-attribute support is added to the resolved schema. [AI]

Comment on lines +107 to +110
if _, ok := a.(typeAnyEntity); ok {
return false
}
return false

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wut? Doesn't this return false no matter what? Maybe it meant to return true if a is typeAnyEntity?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — simplified to a single return false with a clarifying comment. The behavior was actually correct: Rust also returns false for (AnyEntity, Entity(_)) (types.rs:1246). But the dead branch was confusing, agreed. [AI]

}
for k, bAttr := range b.attrs {
aAttr, ok := a.attrs[k]
if !ok {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this case. If a doesn't have an attr, that's okay so long as it's optional in b? Kind of weird given the Strict: required/optional must match exactly statement below. Let's dig in a bit more on this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No change needed — this is correct per the Rust reference (types.rs Attributes::is_subtype). If the supertype b has an optional attribute that the subtype a lacks, that's valid — a record without an optional field is a valid subtype. The "strict: required/optional must match exactly" on lines 157-161 only applies to attributes present in BOTH records. [AI]

Comment on lines +316 to +317
default:
return typeNever{}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What cases could this cover? Seems sketchy.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a clarifying comment. All known resolved.IsType variants are handled in the switch. The default is defensive — typeNever (bottom type) causes downstream type errors, which is the safest fallback for any hypothetical unhandled variant. [AI]


// entityLUB represents the least upper bound of a set of entity types.
type entityLUB struct {
elements []types.EntityType // sorted, unique

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels like an immutable map set. not sure why sorting is important.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Rust reference uses BTreeSet (ordered) for the same structure. Sorting enables deterministic equality via slices.Equal in equalLUB, which is simpler than hash-based set comparison for these typically small collections. The codebase's ImmutableMapSet has non-deterministic iteration order which is less suitable for type-level equality. [AI]

required bool
}

// entityLUB represents the least upper bound of a set of entity types.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"least upper bound" feels like a term of art that could use a definition here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Expanded the comment to explain LUB in type theory terms and how it applies to entity types. [AI]

Copy link

@patjakdev patjakdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a general concern about the numerous early error returns in the validator (https://github.com/strongdm/cedar-go/pull/17/changes#diff-7154ebaaf619f4b86531020b0ae3ce2f61985f09350be37da0c6072792f6127bR64 for example). That doesn't seem right to me. I thought that the validator was supposed to return all errors it finds. It seems like it should always accumulate the error it finds unless it's some kind of error that prevents further analysis.

// checkExtensionValue checks that a value matches an extension type.
func checkExtensionValue(v types.Value, expected resolved.ExtensionType) error {
switch types.Ident(expected) {
case "ipaddr":

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should think about how to keep this switch in sync with all the other places we deal with extension types.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed this is a maintenance concern. No clean centralized registry exists yet — the validator needs type signatures keyed by function name, which is a different concern than the runtime implementations. Noting for future work. [AI]


// Verify parents match
schemaParents := make(map[types.EntityUID]bool)
for parent := range action.Entity.Parents.All() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

action.Entity.Parents is already a set. Why are we making another set?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want better error messages than just checking for set equality, we could add an ImmutableSet.SymmetricDifference() or the like to find the items that aren't in the intersection of the two sets.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Removed the intermediate map[types.EntityUID]bool and use action.Entity.Parents.Contains() directly. [AI]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kept the two-loop approach for specific error messages ("unexpected parent X" / "missing expected parent X"). SymmetricDifference is a nice idea but feels like scope creep for this PR. [AI]

return nil
}

func (v *Validator) validateActionEntity(entity types.Entity) error {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also validate that the action entity doesn't have any attributes or tags?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I wonder if the schema resolver should return errors if actions are specified as parents of regular entities and vice-versa. Validation is likely too late to be doing that - it's assuming a valid schema.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added attribute and tag validation to validateActionEntity(). The Rust reference explicitly creates action entities with empty attributes and no tags (schema.rs:1073-1081). [AI]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that's a valid concern, but it's a schema resolver change rather than a validator change. Noting for future work. [AI]

Comment on lines +123 to +125
func isValidParentType(parentType types.EntityType, allowedTypes []types.EntityType) bool {
return slices.Contains(allowedTypes, parentType)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function seems useless. Just inline it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Inlined slices.Contains(schemaEntity.ParentTypes, parent.Type) at the single call site and removed the function. [AI]

Comment on lines +127 to +129
func isValidEnumID(uid types.EntityUID, schemaEnum resolved.Enum) bool {
return slices.Contains(schemaEnum.Values, uid)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could just be a method on resolved.Enum? Something like func (e Enum) IsValue(types.EntityUID) bool? Dunno, doesn't add much more value than slices.Contains(schemaEnum.Values, uid) to me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping as a function — unlike isValidParentType (1 call site), isValidEnumID is called 5 times across 3 files. A helper is justified here. [AI]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reasonable idea. Since isValidEnumID is called 5 times across 3 files, keeping it as a standalone function is fine for now. Could become a method on resolved.Enum in a future refactor. [AI]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think we should make the helper method on resolved.Enum now.

}
principalMatch := principalTypes == nil
if !principalMatch {
for _, pt := range principalTypes {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish these were sets. The break confused me for a bit because I thought it was applying to the outer for loop.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged. The slices are typically 1-3 elements so sets would add complexity for minimal benefit. The action scope fix also simplified this area — validateActionApplication no longer has the nested loops with breaks. [AI]

Comment on lines +57 to +69
func matchesPrincipalConstraint(pt types.EntityType, constraints []types.EntityType) bool {
if len(constraints) == 0 {
return true // ScopeTypeAll
}
return slices.Contains(constraints, pt)
}

func matchesResourceConstraint(rt types.EntityType, constraints []types.EntityType) bool {
if len(constraints) == 0 {
return true
}
return slices.Contains(constraints, rt)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be DRYer. Or just inlined.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Merged matchesPrincipalConstraint and matchesResourceConstraint into a single matchesEntityTypeConstraint function — they were identical. [AI]

if _, ok := v.schema.Actions[sc.Entity]; !ok {
return nil, fmt.Errorf("action %s not found in schema", sc.Entity)
}
return []types.EntityUID{sc.Entity}, nil

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to traverse the action ancestors as well? The comment at the top of the function makes it seem like it should.

I guess it depends on whether "action groups" are ever meant to be actions in their own right. We do use them that way in our schema.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed as part of the action scope rework (comments 27/28). validateActionScope now expands descendants for ScopeTypeIn via getActionsInSet(), which includes the group itself plus all descendants. [AI]

Comment on lines +19 to +32
principalTypes, err := v.validatePrincipalScope(policy.Principal)
if err != nil {
errs = append(errs, fmt.Errorf("principal scope: %w", err))
}

actionUIDs, err := v.validateActionScope(policy.Action)
if err != nil {
errs = append(errs, fmt.Errorf("action scope: %w", err))
}

resourceTypes, err := v.validateResourceScope(policy.Resource)
if err != nil {
errs = append(errs, fmt.Errorf("resource scope: %w", err))
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that there is all this explicit handling for the scope when, of course, you can write a completely equivalent policy without using the scope at all.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good observation. Scope validation determines which actions/principals/resources apply so conditions can be type-checked against the right request environments. This matches the Rust architecture. [AI]

// Check if actionUID is in the action group of c
if v.isActionInGroup(actionUID, c) {
return true
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, this kind of confuses me too. Seems like you might get a spurious true result from this function if the policy doesn't use in. Need to read more carefully.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed as part of the action scope rework (comments 27/28). matchesActionConstraint now only checks direct equality — no more group membership checks. Group expansion happens in validateActionScope for in scopes only. Also removed the duplicate isActionInGroup function (was identical to isActionDescendant). [AI]

@philhassey
Copy link
Author

Re: the general concern about early error returns — you're right that the validator should accumulate errors where possible. The main Policy() function does accumulate into var errs []error and uses errors.Join, but some inner functions like entity validation return early. The tricky part is that some errors prevent meaningful further analysis (e.g., if an entity type isn't in the schema, there's nothing else to validate). I'll look at converting more of the inner functions to accumulate where it makes sense in a follow-up. [AI]

philhassey and others added 11 commits March 4, 2026 11:38
…ecking

Implement Cedar policy validation against a resolved schema, matching
the Rust Cedar validator's strict mode behavior. The validator performs
RBAC scope validation, action application checks, and expression type
checking across all applicable request environments.

Key behaviors matching the Rust reference implementation:
- has on entity types returns typeBool (not typeTrue) for required
  attributes since entities may not exist at runtime
- && short-circuits type checking when LHS is typeFalse
- Strict entity type disjointness check in set construction
- Empty set contains check (Set<Never>.contains(X) is an error)
- ScopeTypeIsIn validates the is type can be a descendant of the
  in entity's type

Also fixes corpus_test.go to only check policy validation errors
against shouldValidate, matching Rust's validation_passed() semantics.

Passes 6084/6084 corpus schema-validation tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extract testEnv() helper to eliminate repeated requestEnv boilerplate
  across 10 test functions
- Collapse validateEntityRefsBinary into validateEntityRefs using a
  validateEntityRefsPair helper, reducing ~160 lines to ~80
- Remove unused capabilitySet.intersect method
- Remove section marker comment in typechecker.go
- Fix isActionInGroup: was checking groupUID's parents for actionUID
  (backwards); now checks actionUID's parents for groupUID with
  transitive traversal
- Rename subtest names from spaces to camelCase per codebase convention
- Fix typo in schemaTypeToCedarType doc comment

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Match Rust Cedar reference implementation for capability tracking:
- typeOfOr: propagate lCaps/rCaps instead of discarding; intersect
  when both branches are unknown
- typeOfIfThenElse: else branch gets only prior caps (not test caps);
  intersect then/else caps when test is unknown
- Restore capabilitySet.intersect needed by both fixes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…0% coverage

Fix two bugs found by comparing with the Rust reference implementation:

- ScopeTypeIn for principal/resource now computes descendant entity types
  using getEntityTypesIn(), matching Rust's get_entity_types_in() behavior.
  Previously, `principal in Group::"admins"` would not consider User as a
  valid principal type even though User is a child of Group.

- typeOfIs now returns typeTrue when the entity LUB has exactly one element
  matching the tested type, matching Rust's singleton boolean optimization.

Additional improvements:
- Add //sumtype:decl annotation to cedarType interface
- Use { _ = 0 } pattern for marker methods (matches resolved/types.go)
- Simplify typeOfGetTag by using direct type assertion instead of
  isEntityType() + conditional assertion, eliminating dead code
- Add comprehensive tests reaching 100% statement coverage

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add `make lint` target running golangci-lint. Fix all lint issues:
- goimports formatting in cedar_type.go, ext_funcs.go, validate_test.go
- ineffectual assignment to ty in TestTypeCheckValues

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lint

Update `make linters` to run both golangci-lint and go-check-sumtype,
matching the CI workflow. This catches the SA5008 staticcheck issue
that CI detected but the older local golangci-lint (2.1.1) missed.

Remove //sumtype:decl from cedarType since the type switches on it
intentionally use default cases for partial matching, which is
incompatible with -default-signifies-exhaustive=false.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…o linters

Replace free functions (Policy, Entity, Entities, Request) with methods
on a new Validator struct created via New(schema, ...Option). Add
WithStrict (default) and WithPermissive options controlling behavioral
differences in subtype checking, LUB computation, set/contains/equality
validation, and extension function literal requirements.

Add coverage profiling and 100% coverage enforcement to make linters
target, matching the CI workflow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add cedar-validation-tool (Rust) to generate validation results from
corpus tests, corpus-validate Make target, postprocess_corpus.py for
filtering unsupported extensions, and refresh corpus tarballs. Update
corpus-update to run download, convert, and validate steps.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Major validation fixes to match Rust Cedar 4.9.1 behavior:

- Add strict-mode equality type checking with entity LUB and type LUB,
  exempting primitive-vs-primitive comparisons (bool/long/string)
- Remove strict-mode record LUB restrictions (differing keys become
  optional, matching Rust Cedar behavior)
- Compute transitive closure for action entity parent validation
  instead of direct-only parent matching
- Accept empty sets as valid `in` operator targets (typeNever element)
- Validate extension constructor string literal values (ip, decimal,
  datetime, duration) in all modes using types.Parse* functions
- Require prior hasTag capability for all getTag calls, including
  non-variable entities and non-literal tag keys
- Add exhaustive case listings to all 15 cedarType switch statements
  to satisfy sumtype checker with -default-signifies-exhaustive=false

Remaining 7 failures are strict-mode edge cases: 1 false-positive from
always-false policy with empty set (Rust ignores errors in impossible
policies), and 6 false-negatives from record-contains type compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…us-update

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three behavioral fixes to match Rust Cedar's strict mode validator:

1. Strict lubRecord rejects different record key sets (6 tests)
   Records with different attribute keys can no longer be merged via
   width subtyping in strict mode, matching Rust's strict_least_upper_bound.

2. typeOfIn returns typeFalse for unrelated entity types (1 test)
   Added anyEntityDescendantOf to check entity hierarchy. When the LHS
   entity can never be a descendant of the RHS, return typeFalse to
   enable && short-circuiting past unreachable code.

3. Literal equality evaluation at type-check time (2 regressions)
   Added evalLiteralEquality to resolve comparisons like false==false
   to typeTrue/typeFalse, enabling if-then-else short-circuiting past
   branches with incompatible record types. Checked before strict
   compatibility (matching Rust which skips strict checks for known results).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
philhassey and others added 3 commits March 4, 2026 14:10
Replace the simple download-if-missing rule with a check-upstream-corpus
target that downloads and compares against the existing file. Uses an
order-only prerequisite so dependent targets only rebuild when the
tarball actually changes. corpus-update no longer force-deletes the
tarball since freshness is handled by the check.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…cking

Add 3 handpicked corpus tests (all_pass, policy_strict_fail, policy_both_fail)
that exercise the full parse→resolve→validate pipeline through the public API.
Validation JSON files include Rust error text from cedar-validation-tool, and
tests verify both pass/fail status and error count equivalence.

Also enhance cedar-validation-tool to output error details (strictErrors,
permissiveErrors) alongside the existing booleans, using skip_serializing_if
to keep output unchanged when there are no errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants