Skip to content

Conversation

@austnwil
Copy link
Contributor

@austnwil austnwil commented Sep 23, 2025

Addresses #307

Description of changes:

This change creates more descriptive error messages in cases where a schema uses a $null_or:: annotation on the type: field of a type definition that also includes other constraints for which null are never valid. This can address some confusion where a user should have instead used $null_or on the type definition itself and point them in the right direction.

Previously, a schema like the following:

$ion_schema_2_0
type::{
    name: mystring,
    type: $null_or::string,
    codepoint_length: range::[ 1, 10 ]
}

would report the following error when validating null against mystring:

Validation failed:
- not applicable for type null

because null has no codepoint_length and therefore is invalid for this constraint. However, this error is vague and unhelpful for someone figuring out why this doesn't work.

The same schema will now report the following when validating null:

Validation failed:
- type cannot accept null. note: type attempts to accept null via $null_or but defines one or more constraints for which null are never valid - did you mean to use $null_or on the type definition itself?
  - constraint "codepoint_length" is not applicable for null values

Multiple violations due to constraints that do not accept null are grouped together. As an example, this java code:

final var iss = IonSchemaSystemBuilder.standard().build();
final var is = IonSystemBuilder.standard().build();

final Schema schema = iss.newSchema("""
    $ion_schema_2_0
    type::{
        name: mystring,
        type: $null_or::string,
        annotations: closed::[ abc ],
        utf8_byte_length: range::[ min, 4096 ],
        regex: i::"^[a-z0-9_]+$",
    }
    """);

Type type = schema.getType("mystring");
final var ionValue = is.singleValue("abc::null");

Violations violations = type.validate(ionValue);
if (!violations.isValid()) {
    System.out.println(violations);
}

will output these validation errors:

Validation failed:
- type cannot accept null. note: type attempts to accept null via $null_or but defines one or more constraints for which null are never valid - did you mean to use $null_or on the type definition itself?
  - constraint "utf8_byte_length" is not applicable for null values
  - constraint "regex" is not applicable for null values

Other violations are still reported separately:

$ion_schema_2_0
type::{
    name: mystring,
    type: $null_or::string,
    codepoint_length: range::[ 1, 10 ],
    annotations: closed::[]
}
// validate not_allowed::null
Validation failed:
- found one or more unexpected annotations
- type cannot accept null. note: type attempts to accept null via $null_or but defines one or more constraints for which null are never valid - did you mean to use $null_or on the type definition itself?
  - constraint "codepoint_length" is not applicable for null values

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@austnwil austnwil marked this pull request as ready for review September 24, 2025 17:12
Copy link
Contributor

@popematt popematt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While your overall approach appears to work, I think it is more complex than it needs to be. We don't have to improve the messages only when there is a $null_or annotation—we can improve the message for all null-related violations if it keeps things simpler.

Can you see if we can solve things by updating this function?

fun NULL_VALUE(constraint: IonValue) = Violation(
constraint,
"null_value",
"not applicable for null values"
)

Also, are there any tests that need to be updated as a result of this change?

Comment on lines 37 to 39
!expectedClass.isInstance(value) -> issues.add(CommonViolations.INVALID_TYPE(ion, value))
// Null check needs to be first, since Class.isInstance returns false for null values
value.isNullValue -> issues.add(CommonViolations.NULL_VALUE(ion))
!expectedClass.isInstance(value) -> issues.add(CommonViolations.INVALID_TYPE(ion, value))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change isn't actually necessary. Here's why...

So, Class.isInstance does return false for null values, but that is for JVM nulls. In this case, value is a non-null instance of some subclass of IonValue. The isInstance is checking, in this case, whether the value is an IonNumber or similar.

The null check is only checking to see if the (non-null) IonValue instance represents a null Ion value, such as null.null, null.bool, etc.

Ion has typed nulls, so for example, if there's a codepoint_length constraint and you give it null.int, the value will be invalid both because it is (Ion) null and because it's not a string or symbol value. Unless we have some compelling reason to change it, I think we should leave it as is so that only one of these two (overlapping) problems is reported and the wrong Ion type issue takes precedence over null not being valid for the constraint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a bit of a mistake in documenting why I did this. expectedClass.isInstance() will return false for null Ion values whose Ion type is null - i.e. instances of IonNullLite. If a constraint is expecting type IonText, expectedClass.isInstance() will be true for null.string or null.symbol but false for null or null.null. I wanted CommonViolations.NULL_VALUE to be emitted for untyped nulls as well as nulls of the expected type so that the more descriptive error is reported for that case as well, and only CommonViolations.INVALID_TYPE to be emitted if the type was actually wrong (and the type was not null).

Like you mentioned, this change does have the undesirable consequence that null violations take precedence over bad type violations, so when a constraint expecting IonText receives null.int, it will complain about the null instead of the incompatible type.

I can undo this change and still make it work, especially if I just update the error messages in CommonViolations.kt like you suggested.

override fun validate(value: IonValue, issues: Violations) {
val constraintIterator = constraints.iterator()
val typeWantsToAcceptNull = isl.get("type")?.hasTypeAnnotation("\$null_or") == true
val incompatibleConstraintsWithNullOrIssues = Violation(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little confused by the changes in this method and having a hard time determining how it's supposed to work. Can it be simplified at all? Can you describe what sort of testing you did to confirm that it does what you expect?

@austnwil
Copy link
Contributor Author

We don't have to improve the messages only when there is a $null_or annotation—we can improve the message for all null-related violations if it keeps things simpler.

I can definitely simplify this by updating the error message in CommonViolations.kt instead of the custom logic to detect if the user is using $null_or incorrectly in particular, but the error might be less helpful.

Also, are there any tests that need to be updated as a result of this change?

This change doesn't affect whether or not any value is valid for a particular type, so there's nothing to really test.

I wanted to add tests, but the test suite doesn't seem capable of testing the actual violations emitted. It only allows testing that schemas and type definitions are correctly accepted or rejected as valid/invalid and that a particular type accepts or rejects a particular value in general.

@codecov
Copy link

codecov bot commented Sep 25, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.24%. Comparing base (7a9ee6d) to head (9ffd19a).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #308   +/-   ##
=======================================
  Coverage   83.24%   83.24%           
=======================================
  Files         160      160           
  Lines        3777     3777           
  Branches      907      907           
=======================================
  Hits         3144     3144           
  Misses        360      360           
  Partials      273      273           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@austnwil
Copy link
Contributor Author

I simplified the code and made null value and invalid type errors more descriptive in all cases, not just when the user attempts to use $null_or with a constraint that cannot accept null.

Below are some example error messages when validating a few different nulls against this schema:

$ion_schema_2_0
type::{
    name: mytext,
    type: $null_or::text,
    regex: "[a-z]+"
}
Validating null.symbol
Validation failed:
- expected type text, found null symbol
- null values are never valid for types defining regex constraints


Validating null.string
Validation failed:
- expected type text, found null string
- null values are never valid for types defining regex constraints


Validating null.int
Validation failed:
- expected type text, found null int
- values of type int are never valid for types defining regex constraints


Validating null
Validation failed:
- values of type null are never valid for types defining regex constraints


Validating null.null
Validation failed:
- values of type null are never valid for types defining regex constraints

These errors I believe are slightly less helpful than the previous error because they don't explicitly indicate to the user why using $null_or in this manner is incorrect and that you should instead use it on the type definition itself. However, I think they are still good enough to point people in the right direction.

@popematt
Copy link
Contributor

These errors I believe are slightly less helpful than the previous error because they don't explicitly indicate to the user why using $null_or in this manner is incorrect and that you should instead use it on the type definition itself. However, I think they are still good enough to point people in the right direction.

I agree—and the ideal long term solution is an Ion Schema linter that will point out things that you are doing that are probably wrong.

@popematt popematt merged commit 2796de8 into amazon-ion:master Sep 26, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants