Skip to content

RFC: Type Guards #124

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

Bottersnike
Copy link

Rendered

Adds type guards of the form

function isFoo(x): x is Foo
    return x.type == "foo"
end

which can be used in control statements and assertions.

@deviaze
Copy link

deviaze commented May 31, 2025

Could we get this as a sort of typecast as well/instead? This could allow user-defined narrowing in a more natural inline manner.

For example:

type Entry = File | Directory | Symlink

local entry = getentry()
if str.endswith(entry.path, ".luau") :: entry is File then
    print(entry:read()) -- a method only on `File`
end

In this case we want to assume the entry is a File if it ends with .luau, but that isn't something inherent to the function str.endswith (which is just a substring check), but inherent to the logic of our conditional itself.

@Bottersnike
Copy link
Author

In terms of as well, I think that would be cool, though potentially worth a separate RFC. In terms of instead, you then lose many of the benefits of having dedicated guard functions, so I don't think I'd want to not have those.

@gaymeowing
Copy link
Contributor

gaymeowing commented Jun 10, 2025

Is there a backwards compat issue with using == instead? As I think that'd be more fitting, as types currently purely use symbols and not words. Plus I think it's nice how luau separates type syntax by not having it be keyword heavy.

Edit: On second thought is is nicer because it does a better job distinguishing in the example, and it is somewhat more runtime adjacent sorta like typeof().

@alexmccord
Copy link
Contributor

alexmccord commented Jun 10, 2025

Unfortunately you can't use is or any contextual keywords like that. See https://github.com/luau-lang/rfcs/blob/master/docs/disallow-proposals-leading-to-ambiguity-in-grammar.md. x is T is an extension of the F T case described in that. x :: T in the return type, however, is not a bad idea. The alternative would be to wrap in parenthesis so that in syntactic type-land there is no ambiguity, but that logic was ruled out when we used :: for casting.

But, eugh. is_leaf(node) :: node :: Leaf. That's a hard pass.

Another case not being mentioned here is the ambiguity when a field of some table is a refinement target in these types. t.p :: T could refer to a type named p. An example: local M = require("/mod.luau") where M exports some type function F<T>, we run into the case of ambiguity when parsing any arbitrary expression upfront, M.F<"hello">.

We would therefore need to hand-write new syntax that limits what gets parsed to just identifiers, no fields. This also avoids the left recursion problem when reusing :: as a type and trying to parse any arbitrary expression.

```lua
type Tree = { value: number, left: Tree?, right: Tree? } | nil

function isLeaf(x: Tree): x is { left: nil, right: nil }
Copy link
Contributor

@alexmccord alexmccord Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, see this part. It parses as a function call for some global is, with a table literal whose fields are obviously not a valid parse.

function isLeaf(x: Tree): x
  is({ left: nil, right: nil }) -- error
  -- no backtracking, and ambiguous even if
  -- there were for some types, like strings.
  return ...
end

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an outstandingly good point and I feel silly for not having noticed it. Do you think foo(...): (x is Bar) would work? It feels a little icky, but might be the simplest approach. I'm not sure I'd want to introduce any new special symbols for this, and a type function wouldn't properly convey the special semantics.


Assigning the value of a type guard to a variable (`local foo = isCat(x)`) or its use in a more complex expression (`foo(isCat(x))`) is not disallowed, though the predicate returned by the type guard is demoted to a simple boolean value and no longer serves the narrow the type of the subject variable. It is suggested that lint rules may be used to warn about these cases.

Type guard functions are not permitted to have multiple return values. `function foo(x): (x is number, string)` is disallowed, as is `function foo(x): (x is number, x is string)`.
Copy link
Contributor

@alexmccord alexmccord Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks a bit like an arbitrary choice.

function foo(x): (x is number, x is string)
  return typeof(x) == "number", typeof(x) == "string"
end

function bar(x: unknown)
  local is_num, is_str = foo(x)
  if is_num then
    -- x : number
  elseif is_str then
    -- x : string
  else
    -- x : ~number & ~string
  end
end

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to allow predicates to be stored as normal variables, but consider what happens if you make a modification to x after the call to foo but before the use of your predicate booleans. Your predicates are now meaningless. We can't simply track things that might change, as the purpose of type guards is to also allow more complex logic than can be trivially identified by the type narrowing system. This is why the use of type functions was exclusively restricted to control flow statements, and as such any return values other than the first would be ignored. Disallowing multiple return values nips this footgun at the bud.

Copy link
Contributor

@alexmccord alexmccord Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's actually okay. Type refinements store a proposition to refine a specific version of a variable, so even if you update some variable and then apply an outdated proposition, it doesn't affect any variable whose version did not match. This gives us the effect of invalidating any outdated propositions (even if the type system will still commit those refinements)

end
```

This is an acceptable compromise, as situations like these are uncommon. We also have no guarantee that `pet` was not further mutated to remove `bark()`, so `wasDog` can no longer be relied upon. `isDog()` could instead be replaced with a `canBark()` type guard, giving `if isCat(pet) and canBark(pet) then`.
Copy link
Contributor

@alexmccord alexmccord Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, effect systems are necessary to know which refinements survives any invalidation from side effects, but the top effect in the lattice of effects describes every effects. This means all functions, indexing, even something as basic as equality, will invalidate all refinements in relation to any globals and possibly all locals (except primitives that doesn't have mutation, e.g. strings and numbers, but not tables) since they could be reachable by some insidious function that mutates everything. Obviously that's just unusable, so you need to make a pragmatic call to assume the user isn't doing anything crazy.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why the decision was made to exclusively allow type guard calls to be used in control flow (and similar statements like assert). I was unable to come up with any system of rules to allow their use elsewhere that wouldn't immediately break if you looked at it wrong; incorrect types are more dangerous than no types at all.

@deviaze
Copy link

deviaze commented Jun 11, 2025

Looking at this RFC again, I don't like that it's mixing up the runtime return type of the guard function (a boolean), with what it actually means to the user (that it refines x into Foo). Since type guard functions work moreso like type casts than type annotations, I feel that annotation syntax here might be confusing for users.

In terms of as well, I think that would be cool, though potentially worth a separate RFC. In terms of instead, you then lose many of the benefits of having dedicated guard functions, so I don't think I'd want to not have those.

I think a many if not most of the usecases of type guard functions can already be solved (albeit less ergonomically) by casts, putting methods like isFoo: (self: Foo) -> true on table types, using demarking fields on unions of table types { is: "Foo" } | { is: "Boo" }, etc. That's why I feel a new refinement cast syntax would be a more general-purpose ergonomic alternative to type guard functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants