Skip to content

Conversation

niacdoial
Copy link
Contributor

@niacdoial niacdoial commented Sep 6, 2025

This is the second PR in an effort to split #134697 (refactor plus overhaul of the ImproperCTypes family of lints) into individually-mergeable parts.

Contains the changes of the first PR, and splits the core type checking function into several bits, each focused on a specific aspect of FFI-safety.
Some logic which was outside of said core function was also moved into the new functions.

Superset of: #146271

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 6, 2025
Copy link
Contributor

@tgross35 tgross35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a lot of this makes sense, but aren't there behavior changes here? It looks like tuples and arrays may be treated slightly differently.

Which is probably fine, that would ideally just be split from the refactoring and come with test updates.

View changes since this review

Comment on lines +204 to +276
/// What type indirection points to a given type.
#[derive(Clone, Copy)]
enum IndirectionType {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, maybe IndirectionType -> IndirectionTy (Ty/ty is usually used to be a bit shorter and avoid conflicting with the type keyword)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm... for me "Ty" is shorthand for "mir::Ty" or "hir::Ty", but here I don't mean either of those, just the plain "type" english word. Maybe a rename to "IndirectionKind", if that was a legitimate bit of confusion?

Comment on lines +552 to +618
state: VisitorState,
outer_ty: Option<Ty<'tcx>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like outer_ty is only used for checking where tuples are used, and taking the entire Ty seems a bit heavyweight for that. Could VisitorState instead get a new flag / flags giving the needed context?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in later commits, it is used to detect, er...

  • direct-use of a ty as an argument/return type/static (here)
  • being a pointee (and through which kind of indirection) (here)
  • if a ty::Slice is indeed a slice or if it is actually a !Sized array (here)

That seems like too much state to add into VisitorState. Unless we could forward the discriminant of outer_ty.kind() and use this? We would also need the mutability when said kind is Ref/RawPtr, though.

Also IIRC "an entire Ty" is only as big as a usize, so... actually I have no idea if that's the point you were making when saying it's "heavyweight".

Copy link
Contributor

@tgross35 tgross35 Sep 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not heavyweight as in actual size, I meant the amount of information. Since you're already looking at the outer type once in the loop, it would be it would be cleaner to extract the relevant info to some boolean flags at that time rather than passing the full type and needing to re-analyze it when you recurse. I think the relevant bits for the linked changes could easily enough be flags e.g. the existing FN_RETURN / IN_MUT_REF, IN_ARRAY, IS_UNSIZED_POINTEE.

The other benefit is not dealing with Ty * Ty possibilities, which means fewer unexpected combinations to possibly bug! on.

Maybe just use a flag for this commit? And if needed later, it can be turned into a context struct.

Mainly, we realise that the non-null assumption on a Box<_> argument
does not depend on what side of the FFI boundary the function is on.
And anyway, this is not the way to deal with this assumption being maybe violated.
no visible changes to rust users, just making the inner architecture of
the ImproperCTypes lints more sensible, with a clean separation between
the struct (now singular) that interacts with the linting system
and the struct (now singular) that visits the types to check FFI-safety
No changes should be visible by rustc users
This is just some architecture changes to the type checking to
facilitate FFI-safety decisions that depend on how the type is used
(the change here is not complete, there are still bits of "legacy" state
passing for this, but since this is a retconned commit, I can tell you
those bits will disappear before the end of the commit chain)
(there is at least one bit where the decision making code is weird, but
that this is because we do not want to change the lint's behaviour this
early in the chain)
@niacdoial niacdoial force-pushed the improperctypes-refactor2 branch from 1dbb0e2 to f54061c Compare September 6, 2025 20:31
@niacdoial
Copy link
Contributor Author

I think a lot of this makes sense, but aren't there behavior changes here? It looks like tuples and arrays may be treated slightly differently.

Which is probably fine, that would ideally just be split from the refactoring and come with test updates.

I moved the actual change in behaviour in a later commit.
The rest of the changes here are just an exercise in moving more of the type-checking logic into the visit_* methods.

@rust-log-analyzer

This comment has been minimized.

@niacdoial niacdoial force-pushed the improperctypes-refactor2 branch from f54061c to 66037fd Compare September 6, 2025 21:03
@rust-log-analyzer

This comment has been minimized.

Another interal change that shouldn't impact rustc users.
The goal is to break apart the gigantic visit_type function into more
managable and easily-editable bits that focus on specific parts of FFI safety.
@niacdoial niacdoial force-pushed the improperctypes-refactor2 branch from 66037fd to 2781ebd Compare September 6, 2025 22:23
Comment on lines +435 to +441
// (mid-retcon-commit-chain comment:)
// this is the original fallback behavior, which is wrong
if let ty::Adt(def, args) = ty.kind() {
self.visit_struct_or_union(state, ty, *def, args)
} else {
bug!("ImproperCTypes: this retcon commit was badly written")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this will be committed, turn the comment into a FIXME(ctypes).

Also, make the else else if cfg!(debug_assertions). I don't think this should run into any problems, but also don't want to introduce a new ICE for an intermediate change.

Comment on lines +444 to +452
IndirectionType::Ref | IndirectionType::RawPtr => {
if (state.is_in_defined_function() || state.is_in_fnptr())
&& inner_ty.is_sized(self.cx.tcx, self.cx.typing_env())
{
FfiSafe
} else {
self.visit_type(state, Some(ty), inner_ty)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment about what the behavior is here? Just as a FIXME if it isn't exactly correct.

help: if def.is_struct() {
Some(fluent::lint_improper_ctypes_struct_layout_help)
} else {
// (FIXME(ctypes): confirm that this makes sense for unions once #60405 / RFC2645 stabilises)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// (FIXME(ctypes): confirm that this makes sense for unions once #60405 / RFC2645 stabilises)
// FIXME(ctypes): confirm that this makes sense for unions once #60405 / RFC2645 stabilises

Nit, looks like some extra parens

Comment on lines 643 to 647
// I thought CStr (not CString) here could only be reached in non-compiling code:
// - not using an indirection would cause a compile error (this lint *currently* seems to not get triggered on such non-compiling code)
// - and using one would cause the lint to catch on the indirection before reaching its pointee
// but function *pointers* don't seem to have the same no-unsized-parameters requirement to compile
if let Some(sym::cstring_type | sym::cstr_type) =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rephrase this comment in terms of the actual behavior, or what the FIXME action item is here? I'm not sure whether it's saying the current behavior makes sense, or if the lint isn't accurate here for some reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not describing the behaviour, it's just justifying that "yes, we can also encounter sym::cstr_type in this piece of code.
(I'm removing the comment for now, it's not relevant until indirections get treated differently)

Comment on lines +392 to +406
fn visit_scalar(&self, ty: Ty<'tcx>) -> FfiResult<'tcx> {
// FIXME(ctypes): for now, this is very incomplete, and seems to assume a x86_64 target
match ty.kind() {
// note: before rust 1.77, 128-bit ints were not FFI-safe on x86_64
ty::Int(..) | ty::Uint(..) | ty::Float(..) => FfiResult::FfiSafe,
ty::Bool => FfiResult::FfiSafe,

ty::Char => FfiResult::FfiUnsafe {
ty,
reason: fluent::lint_improper_ctypes_char_reason,
help: Some(fluent::lint_improper_ctypes_char_help),
},
_ => bug!("visit_scalar is to be called with scalar (char, int, float, bool) types"),
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind linking to the changes this helps facilitate?

Thinking about this a bit more; bool is always FFI-safe and char is always FFI unsafe, so there probably isn't any reason not to move them out of this function into the big match.

(sorry for a bit of back and forth here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two things that visit_scalar would facilitate, but neither are implemented just yet.

  • look at the safety of these things outside of x86_64
  • lint on possibly-broken value assumptions (if you have a reference or a pattern type as the argument of a extern "C" fn for instance). The new visit_pattern logic would then submit the base type to visit_scalar.

Should I put this as a comment in the function?

that being said, yes I can move bool and char out of there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look at the safety of these things outside of x86_64

Which platforms is this a problem for? As far as I know, there shouldn't be any more known incompatibilities for scalars - and if there are, we should probably just fix them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I... huh.
I kinda assumed there would be some and put off looking into them.

I expected there would be similar situations to u128-on-x86_64 where a type is defined in software (CPU registers can't hold it) and part of the standard but handled differently by different compiler/linker/stdlib stacks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not aware of any remaining problems with the stable types, so I don’t think there is anything to account for here. I.e., visit_scalar can probably remain inlined. Technically there are ABI mismatches for f16 and f128, but we’re working to fix those before stabilization so won’t be linting on them.

where a type is defined in software (CPU registers can't hold it) and part of the standard but handled differently by different compiler/linker/stdlib stacks.

Fwiw whether or not a type fits in registers doesn’t really determine the compatibility level; the ABI provides a spec for what to do (sometimes types that could fit in registers aren’t even passed that way). The problem with x86 i128 is that there the ABI specified what to do and LLVM wasn’t following it. (Not intending to self-promote, but my post on the subject is worth a read if you haven’t seen it https://blog.rust-lang.org/2024/03/30/i128-layout-update/).

There are platforms that don’t specify __int128 in the ABI (like x86-32). We also won’t be linting on that because GCC/Clang don’t let you use __int128 here, so there isn’t anything to be compatible with (bit more in the last paragraph of the PR description at #137306)

Comment on lines +552 to +618
state: VisitorState,
outer_ty: Option<Ty<'tcx>>,
Copy link
Contributor

@tgross35 tgross35 Sep 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not heavyweight as in actual size, I meant the amount of information. Since you're already looking at the outer type once in the loop, it would be it would be cleaner to extract the relevant info to some boolean flags at that time rather than passing the full type and needing to re-analyze it when you recurse. I think the relevant bits for the linked changes could easily enough be flags e.g. the existing FN_RETURN / IN_MUT_REF, IN_ARRAY, IS_UNSIZED_POINTEE.

The other benefit is not dealing with Ty * Ty possibilities, which means fewer unexpected combinations to possibly bug! on.

Maybe just use a flag for this commit? And if needed later, it can be turned into a context struct.

@tgross35
Copy link
Contributor

tgross35 commented Sep 7, 2025

I moved the actual change in behaviour in a later commit.
The rest of the changes here are just an exercise in moving more of the type-checking logic into the visit_* methods.

"split type visiting into subfunctions" still has some changes right? Array went from just checking the type to checking whether or not it is in a function. Which is probably a reasonable change to make, it should just be its own thing (and come with a test update).

(Possible I'm missing something here)

@niacdoial
Copy link
Contributor Author

Array went from just checking the type to checking whether or not it is in a function.

That's a bit of logic that was moved from check_type to visit_type

Comment on lines +691 to +697
match outer_ty.map(|ty| ty.kind()) {
// C functions can return void
None | Some(ty::FnPtr(..)) => state.is_in_function_return(),
// most of those are not even reachable,
// but let's not worry about checking that here
_ => false,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is is_in_function_return ever set when the outer type isn't FnPtr? Seems like it should be sufficient to check only that without the match.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's the subtlety here:
if you check, say extern "C" fn create_something() -> &():

  • it will start by looking at the &() type, for which outer_ty is None or Some(ty::FnPtr) (approximately) and state.is_in_function_return() is true.
  • Then, if it looks at the pointee, (), state.is_in_function_return() remains true, but outer_ty becomes Some(&()) (approximately).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants