-
Notifications
You must be signed in to change notification settings - Fork 13.7k
lint ImproperCTypes: refactor linting architecture (part 2) #146273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a lot of this makes sense, but aren't there behavior changes here? It looks like tuples and arrays may be treated slightly differently.
Which is probably fine, that would ideally just be split from the refactoring and come with test updates.
/// What type indirection points to a given type. | ||
#[derive(Clone, Copy)] | ||
enum IndirectionType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, maybe IndirectionType
-> IndirectionTy
(Ty
/ty
is usually used to be a bit shorter and avoid conflicting with the type
keyword)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm... for me "Ty" is shorthand for "mir::Ty" or "hir::Ty", but here I don't mean either of those, just the plain "type" english word. Maybe a rename to "IndirectionKind", if that was a legitimate bit of confusion?
state: VisitorState, | ||
outer_ty: Option<Ty<'tcx>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like outer_ty
is only used for checking where tuples are used, and taking the entire Ty
seems a bit heavyweight for that. Could VisitorState
instead get a new flag / flags giving the needed context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in later commits, it is used to detect, er...
- direct-use of a ty as an argument/return type/static (here)
- being a pointee (and through which kind of indirection) (here)
- if a ty::Slice is indeed a slice or if it is actually a !Sized array (here)
That seems like too much state to add into VisitorState. Unless we could forward the discriminant of outer_ty.kind()
and use this? We would also need the mutability when said kind is Ref
/RawPtr
, though.
Also IIRC "an entire Ty" is only as big as a usize, so... actually I have no idea if that's the point you were making when saying it's "heavyweight".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not heavyweight as in actual size, I meant the amount of information. Since you're already looking at the outer type once in the loop, it would be it would be cleaner to extract the relevant info to some boolean flags at that time rather than passing the full type and needing to re-analyze it when you recurse. I think the relevant bits for the linked changes could easily enough be flags e.g. the existing FN_RETURN
/ IN_MUT_REF
, IN_ARRAY
, IS_UNSIZED_POINTEE
.
The other benefit is not dealing with Ty * Ty
possibilities, which means fewer unexpected combinations to possibly bug!
on.
Maybe just use a flag for this commit? And if needed later, it can be turned into a context struct.
Mainly, we realise that the non-null assumption on a Box<_> argument does not depend on what side of the FFI boundary the function is on. And anyway, this is not the way to deal with this assumption being maybe violated.
no visible changes to rust users, just making the inner architecture of the ImproperCTypes lints more sensible, with a clean separation between the struct (now singular) that interacts with the linting system and the struct (now singular) that visits the types to check FFI-safety
No changes should be visible by rustc users This is just some architecture changes to the type checking to facilitate FFI-safety decisions that depend on how the type is used (the change here is not complete, there are still bits of "legacy" state passing for this, but since this is a retconned commit, I can tell you those bits will disappear before the end of the commit chain) (there is at least one bit where the decision making code is weird, but that this is because we do not want to change the lint's behaviour this early in the chain)
1dbb0e2
to
f54061c
Compare
I moved the actual change in behaviour in a later commit. |
This comment has been minimized.
This comment has been minimized.
f54061c
to
66037fd
Compare
This comment has been minimized.
This comment has been minimized.
Another interal change that shouldn't impact rustc users. The goal is to break apart the gigantic visit_type function into more managable and easily-editable bits that focus on specific parts of FFI safety.
66037fd
to
2781ebd
Compare
// (mid-retcon-commit-chain comment:) | ||
// this is the original fallback behavior, which is wrong | ||
if let ty::Adt(def, args) = ty.kind() { | ||
self.visit_struct_or_union(state, ty, *def, args) | ||
} else { | ||
bug!("ImproperCTypes: this retcon commit was badly written") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this will be committed, turn the comment into a FIXME(ctypes).
Also, make the else
else if cfg!(debug_assertions)
. I don't think this should run into any problems, but also don't want to introduce a new ICE for an intermediate change.
IndirectionType::Ref | IndirectionType::RawPtr => { | ||
if (state.is_in_defined_function() || state.is_in_fnptr()) | ||
&& inner_ty.is_sized(self.cx.tcx, self.cx.typing_env()) | ||
{ | ||
FfiSafe | ||
} else { | ||
self.visit_type(state, Some(ty), inner_ty) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment about what the behavior is here? Just as a FIXME if it isn't exactly correct.
help: if def.is_struct() { | ||
Some(fluent::lint_improper_ctypes_struct_layout_help) | ||
} else { | ||
// (FIXME(ctypes): confirm that this makes sense for unions once #60405 / RFC2645 stabilises) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// (FIXME(ctypes): confirm that this makes sense for unions once #60405 / RFC2645 stabilises) | |
// FIXME(ctypes): confirm that this makes sense for unions once #60405 / RFC2645 stabilises |
Nit, looks like some extra parens
// I thought CStr (not CString) here could only be reached in non-compiling code: | ||
// - not using an indirection would cause a compile error (this lint *currently* seems to not get triggered on such non-compiling code) | ||
// - and using one would cause the lint to catch on the indirection before reaching its pointee | ||
// but function *pointers* don't seem to have the same no-unsized-parameters requirement to compile | ||
if let Some(sym::cstring_type | sym::cstr_type) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you rephrase this comment in terms of the actual behavior, or what the FIXME action item is here? I'm not sure whether it's saying the current behavior makes sense, or if the lint isn't accurate here for some reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not describing the behaviour, it's just justifying that "yes, we can also encounter sym::cstr_type in this piece of code.
(I'm removing the comment for now, it's not relevant until indirections get treated differently)
fn visit_scalar(&self, ty: Ty<'tcx>) -> FfiResult<'tcx> { | ||
// FIXME(ctypes): for now, this is very incomplete, and seems to assume a x86_64 target | ||
match ty.kind() { | ||
// note: before rust 1.77, 128-bit ints were not FFI-safe on x86_64 | ||
ty::Int(..) | ty::Uint(..) | ty::Float(..) => FfiResult::FfiSafe, | ||
ty::Bool => FfiResult::FfiSafe, | ||
|
||
ty::Char => FfiResult::FfiUnsafe { | ||
ty, | ||
reason: fluent::lint_improper_ctypes_char_reason, | ||
help: Some(fluent::lint_improper_ctypes_char_help), | ||
}, | ||
_ => bug!("visit_scalar is to be called with scalar (char, int, float, bool) types"), | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind linking to the changes this helps facilitate?
Thinking about this a bit more; bool
is always FFI-safe and char
is always FFI unsafe, so there probably isn't any reason not to move them out of this function into the big match
.
(sorry for a bit of back and forth here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two things that visit_scalar would facilitate, but neither are implemented just yet.
- look at the safety of these things outside of x86_64
- lint on possibly-broken value assumptions (if you have a reference or a pattern type as the argument of a
extern "C" fn
for instance). The new visit_pattern logic would then submit the base type to visit_scalar.
Should I put this as a comment in the function?
that being said, yes I can move bool and char out of there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
look at the safety of these things outside of x86_64
Which platforms is this a problem for? As far as I know, there shouldn't be any more known incompatibilities for scalars - and if there are, we should probably just fix them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I... huh.
I kinda assumed there would be some and put off looking into them.
I expected there would be similar situations to u128-on-x86_64 where a type is defined in software (CPU registers can't hold it) and part of the standard but handled differently by different compiler/linker/stdlib stacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m not aware of any remaining problems with the stable types, so I don’t think there is anything to account for here. I.e., visit_scalar can probably remain inlined. Technically there are ABI mismatches for f16 and f128, but we’re working to fix those before stabilization so won’t be linting on them.
where a type is defined in software (CPU registers can't hold it) and part of the standard but handled differently by different compiler/linker/stdlib stacks.
Fwiw whether or not a type fits in registers doesn’t really determine the compatibility level; the ABI provides a spec for what to do (sometimes types that could fit in registers aren’t even passed that way). The problem with x86 i128 is that there the ABI specified what to do and LLVM wasn’t following it. (Not intending to self-promote, but my post on the subject is worth a read if you haven’t seen it https://blog.rust-lang.org/2024/03/30/i128-layout-update/).
There are platforms that don’t specify __int128 in the ABI (like x86-32). We also won’t be linting on that because GCC/Clang don’t let you use __int128 here, so there isn’t anything to be compatible with (bit more in the last paragraph of the PR description at #137306)
state: VisitorState, | ||
outer_ty: Option<Ty<'tcx>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not heavyweight as in actual size, I meant the amount of information. Since you're already looking at the outer type once in the loop, it would be it would be cleaner to extract the relevant info to some boolean flags at that time rather than passing the full type and needing to re-analyze it when you recurse. I think the relevant bits for the linked changes could easily enough be flags e.g. the existing FN_RETURN
/ IN_MUT_REF
, IN_ARRAY
, IS_UNSIZED_POINTEE
.
The other benefit is not dealing with Ty * Ty
possibilities, which means fewer unexpected combinations to possibly bug!
on.
Maybe just use a flag for this commit? And if needed later, it can be turned into a context struct.
"split type visiting into subfunctions" still has some changes right? (Possible I'm missing something here) |
That's a bit of logic that was moved from |
match outer_ty.map(|ty| ty.kind()) { | ||
// C functions can return void | ||
None | Some(ty::FnPtr(..)) => state.is_in_function_return(), | ||
// most of those are not even reachable, | ||
// but let's not worry about checking that here | ||
_ => false, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is is_in_function_return
ever set when the outer type isn't FnPtr
? Seems like it should be sufficient to check only that without the match
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's the subtlety here:
if you check, say extern "C" fn create_something() -> &()
:
- it will start by looking at the
&()
type, for whichouter_ty
isNone
orSome(ty::FnPtr)
(approximately) andstate.is_in_function_return()
is true. - Then, if it looks at the pointee,
()
,state.is_in_function_return()
remains true, butouter_ty
becomesSome(&())
(approximately).
This is the second PR in an effort to split #134697 (refactor plus overhaul of the ImproperCTypes family of lints) into individually-mergeable parts.
Contains the changes of the first PR, and splits the core type checking function into several bits, each focused on a specific aspect of FFI-safety.
Some logic which was outside of said core function was also moved into the new functions.
Superset of: #146271