Skip to content

Conversation

jorisdral
Copy link
Collaborator

@jorisdral jorisdral commented Sep 25, 2025

Resolves #1142.
Resolves #1050.

This PR builds on top of PR #1139, so that one should be merged first.

This PR is an experiment to come up with a more principle approach to erasing typedefs from C types where necessary. One such example occurs when we want to check if a type is a function type so that we can decide whether to use Ptr or FunPtr in the Haskell code for the bindings. To do that reliably we should erase typedefs and const qualifiers.

Most of the effort in this PR is related to adding some type safety when inspecting C types. Meaning, if you want to check that a type is a canonical function type, one would first map the full C type to a canonical type and then check if it is a function type. There is currently nothing preventing us from inspecting the full C type when we should be inspecting the corresponding canonical C type instead, but the types should hopefully make it clearer that we should do the mapping before the inspecting.

There are two commits in this PR, each with a separate attempt at implementing this. Attempt 1 was with higher-kinded datatypes, which I didn't like so much, and it is not so complete as Attempt 2. Attempt 2 uses a Trees That Grow style, though it's more like a Trees That Shrink style.

@jorisdral jorisdral changed the title Jdral/erase typedefs A principled approach to erasing typedefs from C types (and canonicalising C types) Sep 25, 2025
@jorisdral jorisdral marked this pull request as ready for review September 25, 2025 17:57
@jorisdral jorisdral requested a review from edsko September 25, 2025 18:01
@jorisdral jorisdral self-assigned this Sep 25, 2025
@bolt12 bolt12 force-pushed the bolt12/1135 branch 2 times, most recently from 26816dd to 477c7ec Compare September 26, 2025 10:44
@bolt12 bolt12 force-pushed the bolt12/1135 branch 7 times, most recently from 5f5adbd to 58f3462 Compare October 1, 2025 10:37
Base automatically changed from bolt12/1135 to main October 1, 2025 11:06
@bolt12
Copy link
Collaborator

bolt12 commented Oct 1, 2025

#1139 merged :)

@jorisdral jorisdral force-pushed the jdral/erase-typedefs branch 3 times, most recently from 4295aff to feffc1c Compare October 2, 2025 15:47
@jorisdral jorisdral linked an issue Oct 2, 2025 that may be closed by this pull request
@jorisdral jorisdral force-pushed the jdral/erase-typedefs branch from feffc1c to b8082f1 Compare October 9, 2025 08:05
Copy link
Collaborator

@edsko edsko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, feel free to merge this once the changes we discussed are done.

For the record, dealing correctly with macros is outsourced to #1200.

Copy link
Collaborator Author

@jorisdral jorisdral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments from a review together with @edsko

Comment on lines 382 to 386
C.TypeTypedef (TypedefRefRegular nm uTy) -> do
(uTyDepIds, uTy') <- resolve uTy
let mkTypeTypedef = C.TypeTypedef . flip TypedefRefRegular uTy'
(tdDepIds, td) <- auxN mkTypeTypedef nm C.NameKindOrdinary
pure (uTyDepIds `Set.union` tdDepIds, td)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
C.TypeTypedef (TypedefRefRegular nm uTy) -> do
(uTyDepIds, uTy') <- resolve uTy
let mkTypeTypedef = C.TypeTypedef . flip TypedefRefRegular uTy'
(tdDepIds, td) <- auxN mkTypeTypedef nm C.NameKindOrdinary
pure (uTyDepIds `Set.union` tdDepIds, td)
C.TypeTypedef (TypedefRefRegular nm uTy) -> do
(_uTyDepIds, uTy') <- resolve uTy
let mkTypeTypedef = C.TypeTypedef . flip TypedefRefRegular uTy'
(tdDepIds, td) <- auxN mkTypeTypedef nm C.NameKindOrdinary
pure (tdDepIds, td)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TravisCardwell does it make sense that we do not include the dependencies of the underlying type in the result?

Copy link
Collaborator Author

@jorisdral jorisdral Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some context: references to typedefs are annotated with the type Type p underlying the typedef. Applying pass p' after p, we still have to convert the annotation from Type p to Type p'. We do this here by calling resolve on the annotation, but not returning the dependencies.

@jorisdral jorisdral force-pushed the jdral/erase-typedefs branch 2 times, most recently from b7386ff to 8761811 Compare October 10, 2025 08:33
So that we can check whether we erase typedefs succesfully in later commits.
This makes typedef erasure a local operation, and now we no longer need a
mapping of typedef names to typedef definitions.

Moreover, we add type-level tags to the `Type` (representing C types) datatype
in the backend. These tags allow us to statically guarantee that some type
constructs are omitted from the `Type` datatype. We add simple algorithms to map
full C `Type`s to erased (no typedefs) and canonical (no typedefs and const
qualifiers) types. These erased/canonical types and their mappings are now used
instead of ad-hoc typedef erasure using incomplete information.
@jorisdral jorisdral force-pushed the jdral/erase-typedefs branch from 8761811 to 768950a Compare October 10, 2025 08:37
@jorisdral jorisdral enabled auto-merge October 10, 2025 08:44
@jorisdral jorisdral added this pull request to the merge queue Oct 10, 2025
Merged via the queue into main with commit 8d11386 Oct 10, 2025
16 checks passed
@jorisdral jorisdral deleted the jdral/erase-typedefs branch October 10, 2025 09:05
@TravisCardwell
Copy link
Collaborator

The Set C.QualPrelimDeclId returned are dependencies of the declaration currently being processed by resolveDeep, as represented by edges in the UseDeclGraph.

Example:

typedef struct point {
    double x;
    double y;
} point;

struct rect {
    point tl;
    point br;
};
graph TD;
  v1["PrelimDeclIdNamed &quot;point&quot;"]
  v0["PrelimDeclIdNamed &quot;point&quot;"]
  v2["PrelimDeclIdNamed &quot;rect&quot;"]
  v1-->|"UsedInTypedef ByValue"|v0
  v2-->|"UsedInField ByValue &quot;br&quot;"|v1
  v2-->|"UsedInField ByValue &quot;tl&quot;"|v1
Loading

Imagine the resolveDeep call for the struct rect declaration. The two fields are typedefs, so they are processed by this code. Why are we recursing on the underlying type? I do not think doing so is correct, as we are processing the use sites of external bindings in struct rect, not type typedefs themselves.

Returning the dependencies of the underlying type is indeed not necessary, as there are no edges between PrelimDeclIdNamed "rect" and the bottom PrelimDeclIdNamed "point".

Hmmm, that UseDeclGraph does not distinguish between struct point and point. Perhaps it is an issue in what we are displaying? Hopefully it is not a bug.

BTW, I implemented the hs-bindgen-cli info use-decl-graph command in order to get the above. I will throw it into a PR really quick so that you can use it as well.

@edsko
Copy link
Collaborator

edsko commented Oct 10, 2025

Not traversing isn't really an option, unless we do some type trickery. Perhaps we should have a chat about this with the three of us.

@jorisdral
Copy link
Collaborator Author

I don't understand the resolve bindings spec pass well enough to understand why it would be incorrect to recurse into the underlying type. Is it harmful? We're mostly piggy-backing off of the recursive call to resolve so that we get a Type ResolveBindingSpecs from Type NameAnon

@TravisCardwell
Copy link
Collaborator

Ack! Recursing on the underlying type is indeed needed when processing the typedef itself, such as point in the above example. 🙇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Come up with a better abstraction for looking through Typedef inner types Should we panic on unbound typedefs?

4 participants