Skip to content

Conversation

@sergisiso
Copy link
Collaborator

This is a step to reduce internal state of Signatures so they operatate on top of the PSyIR. In thi PR I removed the ComponentIndices class (and associated infrastructure) and the argument in add_accesses. Instead I moved the logic to a method of Reference. So now, we just need to do:

access_sequence.add_access(AccessType.READ, ArrayReference(Symbol("a")))

and once the component_indices are needed the AccessSequence can do self.node.component_indicies()

@codecov
Copy link

codecov bot commented Sep 29, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.90%. Comparing base (2611d11) to head (5127dcc).
⚠️ Report is 44 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3164      +/-   ##
==========================================
- Coverage   99.90%   99.90%   -0.01%     
==========================================
  Files         376      374       -2     
  Lines       53048    52823     -225     
==========================================
- Hits        52996    52771     -225     
  Misses         52       52              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sergisiso
Copy link
Collaborator Author

@arporter @hiker This is ready for review. It was initially a towards #2424 PR, but I found that it may close #1320 as now compoment indices contain the actual psyir nodes from the tree and not strings. So I changed the title to close that one instead.

@sergisiso sergisiso requested review from arporter and hiker September 30, 2025 14:29
@sergisiso sergisiso self-assigned this Sep 30, 2025
@sergisiso sergisiso changed the title (towards #2424) Convert component indices to method (closes #1320) Convert component indices to method Sep 30, 2025
Copy link
Member

@arporter arporter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Sergi, that's an impressive amount of code removed.
While I think it's a considerable improvement, I'm worried that it might break some of @hiker's workflows that aren't in our test suite. If so, we'll need to think about how we can handle that.

# The stencil is 100, 110, 123 - test that appropriate
# accesses were added for each direction
expected = {
# First stencil direction of 123: 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we no longer have information on the stencil accesses of this kernel as provided by the metadata?Obviously this hasn't broken anything in the test suite but does it break anything for @hiker (e.g. in training/tutorials)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if it does and I can have a look

@arporter
Copy link
Member

arporter commented Oct 6, 2025

I've set the integration tests going as that will help to set my mind at rest :-)

@sergisiso
Copy link
Collaborator Author

@arporter This is ready for another review.

While I think it's a considerable improvement, I'm worried that it might break some of @hiker's workflows that aren't in our test suite. If so, we'll need to think about how we can handle that.

I am happy to have a look if something is reported, generally this can be solved by inlining. But I also have some ideas how we can handle it without inlining.

Copy link
Member

@arporter arporter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Sergi, very nearly there now.
I've pinged @hiker in Teams in case there's anything relating to GOcean stencils not covered by our test suite.
Just a bit of tidying to do.

a tuple of each index expression in that compoment. For example,
for a scalar it returns `(())`, for `a%b` it returns ((),()) - two
components with 0 indicies in each, and for `a(i)%b(j,k+1)` it
:returns: a tuple of tuples of index expressions; one for every
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To save confusion, I think it would be better to specialise this text for the class in question. i.e. "...expressions; since Member has no indices we return a tuple containing an empty tuple."

a tuple of each index expression in that compoment. For example,
for a scalar it returns `(())`, for `a%b` it returns ((),()) - two
components with 0 indicies in each, and for `a(i)%b(j,k+1)` it
:returns: a tuple of tuples of index expressions; one for every
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, please specialise this text (can be the same as Member).

:param set_of_vars: set with name of all variables.
:return: a list of sets with all variables used in the
corresponding array subscripts as strings.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly, this returns a list of lists of tuples now. Also the type-hint on the return type doesn't match.

for i1, idx_exprs in enumerate(comp_ind1):
for i2, _ in enumerate(idx_exprs):
try:
partition_infos.append(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description at L300 needs updating now that the content of partition_infos has changed.


# Verify that the index expression is correct. First replace its
# strings with references to that symbol
# The elements in the paratrized indices are string names, but
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"parameterised"

@hiker
Copy link
Collaborator

hiker commented Oct 20, 2025

I have debugged why loop fusion in my training is now failing. Core reason is that (with this PR merged in) fields as kernel parameters are now detected to be scalars. E.g.:

            call invoke(count_neighbours(neighbours, current),)

will now consider current to be a scalar, because is_array = access_info.has_indices(index_variable=index_variable) returns False. I didn't debug this into more detail, but I believe what is happening is that the test:

        if access_info:
            # Access Info might not have information if a variable is used
            # as array (e.g. in case of an array expression). In this case
            # we still need to check the type information in the symbol table.
            is_array = access_info.has_indices(index_variable=index_variable)

and that seems a likely candidate, since previously this function would ask its ComponentIndices :)

In the past, I've added implicitly accesses to field(i+-1, j+-1) (depending on stencil), these indices are now not added anymore.

And then a gocen field is not considered an array (when checking the symbol info), and it fails. Should be easy to reproduce, fusing two gocean loops should trigger this problem (I am actually surprised that this was not picked up by the testing). If you need a test case, merge in branch 1623..., and check .../tutorial/training/gocean/2.6-GameOfLife-fuse/solution (and run make test).

@hiker
Copy link
Collaborator

hiker commented Oct 20, 2025

I checked, surprisingly we seem to have NO tests for fusing gocean loops :(:

psyclone/src/psyclone/tests$ find . -iname \*fus\*.py
./psyir/transformations/loop_fusion_test.py

@sergisiso
Copy link
Collaborator Author

Thanks for checking @hiker , I was expecting this at some point, so it is good to have an example. I'm happy for the training to be merged first if it's close to being ready.

This is expected behaviour, let me explain.

Core reason is that (with this PR merged in) fields as kernel parameters are now detected to be scalars

Not quite. To know the datatype you would do arg.datatype, and this should return ArrayaType or UnresolvedType, but not ScalarType. Or arg.datatype.is_array (with None and False having different meaning)

The 'has_indices' should return False to be consistent with the psyir because it has no indices, e.g. "call kernel1(arg1%data, ...)"

To fix fusing there are 3 options (coming in the next comment as I need to get off the bus)

@sergisiso
Copy link
Collaborator Author

I just describe two options I am thinking are (the third is not fleshed out):

  1. Inlining first. This is convenient because it is already implemented, and inlining is very beneficial for gocean anyway (because the kernel are just the loop bodies). For example tasking is doing it to perform the dependency analysis. The negatives are that then the order matters (cannot apply kerncall transformations after inlining), and could fail for complex inlining or complex kernels that we cannot do proper dependency analysis.

  2. I actually like the virtual accesses, but not having them disconnected from the tree. I considered the idea of adding an assignment below the kernel node that would match the prototype of the kernel body given by the metadata, e.g. for invoke(kernel(arg1, ptwise, stencil)) it could be:

 KernelCall
    arg1%data(i,j) = ptwise%data(i,j) + stencil%data(i, j-1) + stencil%data(i, j+1)

This is the most similar to your current solution, and is resilient to kernels that can not be inlined, or that even inlinlined have complex bodies that can not be analysed. And because lowering the kernel replaces it, we get rid of the prototype assignment anyway. This also fixes #3124 If we do this I also want to consider lfric (where we need to do better fusion) and things like having indices in the algorithm invoke field, which the previous implementation didn't.

What do you think @hiker?

@hiker
Copy link
Collaborator

hiker commented Oct 20, 2025

I just describe two options I am thinking are (the third is not fleshed out):

  1. Inlining first. This is convenient because it is already implemented, and inlining is very beneficial for gocean anyway (because the kernel are just the loop bodies). For example tasking is doing it to perform the dependency analysis. The negatives are that then the order matters (cannot apply kerncall transformations after inlining), and could fail for complex inlining or complex kernels that we cannot do proper dependency analysis.

Correct me from wrong, but doesn't inlining need lowering first? Last time I tried (and that was a week or two ago), inlining gocean kernel did not work (in the training I actually overwrote the validate function to accept inlining gocean if the error is the known one).

While I agree that inlining is beneficial, I am:

  1. not convinced that inlining is ... feature-complete enough :)
  2. We are losing potentially useful information? E.g. in gocean we can easily test if the loop boundaries are the same (same loop type etc), while after lowering we don't necessarily have that information anymore :(
  1. I actually like the virtual accesses, but not having them disconnected from the tree. I considered the idea of adding an assignment below the kernel node that would match the prototype of the kernel body given by the metadata, e.g. for invoke(kernel(arg1, ptwise, stencil)) it could be:
 KernelCall
    arg1%data(i,j) = ptwise%data(i,j) + stencil%data(i, j-1) + stencil%data(i, j+1)

This is the most similar to your current solution, and is resilient to kernels that can not be inlined, or that even inlinlined have complex bodies that can not be analysed. And because lowering the kernel replaces it, we get rid of the prototype assignment anyway. This also fixes #3124 If we do this I also want to consider lfric (where we need to do better fusion) and things like having indices in the algorithm invoke field, which the previous implementation didn't.

Hmm - yes. that looks very similar to what we have, but I wonder if that might not cause issues with other tools working on the tree if there is suddenly an additional statements? E.g. thinking of moving kernels etc.

In this particular case, wouldn't it be easier to just use the information that we are in the gocean API, and that therefore the variable is an array?

Now that I am writing this, that would probably only get us over the first hurdle (to detect that the fields are arrays). The next step in the array validation would take the indices into account (atm the validation for loop fusion is simplified in that it only allows same index expressions, e.g. a(i, j-1) everywhere and we are good, but if we have a(i,j) and a(i,j-1), then we just assume that it's not safe to fuse. Likely, this test would then fail if we don't have indices?

I'll see if I can come up with a better idea? ATM ... I have nothing :)

@hiker
Copy link
Collaborator

hiker commented Oct 20, 2025

Oh, we have some gocean fuse tests in ./domain/gocean/transformations/gocean1p0_transformations_test.py, but they are mostly (all?) only testing errors :(

@sergisiso
Copy link
Collaborator Author

Correct me from wrong, but doesn't inlining need lowering first? Last time I tried (and that was a week or two ago), inlining gocean kernel did not work (in the training I actually overwrote the validate function to accept inlining gocean if the error is the known one).

Yes, this is why I said order matters, basically you need need to do in the script psykal_transfomrations->lowering->psyir_generic transformations. Following this order worked in Nemolite2D. But I agree with your points above, I was proposing this as a solution that works now, while we implement the other solution that may take longer.

Now that I am writing this, that would probably only get us over the first hurdle (to detect that the fields are arrays).

Again, I don't think this is a hurdle, using arg.datatype should already give you this info. But has_indices should be False (as they don't have them :))

but I wonder if that might not cause issues with other tools working on the tree if there is suddenly an additional statements? E.g. thinking of moving kernels

Moving the kernel is still fine (it moves together with its children). The one that requires a bit of care is hoisting, but even now it shouldn't let it hoist it across an unexpected kernel, like the kernelcall.
And I would say that the current implementation is already causing issues with other tools, e.g.: We sometimes assume that all these are reference [ref_key.node for ref region.reference_access().keys()], as this is true for simple examples but this sometimes breaks when the reference_access is actually associated with a symbol directly in a Loop, CodeBlock or KernelCall. Meaning that every time we use reference_access we need to account for edge cases for these. Another case was looking for a virtual reference in the tree, and of course failing because it wasn't there. What I am trying to do is precisely make this work more seamlessly with other tools without needed edge cases.

The KernelCall is a high level kernel and we can define its children as we like. It can also be
KernelCall
(child 0)write: arg1%data(i,j)
(child 1 to N): ptwise%data(i,j), stencil%data(i, j-1), stencil%data(i, j+1)

The advantage of the assignment is that we don't need to specialise reference_accesses for it.

@hiker
Copy link
Collaborator

hiker commented Oct 21, 2025

Ah, I actually didn't think it all the way through. I thought you were adding the accesses 'after' (i.e. next to the) kernel, but as child of the kernel that makes a lot more sense, I totally missed that.

That sounds actually good, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants