-
Notifications
You must be signed in to change notification settings - Fork 60
[FIX] Evaluation: Optimize invalidation of cell depending of spreads #6827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 18.0
Are you sure you want to change the base?
Conversation
38af9dd
to
8e6ab30
Compare
Currently, we invalidate all the cells that depend on a spilling formula during its evaluation. However, some edgecases (like the one described below) can make the amount of dependant cells explode with a noticeable amount of redundancy in the cells invalidated. study case: - in A1 write `=SPLIT("1 2", " ")` - in C1, write `=B2` - autofill these over 1000 columns - in another sheet, write a VLOOKUP that search in column C of sheet 1 and autofill around 7000 times -> the evaluation becomes really slow The example above is problematic because for each cell evaluated in A1-B1, we will the corresponding C formula along with EVERY cell with a VLOOKUP, which means the cells with VLOOKUP will be invalidated 500 (# or rows) times. It so happens that the invalidation step requires to regroup all the cells in zones and this step can be costy, especially if repeated hundreds of times. This revision captures the cells that need to be invalidated during a pass of evaluation and invalidates all of them at once to limit the overhead of regrouping them. In the aforementioned example, the time to evaluate drops from 6.7 to 2.1 seconds on average (10 runs). Task: 4954710
8e6ab30
to
5426e10
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems good to me, but I'm no evaluation expert. I'll let lul or laa check if this is correct
// the result matrices are split in subzones to exclude the array formula position. | ||
// This avoids to invalidate the topleft position of an array formula | ||
// and create an infinite loop of self-invalidation | ||
const boundingboxes = this.positionsToInvalidate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const boundingboxes = this.positionsToInvalidate | |
const boundingBoxes = this.positionsToInvalidate |
) | ||
.flat(); | ||
const invalidatedPositions = this.formulaDependencies().getCellsDependingOn(boundingboxes); | ||
for (const range of this.positionsToInvalidate) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this removes the top-left cells from the positions to invalidate ... wasn't this already done in the excludeTopLeft
above ? I'm a bit confused
) | ||
.flat(); | ||
const invalidatedPositions = this.formulaDependencies().getCellsDependingOn(boundingboxes); | ||
for (const range of this.positionsToInvalidate) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So are they ranges or positions ? I feel like we could improve the naming here
.flat(); | ||
const invalidatedPositions = this.formulaDependencies().getCellsDependingOn(boundingboxes); | ||
for (const range of this.positionsToInvalidate) { | ||
invalidatedPositions.delete({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is causing the issue @hokolomopo posted on the task chatter.
A1 should be recomputed (because A3 spilled on the zone it depends on), but this line removes it from this.nextPositionsToUpdate
Currently, we invalidate all the cells that depend on a spilling formula during its evaluation. However, some edgecases (like the one described below) can make the amount of dependant cells explode with a noticeable amount of redundancy in the cells invalidated.
study case:
=SPLIT("1 2", " ")
=B2
-> the evaluation becomes really slow
The example above is problematic because for each cell evaluated in A1-B1, we will the corresponding C formula along with EVERY cell with a VLOOKUP, which means the cells with VLOOKUP will be invalidated 500 (# or rows) times.
It so happens that the invalidation step requires to regroup all the cells in zones and this step can be costy, especially if repeated hundreds of times.
This revision captures the cells that need to be invalidated during a pass of evaluation and invalidates all of them at once to limit the overhead of regrouping them.
In the aforementioned example, the time to evaluate drops from 6.7 seconds to 2.1 seconds on average.
Task: 4954710
Description:
description of this task, what is implemented and why it is implemented that way.
Task: TASK_ID
review checklist