Skip to content

Conversation

@BenoitHanotte
Copy link

@BenoitHanotte BenoitHanotte commented Jan 14, 2020

What is the purpose of the change

The RelNode's digest is used by the Calcite HepPlanner to avoid adding
duplicate vertices to the graph. If an equivalent vertex was already
present in the graph, then that vertex is used in place of the newly
generated one.
This means that the digest needs to contain all the information
necessary to identifying a vertex and distinguishing it from similar (but not equivalent) vertices.

In the case of the WindowAggregation nodes, the window specs are
currently not in the digest, meaning that two aggregations with the same
signatures and expressions but different windows are considered
equivalent by the planner, which is not correct and will lead to an
invalid Physical Plan.

This commit fixes this issue and adds a test ensuring that the window
specs are in the digest, as well as similar aggregations on two
different windows will not be considered equivalent.

Brief change log

Added window specs to the following RelNodes:

  • LogicalWindowAggregate
  • FlinkLogicalWindowAggregate
  • LogicalWindowTableAggregate
  • FlinkLogicalWindowTableAggregate

Added unit tests to the legacy planner and the blink planner to prevent future regressions.

Verifying this change

This change added tests and can be verified as follows:

  • Added unit tests for the old planner to ensure window specs are in digest and that similar aggregations with different windows are not considered equivalent in the physical plan.
  • Added unit tests for the blink planner to ensure no such regression can be introduced in the future.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

@BenoitHanotte BenoitHanotte force-pushed the FLINK-15577-criteo-1.9 branch 5 times, most recently from dfceba3 to ea11525 Compare January 14, 2020 12:27
The RelNode's digest is used by the Calcite HepPlanner to avoid adding
duplicate vertices to the graph. If an equivalent vertex was already
present in the graph, then that vertex is used in place of the newly
generated one.
This means that the digest needs to contain all the information
necessary to identifying a vertex and distinguishing it from similar
- but not equivalent - vertices.

In the case of the `WindowAggregation` nodes, the window specs are
currently not in the digest, meaning that two aggregations with the same
signatures and expressions but different windows are considered
equivalent by the planner, which is not correct and will lead to an
invalid Physical Plan.

This commit fixes this issue and adds a test ensuring that the window
specs are in the digest, as well as similar aggregations on two
different windows will not be considered equivalent.
@BenoitHanotte BenoitHanotte force-pushed the FLINK-15577-criteo-1.9 branch from ea11525 to 74f02aa Compare January 14, 2020 13:55
@BenoitHanotte BenoitHanotte changed the title [FLINK-15577] Add Window specs to WindowAggregate nodes' digests [table-planner] Fix similar aggregations with different windows being considered the same Jan 14, 2020
@BenoitHanotte BenoitHanotte changed the title [table-planner] Fix similar aggregations with different windows being considered the same [FLINK-15577][table-planner] Fix similar aggregations with different windows being considered the same Jan 14, 2020
The Blink planner doesn't seem to be subject to the bug described in
FLINK-15577. For safety, we also add the tests to ensure no regression
is possible that would introduce the issue in the Blink planner.
@BenoitHanotte BenoitHanotte force-pushed the FLINK-15577-criteo-1.9 branch from 74f02aa to 266e62d Compare January 14, 2020 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant