feat: `ClassicJoin` for PWMJ #17482

jonathanc-n · 2025-09-09T04:02:34Z

Which issue does this PR close?

part of [EPIC]: Make PiecewiseMergeJoin work in Datafusion #17427

Rationale for this change

Adds regular joins (left, right, full, inner) for PWMJ as they behave differently in the code path.

What changes are included in this PR?

Adds classic join + physical planner

Are these changes tested?

Yes SLT tests + unit tests

Follow up work to this pull request

Handling partitioned queries and multiple record batches (fuzz testing will be handled with this)
Simplify physical planning
Add more unit tests for different types (another pr as the LOC in this pr is getting a little daunting)

next would be to implement the existence joins

jonathanc-n · 2025-09-09T04:04:33Z

@2010YOUY01 Would you like to take a look at if this is how you wanted to split up the work? I just wanted to put this out today then i'll clean it up better this week. Only failing one external test currently.

jonathanc-n · 2025-09-09T04:05:57Z

datafusion/core/src/physical_planner.rs

                let join: Arc<dyn ExecutionPlan> = if join_on.is_empty() {
                    if join_filter.is_none() && matches!(join_type, JoinType::Inner) {
                        // cross join if there is no join conditions and no join filter set
                        Arc::new(CrossJoinExec::new(physical_left, physical_right))
+                    } else if num_range_filters == 1


I would like to refactor this in another pull request, just a refactor but it should be quite simple to do. Just wanted to get this version in first.

jonathanc-n · 2025-09-09T04:07:10Z

datafusion/sqllogictest/test_files/joins.slt

+statement ok
+set datafusion.execution.batch_size = 8192;
+
+# TODO: partitioned PWMJ execution


Currently doesn't allow partitioned execution, this would make reviewing the tests a little messy as many of the partitioned single range queries would switch to PWMJ. Another follow up, will be tracked in #17427

…jonathanc-n/datafusion into classic-join-physical-planner

jonathanc-n · 2025-09-09T18:00:17Z

cc @2010YOUY01 @comphead this pr is now ready!

2010YOUY01 · 2025-09-11T11:30:30Z

This is great! I have some suggestions for the planning part, and I'll review the execution part tomorrow.

Refactor the in-equality extracting logic

I suggest to move the inequality-extracting logic from physical_planner.rs into https://github.com/apache/datafusion/blob/main/datafusion/optimizer/src/extract_equijoin_predicate.rs

The reason is we'd better put similar code into a single place, instead of let it scatter to multiple places. ExtractEquijoinPredicate logical optimizer rule is extracting equality join predicates like t1.v1 = t2.v1, here we want to extract t1.v1 < t2.v1, their logic should be very similar.

To do this I think we need to extend the logical plan join node with extra ie predicate field (maybe we can define a new struct for IE predicate with (Expr, Op, Expr), and we can also use that in other places)

/// Join two logical plans on one or more join columns
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct Join {
    ...
    /// Equijoin clause expressed as pairs of (left, right) join expressions
    pub on: Vec<(Expr, Expr)>,                                                                 
    /// In-equility clause expressed as pairs of (left, right) join expressions           <-- HERE
    pub ie_predicates: Vec<(Expr, IEOp, Expr)>,
    /// Filters applied during join (non-equi conditions)
    pub filter: Option<Expr>,
    ...
}

To make it compatible for systems only use the LogicalPlan API, but not the physical plans, we can also provide a utility to move the IE predicates back to the filter:

Before: 
ie_predicates: [t1.v1 < t2.v1, t1.v2 < t2.v2]
filter: (t1.v3 + t2.v3) = 100

After:
ie_predicates: []
filter: ((t1.v3 + t2.v3) = 100) AND (t1.v1 < t2.v1) AND (t1.v2 < t2.v2)

Perhaps we can open a PR only for this IE predicates extracting task, and during the initial planning we can simply move the IE predicates back to the filter with the above mentioned utility.

Make it configurable to turn on/off PWMJ

I'll try to finish #17467 soon to make it easier, so let's put this on hold for now.

comphead · 2025-09-11T19:15:08Z

Thanks @jonathanc-n and @2010YOUY01

#17467 definitely would be nice to have as PWMJ can start as optional experimental join, which would be separately documented, showing benefits and limitations for the end user. Actually the same happened for SMJ being experimental feature for quite some time.

Another great point to identify bottlenecks in performance is to absorb some knowledge from #17488 and keep the join more stable.

As optional feature it is pretty safe to go, again referring to SMJ there was a separate ticket which post launch checks to make sure it is safe to use like #9846

Let me know your thoughts?

jonathanc-n · 2025-09-11T22:36:06Z

Yes I think the experimental flag should be added first and we can do the equality extraction logic as a follow up. WDYT @2010YOUY01 Do you think you want to get #17467 before this one?

2010YOUY01 · 2025-09-13T04:18:15Z

Yes I think the experimental flag should be added first and we can do the equality extraction logic as a follow up. WDYT @2010YOUY01 Do you think you want to get #17467 before this one?

Yes, so let's do other work first. If I can't get #17467 done when this PR is ready, let's add enable_piecewise_merge_join option here -- I think we can agree on this configuration.

2010YOUY01

I have gone over the exec.rs, and will continue with the stream implementation part soon.

2010YOUY01 · 2025-09-13T04:20:22Z