Skip to content

Conversation

heyihong
Copy link
Contributor

@heyihong heyihong commented Sep 12, 2025

What changes were proposed in this pull request?

This PR fixes temporal value conversion issues in the LiteralValueProtoConverter for Spark Connect. The main changes include:

  1. Fixed temporal value conversion in getConverter method: Updated the conversion logic for temporal data types (DATE, TIMESTAMP, TIMESTAMP_NTZ, DAY_TIME_INTERVAL, YEAR_MONTH_INTERVAL, TIME) to use proper utility methods from SparkDateTimeUtils and SparkIntervalUtils instead of directly returning raw protobuf values.

  2. Added comprehensive test coverage: Extended the PlanGenerationTestSuite with a new test case that includes a tuple containing all temporal types to ensure proper conversion and serialization.

  3. Updated test expectations: Modified the expected explain output and query test files to reflect the corrected temporal value handling.

Why are the changes needed?

The struct type in typedlit doesn't work well with temporal values due to bugs in type conversions. For example, the code below fails:

import org.apache.spark.sql.functions.typedlit

spark.sql("select 1").select(typedlit((1, java.time.LocalDate.of(2020, 10, 10)))).collect()
"""
org.apache.spark.SparkIllegalArgumentException: The value (18545) of the type (java.lang.Integer) cannot be converted to the DATE type.
  org.apache.spark.sql.catalyst.CatalystTypeConverters$DateConverter$.toCatalystImpl(CatalystTypeConverters.scala:356)
  org.apache.spark.sql.catalyst.CatalystTypeConverters$DateConverter$.toCatalystImpl(CatalystTypeConverters.scala:347)
  org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toCatalyst(CatalystTypeConverters.scala:110)
  org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:271)
  org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:251)
  org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toCatalyst(CatalystTypeConverters.scala:110)
  org.apache.spark.sql.catalyst.CatalystTypeConverters$.$anonfun$createToCatalystConverter$2(CatalystTypeConverters.scala:532)
  org.apache.spark.sql.connect.planner.LiteralExpressionProtoConverter$.toCatalystExpression(LiteralExpressionProtoConverter.scala:116)
"""

Does this PR introduce any user-facing change?

Yes. This PR fixes temporal value conversion in LiteralValueProtoConverter, allowing the struct type in typedlit to work with temporal values.

How was this patch tested?

build/sbt "connect-client-jvm/testOnly org.apache.spark.sql.PlanGenerationTestSuite"
build/sbt "connect/testOnly org.apache.spark.sql.connect.ProtoToParsedPlanTestSuite"

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Cursor 1.5.11

@heyihong
Copy link
Contributor Author

@cloud-fan

@@ -32,7 +32,7 @@ import io.grpc.{Context, Status, StatusRuntimeException}
import io.grpc.stub.StreamObserver
import org.apache.commons.lang3.exception.ExceptionUtils

import org.apache.spark.{SparkEnv, TaskContext}
import org.apache.spark.{SparkEnv, SparkException, TaskContext}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert this irrelevant change, @heyihong .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants