Skip to content

Conversation

xxlaykxx
Copy link
Contributor

@xxlaykxx xxlaykxx commented Sep 3, 2025

What's Changed

Updated ComplexCopier to support ExtensionType - it contains two copy methods

public static void copy(FieldReader input, FieldWriter output) //for not breaking  existing logic

public static void copy(FieldReader input, FieldWriter output, ExtensionTypeWriterFactory extensionTypeWriterFactory)

Also updated ComplexCopier tests.
Closes #836.

@xxlaykxx xxlaykxx marked this pull request as ready for review September 4, 2025 12:26
@xxlaykxx
Copy link
Contributor Author

@lidavidm Could you plz take a look?

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the factory, vs. being able to copy the "storage type" directly?

Comment on lines 119 to 120
throw new UnsupportedOperationException(
"EXTENSIONTYPE are not supported yet. Please provide an ExtensionTypeWriterFactory." );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can ever support extension without the factory right?

Suggested change
throw new UnsupportedOperationException(
"EXTENSIONTYPE are not supported yet. Please provide an ExtensionTypeWriterFactory." );
throw new IllegalArgumentException(
"Must provide ExtensionTypeWriterFactory" );

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, we need a factory to determine writer impl for extension type.

@Override
public void copyFrom(
int fromIndex, int thisIndex, ValueVector from, ExtensionTypeWriterFactory writerFactory) {
throw new UnsupportedOperationException();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be abstract methods instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because this method is used only with complex type vectors and they have implementation; for non-complex, it's not supported, and this behavior is covered here

@xxlaykxx
Copy link
Contributor Author

Why do we need the factory, vs. being able to copy the "storage type" directly?

If we use just the storage type, original (extension) type/vector will be missed

@xxlaykxx xxlaykxx requested a review from lidavidm September 12, 2025 13:28
@jhrotko
Copy link

jhrotko commented Sep 26, 2025

@lidavidm Would you mind taking another look at this PR when you have a chance? We would really appreciate your feedback. Thanks!

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my question is then, how does this work if we have more than one extension type (e.g. a struct vector with two extension type fields), given that everything is based around having 0 or 1 ExtensionTypeWriterFactorys?

@xxlaykxx
Copy link
Contributor Author

I guess my question is then, how does this work if we have more than one extension type (e.g. a struct vector with two extension type fields), given that everything is based around having 0 or 1 ExtensionTypeWriterFactorys?

Because ExtensionTypeWriterFactory.getWriterImpl will return a writer based on vector type, so if will be several extension types - getWriterImpl should just handle this something like

if (extensionTypeVector instanceof TimeStampWithPrecisionVector) {
      return new TimeStampWithPrecisionWriterImpl(
          (TimeStampWithPrecisionVector) extensionTypeVector);
} else if (extensionTypeVector instanceof UuidVector) {
      return new UuidWriterImpl((UuidVector) extensionTypeVector);
}

@lidavidm
Copy link
Member

Works for me.

Incidentally, ExtensionTypeWriterFactory is type-parameterized but it seems we never actually use it that way - can you file an issue to drop that and have getWriter just return FieldWriter?

@xxlaykxx
Copy link
Contributor Author

Works for me.

Incidentally, ExtensionTypeWriterFactory is type-parameterized but it seems we never actually use it that way - can you file an issue to drop that and have getWriter just return FieldWriter?

#861

@lidavidm
Copy link
Member

There's still pre-commit failures - might be worth running the formatter locally? mvn spotless:apply

@xxlaykxx xxlaykxx requested a review from lidavidm September 29, 2025 14:05
@xxlaykxx
Copy link
Contributor Author

@lidavidm could you re-check plz?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[JAVA] Add support for ExtensionType for ComplexCopier
3 participants