-
Notifications
You must be signed in to change notification settings - Fork 295
Fix createViewWithCustomMetadataLocation tests for cloudTest task #2405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
||
/** Runs PolarisRestCatalogViewIntegrationTest on AWS. */ | ||
public class PolarisRestCatalogViewAwsIntegrationTest | ||
public abstract class PolarisRestCatalogViewAwsIntegrationTestBase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're renaming this type anyway, let's better use the protocol name, not the vendor name.
public abstract class PolarisRestCatalogViewAwsIntegrationTestBase | |
public abstract class PolarisRestCatalogViewS3IntegrationTestBase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall I change these 3 (non-view test bases) as well in one go:
- PolarisRestCatalogAwsIntegrationTestBase
- PolarisRestCatalogAzureIntegrationTestBase
- PolarisRestCatalogGcpIntegrationTestBase
Also the 6 implementations:
- RestCatalogAwsIT
- RestCatalogAzureIT
- RestCatalogGcpIT
- RestCatalogViewAwsIT
- RestCatalogViewAzureIT
- RestCatalogViewGcpIT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd support it, yea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mean, we support MinIO and a bunch of other S3 compatible systems, not only AWS. Sure, we could argue about GCS or ADLS. However, GCS and ADLS are specific, where a vendor name or "product suite" name are not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, renamed the classes and related parts.
@@ -20,13 +20,12 @@ | |||
|
|||
import java.util.List; | |||
import java.util.Optional; | |||
import java.util.stream.Stream; | |||
import org.apache.hadoop.fs.Path; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, sure, this is just a test, but could we avoid Hadoop dependencies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea behind Hadoop Path came because Iceberg does something "funky" with the metadataFileLocation()
that resembles Hadoop Path behavior: It normalizes the scheme file:///...
to file:/...
, but it keeps s3://...
as s3://...
.
For the assertion later I couldn't use java.nio.Path
because that normalizes the scheme and creates an incorrect s3:/...
path.
Just dug a bit more, I found a reference to Hadoop Path in Iceberg's LocationProviders:24, maybe this is how they do it as well, I could not find the exact location where it gets normalized.
I’m open to other approaches, but the only solution that came to mind was writing a custom assertion, which I’m also happy with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argh - the damn file
URI scheme is so ambiguous (IMHO, because of multiple representations for the same thing, legacy representations, risk of interpreting those wrong).
If it's just about eliminating all host related parts, I'd say let's just use a regex to "fix" the paths?
Something like
protected static final Pattern FILE_LOCATION_PATTERN =
Pattern.compile("file:/*(.*)");
protected String fixFileUri(String location) {
var m = FILE_LOCATION_PATTERN.matcher(location);
return m.matches() ? "file:/" + m.group(1) : location;
}
WDYT? Would that work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just FYI In Polaris we have:
org.apache.polaris.core.storage.StorageLocation
which standardizes file URIs to"file:///
- but
InMemoryStorageIntegration
does the opposite 😅 :
polaris/polaris-core/src/main/java/org/apache/polaris/core/storage/InMemoryStorageIntegration.java
Line 74 in 532ee51
.map(str -> str.replace("file:///", "file:/")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm - maybe let's ignore the file
scheme specialty and just assume that it's "correct" if supplied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I used java.nio.file.Path
to remove the scheme.
...in/java/org/apache/polaris/service/it/test/PolarisRestCatalogViewGcpIntegrationTestBase.java
Outdated
Show resolved
Hide resolved
.../java/org/apache/polaris/service/it/test/PolarisRestCatalogViewAzureIntegrationTestBase.java
Outdated
Show resolved
Hide resolved
This patch fixes the createViewWithCustomMetadataLocation tests for cloudTest tasks. The original test was generating temp directories internally, causing cloudTests to fail with BadRequestException instead of the expected ForbiddenException. Changes: - Switched to Hadoop's Path (Java Path removes slashes, e.g. s3://bucket/path -> s3:/bucket/path) - Made base classes abstract to avoid running them - Implemented createViewWithCustomMetadataLocation to allow passing a custom location Testing: - Verified locally
protected String getCustomMetadataLocationDir() { | ||
return ""; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
protected String getCustomMetadataLocationDir() { | |
return ""; | |
} | |
protected abstract String getCustomMetadataLocationDir(); |
@@ -86,6 +86,8 @@ public abstract class PolarisRestCatalogViewIntegrationBase extends ViewCatalogT | |||
private static PolarisApiEndpoints endpoints; | |||
private static PolarisClient client; | |||
private static ManagementApi managementApi; | |||
protected static final String POLARIS_IT_SUBDIR = "polaris_it"; | |||
protected static final String POLARIS_IT_CUSTOM_SUBDIR = "polaris_it_custom"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This constant seems unused.
protected boolean shouldSkip() { | ||
return Stream.of(BASE_LOCATION, TENANT_ID).anyMatch(Strings::isNullOrEmpty); | ||
protected String getCustomMetadataLocationDir() { | ||
return StorageUtil.concatFilePrefixes(BASE_LOCATION, POLARIS_IT_SUBDIR, File.separator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be:
return StorageUtil.concatFilePrefixes(BASE_LOCATION, POLARIS_IT_SUBDIR, File.separator); | |
return StorageUtil.concatFilePrefixes(BASE_LOCATION, POLARIS_IT_CUSTOM_SUBDIR, File.separator); |
This patch fixes the
createViewWithCustomMetadataLocation
tests forcloudTest
tasks. The original test was generating temp directories internally, causingcloudTest
s to fail withBadRequestException
instead of the expectedForbiddenException
.Changes:
abstract
to avoid running themcreateViewWithCustomMetadataLocation
to allow passing a custom locationTesting: