-
Notifications
You must be signed in to change notification settings - Fork 366
Block the creation of catalog with WASB[S] locations #376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -191,4 +191,12 @@ public static <T> Builder<T> builder() { | |
| "If set to true, allows tables to be dropped with the purge parameter set to true.") | ||
| .defaultValue(true) | ||
| .build(); | ||
|
|
||
| public static final PolarisConfiguration<Boolean> SUPPORT_WASB_CATALOG = | ||
| PolarisConfiguration.<Boolean>builder() | ||
| .key("SUPPORT_WASB_CATALOG") | ||
| .description( | ||
| "If set to true, allows the creation of catalogs with storage locations using Azure WASB prefix.") | ||
| .defaultValue(false) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Considering we support
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't it the case that WASB does work out of the box for Polaris just like |
||
| .build(); | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -28,9 +28,13 @@ public class AzureLocation extends StorageLocation { | |
| private static final Pattern URI_PATTERN = Pattern.compile("^(abfss?|wasbs?)://([^/?#]+)(.*)?$"); | ||
|
|
||
| public static final String ADLS_ENDPOINT = "dfs.core.windows.net"; | ||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: spurious change? |
||
| public static final String BLOB_ENDPOINT = "blob.core.windows.net"; | ||
|
|
||
| public static final String ABFS_SCHEME = "abfs://"; | ||
| public static final String ABFSS_SCHEME = "abfss://"; | ||
| public static final String WASB_SCHEME = "wasb://"; | ||
| public static final String WASBS_SCHEME = "wasbs://"; | ||
|
|
||
|
Comment on lines
+33
to
+37
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Typically the Furthermore, isn't this duplicative of the constants in |
||
| private final String scheme; | ||
| private final String storageAccount; | ||
| private final String container; | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -80,6 +80,7 @@ | |
| import org.apache.polaris.core.storage.PolarisStorageConfigurationInfo; | ||
| import org.apache.polaris.core.storage.StorageLocation; | ||
| import org.apache.polaris.core.storage.aws.AwsStorageConfigurationInfo; | ||
| import org.apache.polaris.core.storage.azure.AzureLocation; | ||
| import org.apache.polaris.core.storage.azure.AzureStorageConfigurationInfo; | ||
| import org.jetbrains.annotations.NotNull; | ||
| import org.jetbrains.annotations.Nullable; | ||
|
|
@@ -533,6 +534,24 @@ private boolean catalogOverlapsWithExistingCatalog(CatalogEntity catalogEntity) | |
| }); | ||
| } | ||
|
|
||
| private boolean catalogLocationsContainWasb(CatalogEntity catalogEntity) { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Quite a thin method and only used in one place. If we are going to make it its own method, can we at least make it more visible to test it? |
||
| boolean supportWasbCatalog = | ||
| callContext | ||
| .getPolarisCallContext() | ||
| .getConfigurationStore() | ||
| .getConfiguration( | ||
| callContext.getPolarisCallContext(), PolarisConfiguration.SUPPORT_WASB_CATALOG); | ||
| if (supportWasbCatalog) { | ||
| return false; | ||
| } | ||
|
|
||
| return getCatalogLocations(catalogEntity).stream() | ||
| .anyMatch( | ||
| location -> | ||
| location.startsWith(AzureLocation.WASB_SCHEME) | ||
| || location.startsWith(AzureLocation.WASBS_SCHEME)); | ||
| } | ||
|
|
||
| public PolarisEntity createCatalog(PolarisEntity entity) { | ||
| PolarisAuthorizableOperation op = PolarisAuthorizableOperation.CREATE_CATALOG; | ||
| authorizeBasicRootOperationOrThrow(op); | ||
|
|
@@ -543,6 +562,12 @@ public PolarisEntity createCatalog(PolarisEntity entity) { | |
| entity.getName()); | ||
| } | ||
|
|
||
| if (catalogLocationsContainWasb((CatalogEntity) entity)) { | ||
| throw new ValidationException( | ||
| "Cannot create Catalog %s. Polaris does not support catalog locations containing WASB paths", | ||
| entity.getName()); | ||
| } | ||
|
|
||
| long id = | ||
| entity.getId() <= 0 | ||
| ? entityManager | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,181 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. The ASF licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
| package org.apache.polaris.service.admin; | ||
|
|
||
| import static org.apache.polaris.service.context.DefaultContextResolver.REALM_PROPERTY_KEY; | ||
| import static org.assertj.core.api.Assertions.assertThat; | ||
|
|
||
| import io.dropwizard.testing.ConfigOverride; | ||
| import io.dropwizard.testing.ResourceHelpers; | ||
| import io.dropwizard.testing.junit5.DropwizardAppExtension; | ||
| import io.dropwizard.testing.junit5.DropwizardExtensionsSupport; | ||
| import jakarta.ws.rs.client.Entity; | ||
| import jakarta.ws.rs.client.Invocation; | ||
| import jakarta.ws.rs.core.Response; | ||
| import java.io.IOException; | ||
| import java.util.Collections; | ||
| import java.util.List; | ||
| import java.util.UUID; | ||
| import java.util.stream.Stream; | ||
| import org.apache.polaris.core.admin.model.AzureStorageConfigInfo; | ||
| import org.apache.polaris.core.admin.model.Catalog; | ||
| import org.apache.polaris.core.admin.model.CatalogProperties; | ||
| import org.apache.polaris.core.admin.model.CreateCatalogRequest; | ||
| import org.apache.polaris.core.admin.model.StorageConfigInfo; | ||
| import org.apache.polaris.service.PolarisApplication; | ||
| import org.apache.polaris.service.config.PolarisApplicationConfig; | ||
| import org.apache.polaris.service.test.PolarisConnectionExtension; | ||
| import org.apache.polaris.service.test.PolarisRealm; | ||
| import org.junit.jupiter.api.BeforeAll; | ||
| import org.junit.jupiter.api.extension.ExtendWith; | ||
| import org.junit.jupiter.api.extension.ExtensionContext; | ||
| import org.junit.jupiter.params.ParameterizedTest; | ||
| import org.junit.jupiter.params.provider.Arguments; | ||
| import org.junit.jupiter.params.provider.ArgumentsProvider; | ||
| import org.junit.jupiter.params.provider.ArgumentsSource; | ||
|
|
||
| /** Tests for the validations performed in {@link PolarisAdminService}. */ | ||
| @ExtendWith({DropwizardExtensionsSupport.class, PolarisConnectionExtension.class}) | ||
| public class PolarisAdminServiceValidationTest { | ||
| private static final DropwizardAppExtension<PolarisApplicationConfig> EXT_BLOCK_WASB = | ||
| initExt(ConfigOverride.config("featureConfiguration.SUPPORT_WASB_CATALOG", "false")); | ||
| private static final DropwizardAppExtension<PolarisApplicationConfig> EXT_SUPPORT_WASB = | ||
| initExt(ConfigOverride.config("featureConfiguration.SUPPORT_WASB_CATALOG", "true")); | ||
|
|
||
| private static String userToken; | ||
| private static String realm; | ||
|
|
||
| private static DropwizardAppExtension<PolarisApplicationConfig> initExt( | ||
| ConfigOverride... overrides) { | ||
| return new DropwizardAppExtension<>( | ||
| PolarisApplication.class, | ||
| ResourceHelpers.resourceFilePath("polaris-server-integrationtest.yml"), | ||
| Stream.concat( | ||
| Stream.of( | ||
| // Bind to random port to support parallelism | ||
| ConfigOverride.config("server.applicationConnectors[0].port", "0"), | ||
| ConfigOverride.config("server.adminConnectors[0].port", "0")), | ||
| Stream.of(overrides)) | ||
| .toArray(ConfigOverride[]::new)); | ||
| } | ||
|
|
||
| @BeforeAll | ||
| public static void setup( | ||
| PolarisConnectionExtension.PolarisToken adminToken, @PolarisRealm String polarisRealm) | ||
| throws IOException { | ||
| userToken = adminToken.token(); | ||
| realm = polarisRealm; | ||
|
|
||
| // Set up the database location | ||
| PolarisConnectionExtension.createTestDir(realm); | ||
| } | ||
|
|
||
| private static Invocation.Builder request(DropwizardAppExtension<PolarisApplicationConfig> ext) { | ||
| return ext.client() | ||
| .target(String.format("http://localhost:%d/api/management/v1/catalogs", ext.getLocalPort())) | ||
| .request("application/json") | ||
| .header("Authorization", "Bearer " + userToken) | ||
| .header(REALM_PROPERTY_KEY, realm); | ||
| } | ||
|
|
||
| private Response createAzureCatalog( | ||
| DropwizardAppExtension<PolarisApplicationConfig> ext, | ||
| String defaultBaseLocation, | ||
| boolean isExternal, | ||
| List<String> allowedLocations) { | ||
| String uuid = UUID.randomUUID().toString(); | ||
| StorageConfigInfo config = | ||
| AzureStorageConfigInfo.builder() | ||
| .setTenantId("tenantId") | ||
| .setConsentUrl("https://consentUrl") | ||
| .setMultiTenantAppName("multiTenantAppName") | ||
| .setStorageType(StorageConfigInfo.StorageTypeEnum.AZURE) | ||
| .setAllowedLocations(allowedLocations) | ||
| .build(); | ||
| Catalog catalog = | ||
| new Catalog( | ||
| isExternal ? Catalog.TypeEnum.EXTERNAL : Catalog.TypeEnum.INTERNAL, | ||
| String.format("overlap_catalog_%s", uuid), | ||
| new CatalogProperties(defaultBaseLocation), | ||
| System.currentTimeMillis(), | ||
| System.currentTimeMillis(), | ||
| 1, | ||
| config); | ||
| try (Response response = request(ext).post(Entity.json(new CreateCatalogRequest(catalog)))) { | ||
| return response; | ||
| } | ||
| } | ||
|
|
||
| @ParameterizedTest | ||
| @ArgumentsSource(AzurePathArgs.class) | ||
| public void testWasbCatalogCreationBlocked(String defaultBaseLocation, String allowedLocation) { | ||
| int expectedStatusCode = | ||
| (defaultBaseLocation.startsWith("wasbs") || allowedLocation.startsWith("wasbs")) | ||
| ? Response.Status.BAD_REQUEST.getStatusCode() | ||
| : Response.Status.CREATED.getStatusCode(); | ||
|
|
||
| assertThat( | ||
| createAzureCatalog( | ||
| EXT_BLOCK_WASB, | ||
| defaultBaseLocation, | ||
| false, | ||
| Collections.singletonList(allowedLocation))) | ||
| .returns(expectedStatusCode, Response::getStatus); | ||
| assertThat( | ||
| createAzureCatalog( | ||
| EXT_BLOCK_WASB, | ||
| defaultBaseLocation, | ||
| true, | ||
| Collections.singletonList(allowedLocation))) | ||
| .returns(expectedStatusCode, Response::getStatus); | ||
| } | ||
|
|
||
| @ParameterizedTest | ||
| @ArgumentsSource(AzurePathArgs.class) | ||
| public void testWasbCatalogCreationAllowed(String defaultBaseLocation, String allowedLocation) { | ||
| assertThat( | ||
| createAzureCatalog( | ||
| EXT_SUPPORT_WASB, | ||
| defaultBaseLocation, | ||
| false, | ||
| Collections.singletonList(allowedLocation))) | ||
| .returns(Response.Status.CREATED.getStatusCode(), Response::getStatus); | ||
| assertThat( | ||
| createAzureCatalog( | ||
| EXT_SUPPORT_WASB, | ||
| defaultBaseLocation, | ||
| true, | ||
| Collections.singletonList(allowedLocation))) | ||
| .returns(Response.Status.CREATED.getStatusCode(), Response::getStatus); | ||
| } | ||
|
|
||
| private static class AzurePathArgs implements ArgumentsProvider { | ||
| @Override | ||
| public Stream<? extends Arguments> provideArguments(ExtensionContext extensionContext) { | ||
| return Stream.of(createArguments(), createArguments(), createArguments(), createArguments()); | ||
| } | ||
|
|
||
| private Arguments createArguments() { | ||
| String prefix = UUID.randomUUID().toString(); | ||
| String wasbPath = String.format("wasbs://[email protected]/%s/wasb", prefix); | ||
| String abfsPath = String.format("abfss://[email protected]/%s/abfs", prefix); | ||
| return Arguments.of(wasbPath, abfsPath); | ||
| } | ||
| } | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -903,7 +903,9 @@ components: | |
|
|
||
| AzureStorageConfigInfo: | ||
| type: object | ||
| description: azure storage configuration info | ||
| description: | ||
| azure storage configuration info; base location supports abfs[s] schemes and only supports wasb[s] schemes | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Based on the code, it's not just the base location |
||
| when featureConfiguration.SUPPORT_WASB_CATALOG is true | ||
| allOf: | ||
| - $ref: '#/components/schemas/StorageConfigInfo' | ||
| properties: | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should re-examine this approach. It's not clear to me how this pattern can extend to having more than 2 schemes -- for example S3 has
s3n,s3a, ands3. Also, conceivably, you might want to allow justabfssand notabfsor vice versa.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that the current approach is not at all extensible. I was hoping we have a test annotation like
@WithConfig(configName="true")that could directly modify the config store, but that would require a larger test framework change that would fall out of scope of this PR. Currently the config store only contains getter methods and there's also no way to modify its content via code via a setter method, so I had to proceed with this pattern of initiating twoDropwizardAppExtensioninstances.Do you think we could wait until the test framework to be more interactable with the config store that we refactor this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually thought that test with the two
DropwizardAppExtensionwas pretty good given what we have now. The suggestion you made for an annotation sounds great, but I definitely don't think that's a blocker for this PR.My comment here is specifically about having a single boolean config specific to WASB (+WASBS) and how that doesn't scale.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know that we need to extend this configuration to support
s3n, etc. I do think we could push the check into theStorageLocationclass so that it can be responsible for tracking which scheme(s) it supports and which should be rejected at creation. So, e.g., theAzureLocationclass can implement acanCreateNew()method or something and it can, internally, determine whetherwasbis supported or not. Some futureS3StorageLocationcan tracks3xURLs, though AFAIK there's no reason to explicitly add/deny support for any of those variations are supported by the existing IcebergS3FileIO.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we probably should support it. Indeed, don't we already block
httppaths?Also, why is this specific to catalog creation? The approach here just seems very specific to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that rationalizing this with the
StorageLocationorFileIOorStorageConfigurationInfoorStorageTypetypes is a good idea. Right now it seems like the logic for dealing with different schemes and file systems is not centralized enough.