Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions python/setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,8 @@ python_requires = >=3.7
install_requires =
mmh3
singledispatch
pyyaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make pyyaml a dependency of the library itself? If so, is there anyway to make it just a “test” dependency or something or will we need it at runtime?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need it at runtime. Specifically in the TableMetadata class which hasn't been merged in yet but the idea is that reading in a table metadata file will be validated against the open api spec.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we gonna do that for every file on every read? Or just as a test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a great question and I think we can decide on that later. The TableMetadata PR is still a WIP but that class includes this validate method. So we may have certain paths where we don't validate (loading a metadata file from a path we got from a catalog) and other paths where we do validate (like before writing+committing a new metadata file), just as an example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to pin it on a version. Currently it will just install the latest greatest, but there might be breaking changes (happens all the time in Python land :)

Suggested change
pyyaml
pyyaml==6.0

include_package_data = true
[options.extras_require]
arrow =
pyarrow
Expand All @@ -53,3 +55,6 @@ dev=
pytest
[options.packages.find]
where = src
[options.package_data]
""=
../src/iceberg/assets/rest-catalog-open-api.yaml
16 changes: 16 additions & 0 deletions python/src/iceberg/assets/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
1 change: 1 addition & 0 deletions python/src/iceberg/assets/rest-catalog-open-api.yaml
32 changes: 32 additions & 0 deletions python/src/iceberg/utils/openapi.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

import os
from functools import lru_cache

import yaml

from iceberg import assets


@cache
def read_yaml_file(path: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def read_yaml_file(path: str):
def read_yaml_file(path: str) -> Dict[str, Any]:

return yaml.safe_load(open(path))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: If something happens, we'll clean up the connection:

with open(path) as f:
    return yaml.safe_load(f



def load_openapi_spec():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def load_openapi_spec():
def load_openapi_spec() -> Dict[str, Any]:

return read_yaml_file(os.path.join(list(assets.__path__)[0], "rest-catalog-open-api.yaml"))
37 changes: 37 additions & 0 deletions python/tests/utils/test_openapi.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

import os
import tempfile

import yaml

from iceberg.utils import openapi


def test_reading_a_yaml_file():
with tempfile.TemporaryDirectory() as tmpdirname:
data = {"foo": "bar"}
file_location = os.path.join(tmpdirname, "foo.yaml")
with open(file_location, "w") as f:
yaml.dump(data, f)
assert openapi.read_yaml_file(file_location) == {"foo": "bar"}


def test_loading_rest_catalog_openapi_spec():
rest_catalog_openapi_spec = openapi.load_openapi_spec()
assert rest_catalog_openapi_spec["info"]["title"] == "Apache Iceberg REST Catalog API"