Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backporting changes from Private Datasets feature branch #992

Merged
merged 6 commits into from
Dec 17, 2024

Conversation

s373r
Copy link
Member

@s373r s373r commented Dec 16, 2024

Description

  • Some changes are piling up in the Private Datasets branch that are better put in before the feature is finalized.
  • Also, it will help for readers of the Private Datasets branch to not be distracted by changes that are not directly related to the epic

Added

  • kamu-adapter-graphql: added macros (from_catalog_n!() & unsafe_from_catalog_n!())
    that simplify the extraction of components from the DI catalog
  • database-common: the logic for pagination of data processing is generalized in EntityPageStreamer

Changed

  • Speed up project build time by removing unused dependencies which were not detected by automated tools

Checklist before requesting a review

@s373r s373r force-pushed the chore/private-datasets-backports branch from d8c2c24 to 8d56658 Compare December 16, 2024 21:31
Copy link
Member Author

@s373r s373r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added notes for readers

@@ -1398,7 +1398,7 @@
]
},
"post": {
"description": "### Regular Queries\nThis endpoint lets you execute arbitrary SQL that can access multiple\ndatasets at once.\n\nExample request body:\n```json\n{\n \"query\": \"select event_time, from, to, close from \\\"kamu/eth-to-usd\\\"\",\n \"limit\": 3,\n \"queryDialect\": \"SqlDataFusion\",\n \"dataFormat\": \"JsonAoA\",\n \"schemaFormat\": \"ArrowJson\"\n}\n```\n\nExample response:\n```json\n{\n \"output\": {\n \"data\": [\n [\"2024-09-02T21:50:00Z\", \"eth\", \"usd\", 2537.07],\n [\"2024-09-02T21:51:00Z\", \"eth\", \"usd\", 2541.37],\n [\"2024-09-02T21:52:00Z\", \"eth\", \"usd\", 2542.66]\n ],\n \"dataFormat\": \"JsonAoA\",\n \"schema\": {\"fields\": [\"...\"]},\n \"schemaFormat\": \"ArrowJson\"\n }\n}\n```\n\n### Verifiable Queries\n[Cryptographic proofs](https://docs.kamu.dev/node/commitments) can be\nalso requested to hold the node **forever accountable** for the provided\nresult.\n\nExample request body:\n```json\n{\n \"query\": \"select event_time, from, to, close from \\\"kamu/eth-to-usd\\\"\",\n \"limit\": 3,\n \"queryDialect\": \"SqlDataFusion\",\n \"dataFormat\": \"JsonAoA\",\n \"schemaFormat\": \"ArrowJson\",\n \"include\": [\"proof\"]\n}\n```\n\nCurrently we support verifiability by ensuring that queries are\ndeterministic and fully reproducible and signing the original response with\nNode's private key. In future more types of proofs will be supported.\n\nExample response:\n```json\n{\n \"input\": {\n \"query\": \"select event_time, from, to, close from \\\"kamu/eth-to-usd\\\"\",\n \"queryDialect\": \"SqlDataFusion\",\n \"dataFormat\": \"JsonAoA\",\n \"include\": [\"Input\", \"Proof\", \"Schema\"],\n \"schemaFormat\": \"ArrowJson\",\n \"datasets\": [{\n \"id\": \"did:odf:fed0119d20360650afd3d412c6b11529778b784c697559c0107d37ee5da61465726c4\",\n \"alias\": \"kamu/eth-to-usd\",\n \"blockHash\": \"f1620708557a44c88d23c83f2b915abc10a41cc38d2a278e851e5dc6bb02b7e1f9a1a\"\n }],\n \"skip\": 0,\n \"limit\": 3\n },\n \"output\": {\n \"data\": [\n [\"2024-09-02T21:50:00Z\", \"eth\", \"usd\", 2537.07],\n [\"2024-09-02T21:51:00Z\", \"eth\", \"usd\", 2541.37],\n [\"2024-09-02T21:52:00Z\", \"eth\", \"usd\", 2542.66]\n ],\n \"dataFormat\": \"JsonAoA\",\n \"schema\": {\"fields\": [\"...\"]},\n \"schemaFormat\": \"ArrowJson\"\n },\n \"subQueries\": [],\n \"commitment\": {\n \"inputHash\": \"f1620e23f7d8cdde7504eadb86f3cdf34b3b1a7d71f10fe5b54b528dd803387422efc\",\n \"outputHash\": \"f1620e91f4d3fa26bc4ca0c49d681c8b630550239b64d3cbcfd7c6c2d6ff45998b088\",\n \"subQueriesHash\": \"f1620ca4510738395af1429224dd785675309c344b2b549632e20275c69b15ed1d210\"\n },\n \"proof\": {\n \"type\": \"Ed25519Signature2020\",\n \"verificationMethod\": \"did:key:z6MkkhJQPHpA41mTPLFgBeygnjeeADUSwuGDoF9pbGQsfwZp\",\n \"proofValue\": \"uJfY3_g03WbmqlQG8TL-WUxKYU8ZoJaP14MzOzbnJedNiu7jpoKnCTNnDI3TYuaXv89vKlirlGs-5AN06mBseCg\"\n }\n}\n```\n\nA client that gets a proof in response should\nperform [a few basic steps](https://docs.kamu.dev/node/commitments#response-validation) to validate\nthe proof integrity. For example making sure that the DID in\n`proof.verificationMethod` actually corresponds to the node you're querying\ndata from and that the signature in `proof.proofValue` is actually valid.\nOnly after this you can use this proof to hold the node accountable for the\nresult.\n\nA proof can be stored long-term and then disputed at a later point using\nyour own node or a 3rd party node you can trust via the\n[`/verify`](#tag/odf-query/POST/verify) endpoint.\n\nSee [commitments documentation](https://docs.kamu.dev/node/commitments) for details.",
"description": "### Regular Queries\nThis endpoint lets you execute arbitrary SQL that can access multiple\ndatasets at once.\n\nExample request body:\n```json\n{\n \"query\": \"select event_time, from, to, close from \\\"kamu/eth-to-usd\\\"\",\n \"limit\": 3,\n \"queryDialect\": \"SqlDataFusion\",\n \"dataFormat\": \"JsonAoA\",\n \"schemaFormat\": \"ArrowJson\"\n}\n```\n\nExample response:\n```json\n{\n \"output\": {\n \"data\": [\n [\"2024-09-02T21:50:00Z\", \"eth\", \"usd\", 2537.07],\n [\"2024-09-02T21:51:00Z\", \"eth\", \"usd\", 2541.37],\n [\"2024-09-02T21:52:00Z\", \"eth\", \"usd\", 2542.66]\n ],\n \"dataFormat\": \"JsonAoA\",\n \"schema\": {\"fields\": [\"...\"]},\n \"schemaFormat\": \"ArrowJson\"\n }\n}\n```\n\n### Verifiable Queries\n[Cryptographic proofs](https://docs.kamu.dev/node/commitments) can be\nalso requested to hold the node **forever accountable** for the provided\nresult.\n\nExample request body:\n```json\n{\n \"query\": \"select event_time, from, to, close from \\\"kamu/eth-to-usd\\\"\",\n \"limit\": 3,\n \"queryDialect\": \"SqlDataFusion\",\n \"dataFormat\": \"JsonAoA\",\n \"schemaFormat\": \"ArrowJson\",\n \"include\": [\"proof\"]\n}\n```\n\nCurrently, we support verifiability by ensuring that queries are\ndeterministic and fully reproducible and signing the original response with\nNode's private key. In future more types of proofs will be supported.\n\nExample response:\n```json\n{\n \"input\": {\n \"query\": \"select event_time, from, to, close from \\\"kamu/eth-to-usd\\\"\",\n \"queryDialect\": \"SqlDataFusion\",\n \"dataFormat\": \"JsonAoA\",\n \"include\": [\"Input\", \"Proof\", \"Schema\"],\n \"schemaFormat\": \"ArrowJson\",\n \"datasets\": [{\n \"id\": \"did:odf:fed0119d20360650afd3d412c6b11529778b784c697559c0107d37ee5da61465726c4\",\n \"alias\": \"kamu/eth-to-usd\",\n \"blockHash\": \"f1620708557a44c88d23c83f2b915abc10a41cc38d2a278e851e5dc6bb02b7e1f9a1a\"\n }],\n \"skip\": 0,\n \"limit\": 3\n },\n \"output\": {\n \"data\": [\n [\"2024-09-02T21:50:00Z\", \"eth\", \"usd\", 2537.07],\n [\"2024-09-02T21:51:00Z\", \"eth\", \"usd\", 2541.37],\n [\"2024-09-02T21:52:00Z\", \"eth\", \"usd\", 2542.66]\n ],\n \"dataFormat\": \"JsonAoA\",\n \"schema\": {\"fields\": [\"...\"]},\n \"schemaFormat\": \"ArrowJson\"\n },\n \"subQueries\": [],\n \"commitment\": {\n \"inputHash\": \"f1620e23f7d8cdde7504eadb86f3cdf34b3b1a7d71f10fe5b54b528dd803387422efc\",\n \"outputHash\": \"f1620e91f4d3fa26bc4ca0c49d681c8b630550239b64d3cbcfd7c6c2d6ff45998b088\",\n \"subQueriesHash\": \"f1620ca4510738395af1429224dd785675309c344b2b549632e20275c69b15ed1d210\"\n },\n \"proof\": {\n \"type\": \"Ed25519Signature2020\",\n \"verificationMethod\": \"did:key:z6MkkhJQPHpA41mTPLFgBeygnjeeADUSwuGDoF9pbGQsfwZp\",\n \"proofValue\": \"uJfY3_g03WbmqlQG8TL-WUxKYU8ZoJaP14MzOzbnJedNiu7jpoKnCTNnDI3TYuaXv89vKlirlGs-5AN06mBseCg\"\n }\n}\n```\n\nA client that gets a proof in response should\nperform [a few basic steps](https://docs.kamu.dev/node/commitments#response-validation) to validate\nthe proof integrity. For example making sure that the DID in\n`proof.verificationMethod` actually corresponds to the node you're querying\ndata from and that the signature in `proof.proofValue` is actually valid.\nOnly after this you can use this proof to hold the node accountable for the\nresult.\n\nA proof can be stored long-term and then disputed at a later point using\nyour own node or a 3rd party node you can trust via the\n[`/verify`](#tag/odf-query/POST/verify) endpoint.\n\nSee [commitments documentation](https://docs.kamu.dev/node/commitments) for details.",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for readers:
image

@@ -1367,7 +1367,7 @@
]
},
"post": {
"description": "### Regular Queries\nThis endpoint lets you execute arbitrary SQL that can access multiple\ndatasets at once.\n\nExample request body:\n```json\n{\n \"query\": \"select event_time, from, to, close from \\\"kamu/eth-to-usd\\\"\",\n \"limit\": 3,\n \"queryDialect\": \"SqlDataFusion\",\n \"dataFormat\": \"JsonAoA\",\n \"schemaFormat\": \"ArrowJson\"\n}\n```\n\nExample response:\n```json\n{\n \"output\": {\n \"data\": [\n [\"2024-09-02T21:50:00Z\", \"eth\", \"usd\", 2537.07],\n [\"2024-09-02T21:51:00Z\", \"eth\", \"usd\", 2541.37],\n [\"2024-09-02T21:52:00Z\", \"eth\", \"usd\", 2542.66]\n ],\n \"dataFormat\": \"JsonAoA\",\n \"schema\": {\"fields\": [\"...\"]},\n \"schemaFormat\": \"ArrowJson\"\n }\n}\n```\n\n### Verifiable Queries\n[Cryptographic proofs](https://docs.kamu.dev/node/commitments) can be\nalso requested to hold the node **forever accountable** for the provided\nresult.\n\nExample request body:\n```json\n{\n \"query\": \"select event_time, from, to, close from \\\"kamu/eth-to-usd\\\"\",\n \"limit\": 3,\n \"queryDialect\": \"SqlDataFusion\",\n \"dataFormat\": \"JsonAoA\",\n \"schemaFormat\": \"ArrowJson\",\n \"include\": [\"proof\"]\n}\n```\n\nCurrently we support verifiability by ensuring that queries are\ndeterministic and fully reproducible and signing the original response with\nNode's private key. In future more types of proofs will be supported.\n\nExample response:\n```json\n{\n \"input\": {\n \"query\": \"select event_time, from, to, close from \\\"kamu/eth-to-usd\\\"\",\n \"queryDialect\": \"SqlDataFusion\",\n \"dataFormat\": \"JsonAoA\",\n \"include\": [\"Input\", \"Proof\", \"Schema\"],\n \"schemaFormat\": \"ArrowJson\",\n \"datasets\": [{\n \"id\": \"did:odf:fed0119d20360650afd3d412c6b11529778b784c697559c0107d37ee5da61465726c4\",\n \"alias\": \"kamu/eth-to-usd\",\n \"blockHash\": \"f1620708557a44c88d23c83f2b915abc10a41cc38d2a278e851e5dc6bb02b7e1f9a1a\"\n }],\n \"skip\": 0,\n \"limit\": 3\n },\n \"output\": {\n \"data\": [\n [\"2024-09-02T21:50:00Z\", \"eth\", \"usd\", 2537.07],\n [\"2024-09-02T21:51:00Z\", \"eth\", \"usd\", 2541.37],\n [\"2024-09-02T21:52:00Z\", \"eth\", \"usd\", 2542.66]\n ],\n \"dataFormat\": \"JsonAoA\",\n \"schema\": {\"fields\": [\"...\"]},\n \"schemaFormat\": \"ArrowJson\"\n },\n \"subQueries\": [],\n \"commitment\": {\n \"inputHash\": \"f1620e23f7d8cdde7504eadb86f3cdf34b3b1a7d71f10fe5b54b528dd803387422efc\",\n \"outputHash\": \"f1620e91f4d3fa26bc4ca0c49d681c8b630550239b64d3cbcfd7c6c2d6ff45998b088\",\n \"subQueriesHash\": \"f1620ca4510738395af1429224dd785675309c344b2b549632e20275c69b15ed1d210\"\n },\n \"proof\": {\n \"type\": \"Ed25519Signature2020\",\n \"verificationMethod\": \"did:key:z6MkkhJQPHpA41mTPLFgBeygnjeeADUSwuGDoF9pbGQsfwZp\",\n \"proofValue\": \"uJfY3_g03WbmqlQG8TL-WUxKYU8ZoJaP14MzOzbnJedNiu7jpoKnCTNnDI3TYuaXv89vKlirlGs-5AN06mBseCg\"\n }\n}\n```\n\nA client that gets a proof in response should\nperform [a few basic steps](https://docs.kamu.dev/node/commitments#response-validation) to validate\nthe proof integrity. For example making sure that the DID in\n`proof.verificationMethod` actually corresponds to the node you're querying\ndata from and that the signature in `proof.proofValue` is actually valid.\nOnly after this you can use this proof to hold the node accountable for the\nresult.\n\nA proof can be stored long-term and then disputed at a later point using\nyour own node or a 3rd party node you can trust via the\n[`/verify`](#tag/odf-query/POST/verify) endpoint.\n\nSee [commitments documentation](https://docs.kamu.dev/node/commitments) for details.",
"description": "### Regular Queries\nThis endpoint lets you execute arbitrary SQL that can access multiple\ndatasets at once.\n\nExample request body:\n```json\n{\n \"query\": \"select event_time, from, to, close from \\\"kamu/eth-to-usd\\\"\",\n \"limit\": 3,\n \"queryDialect\": \"SqlDataFusion\",\n \"dataFormat\": \"JsonAoA\",\n \"schemaFormat\": \"ArrowJson\"\n}\n```\n\nExample response:\n```json\n{\n \"output\": {\n \"data\": [\n [\"2024-09-02T21:50:00Z\", \"eth\", \"usd\", 2537.07],\n [\"2024-09-02T21:51:00Z\", \"eth\", \"usd\", 2541.37],\n [\"2024-09-02T21:52:00Z\", \"eth\", \"usd\", 2542.66]\n ],\n \"dataFormat\": \"JsonAoA\",\n \"schema\": {\"fields\": [\"...\"]},\n \"schemaFormat\": \"ArrowJson\"\n }\n}\n```\n\n### Verifiable Queries\n[Cryptographic proofs](https://docs.kamu.dev/node/commitments) can be\nalso requested to hold the node **forever accountable** for the provided\nresult.\n\nExample request body:\n```json\n{\n \"query\": \"select event_time, from, to, close from \\\"kamu/eth-to-usd\\\"\",\n \"limit\": 3,\n \"queryDialect\": \"SqlDataFusion\",\n \"dataFormat\": \"JsonAoA\",\n \"schemaFormat\": \"ArrowJson\",\n \"include\": [\"proof\"]\n}\n```\n\nCurrently, we support verifiability by ensuring that queries are\ndeterministic and fully reproducible and signing the original response with\nNode's private key. In future more types of proofs will be supported.\n\nExample response:\n```json\n{\n \"input\": {\n \"query\": \"select event_time, from, to, close from \\\"kamu/eth-to-usd\\\"\",\n \"queryDialect\": \"SqlDataFusion\",\n \"dataFormat\": \"JsonAoA\",\n \"include\": [\"Input\", \"Proof\", \"Schema\"],\n \"schemaFormat\": \"ArrowJson\",\n \"datasets\": [{\n \"id\": \"did:odf:fed0119d20360650afd3d412c6b11529778b784c697559c0107d37ee5da61465726c4\",\n \"alias\": \"kamu/eth-to-usd\",\n \"blockHash\": \"f1620708557a44c88d23c83f2b915abc10a41cc38d2a278e851e5dc6bb02b7e1f9a1a\"\n }],\n \"skip\": 0,\n \"limit\": 3\n },\n \"output\": {\n \"data\": [\n [\"2024-09-02T21:50:00Z\", \"eth\", \"usd\", 2537.07],\n [\"2024-09-02T21:51:00Z\", \"eth\", \"usd\", 2541.37],\n [\"2024-09-02T21:52:00Z\", \"eth\", \"usd\", 2542.66]\n ],\n \"dataFormat\": \"JsonAoA\",\n \"schema\": {\"fields\": [\"...\"]},\n \"schemaFormat\": \"ArrowJson\"\n },\n \"subQueries\": [],\n \"commitment\": {\n \"inputHash\": \"f1620e23f7d8cdde7504eadb86f3cdf34b3b1a7d71f10fe5b54b528dd803387422efc\",\n \"outputHash\": \"f1620e91f4d3fa26bc4ca0c49d681c8b630550239b64d3cbcfd7c6c2d6ff45998b088\",\n \"subQueriesHash\": \"f1620ca4510738395af1429224dd785675309c344b2b549632e20275c69b15ed1d210\"\n },\n \"proof\": {\n \"type\": \"Ed25519Signature2020\",\n \"verificationMethod\": \"did:key:z6MkkhJQPHpA41mTPLFgBeygnjeeADUSwuGDoF9pbGQsfwZp\",\n \"proofValue\": \"uJfY3_g03WbmqlQG8TL-WUxKYU8ZoJaP14MzOzbnJedNiu7jpoKnCTNnDI3TYuaXv89vKlirlGs-5AN06mBseCg\"\n }\n}\n```\n\nA client that gets a proof in response should\nperform [a few basic steps](https://docs.kamu.dev/node/commitments#response-validation) to validate\nthe proof integrity. For example making sure that the DID in\n`proof.verificationMethod` actually corresponds to the node you're querying\ndata from and that the signature in `proof.proofValue` is actually valid.\nOnly after this you can use this proof to hold the node accountable for the\nresult.\n\nA proof can be stored long-term and then disputed at a later point using\nyour own node or a 3rd party node you can trust via the\n[`/verify`](#tag/odf-query/POST/verify) endpoint.\n\nSee [commitments documentation](https://docs.kamu.dev/node/commitments) for details.",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -40,7 +40,7 @@ impl From<&AccountID> for odf::AccountID {

impl From<AccountID> for String {
fn from(val: AccountID) -> Self {
val.0.as_did_str().to_string()
val.0.to_string()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for readers: the same but shorter (here and below)

async fn test_handler_panics() {
// Not expecting panic to be trapped - that's the job of an HTTP server
let schema = kamu_adapter_graphql::schema_quiet();
schema.execute(async_graphql::Request::new(indoc!(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for readers: now we get the error, not the panic

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this reason, I have decided to remove this test

@s373r s373r marked this pull request as ready for review December 16, 2024 21:43
@zaychenko-sergei zaychenko-sergei merged commit ba44af5 into master Dec 17, 2024
5 of 6 checks passed
@zaychenko-sergei zaychenko-sergei deleted the chore/private-datasets-backports branch December 17, 2024 09:23
s373r added a commit that referenced this pull request Dec 19, 2024
* Do not show usage error for --all flag (#960)

* Do not show usage error for --all flag

When --all flag is set for the `repo delete` command,
and there are no repositories to delete, do not shoow usage error.

* Improve args validation

* Improve args validation, e2e tests

* Typo corrected in feature flags (#974)

* Images, kamu-base-git: fix collision of executable files (#975)

* 868 api server provide feature flags for UI (#976)

Separated runtime and UI configuration flags. UI config is provided by API server too.

* Release v0.210.0 + minor deps

* 854 persistent storage of dataset dependencies graph (#973)

Dependency graph service moved to 'datasets' domain.
Defined dataset dependency repository interface and created 3 implementations.
No more postponed initialization, organized initial setup in the form of an indexer.
Added telemetry extensions on the way.
Tests for repositories, stabilized other tests.
Cascading effect on delete within the dataset entry domain.

* v0.211.0 + minor deps

* Fixed image building (#977)

Replaced cascade delete of dataset entries in graph with more explicit events to allow orphan upstream dependencies where only ID is given

* Upgrade to datafusion 43

* Use thiserror v2 throughout

* trust-dns-resolver => hickory-resolver + minor deps

* Fix non-sequential offsets on ingest

* 0.212.0

* Use KAMU_CONTAINER_RUNTIME_TYPE env var in Makefile (#991)

* Use KAMU_CONTAINER_RUNTIME_TYPE env var in Makefile
* Make podman default engine for e2e tests

* Backporting changes from Private Datasets feature branch (#992)

* Backport tweaks

* Add doc strings

* Remove unused deps

* Remove unactual test

* CHANGELOG.md: update

* Tips after self-review

* Delete env var on dataset delete (#993)

* Delete env var on dataset delete

* 984 refactoring separate planning and execution phases in key dataset manipulation services (#994)

* Draft split of `CompactionService` into planner and execution parts

* Compaction cleanups

* Compacting more cleanups

* Compacting: read old HEAD on planning phase

* Reset service split on planner and execution

* Extracted `MetadataQueryService` - to query polling, push sources and set transform, instead of ingest/transform planners

* DataWriterMetadataState became part of polling ingest item at the planning phase

* Setting watermark : separate planner and execution service

* Push ingest service prepared for split

* Push ingest split on planning and executing

* Made some order in infra/core services

* {Flow,Task,Outbox}Executor=>Agent

* Unified naming of planners and executors

* Revised telemetry in refactored components

* Review: DataWriterDataFusionBuilder flattened

* changelog

* v0.123.0 + minor deps

* kamu-dev-base: include short commit hash as well (#995)

* v0.213.1: less agressive telemetry with `DataWriterMetadataState`

---------

Co-authored-by: Andrii Demus <[email protected]>
Co-authored-by: Sergei Zaychenko <[email protected]>
Co-authored-by: Sergii Mikhtoniuk <[email protected]>
Co-authored-by: Roman Boiko <[email protected]>
s373r added a commit that referenced this pull request Jan 17, 2025
* Private Datasets: GQL API: Ability to change dataset visibility (#814)

* Changes before rebasing

* from_catalog_n: add clippy warnings suppression
* kamu-adapter-auth-oso: add TODOs
* Migrations: re-index ReBAC properties
* test_oso: update imports
* KamuAuthOso: add TODOs
* DatasetActionAuthorizer::check_action_allowed(): add a TODO
* DatasetEntryServiceHarness: update for tests
* RebacService::{get_account_properties(),get_dataset_properties()}: return idempotency
* DatasetEntryRepository::get_dataset_entries(): implement for SQLite & Postgres
* RebacRepository::properties_count(): implement for SQLite & Postgres
* AccountRepository::get_accounts(): implement for SQLite & Postgres
* OsoResourceServiceInMem: handle DatasetLifecycleMessage's
* OsoResourceServiceInMem::initialize(): update types
* Split OsoResourceHolder to OsoResourceServiceInMem & OsoResourceServiceInitializator
* OsoResourceHolder: remove dependency to JOB_KAMU_DATASETS_DATASET_ENTRY_INDEXER
* kamu-cli: register DatasetEntryIndexer even if not in workspace
* Tests stabilization activities
* RebacIndexer: add missed #[interface(dyn InitOnStartup)]
* kamu-cli: kamu_auth_rebac_services::register_dependencies()
* kamu-cli: kamu_adapter_auth_oso::register_dependencies()
* OsoDatasetAuthorizer: integrate OsoResourceHolder
* OsoResourceHolder: introduce
* DatasetEntryIndexer::index_datasets(): increase log severity
* RebacIndexer: introduce
* kamu-adapter-auth-oso: update description
* RebacServiceImpl: dataset_id_entity -> dataset_entity
* test_multi_tenant_rebac_dataset_lifecycle_message_consumer: actualize tests
* kamu-adapter-auth-oso: add anonymous() helper
* kamu-adapter-auth-oso: use MockDatasetRepositoryWriter
* kamu-adapter-auth-oso: actualize tests
* DatasetActionAuthorizer, DatasetAction: add oso-related impls
* OsoDatasetAuthorizer::get_allowed_actions(): return <HashSet<DatasetAction>, InternalError>
* GQL, Dataset::properties(): use kamu_auth_rebac::DatasetProperties
* kamu-adapter-auth{,-rebac}: remove experimental crates
* OsoDatasetAuthorizer: initial RebacService integration
* #[allow(unused_variables)] -> #[expect(unused_variables)]
* kamu-adapter-auth: extract
* kamu-adapter-rebac: initial
* kamu-adapter-oauth, AggregatingDatasetActionAuthorizer: initial
* kamu-adapter-graphql, from_catalog_n!(): introduce
* test_multi_tenant_rebac_dataset_lifecycle_message_consumer: stabilize tests
* SmTP, AxumServerPushProtocolInstance::push_main_flow(): remove extra allocations
* Tests, test_gql_datasets: use macros for tests
* Tests, test_gql_datasets: expected first
* Fixes after rebasing
* Tests: update dataset_create_empty_*()
* RunInDatabaseTransactionLayer: remove unused
* GQL, Datasets: use pretty_assertions::assert_eq!()
* GQL, DatasetPropertyName: remove outdated scalar
* MultiTenantRebacDatasetLifecycleMessageConsumer::handle_dataset_lifecycle_created_message(): add "allows_anonymous_read" property as well
* GQL, Dataset::properties(): return flags for simplicity
* DatasetMut::set_visibility(): stabilize
* Preparations
  - DependencyGraphServiceInMemory: remove extra .int_err() calls
  - Dataset::rebac_properties(): introduce
  - RebacService::get_dataset_properties(): use DatasetPropertyName instead of PropertyName
  - kamu-auth-rebac: extract value constants
  - DatasetMut::{set_publicly_available(),set_anonymous_available()}: ensure account owns dataset
  - DatasetMut: move to own directory
  - DatasetMut::{set_publicly_available(),set_anonymous_available()}: hide methods behind logging guards
  - DatasetMut::set_property(): extract method
  - DatasetMut::set_anonymous_available(): implement
  - DatasetMut::set_publicly_available(): implement
  - RevokeResultSuccess::message(): fix typo

* Fixes after rebasing on 0.208.*

* Tests, kamu-cli: auto-register e2e-user for the e2e mode

* OSO: replace names with IDs in schema

* Tests stabilization

* sqlx: add cached queries

* Build speed-ups: remove unused deps

* test_pull_derivative_mt: correct running

* CHANGELOG: add some entries

* DatasetEntryRepository: simplify lifetimes

* kamu-adapter-auth-oso-rebac: add "-rebac" suffix

* Remove several TODOs

* CHANGELOG.md: add several entries

* OsoDatasetAuthorizer: revisit implementation

* Review 1: GQL: remove Dataset.properties

* Review 1: OsoDatasetAuthorizer::ctor(): fix param name

* database-common, EntityStreamer: introduce

* DatasetEntryServiceImpl: use EntityStreamer

* RebacServiceImpl::get_dataset_properties_by_ids(): add

* PaginationOpts::safe_limit(): add

* Tests, EntityStreamer: add tests with input data

* RebacService::get_dataset_properties_by_ids(): update interface

* DatasetEntryServiceImpl: use EntityStreamer [2]

* OsoResourceServiceInMem: rewrite to use streamed pages

* OsoDatasetAuthorizer: use get_multiple_dataset_resources()

* OsoResourceServiceInitializator: remove

* query_handler_post(): add a comma in doc

* DatasetActionAuthorizer: add TODOs

* test_flow_event_store: fix typos

* OsoDatasetAuthorizer::user_dataset_pair(): remove

* RebacIndexer::index_dataset_entries(): iterate over a stream

* EntityStreamer: remove extra int_err() & resort declarations

* AccountRepository::get_accounts(): streamed version

* RebacIndexer::index_accounts(): use iterate over a stream

* Test fixes

* RebacRepository::get_entity_properties_by_ids(): implementations

* Remove extra as_did_str() call

* RebacRepository::get_entity_properties_by_ids(): implementations[2]

* AccountRepository::accounts_count(): implementations

* PostgresAccountRepository::get_accounts(): implementation

* sqlx: update cached queries

* RebacRepository::get_entity_properties_by_ids(): implementations[3]

* DatasetEntryServiceImpl: use tokio::sync::RwLock

* PostgresDatasetEntryRepository: tweaks

* EntityStreamer -> EntityPageStreamer

* sqlite_generate_placeholders_list: extract & use

* OsoResourceServiceInMem: add a TODO about state

* Search::query(): use from_catalog_n!()

* OsoResourceServiceInMem -> OsoResourceServiceImpl

* KamuAuthOso: impl Deref to Arc<Oso>

* OsoResourceServiceImpl: concrete error types

* kamu-adapter-auth-oso-rebac: remove extra dep

* DatasetEntryRepository: use odf namespace

* DatasetEntryServiceImpl: use odf namespace

* DatasetEntryService::list_entries_owned_by(): do not clone owner_id

* DatasetEntryRepository::get_dataset_entries(): update ORDER BY column

* EntityListing -> EntityPageListing

* Tweaks before merging

* GQL: Dataset.visibility(): return back, after being deleted by mistake (#997)

* Merge actual changes (#998)

* Do not show usage error for --all flag (#960)

* Do not show usage error for --all flag

When --all flag is set for the `repo delete` command,
and there are no repositories to delete, do not shoow usage error.

* Improve args validation

* Improve args validation, e2e tests

* Typo corrected in feature flags (#974)

* Images, kamu-base-git: fix collision of executable files (#975)

* 868 api server provide feature flags for UI (#976)

Separated runtime and UI configuration flags. UI config is provided by API server too.

* Release v0.210.0 + minor deps

* 854 persistent storage of dataset dependencies graph (#973)

Dependency graph service moved to 'datasets' domain.
Defined dataset dependency repository interface and created 3 implementations.
No more postponed initialization, organized initial setup in the form of an indexer.
Added telemetry extensions on the way.
Tests for repositories, stabilized other tests.
Cascading effect on delete within the dataset entry domain.

* v0.211.0 + minor deps

* Fixed image building (#977)

Replaced cascade delete of dataset entries in graph with more explicit events to allow orphan upstream dependencies where only ID is given

* Upgrade to datafusion 43

* Use thiserror v2 throughout

* trust-dns-resolver => hickory-resolver + minor deps

* Fix non-sequential offsets on ingest

* 0.212.0

* Use KAMU_CONTAINER_RUNTIME_TYPE env var in Makefile (#991)

* Use KAMU_CONTAINER_RUNTIME_TYPE env var in Makefile
* Make podman default engine for e2e tests

* Backporting changes from Private Datasets feature branch (#992)

* Backport tweaks

* Add doc strings

* Remove unused deps

* Remove unactual test

* CHANGELOG.md: update

* Tips after self-review

* Delete env var on dataset delete (#993)

* Delete env var on dataset delete

* 984 refactoring separate planning and execution phases in key dataset manipulation services (#994)

* Draft split of `CompactionService` into planner and execution parts

* Compaction cleanups

* Compacting more cleanups

* Compacting: read old HEAD on planning phase

* Reset service split on planner and execution

* Extracted `MetadataQueryService` - to query polling, push sources and set transform, instead of ingest/transform planners

* DataWriterMetadataState became part of polling ingest item at the planning phase

* Setting watermark : separate planner and execution service

* Push ingest service prepared for split

* Push ingest split on planning and executing

* Made some order in infra/core services

* {Flow,Task,Outbox}Executor=>Agent

* Unified naming of planners and executors

* Revised telemetry in refactored components

* Review: DataWriterDataFusionBuilder flattened

* changelog

* v0.123.0 + minor deps

* kamu-dev-base: include short commit hash as well (#995)

* v0.213.1: less agressive telemetry with `DataWriterMetadataState`

---------

Co-authored-by: Andrii Demus <[email protected]>
Co-authored-by: Sergei Zaychenko <[email protected]>
Co-authored-by: Sergii Mikhtoniuk <[email protected]>
Co-authored-by: Roman Boiko <[email protected]>

* Fixes after merging (#999)

* `DatasetOwnershipService`: moved to the `kamu-dataset` area & implemented via `DatasetEntryServiceImpl` (#1004)

* DatasetOwnershipService: use odf namespace

* DatasetEntryServiceImpl: impl DatasetOwnershipService

* DatasetOwnershipService: move to kamu-datasets scope

* CHANGELOG.md: update

* GQL, DatasetMut::set_visibility(): correct return type (#1007)

* GQL, DatasetMetadata: be prepared for not accessed datasets (#1011)

* GQL, DatasetMetadata: correct processing of dataset's dependencies that are not found (#1013)

* GQL, DatasetMetadata: update dataset's dependencies types (#1014)

* Private Datasets: absorb helpful commits from command updates (#1016)

* E2E: added the ability to create an account using CLI

* OutboxImmediateImpl::post_message_as_json(): return a dispatch error, if present

* Fixes after merge

* GQL: Datasets: auth checks (#1017)

* E2E: `DatasetMut::set_visibility()` (#1032)

* KamuApiServerClient::graphql_api_call_assert_with_token(): remove extra method

* KamuApiServerClient: introduce GraphQLResponse type

* E2E: DatasetMut::set_visibility()

* RebacIndexer: respect predefined treat_datasets_as_public (#1033)

* Chore/private datasets address comments - 1 (#1037)

* DatasetEntryServiceExt: absorb DatasetOwnershipService::get_owned_datasets()

* DatasetEntryServiceExt: absorb all rest DatasetOwnershipService

* kamu-adapter-auth-oso-rebac: remove duplicate dep

* DatasetActionAuthorizer: classify_datasets_by_allowance() -> classify_dataset_handles_by_allowance()

* DatasetRegistry: remove an TODO

* DatasetEntryRepository::get_dataset_entries(): use dataset_name column for sorting in implementations (as it was)

* OsoResourceServiceImpl: state extraction to singleton component

* DatasetActionAuthorizer::check_action_allowed(): use DatasetID instead of DatasetHandle

* DatasetActionAuthorizer::is_action_allowed(): use DatasetID instead of DatasetHandle

* DatasetActionAuthorizer::get_allowed_actions(): use DatasetID instead of DatasetHandle

* DatasetActionAuthorizer: finalization

* ODataServiceContext::list_collections(): use DatasetActionAuthorizer::filtered_datasets_stream()

* Datasets::by_account_impl(): use DatasetActionAuthorizer::filtered_datasets_stream()

* Search::query(): use DatasetActionAuthorizer::filtered_datasets_stream()

* GetDatasetDownstreamDependenciesUseCase: extract

* GetDatasetUpstreamDependenciesUseCase: extract

* AccountServiceImpl::all_accounts(): absorb list_all_accounts() method

* ExpensiveAccountRepository: extract trait

* RebacService::properties_count(): implement

* DatasetEntryService: move list-* operations within an implementation

* Fix a cached sqlx query

* chore/private-datasets-address-comments-vol-2 (#1038)

* ensure_account_owns_dataset() -> ensure_account_is_owner_or_admin()

* {Account,Dataset}Properties::apply(): add

* RebacServiceImpl: inject default properties

* Merge actual changes

* Release (minor): 0.218.0

---------

Co-authored-by: Andrii Demus <[email protected]>
Co-authored-by: Sergei Zaychenko <[email protected]>
Co-authored-by: Sergii Mikhtoniuk <[email protected]>
Co-authored-by: Roman Boiko <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants