Skip to content
This repository has been archived by the owner on Apr 22, 2024. It is now read-only.

fix(ui): the derived data must have an id field with UUID content #400

Merged
merged 7 commits into from
Oct 10, 2018

Conversation

obdulia-losantos
Copy link
Collaborator

@obdulia-losantos obdulia-losantos commented Oct 9, 2018

Reducing validation errors

  • If there is an "id" field, the random generated content will be UUID complaint.
  • If the Input does not have an "id" field the passthrough mapping will include the #!uuid rule and the "id" field in the derived Entity Type.
  • If one of the given Entity Types does not have an "id" field will throw an error.

@lordmallam
Copy link
Contributor

Why must the input schema have an id?. We have basically no control over the shape of expected incoming data.

@obdulia-losantos
Copy link
Collaborator Author

But we have some restrictions on the type and content of data we expect 🤷‍♀️

@obdulia-losantos
Copy link
Collaborator Author

I do not pretend to merge this PR, but I want to bold the fact that sometimes the validation fails due to the random data we generate with the library, and the passthrough mapping without "id" does not pass the validation 😒

@obdulia-losantos obdulia-losantos changed the title fix(ui): the schema must have an id field with UUID content fix(ui): the derived data must have an id field with UUID content Oct 9, 2018
@admbtlr
Copy link
Contributor

admbtlr commented Oct 9, 2018

What do you think @shawnsarwar ?

@shawnsarwar
Copy link
Contributor

Here's my full brain-dump on this. Every entity needs to have an ID that's unique the reason being we want to be able to have durable links between entities. From the kernel side, we need to be able to access any entity via the url /entities/{id-of-entity}. We can either have a fixed field for that ID in the payload as we do now, or complicate things a bit by using an annotation in the schema to specify which payload field is the ID. This is provided for in Avro + salad using @jsonLDPredicate -> type: @id. This is more complex to implement as we then need to understand the schema of each message in order to figure out which field needs to be come the kernel ID. It requires the producer to understand the same.

@shawnsarwar
Copy link
Contributor

To the particulars of this PR. Perhaps when creating a pass-through pipeline, if the input schema does not have a valid UUID in the ID field, one is created. If an ID field is present but isn't a UUID, we have to prepend the old _id field with some signifier and make a new one and properly handle redirection of the old ID to the not_aether_id field.

@obdulia-losantos
Copy link
Collaborator Author

More or less this PR tries to reduce the validation errors.

  • If there is an "id" field, the random generated content will be UUID complaint.
  • If the Input does not have an "id" field the passthrough mapping will include the #!uuid rule and the "id" field in the derived Entity Type.
  • If one of the given Entity Types does not have an "id" field will throw an error.

@lordmallam @shawnsarwar @adamvert is this at least enough?

@shawnsarwar
Copy link
Contributor

@obdulia-losantos I think that should be sufficient.

@shawnsarwar shawnsarwar self-requested a review October 9, 2018 15:47
@obdulia-losantos obdulia-losantos merged commit bbc50ac into parallel-extraction Oct 10, 2018
@obdulia-losantos obdulia-losantos deleted the parallel-schema-id branch October 10, 2018 09:11
lordmallam added a commit that referenced this pull request Oct 10, 2018
* feat: add mapping sets to kernel (#337)

* refactor: mapping set model

* chore: joining tables - submission, mapping

* chore: add input to mapping set model

* fix: (kernel-test) mapping set

* fix: remove unused containers

* fix: db network

* fix: set mappingset input to nullable

* chore: mappingset migration

* fix: (kernel) mapping set to mapping relationship

* fix: (kernel) extraction on only active mappings within a set

* chore: (kernel) migration clean-up

* revert: migration

* chore: merge migrations

* fix: kernel model update

* fix: filter typo

* chore: update kernel ERD

* chore: migrate mappings to mapping set

* fix: clear mapping projectschemas then readd

* fix: format code

* fix: format code

* fix: pylint whitespaces

* feat: (ui) mappingsets to pipelines  (#361)

* feat: ui contract model

* fix: kernel migration

* fix: kernel migrations

* feat: ui mapping set

* feat: react component update

* test: (ui) pipeline fetch

* fix: remove comments

* fix: readonly message

* fix: model default_name

* fix: loop mapping set pages

* fix: test add pipeline

* fix: artefacts generation with mapping sets (#350)

* fix: tweaking models (100% coverage)

* fix: tweaking filters (100% coverage)

* fix(views): distinct count in stats

* fix: upsert project artefacts with mapping sets

* fix: API urls

* fix(common): submit to kernel with mapping set

* fix(odk): submission to kernel

* fix(odk): generated mappings are read only

* fix: naming naming naming

* fix: tweaking

* fix(couchdb): adapt to mapping set model

* fix: reactivate UI tests in travis (#371)

* fix: reactivate UI tests in travis

* test: sass-lint rules

* fix mappingset migration (#379)

* chore: add mappingset to client test_fixtures

* chore: swap mapping.id for mappingset.id in submission generation

* fix: use model from swagger for submissions in integration tests

* fix: remove useless test and change scope of entity generation

* feat (ui): pipeline fetch/publish using mapping set (#373)

* fix: tweaking models (100% coverage)

* fix: tweaking filters (100% coverage)

* fix(views): distinct count in stats

* fix: upsert project artefacts with mapping sets

* fix: API urls

* fix(common): submit to kernel with mapping set

* fix(odk): submission to kernel

* fix(odk): generated mappings are read only

* feat: ui contract model

* fix: kernel migration

* fix: kernel migrations

* feat: ui mapping set

* feat: react component update

* test: (ui) pipeline fetch

* fix: remove comments

* fix: readonly message

* fix: naming naming naming

* fix: model default_name

* fix: loop mapping set pages

* fix: test add pipeline

* fix: tweaking

* fix: pipeline:contracts

* feat: pipeline publish

* fix(ui): test

* fix: temp deactivate couch-sync tests

* test (ui): mapping set 100%

* fix: project name

* fix: ui test consistency

* chore: readonly class

* chore: selected pipeline readonly class

* added styling for readonly-pipeline in overview screen

* added styling to readonly-pipeline navbar

* added styling for read-only text inputs

* better presentation of mapping-definitions json textarea.

* fix (ui): pipeline - contract infix

* fix (ui): fix pipeline view

* fix(ui): test

* fix (ui): check pipelines lenght

* fix (ui): migration fix

* fix(ui): contract migration

* fix: artefacts names (#387)

* fix(ui): filter piplines - redux

* feat(kernel): create an empty mapping along with the passthrough one (#389)

* feat(kernel): create empty mapping

* fix: run_entity_extraction

* fixed css grid and added word break for long titles without breaking space (#397)

* docs(ui): fix model comments (#401)

* feat(kernel): include a random input in the generated mapping (#402)

* feat: include input in generated mapping

* fix: do not duplicate constants

* fix(ui): the derived data must have an id field with UUID content (#400)

* fix: the schema must have an id field with UUID content

* fix: apply only to derived schemas

* fix: also derived entity type

* fix: cleaning

* fix: check id field in EntityTypes list

* test: implement id rule

* fix: including docs
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants