Skip to content
This repository has been archived by the owner on Apr 22, 2024. It is now read-only.

Commit

Permalink
Parallel extraction (#386)
Browse files Browse the repository at this point in the history
* feat: add mapping sets to kernel (#337)

* refactor: mapping set model

* chore: joining tables - submission, mapping

* chore: add input to mapping set model

* fix: (kernel-test) mapping set

* fix: remove unused containers

* fix: db network

* fix: set mappingset input to nullable

* chore: mappingset migration

* fix: (kernel) mapping set to mapping relationship

* fix: (kernel) extraction on only active mappings within a set

* chore: (kernel) migration clean-up

* revert: migration

* chore: merge migrations

* fix: kernel model update

* fix: filter typo

* chore: update kernel ERD

* chore: migrate mappings to mapping set

* fix: clear mapping projectschemas then readd

* fix: format code

* fix: format code

* fix: pylint whitespaces

* feat: (ui) mappingsets to pipelines  (#361)

* feat: ui contract model

* fix: kernel migration

* fix: kernel migrations

* feat: ui mapping set

* feat: react component update

* test: (ui) pipeline fetch

* fix: remove comments

* fix: readonly message

* fix: model default_name

* fix: loop mapping set pages

* fix: test add pipeline

* fix: artefacts generation with mapping sets (#350)

* fix: tweaking models (100% coverage)

* fix: tweaking filters (100% coverage)

* fix(views): distinct count in stats

* fix: upsert project artefacts with mapping sets

* fix: API urls

* fix(common): submit to kernel with mapping set

* fix(odk): submission to kernel

* fix(odk): generated mappings are read only

* fix: naming naming naming

* fix: tweaking

* fix(couchdb): adapt to mapping set model

* fix: reactivate UI tests in travis (#371)

* fix: reactivate UI tests in travis

* test: sass-lint rules

* fix mappingset migration (#379)

* chore: add mappingset to client test_fixtures

* chore: swap mapping.id for mappingset.id in submission generation

* fix: use model from swagger for submissions in integration tests

* fix: remove useless test and change scope of entity generation

* feat (ui): pipeline fetch/publish using mapping set (#373)

* fix: tweaking models (100% coverage)

* fix: tweaking filters (100% coverage)

* fix(views): distinct count in stats

* fix: upsert project artefacts with mapping sets

* fix: API urls

* fix(common): submit to kernel with mapping set

* fix(odk): submission to kernel

* fix(odk): generated mappings are read only

* feat: ui contract model

* fix: kernel migration

* fix: kernel migrations

* feat: ui mapping set

* feat: react component update

* test: (ui) pipeline fetch

* fix: remove comments

* fix: readonly message

* fix: naming naming naming

* fix: model default_name

* fix: loop mapping set pages

* fix: test add pipeline

* fix: tweaking

* fix: pipeline:contracts

* feat: pipeline publish

* fix(ui): test

* fix: temp deactivate couch-sync tests

* test (ui): mapping set 100%

* fix: project name

* fix: ui test consistency

* chore: readonly class

* chore: selected pipeline readonly class

* added styling for readonly-pipeline in overview screen

* added styling to readonly-pipeline navbar

* added styling for read-only text inputs

* better presentation of mapping-definitions json textarea.

* fix (ui): pipeline - contract infix

* fix (ui): fix pipeline view

* fix(ui): test

* fix (ui): check pipelines lenght

* fix (ui): migration fix

* fix(ui): contract migration

* fix: artefacts names (#387)

* fix(ui): filter piplines - redux

* feat(kernel): create an empty mapping along with the passthrough one (#389)

* feat(kernel): create empty mapping

* fix: run_entity_extraction

* fixed css grid and added word break for long titles without breaking space (#397)

* docs(ui): fix model comments (#401)

* feat(kernel): include a random input in the generated mapping (#402)

* feat: include input in generated mapping

* fix: do not duplicate constants

* fix(ui): the derived data must have an id field with UUID content (#400)

* fix: the schema must have an id field with UUID content

* fix: apply only to derived schemas

* fix: also derived entity type

* fix: cleaning

* fix: check id field in EntityTypes list

* test: implement id rule

* fix: including docs
  • Loading branch information
lordmallam authored Oct 10, 2018
1 parent bbcb105 commit da1e5fb
Show file tree
Hide file tree
Showing 61 changed files with 2,720 additions and 1,043 deletions.
4 changes: 2 additions & 2 deletions aether-common-module/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,10 +84,10 @@ Possible responses:
- `Always Look on the Bright Side of Life!!!`
- `Brought to you by eHealth Africa - good tech for hard places`

#### To make submissions linked to an existing project artefact (mapping).
#### To push submissions linked to an existing project artefact (mapping set).

```python
aether.common.kernel.utils.submit_to_kernel(submission, mapping_id, submission_id=None)
aether.common.kernel.utils.submit_to_kernel(submission, mappingset_id, submission_id=None)
```

### Conf section
Expand Down
12 changes: 6 additions & 6 deletions aether-common-module/aether/common/kernel/tests/test_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -140,12 +140,12 @@ def test__test_connection_get_fail(self, mock_get, mock_head):
)

@mock.patch.dict('os.environ', AETHER_ENV_MOCK)
def test_submit_to_kernel__without_mapping_id(self):
def test_submit_to_kernel__without_mappingset_id(self):
self.assertRaises(
Exception,
utils.submit_to_kernel,
submission={'a': 1},
mapping_id=None,
submission={},
mappingset_id=None,
)

@mock.patch.dict('os.environ', AETHER_ENV_MOCK)
Expand All @@ -154,22 +154,22 @@ def test_submit_to_kernel__without_submission(self):
Exception,
utils.submit_to_kernel,
submission=None,
mapping_id=1,
mappingset_id=1,
)

@mock.patch('requests.put')
@mock.patch('requests.post')
@mock.patch.dict('os.environ', AETHER_ENV_MOCK)
def test_submit_to_kernel__without_submission_id(self, mock_post, mock_put):
utils.submit_to_kernel(submission={'_id': 'a'}, mapping_id=1, submission_id=None)
utils.submit_to_kernel(submission={'_id': 'a'}, mappingset_id=1, submission_id=None)
mock_put.assert_not_called()
mock_post.assert_called()

@mock.patch('requests.put')
@mock.patch('requests.post')
@mock.patch.dict('os.environ', AETHER_ENV_MOCK)
def test_submit_to_kernel__with_submission_id(self, mock_post, mock_put):
utils.submit_to_kernel(submission={'_id': 'a'}, mapping_id=1, submission_id=1)
utils.submit_to_kernel(submission={'_id': 'a'}, mappingset_id=1, submission_id=1)
mock_put.assert_called()
mock_post.assert_not_called()

Expand Down
8 changes: 4 additions & 4 deletions aether-common-module/aether/common/kernel/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -139,16 +139,16 @@ def get_data(url):
return results


def submit_to_kernel(submission, mapping_id, submission_id=None):
def submit_to_kernel(submission, mappingset_id, submission_id=None):
'''
Push the submission to Aether Kernel
'''

if submission is None:
raise errors.SubmissionError(_('Cannot make submission without content!'))

if mapping_id is None:
raise errors.SubmissionError(_('Cannot make submission without mapping!'))
if mappingset_id is None:
raise errors.SubmissionError(_('Cannot make submission without mapping set!'))

if submission_id:
# update existing doc
Expand All @@ -164,7 +164,7 @@ def submit_to_kernel(submission, mapping_id, submission_id=None):
url,
json={
'payload': submission,
'mapping': mapping_id,
'mappingset': mappingset_id,
},
headers=get_auth_header(),
)
2 changes: 1 addition & 1 deletion aether-couchdb-sync-module/aether/sync/api/couchdb_sync.py
Original file line number Diff line number Diff line change
Expand Up @@ -194,5 +194,5 @@ def post_to_aether(document, aether_id=False):
)

return kernel_utils.submit_to_kernel(submission=document,
mapping_id=str(schema.kernel_id),
mappingset_id=str(schema.kernel_id),
submission_id=aether_id)
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
from . import clean_couch


SUBMISSION_FK = 'mapping'
SUBMISSION_FK = 'mappingset'
headers_testing = kernel_utils.get_auth_header()
device_id = 'test_import-from-couch'

Expand Down
8 changes: 7 additions & 1 deletion aether-kernel/aether/kernel/admin.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,13 +29,18 @@ class ProjectAdmin(CompareVersionAdmin):

class MappingAdmin(CompareVersionAdmin):
form = forms.MappingForm
list_display = ('id', 'name', 'revision',)
readonly_fields = ('id',)


class MappingSetAdmin(CompareVersionAdmin):
list_display = ('id', 'name', 'revision', 'project',)
readonly_fields = ('id',)


class SubmissionAdmin(CompareVersionAdmin):
form = forms.SubmissionForm
list_display = ('id', 'revision', 'mapping', 'map_revision',)
list_display = ('id', 'revision', 'mappingset',)
readonly_fields = ('id',)


Expand All @@ -57,6 +62,7 @@ class EntityAdmin(CompareVersionAdmin):


admin.site.register(models.Project, ProjectAdmin)
admin.site.register(models.MappingSet, MappingSetAdmin)
admin.site.register(models.Mapping, MappingAdmin)
admin.site.register(models.Submission, SubmissionAdmin)
admin.site.register(models.Schema, SchemaAdmin)
Expand Down
110 changes: 103 additions & 7 deletions aether-kernel/aether/kernel/api/avro_tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,14 @@
for details.
'''

import collections
import copy
import uuid
import random

from collections import namedtuple
from copy import deepcopy
from os import urandom
from string import ascii_letters
from uuid import uuid4


# Constants used by AvroValidator to distinguish between avro types
# ``int`` and ``long``.
Expand Down Expand Up @@ -55,6 +60,87 @@
NAMESPACE = 'org.ehealthafrica.aether'


def random_string():
return ''.join(random.choice(ascii_letters) for i in range(random.randint(1, 30)))


def random_avro(schema):
'''
Generates a random value based on the given AVRO schema.
'''

name = schema.get('name')
avro_type = schema['type']
if isinstance(avro_type, list): # UNION or NULLABLE
# ["null", "int", "string", {"type: "record", ...}]
avro_type = [t for t in avro_type if t != NULL] # ignore NULL
if len(avro_type) == 1: # it was NULLABLE
avro_type = avro_type[0]

if __has_type(avro_type): # {"type": {"type": "zzz", ...}}
schema = avro_type
avro_type = avro_type.get('type')

if avro_type == NULL:
return None

if avro_type == BOOLEAN:
return True if random.random() > 0.5 else False

if avro_type in [BYTES, FIXED]:
return urandom(schema.get('size', 8))

if avro_type == INT:
return random.randint(INT_MIN_VALUE, INT_MAX_VALUE)

if avro_type == LONG:
return random.randint(LONG_MIN_VALUE, LONG_MAX_VALUE)

if avro_type in [FLOAT, DOUBLE]:
return random.random() + random.randint(INT_MIN_VALUE, INT_MAX_VALUE)

if avro_type == STRING:
if name == 'id':
return str(uuid4()) # "id" fields contain an UUID
return random_string()

if avro_type == ENUM:
return random.choice(schema['symbols'])

if avro_type == RECORD:
return {
f['name']: random_avro(f)
for f in schema.get('fields', [])
}

if avro_type == MAP:
values = schema.get('values')
map_type = values if __has_type(values) else {'type': values}
return {
random_string(): random_avro(map_type)
for i in range(random.randint(1, 5))
}

if avro_type == ARRAY:
items = schema.get('items')
array_type = items if __has_type(items) else {'type': items}
return [
random_avro(array_type)
for i in range(random.randint(1, 5))
]

if isinstance(avro_type, list): # UNION
# choose one random type and generate value
# ["int", "string", {"type: "record", ...}]
ut = avro_type[random.randint(0, len(avro_type) - 1)]
ut = ut if __has_type(ut) else {'type': ut}
return random_avro(ut)

# TODO: named types ¯\_(ツ)_/¯

return None


class AvroValidationException(Exception):
pass

Expand All @@ -70,7 +156,7 @@ class AvroValidationException(Exception):
#
# indicates that the expected type at path "$.a.b" was a union of
# 'null' and 'string'. The actual value was 1.
AvroValidationError = collections.namedtuple(
AvroValidationError = namedtuple(
'AvroValidationError',
['expected', 'datum', 'path'],
)
Expand Down Expand Up @@ -333,9 +419,11 @@ def avro_schema_to_passthrough_artefacts(item_id, avro_schema):
'''

if not item_id:
item_id = str(uuid.uuid4())
item_id = str(uuid4())

definition = deepcopy(avro_schema)
sample = random_avro(definition)

definition = copy.deepcopy(avro_schema)
# assign default namespace
if not definition.get('namespace'):
definition['namespace'] = NAMESPACE
Expand Down Expand Up @@ -377,7 +465,15 @@ def avro_schema_to_passthrough_artefacts(item_id, avro_schema):
'definition': {
'entities': {name: item_id},
'mapping': rules,
}
},
# this is an auto-generated mapping that shouldn't be modified manually
'is_read_only': True,
'is_active': True,
'input': sample, # include a data sample
}

return schema, mapping


def __has_type(avro_type):
return isinstance(avro_type, dict) and avro_type.get('type')
54 changes: 54 additions & 0 deletions aether-kernel/aether/kernel/api/filters.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,26 +39,71 @@ class Meta:


class MappingFilter(filters.FilterSet):
mappingset = filters.CharFilter(
method='mappingset_filter',
)
projectschema = filters.CharFilter(
method='projectschema_filter',
)

def mappingset_filter(self, queryset, name, value):
if is_uuid(value):
return queryset.filter(mappingset__pk=value)
else:
return queryset.filter(mappingset__name=value)

def projectschema_filter(self, queryset, name, value):
if is_uuid(value):
return queryset.filter(projectschemas__in=[value])
else:
return queryset.filter(projectschemas__name__in=[value])

class Meta:
fields = '__all__'
exclude = ('definition',)
model = models.Mapping


class MappingSetFilter(filters.FilterSet):
project = filters.CharFilter(
method='project_filter',
)

def project_filter(self, queryset, name, value):
if is_uuid(value):
return queryset.filter(project__pk=value)
else:
return queryset.filter(project__name=value)

class Meta:
fields = '__all__'
exclude = ('input',)
model = models.MappingSet


class SubmissionFilter(filters.FilterSet):
instanceID = filters.CharFilter(
field_name='payload__meta__instanceID',
)
project = filters.CharFilter(
method='project_filter',
)
mappingset = filters.CharFilter(
method='mappingset_filter',
)

def project_filter(self, queryset, name, value):
if is_uuid(value):
return queryset.filter(project__pk=value)
else:
return queryset.filter(project__name=value)

def mappingset_filter(self, queryset, name, value):
if is_uuid(value):
return queryset.filter(mappingset__pk=value)
else:
return queryset.filter(mappingset__name=value)

class Meta:
fields = '__all__'
exclude = ('payload',)
Expand Down Expand Up @@ -100,6 +145,9 @@ class EntityFilter(filters.FilterSet):
project = filters.CharFilter(
method='project_filter',
)
mapping = filters.CharFilter(
method='mapping_filter',
)

def project_filter(self, queryset, name, value):
if is_uuid(value):
Expand All @@ -113,6 +161,12 @@ def schema_filter(self, queryset, name, value):
else:
return queryset.filter(projectschema__schema__name=value)

def mapping_filter(self, queryset, name, value):
if is_uuid(value):
return queryset.filter(mapping__pk=value)
else:
return queryset.filter(mapping__name=value)

class Meta:
fields = '__all__'
exclude = ('payload',)
Expand Down
Loading

0 comments on commit da1e5fb

Please sign in to comment.