'get_aggregation_unit' action and 'aggregation_unit' params for all spatial aggregates #5278

jc-harrison · 2022-07-07T12:37:34Z

I have:

Formatted any Python files with black
Brought the branch up to date with master
Added any relevant Github labels
Added tests for any new additions
Added or updated any relevant documentation
Added an Architectural Decision Record (ADR), if appropriate
Added an MPLv2 License Header if appropriate
Updated the Changelog

Description

Adds a new flowmachine server action, get_aggregation_unit, which returns the aggregation unit associated with a spatial aggregate query (or None if the query is not a spatial aggregate). To support this, all exposed query class instances corresponding to a spatial aggregate query now have an aggregation_unit attribute or property.

Also adds a dump-only aggregation_unit parameter to all spatial aggregates that do not have an explicit aggregation_unit parameter - a dump-only field is not accepted as an input parameter, but it is included in the API spec (marked as "read-only") so that FlowAPI can use this parameter to determine whether a query kind is a spatial aggregate when generating permission scopes. The dump-only aggregation_unit fields also mean that the aggregation_unit property of an exposed query will be included if the query is serialised - while that's a nice feature, it's not something we currently rely on for anything.

Adds a docs section on exposing a new query kind, since there are an increasing number of steps that need to be included in an exposed query.

Also a few changes to query schemas that are not directly related to the original aims of this PR:

Defines OneOfQuerySchema (subclass of OneOfSchema) for nesting a choice of multiple sub-query schemas. Consequences are that all exposed query classes now need a query_kind class attribute, and we no longer need a workaround in FlowAPI to mark query_kind parameters as required.
Fixes some custom validators so that errors in nested fields do not cause validation errors to be masked by key/attribute errors (Modal location aggregation-unit consistency validation masks other validation errors #4816)
Ensures that all required parameters are marked as such in the query schemas
Defines a Direction field, to replace the duplicated code used for direction fields in several query schemas

…tent with other query kinds

…n-units

codecov · 2022-07-07T12:38:00Z

Codecov Report

Merging #5278 (64a9ff7) into master (7a4594a) will decrease coverage by 17.27%.
The diff coverage is 86.02%.

@@             Coverage Diff             @@
##           master    #5278       +/-   ##
===========================================
- Coverage   93.43%   76.15%   -17.28%     
===========================================
  Files         276      277        +1     
  Lines       10724    10847      +123     
  Branches     1274     1284       +10     
===========================================
- Hits        10020     8261     -1759     
- Misses        576     2333     +1757     
- Partials      128      253      +125

Impacted Files	Coverage Δ
flowapi/flowapi/api_spec.py	`35.13% <ø> (-57.89%)`	⬇️
flowapi/flowapi/permissions.py	`83.92% <ø> (-9.83%)`	⬇️
flowclient/flowclient/query_specs.py	`100.00% <ø> (ø)`
...hine/core/server/query_schemas/aggregation_unit.py	`68.42% <40.00%> (-31.58%)`	⬇️
...machine/flowmachine/core/server/action_handlers.py	`57.05% <56.52%> (-29.48%)`	⬇️
...machine/core/server/query_schemas/custom_fields.py	`80.30% <66.66%> (-1.37%)`	⬇️
...ine/core/server/query_schemas/visited_most_days.py	`66.66% <66.66%> (+6.66%)`	⬆️
...re/server/query_schemas/mobility_classification.py	`91.30% <75.00%> (-8.70%)`	⬇️
...achine/core/server/query_schemas/labelled_flows.py	`92.30% <76.92%> (-7.70%)`	⬇️
...core/server/query_schemas/unique_visitor_counts.py	`80.00% <78.57%> (-5.00%)`	⬇️
... and 160 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

greenape · 2022-07-07T12:47:09Z

docs/source/developer/index.md

+
+In a new file 'flowmachine/core/server/query_schemas/my_query.py', define a new class `MyQueryExposed`. This class is responsible for constructing the appropriate `MyQuery` object from parameter values supplied in an API call.
+
+There are two options here: if users should be able to select a random sample of the rows from this query result, the exposed query class should inherit from [`BaseExposedQueryWithSampling`](../flowmachine/flowmachine/core/server/query_schemas/base_query_with_sampling/#class-baseexposedquerywithsampling) (this is usually appropriate for _individual-level_ queries). If it does not make sense to allow random sampling of the query result (as is usually the case for _aggregate_ queries), the exposed query class should inherit from [`BaseExposedQuery`](../flowmachine/flowmachine/core/server/query_schemas/base_exposed_query/#class-baseexposedquery).


We might consider changing the names of these really.

Yes, could do. What would you suggest?

I was wondering about BaseIndividualQuery and BaseAggregateQuery, but I'm not sure it makes sense given that it may not be universally the case that these are the appropriate bases.

Hmm, yes, I don't know whether it will always be the case that we'd want to allow sampling for all non-aggregates and never for aggregates.

I wonder whether we'd be better-off rolling the sampling stuff into BaseExposedQuery (i.e. apply sampling if there's a sampling parameter, otherwise don't), and turning BaseQueryWithSamplingSchema into a mixin - then we don't have to worry about having two different bases.

greenape · 2022-07-07T12:55:43Z

flowmachine/flowmachine/core/server/action_handlers.py

-    For this action handler the `action_params` are exactly the query kind plus the
-    parameters needed to construct the query.
-    """
+def _load_query_object(params: dict) -> "BaseExposedQuery":


Might make sense to make this a class method on BaseExposedQuery?

Not straightforward without leading to circular imports. And on reflection, I'm not sure it makes all that much sense anyway - the core of this function is just FlowmachineQuerySchema().load(params); the rest is just catching appropriate errors and manipulating them into a form that's convenient for constructing an appropriate ZMQReply, so I think it makes sense to keep this alongside the action handlers.

greenape · 2022-07-07T12:57:49Z

flowmachine/flowmachine/core/server/query_schemas/active_at_reference_location_counts.py

-    query_kind = fields.String(validate=OneOf(["active_at_reference_location_counts"]))
-    unique_locations = fields.Nested(UniqueLocationsSchema())
-    reference_locations = fields.Nested(ReferenceLocationSchema())
+    query_kind = fields.String(validate=OneOf([__model__.query_kind]), required=True)


Can this be moved into BaseSchema?

Unfortunately not, I think - the validator needs to get the query kind from __model__.query_kind, and this has to be set when defining the class. It would be neater if we could move this into BaseSchema, but I can't see a way to do it.

Could use __init_subclass__ to attach it to the class?

Oh true, yes, that should work.

Hmm, no, doesn't seem to work. I tried adding

def __init_subclass__(cls, **kwargs): super().__init_subclass__(**kwargs) if hasattr(cls, "__model__"): cls.query_kind = fields.String(validate=OneOf([cls.__model__.query_kind]), required=True)

and while subclasses get a query_kind class attribute as expected, it doesn't get included as a field to be deserialised. Looks like marshmallow is doing some magic behind the scenes that doesn't play nicely with this approach.

flowmachine/flowmachine/core/server/query_schemas/aggregation_unit.py

greenape · 2022-07-07T14:24:30Z

docs/source/developer/index.md

+
+In a new file 'flowmachine/core/server/query_schemas/my_query.py', define a new class `MyQueryExposed`. This class is responsible for constructing the appropriate `MyQuery` object from parameter values supplied in an API call.
+
+There are two options here: if users should be able to select a random sample of the rows from this query result, the exposed query class should inherit from [`BaseExposedQueryWithSampling`](../flowmachine/flowmachine/core/server/query_schemas/base_query_with_sampling/#class-baseexposedquerywithsampling) (this is usually appropriate for _individual-level_ queries). If it does not make sense to allow random sampling of the query result (as is usually the case for _aggregate_ queries), the exposed query class should inherit from [`BaseExposedQuery`](../flowmachine/flowmachine/core/server/query_schemas/base_exposed_query/#class-baseexposedquery).


I was wondering about BaseIndividualQuery and BaseAggregateQuery, but I'm not sure it makes sense given that it may not be universally the case that these are the appropriate bases.

…unit.py Co-authored-by: Jonathan Gray <[email protected]>

cypress · 2022-07-07T15:06:49Z

Test summary

43 • 0 • 0 • 0 • 0

Run details


Project	FlowAuth
Status	Passed
Commit	`64a9ff7`
Started	Aug 12, 2022 12:27 PM
Ended	Aug 12, 2022 12:40 PM
Duration	13:25 💡
OS	Linux Debian - 10.5
Browser	Electron 100

View run in Cypress Dashboard ➡️

This comment has been generated by cypress-bot as a result of this project's GitHub integration settings. You can manage this integration in this project's settings in the Cypress Dashboard

…n-unit

jc-harrison added 30 commits June 15, 2022 17:48

OneOfQuerySchema

8537429

Remove hack for making query_kind read-only

1a980cf

Try adding read-only aggregation unit to SpatialAggregateSchema

9f40488

Don't create new scopes from read-only aggregation unit

8cccc16

Fix get_available_queries action

67d9670

Re-approve API schema

90bac9b

Helper function to load query

16ea188

Add "get_aggregation_unit" action handler

db52326

Update "unknown action" test

43bacdb

Add some tests for get_aggregation_unit action

91c6ce2

Fix #4816

8820019

Add regression test for #4816

c18d822

Add tests for all spatial aggregate query kinds

4d4c9eb

Add aggregation_unit property to active_at_reference_location_counts

cd03113

add aggregation_unit property to aggregate_network_objects

74a4ed7

Add aggregation_unit property to flows

c662c92

Make aggregation_unit attribute of DFSTotalMetricAmountExposed consis…

11ed189

…tent with other query kinds

Don't silently treat badly-formed aggregation units as not-aggregatio…

4e3f383

…n-units

Fix default direction param

c59924f

Add aggregation_unit property to joined_spatial_aggregate

82cb141

Add aggregation_unit property to labelled_flows

1543fba

Add aggregation_unit property to labelled_spatial_aggregate

84ee9fd

Add aggregation_unit property to unique_visitor_counts

f348afb

Add aggregation_unit property to unmoving_counts

790ffb4

Add aggregation_unit property to unmoving_at_reference_location_counts

3960186

Ensure all required fields are marked as such

4763ccb

Skip schema validation on field errors

f8edee4

Fix Direction field

b3c7354

Use real db connection for action handler test

4092a35

Fix LocationIntroversion

112db6d

jc-harrison added 3 commits July 6, 2022 13:49

Integrate nocturnal_events spec change into flowclient

4a8edee

Re-approve API spec

0220f09

Add docs section on exposing a new query kind

f1abe58

jc-harrison marked this pull request as draft July 7, 2022 12:39

greenape reviewed Jul 7, 2022

View reviewed changes

jc-harrison and others added 3 commits July 7, 2022 15:35

Fix docs

47d161e

Update flowmachine/flowmachine/core/server/query_schemas/aggregation_…

ff419e1

…unit.py Co-authored-by: Jonathan Gray <[email protected]>

Merge branch 'master' into explicit-aggregation-unit

f2dba5d

jc-harrison marked this pull request as ready for review July 7, 2022 14:50

jc-harrison added 3 commits July 8, 2022 15:49

More tests

a2a6f8e

Approve query IDs for modified query specs

b2bc665

Docstrings

bc60f17

jc-harrison added enhancement New feature or request FlowMachine Issues related to FlowMachine bug Something isn't working labels Jul 8, 2022

jc-harrison added 2 commits July 8, 2022 17:03

Merge remote-tracking branch 'origin/master' into explicit-aggregatio…

96794da

…n-unit

Update CHANGELOG.md

b06785e

jc-harrison requested a review from greenape July 8, 2022 16:12

jc-harrison added 2 commits July 11, 2022 09:35

Merge branch 'master' into explicit-aggregation-unit

0560a96

Merge branch 'master' into explicit-aggregation-unit

d8faf2c

greenape approved these changes Aug 12, 2022

View reviewed changes

jc-harrison added 2 commits August 12, 2022 10:58

Quiet pip install to avert threading issue

d3af691

Merge branch 'master' into explicit-aggregation-unit

64a9ff7

jc-harrison added the ready-to-merge Label indicating a PR is OK to automerge label Aug 12, 2022

mergify bot merged commit 65a2a05 into master Aug 12, 2022

mergify bot deleted the explicit-aggregation-unit branch August 12, 2022 12:41

Thingus mentioned this pull request Aug 16, 2022

Permissions rework #5163

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'get_aggregation_unit' action and 'aggregation_unit' params for all spatial aggregates #5278

'get_aggregation_unit' action and 'aggregation_unit' params for all spatial aggregates #5278

jc-harrison commented Jul 7, 2022 •

edited

Loading

codecov bot commented Jul 7, 2022 •

edited

Loading

greenape Jul 7, 2022

jc-harrison Jul 7, 2022

greenape Jul 7, 2022

jc-harrison Jul 7, 2022

greenape Jul 7, 2022

jc-harrison Jul 8, 2022

greenape Jul 7, 2022

jc-harrison Jul 7, 2022

greenape Jul 7, 2022

jc-harrison Jul 7, 2022

jc-harrison Jul 7, 2022 •

edited

Loading

greenape Jul 7, 2022

cypress bot commented Jul 7, 2022 •

edited

Loading


		In a new file 'flowmachine/core/server/query_schemas/my_query.py', define a new class `MyQueryExposed`. This class is responsible for constructing the appropriate `MyQuery` object from parameter values supplied in an API call.

		There are two options here: if users should be able to select a random sample of the rows from this query result, the exposed query class should inherit from [`BaseExposedQueryWithSampling`](../flowmachine/flowmachine/core/server/query_schemas/base_query_with_sampling/#class-baseexposedquerywithsampling) (this is usually appropriate for _individual-level_ queries). If it does not make sense to allow random sampling of the query result (as is usually the case for _aggregate_ queries), the exposed query class should inherit from [`BaseExposedQuery`](../flowmachine/flowmachine/core/server/query_schemas/base_exposed_query/#class-baseexposedquery).

'get_aggregation_unit' action and 'aggregation_unit' params for all spatial aggregates #5278

'get_aggregation_unit' action and 'aggregation_unit' params for all spatial aggregates #5278

Conversation

jc-harrison commented Jul 7, 2022 • edited Loading

I have:

Description

codecov bot commented Jul 7, 2022 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jc-harrison Jul 7, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cypress bot commented Jul 7, 2022 • edited Loading

Test summary

Run details

jc-harrison commented Jul 7, 2022 •

edited

Loading

codecov bot commented Jul 7, 2022 •

edited

Loading

jc-harrison Jul 7, 2022 •

edited

Loading

cypress bot commented Jul 7, 2022 •

edited

Loading