Keep a live in-memory "view" of all matching resources in the cluster #661

nolar · 2021-01-31T19:13:53Z

Indexers automatically maintain in-memory overviews of resources (indices), grouped by keys that are usually calculated based on these resources.

See the detailed description and several recipes in the .rst file in the PR's changes. The essence:

import kopf

@kopf.index('pods', labels={'excluded': kopf.ABSENT})
def pod_phases(namespace, name, status, **_):
    return {(namespace, name): status.get('phase')}

@kopf.timer('kex', interval=5)
def tick(patch, pod_phases: kopf.Index, **_):
    patch.status['children'] = [
        f"{pod_namespace}/{pod_name}"
        for (pod_namespace, pod_name), pod_phase in pod_phases.items()
        if pod_phase == 'Running'
    ]
    print(patch.status['children'])

The indices can be used for cross-resource awareness: e.g., when a resource of kind X is changed, it can get all the information about all resources of kind Y without talking to the Kubernetes API.

The indices are optimized for lookups by keys (O(1) — the same as Python's dicts), though iterating over them is also an option. The updates & deletions are O(k), where "k" is the number of keys per object (not of the overall keys!).

A side-note: The in-memory indices are a step further towards cross-resource handlers (an example: when a child pod is changed, the cross-resource handler should be invoked in the context of its parent, not of the child itself, with all other pods' information at hand). In that case, only one resource is known due to the arrived event, while another resource should be either remembered in-memory or fake-patched with the full info of each child to trigger the handling. The latter way looks stressful to K8s & K8s API, so the in-memory way seems the only viable solution.

Things left to do (can take time):

Signed-off-by: Sergey Vasilyev <[email protected]>

If the resource is non-indexable, there is no need to create the individual toggles neither for the resource kind nor for each individual resource object — such resources do not block the operator's readiness (but can be blocked themselves). In practice, even if created, those toggles will be turned on and/or dropped as soon as the processing routine is entered and finds that there are no indexing handlers (nothing to do). This is a waste of computing power and memory on startup, which can affect the operators in massive clusters — for no benefit. Instead, only create resource kind's & resource object's toggles when and if there are indexing handlers to be executed. As a particular case, when an operator has no indices at all (assumed to be the default and the most common case), create no single toggle, or at most one (for the global operator block until all specialised per-resource-kind blockers are checked & skipped). Without this fix, there will be as many useless toggles as there are resource kinds (usually below ten) PLUS resource objects (can be in hundreds or thousands). Signed-off-by: Sergey Vasilyev <[email protected]>

nolar · 2021-02-24T08:16:09Z

So far, I gave it a couple of days to settle down and check if there are any other after-thoughts. There are none. It can be merged.

The only concern left is that it could be beneficial in the future if other builtin types are specially interpreted similar to the 1st-level dict now. E.g., accept multi-level dict for multi-leve indices, or merge list/set into the indexed store instead of storing them as values. The current interpretation of only the 1st-level dict feels assymetrical.

In that case, it would require backward-incompatible changes (now, such values are treated as values).

One solution would be to wrap these types now in something like kopf.literal() to disable their interpretation. Which is overcomplication. Another solution would be to never interpret the builtins but always expect explicit return kopf.Index({key: val}) returned (or return kopf.Index({key: kopf.Store([...])}) for merging the items). This also looks overcomplicated from the API/DSL point of view; and is wordy.

I believe, this forward compatibility cannot be ensured now. This can be extended later with additional options to @kopf.index() or additional wrappers as shown above — to hint how the structural results should be interpreted (indexed/stored/merged); if at all needed.

nolar added the enhancement New feature or request label Jan 31, 2021

nolar force-pushed the live-views branch from cafeaee to e0bfa7e Compare January 31, 2021 19:55

nolar mentioned this pull request Jan 31, 2021

Automatic CRD Manifest Generation #524

Open

2 tasks

nolar force-pushed the live-views branch from e0bfa7e to 8cd3c62 Compare February 6, 2021 11:25