Skip to content

[Usage counters] Relocate counters to a dedicated index. Unify / simplify 'ui' and 'server' counters logic.#187064

Merged
gsoldevila merged 14 commits intoelastic:mainfrom
gsoldevila:kbn-dashboards-plus-plus
Jul 5, 2024
Merged

[Usage counters] Relocate counters to a dedicated index. Unify / simplify 'ui' and 'server' counters logic.#187064
gsoldevila merged 14 commits intoelastic:mainfrom
gsoldevila:kbn-dashboards-plus-plus

Conversation

@gsoldevila
Copy link
Contributor

Summary

Part of #186530.

This PR sets the basis for allowing namespaced usage counters.
It relocates usage-counters SO type from .kibana to a dedicated .kibana_usage_counters.

Furthermore, the original SO type is removed, and replaced by 2 separate types:

  • server-counters
  • ui-counters

Note that these 2 steps are necessary if we want to leverage namespaces property of the saved objects.
We can't currently update the namespaceType: 'agnostic' without causing a migration.
Thus, these two types will be defined as namespaceType: single.

Up until now, UI counters were stored under a special domainId: uiCounter.
This forced a workaround that consisted in storing appName:eventName in the counterName property of the SO.
Having a dedicated SO type for them allows to store appName as domainId, avoiding the need for a workaround.

@gsoldevila gsoldevila added Team:Core Platform Core services: plugins, logging, config, saved objects, http, ES client, i18n, etc t// Feature:Telemetry release_note:skip Skip the PR/issue when compiling release notes backport:skip This PR does not require backporting labels Jun 27, 2024
@elastic elastic deleted a comment from kibana-ci Jun 27, 2024
Comment on lines 168 to 161
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UsageCountersService should ensure that a certain domainId is not registered more than once (line 163)

Copy link
Member

@afharo afharo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks like a great and much-needed refactor of our counters system. Moving them to a custom index will give us the opportunity to scale. Also, moving UI Counters to actually persist the properties in a 1:1 mapping instead of : splits is another great win.

However, I have some reservations with the shared API dealing with multiple SOs: Moving UI Counters to its own SO type is a great move to decouple UI and Server Counters' codebases. Except that it doesn't really decouple them: both SO types are registered from a common place.

We are storing them separately, but a consumer of the API can't create a counter with the same domainId in the server and the browser (which it's a pain since domainId tends to be the plugin ID in most usages, like dashboards).

I wonder if UI counters should prepend ui: to the domainId so that these clashes won't occur. This could also keep the SO type hidden from the public API.

Alternatively, we could specify source: 'ui' | 'server' to hide the persistence layer (SO type). And, if we need to specify the source during the usage counter getOrCreate, I wonder if it could just be one property in a common SO type, that we can filter by. Crazy use case to back it up:
Count views of dashboards coming from the UI (source: 'ui'), and the API (source: 'server').

WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧡

Comment on lines 95 to 103
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: why would it fail in a way that retries could help? AFAIK, there's no I/O here.

IMO, it's enough with

const counter = counters.getUsageCounterByDomainId(domainId) ?? 
    counters.createUsageCounter(domainId, UI_COUNTERS_SAVED_OBJECT_TYPE);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had Java and context switching in my head, you're right.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want the fallback here? If we fallback here, we're essentially making no distinction for the default space, aren't we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end I'm not creating any agnostic SO types, so they all need to be stored in a specific namespace.

The idea is that whenever users of the API don't specify a namespace, the default one will be used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 mind to revert this when you're done testing :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops good catch!

@gsoldevila
Copy link
Contributor Author

gsoldevila commented Jun 28, 2024

Answering comment

After discussion, we've decided to use a single so type 'counters' and introduce a source: 'server' | 'ui' property in the events.

  • This way we can better filter for it through the SO client.
  • This also prevents name clashes in plugins that register both server-side and ui-side counters. A counter will now be "source-agnostic" per se, and the events themselves will specify the source.
  • Done in 267b263

@gsoldevila gsoldevila force-pushed the kbn-dashboards-plus-plus branch from f835bfa to fff9e1c Compare July 2, 2024 09:20
// Check for migration steps present in the logs
logs = await fs.readFile(logFilePath, 'utf-8');
expect(logs).not.toMatch('CREATE_NEW_TARGET');
expect(logs).not.toMatch('[.kibana] CREATE_NEW_TARGET');
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test now creates a new index for the new type.
I made the constraint more specific and ensure that a new .kibana is not being created.


expect(Array.isArray(typesMap['.kibana'])).toEqual(true);
expect(typesMap['.kibana'].length > 50).toEqual(true);
expect(typesMap['.kibana'].includes('action')).toEqual(true);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking that the map contains an exhaustive list of mappings is not necessary.
This test is broken everytime anyone adds a new SO type.
I've relaxed the constraint a bit.


// the deleteByNamespace performs an updateByQuery under the hood.
// It targets all SO indices that have namespace-aware types
const nsIndices = getIndicesWithNamespaceAwareTypes(setup.savedObjects.getTypeRegistry());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we were running a very costly operation, registering thousands of routes, when 99% of requests target .kibana solely.
The only case where multiple indices were specified in the HTTP request path is with namespace-aware indices.
The test was timing out due to the new SO index, which added a lot more permutations.
Hence, I used the opportunity to simplify it.
The test should now be noticeably ligher/faster.

/**
* Parameters to the `serializeCounterKey` method
* @internal used in kibana_usage_collectors
* @internal bused in kibana_usage_collectors
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

createMockSavedObjectDoc(
moment().subtract(6, 'days'),
'doc-id-3',
USAGE_COUNTERS_SAVED_OBJECT_TYPE
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove, useless

@kibana-ci
Copy link

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #8 / Actions and Triggers app Stack alerts page Loads the page Applies the correct quick filter

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/core-saved-objects-server 133 134 +1
usageCollection 16 17 +1
total +2
Unknown metric groups

API count

id before after diff
@kbn/core-saved-objects-server 561 562 +1
usageCollection 55 57 +2
total +3

History

  • 💔 Build #218801 failed f835bfac0d21c0c7bb8184788f73a10b0e1ad641
  • 💔 Build #218708 failed f5a9287d69e8293a97ae2bf60111bc99ba1e1498
  • 💔 Build #218456 failed 267b263e7b1f58de006482c64e6e7ccf33e75b2a
  • 💔 Build #218230 failed a9576db5210c59d8a0d65112ba9888b1b044bb45

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@gsoldevila gsoldevila marked this pull request as ready for review July 2, 2024 14:10
@gsoldevila gsoldevila requested a review from a team as a code owner July 2, 2024 14:10
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

Copy link
Member

@afharo afharo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Telemetry LGTM

return { dailyEvents };
}
const UI: UsageCounters.v1.CounterEventSource = 'ui';
const UI_COUNTERS_FILTER = `${USAGE_COUNTERS_SAVED_OBJECT_TYPE}.attributes.source: ${UI}`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the .attributes bit? I thought the core logic introduced it automatically

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do. the core rewrite is {type}.attributes.{attrPath} => {type}.{attrPath} (as each type's attributes are stored under a property matching their name, for historical reasons)


return await Promise.all(
docsToDelete.map(({ id }) => savedObjectsClient.delete(USAGE_COUNTERS_SAVED_OBJECT_TYPE, id))
docsToDelete.map(({ id, type }) => savedObjectsClient.delete(type, id))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: do we need to specify the namespace when deleting it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q2: Should we take this opportunity to use bulkDelete?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on slack: because we're using an unscoped repository, we can (and need to) pass the namespace when calling delete

(and yeah, using bulkDelete would be ideal)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update to take namespaces into account.
Unfortunately bulkDelete does not allow for a multi-namespace delete operation, and I believe it relies on the spaces extension.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed on b7a2d54

Copy link
Contributor

@pgayvallet pgayvallet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM.

Only mandatory change is adapting the delete calls (#187064 (comment))

"tokenType"
],
"core-usage-stats": [],
"counter": [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we shouldn't go with something slightly less generic than just counter for the type identifier? core-counters at least? Maybe with an additional modifier if we're planning on having multiple SO types for our counter needs?

WDYT?

Copy link
Contributor Author

@gsoldevila gsoldevila Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update it to usage-counter (it's not a core feature).
We finally have a single type and store both agnostic and namespace-aware under namespaceType: single, using default when no namespace is provided.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed on b7a2d54

namespaces: ['*'],
fields: ['count', 'counterName', 'counterType', 'domainId'],
filter,
perPage: 1000,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: (I doubt it will ever be relevant, but still) one of the upsidesof the PIT is to release memory preasure by having smaller batches of fetched objects, so we usually want to have smaller batches than 1k (like 100 or so).

dailyEvents.push(event);
}
} catch (_) {
// swallow error; allows sending successfully transformed objects.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: we don't even want a warning or debug level message here?

Comment on lines +92 to +95
acc[key] = {
...existingCounter,
...counter,
total: existingCounter.total + counter.total,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any scenario where the spreading properties of existingCounter and counter will differ?

My question is, could this simply be replaced by:

acc[key].total = acc[key].total + counter.total

? (just minor perf increase, avoiding 2 spreads + an object creation)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only property we want to systematically override is lastUpdatedAt with the most recent one.
I'll update the code with your perf. enhancement.

return { dailyEvents };
}
const UI: UsageCounters.v1.CounterEventSource = 'ui';
const UI_COUNTERS_FILTER = `${USAGE_COUNTERS_SAVED_OBJECT_TYPE}.attributes.source: ${UI}`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do. the core rewrite is {type}.attributes.{attrPath} => {type}.{attrPath} (as each type's attributes are stored under a property matching their name, for historical reasons)


return await Promise.all(
docsToDelete.map(({ id }) => savedObjectsClient.delete(USAGE_COUNTERS_SAVED_OBJECT_TYPE, id))
docsToDelete.map(({ id, type }) => savedObjectsClient.delete(type, id))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on slack: because we're using an unscoped repository, we can (and need to) pass the namespace when calling delete

(and yeah, using bulkDelete would be ideal)

* Creates and registers a usage counter to collect daily aggregated plugin counter events
*/
createUsageCounter: (type: string) => UsageCounter;
createUsageCounter: (type: string, source?: 'server' | 'ui') => UsageCounter;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: how much work would it represen to have this additional parameter be mandatory? How many calls to adapt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact this is wrong/outdated.
The counters will not be bound to a source anymore. The source will be specified in the events.

createUsageCounter only reserves a certain 'domainId', but it does not register / persist any counter on that domain. Perhaps it should be named registerUsageCounterDomain instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renaming would involve a bunch of places / owners. I'll remove the 'source' for now, and we can tackle that on a separate PR.

image

Comment on lines 8 to 9
import * as Rx from 'rxjs';
import * as rxOp from 'rxjs';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: unrelated to the PR. but 🙃

serializeCounterKey({ ...attributes, date: updatedAt, namespace: namespaces?.[0] })
)
.sort()
).toEqual(RECENT_COUNTERS);
Copy link
Contributor Author

@gsoldevila gsoldevila Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an integration test to ensure we delete old ones, across namespaces, but preserve recent ones (even when their keys match those of the old ones).

@pgayvallet
Copy link
Contributor

Took a look at the latest changes - restamping with LGTM

@gsoldevila gsoldevila enabled auto-merge (squash) July 5, 2024 09:44
@gsoldevila gsoldevila merged commit e524fb6 into elastic:main Jul 5, 2024
@elasticmachine
Copy link
Contributor

💛 Build succeeded, but was flaky

Failed CI Steps

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/core-saved-objects-server 133 134 +1
Unknown metric groups

API count

id before after diff
@kbn/core-saved-objects-server 561 562 +1
usageCollection 55 56 +1
total +2

History

gsoldevila added a commit that referenced this pull request Jul 5, 2024
## Summary

Follow up to #187064

* Improves the `rollups.test.ts` integration test.
* Adds the `count` field to the mappings, so that it can be aggregated.
@gsoldevila gsoldevila changed the title Relocate usage-counters to dedicated index and split server Vs ui [Usage counters] Relocate counters to a dedicated index. Unify / simplify 'ui' and 'server' counters logic. Jul 9, 2024
gsoldevila added a commit that referenced this pull request Aug 5, 2024
## Summary

Part of #186530
Follow-up of #187064 

The goal of this PR is to provide the necessary means to allow
implementing the [Counting
views](https://docs.google.com/document/d/1W77qoweixcjrq0sEKh_LjIk3j33Xyy9umod9mG9BlOM/edit)
part of the _Dashboards++_ initiative.
We do this by extending the capabilities of the _usage counters_ APIs:
* We support custom retention periods. Currently data is only kept in SO
indices for 5 days. Having 90 days worth of counting was required for
Dashboards++.
* We expose a Search API that will allow retrieving persisted counters.

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:skip This PR does not require backporting Feature:Telemetry release_note:skip Skip the PR/issue when compiling release notes Team:Core Platform Core services: plugins, logging, config, saved objects, http, ES client, i18n, etc t// v8.16.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants