Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eBPF/PSA: Implement caching for ActionSelector, LPM and ternary tables #3738

Merged
merged 2 commits into from
Nov 30, 2022

Conversation

tatry
Copy link
Contributor

@tatry tatry commented Nov 28, 2022

This PR introduces optimization for the PSA/eBPF backend called table caching which adds cache table with only exact match keys for time consuming look-ups. We identified three situations where this optimization can improve performance:

  • Table with ternary (and/or lpm, exact) key - skip slow TSS algorithm if the key was earlier matched.
  • Table with lpm (and/or exact) key - skip slow LPM_TRIE map (especially when there is many entries) if the key was earlier matched.
  • ActionSelector member selection from group - skip slow checksum calculation for selector key if it was earlier calculated.

This optimization may not improve performance in every case, so it must be explicitly enabled by compiler option --table-caching.

Limitations/to do:

  • DirectCounter and DirectMeter are not supported by this optimization (tables with these externs will not have enabled cache).
  • Cache size is a half of table size, but it would be more configurable or smart during compilation.
  • Updates to tables with cache or ActionSelector require cache invalidation (nikss library will remove all cached entries).

This commit introduces optimization for the PSA/eBPF backend called
`table caching` which adds cache table with only `exact` match keys for time
consuming look-ups. We identified three situations where this optimization
can improve performance:

- Table with `ternary` (and/or `lpm`, `exact`) key - skip slow TSS algorithm
  if the key was earlier matched.
- Table with `lpm` (and/or `exact`) key - skip slow `LPM_TRIE` map (especially
  when there is many entries) if the key was earlier matched.
- `ActionSelector` member selection from group - skip slow checksum
  calculation for `selector` key if it was earlier calculated.

This optimization may not improve performance in every case, so it must be
explicitly enabled by compiler option `--table-caching`.

Limitations/to do:

- `DirectCounter` and `DirectMeter` are not supported by this optimization
  (tables with these externs will not have enabled cache).
- Cache size is a half of table size, but it would be more configurable or
  smart during compilation.
- Updates to tables with cache or `ActionSelector` require cache invalidation
  (`nikss` library will remove all cached entries).

Co-authored-by: Tomasz Osiński <[email protected]>
- table with `lpm` (and/or exact) key - skip slow `LPM_TRIE` map (especially when there is many entries) if the key was matched earlier.
- `ActionSelector` member selection from group - skip slow checksum calculation for `selector` key if it was calculated earlier.

The fast exact-match map is added in front of each instance of a table that contains a `lpm`, `ternary` or `selector` match
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also expect that modifying the original table may invalidate the cache.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache invalidation is up to control plane. I think it never occur from eBPF program side.

table entry. Tables with these externs will not have enabled cache optimization even when enabled by compiler option.
- When table cache optimization is enabled for a table, the number of cached entries is determined as a half of table size.
This would be more configurable or smart during compilation.
- Updates to tables or ActionSelector with enabled table cache optimization require cache invalidation. `nikss` library
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may be able to do better if you keep enough state to figure out what to invalidate.
Invalidating the whole cache may be very expensive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it might be expensive. I think that key and mask is enough to implement selective cache invalidation when updating table. Whole cache is invalidated because it is a simple algorithm and always work, but might be changed in the future.

}

void EBPFTablePSA::emitCacheLookup(CodeBuilder* builder, cstring key, cstring value) {
cstring cacheVal = "cached_value";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this behaves like a table could you have reused the code generation for a table by creating a fake table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can say cache behaves like a table but not in the same way. Both have similarly named methods, but internals of these are different. So table's method have to be overridden by these methods for cache (this code will not disappear). And the main part of table lookup is done by control block (not a table itself), so it is not easily accessible from table.

I think it will be rather hard to reuse code for table, so I would keep as is.


builder->emitIndent();
if (memcpy) {
builder->appendFormat("memcpy(&%s.%s, &", cacheKeyVar.c_str(), fieldName.c_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so ebpf supports memcpy... or perhaps this is just an intrinsic that the compiler knows about

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably it maps to an intrinsic that the compiler knows about. I forgot to change it to __builtin_memcpy.

@tatry
Copy link
Contributor Author

tatry commented Nov 30, 2022

@mbudiu-vmw
Thank you for your review, I think that I addressed all of your questions.

@mihaibudiu mihaibudiu merged commit 9036ffc into p4lang:main Nov 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants