-
Notifications
You must be signed in to change notification settings - Fork 444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eBPF/PSA: Implement caching for ActionSelector, LPM and ternary tables #3738
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -535,6 +535,23 @@ $ bpftool help | |
|
||
Refer to [the bpftool guide](https://manpages.ubuntu.com/manpages/focal/man8/bpftool-prog.8.html) for more examples how to use it. | ||
|
||
# Performance optimizations | ||
|
||
## Table caching | ||
|
||
Table caching optimizes P4 table lookups by adding a cache with all `exact` matches for time-consuming lookups including: | ||
- table with `ternary` (and/or lpm, exact) key - skip slow TSS algorithm if the key was matched earlier. | ||
- table with `lpm` (and/or exact) key - skip slow `LPM_TRIE` map (especially when there is many entries) if the key was matched earlier. | ||
- `ActionSelector` member selection from group - skip slow checksum calculation for `selector` key if it was calculated earlier. | ||
|
||
The fast exact-match map is added in front of each instance of a table that contains a `lpm`, `ternary` or `selector` match | ||
key. The table cache is implemented with `BPF_MAP_TYPE_LRU_HASH`, which shares its implementation with the BPF hash map. | ||
The LRU map provides a good lookup performance, but lower performance on map updates due to a maintenance process. Thus, | ||
this optimization fits into use cases, where a value of table key changes infrequently between packets. | ||
|
||
This optimization may not improve performance in every case, so it must be explicitly enabled by compiler option. To enable | ||
table caching pass `--table-caching` to the compiler. | ||
|
||
# TODO / Limitations | ||
|
||
We list the known bugs/limitations below. Refer to the Roadmap section for features planned in the near future. | ||
|
@@ -545,14 +562,20 @@ with some NICs. So far, we have verified the correct behavior with Intel 82599ES | |
- `lookahead()` with bit fields (e.g., `bit<16>`) doesn't work. | ||
- `@atomic` operation is not supported yet. | ||
- `psa_idle_timeout` is not supported yet. | ||
- DirectCounter and DirectMeter externs are not supported for P4 tables with implementation (ActionProfile). | ||
- DirectCounter and DirectMeter externs are not supported for P4 tables with implementation (ActionProfile or ActionSelector). | ||
- The `xdp2tc=head` mode works only for packets larger than 34 bytes (the size of Ethernet and IPv4 header). | ||
- `value_set` only supports the exact match type and can only match on a single field in the `select()` expression. | ||
- The number of entries in ternary tables are limited by the number of unique ternary masks. If a P4 program uses many ternary tables and the `--max-ternary-masks` (default: 128) is set | ||
to a high value, the P4 program may not load into the BPF subsystem due to the BPF complexity issue (the 1M instruction limit exceeded). This is the limitation of the current implementation of the TSS algorithm that | ||
requires iteration over BPF maps. Note that the recent kernel introduced the [bpf_for_each_map_elem()](https://lwn.net/Articles/846504/) helper that should simplify the iteration process and help to overcome the current limitation. | ||
- Setting a size of ternary tables does not currently work. | ||
- DirectMeter cannot be used if a table defines `ternary` match fields, as [BPF spinlocks are not allowed in inner maps of map-in-map](https://patchwork.ozlabs.org/project/netdev/patch/[email protected]/). | ||
- Table cache optimization can't be enabled on tables with DirectCounter or DirectMeter due to two different states of a | ||
table entry. Tables with these externs will not have enabled cache optimization even when enabled by compiler option. | ||
- When table cache optimization is enabled for a table, the number of cached entries is determined as a half of table size. | ||
This would be more configurable or smart during compilation. | ||
- Updates to tables or ActionSelector with enabled table cache optimization require cache invalidation. `nikss` library | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You may be able to do better if you keep enough state to figure out what to invalidate. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree, it might be expensive. I think that key and mask is enough to implement selective cache invalidation when updating table. Whole cache is invalidated because it is a simple algorithm and always work, but might be changed in the future. |
||
will remove all cached entries if it detects cache. | ||
|
||
# Roadmap | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -225,6 +225,8 @@ EBPFTablePSA::EBPFTablePSA(const EBPFProgram* program, const IR::TableBlock* tab | |
initDirectCounters(); | ||
initDirectMeters(); | ||
initImplementation(); | ||
|
||
tryEnableTableCache(); | ||
} | ||
|
||
EBPFTablePSA::EBPFTablePSA(const EBPFProgram* program, CodeGenInspector* codeGen, cstring name) | ||
|
@@ -320,11 +322,13 @@ void EBPFTablePSA::emitInstance(CodeBuilder* builder) { | |
program->arrayIndexType, cstring("struct ") + valueTypeName, | ||
1); | ||
} | ||
|
||
emitCacheInstance(builder); | ||
} | ||
|
||
void EBPFTablePSA::emitTypes(CodeBuilder* builder) { | ||
EBPFTable::emitTypes(builder); | ||
// TODO: placeholder for handling PSA-specific types | ||
emitCacheTypes(builder); | ||
} | ||
|
||
/** | ||
|
@@ -899,4 +903,130 @@ cstring EBPFTablePSA::addPrefixFunc(bool trace) { | |
return addPrefixFunc; | ||
} | ||
|
||
void EBPFTablePSA::tryEnableTableCache() { | ||
if (!program->options.enableTableCache) return; | ||
if (!isLPMTable() && !isTernaryTable()) return; | ||
if (!counters.empty() || !meters.empty()) { | ||
::warning(ErrorType::WARN_UNSUPPORTED, | ||
"%1%: table cache can't be enabled due to direct extern(s)", | ||
table->container->name); | ||
return; | ||
} | ||
|
||
tableCacheEnabled = true; | ||
createCacheTypeNames(false, true); | ||
} | ||
|
||
void EBPFTablePSA::createCacheTypeNames(bool isCacheKeyType, bool isCacheValueType) { | ||
cacheTableName = instanceName + "_cache"; | ||
|
||
cacheKeyTypeName = keyTypeName; | ||
if (isCacheKeyType) cacheKeyTypeName = keyTypeName + "_cache"; | ||
|
||
cacheValueTypeName = valueTypeName; | ||
if (isCacheValueType) cacheValueTypeName = valueTypeName + "_cache"; | ||
} | ||
|
||
void EBPFTablePSA::emitCacheTypes(CodeBuilder* builder) { | ||
if (!tableCacheEnabled) return; | ||
|
||
builder->emitIndent(); | ||
builder->appendFormat("struct %s ", cacheValueTypeName.c_str()); | ||
builder->blockStart(); | ||
|
||
builder->emitIndent(); | ||
builder->appendFormat("struct %s value", valueTypeName.c_str()); | ||
builder->endOfStatement(true); | ||
|
||
// additional metadata fields add at the end of this structure. This allows | ||
// to simpler conversion cache value to value used by table | ||
|
||
builder->emitIndent(); | ||
builder->append("u8 hit"); | ||
builder->endOfStatement(true); | ||
|
||
builder->blockEnd(false); | ||
builder->endOfStatement(true); | ||
} | ||
|
||
void EBPFTablePSA::emitCacheInstance(CodeBuilder* builder) { | ||
if (!tableCacheEnabled) return; | ||
|
||
// TODO: make cache size calculation more smart. Consider using annotation or compiler option. | ||
size_t cacheSize = std::max((size_t)1, size / 2); | ||
builder->target->emitTableDecl(builder, cacheTableName, TableHashLRU, | ||
"struct " + cacheKeyTypeName, "struct " + cacheValueTypeName, | ||
cacheSize); | ||
} | ||
|
||
void EBPFTablePSA::emitCacheLookup(CodeBuilder* builder, cstring key, cstring value) { | ||
cstring cacheVal = "cached_value"; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if this behaves like a table could you have reused the code generation for a table by creating a fake table? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can say cache behaves like a table but not in the same way. Both have similarly named methods, but internals of these are different. So table's method have to be overridden by these methods for cache (this code will not disappear). And the main part of table lookup is done by control block (not a table itself), so it is not easily accessible from table. I think it will be rather hard to reuse code for table, so I would keep as is. |
||
|
||
builder->appendFormat("struct %s* %s = NULL", cacheValueTypeName.c_str(), cacheVal.c_str()); | ||
builder->endOfStatement(true); | ||
|
||
builder->target->emitTraceMessage(builder, "Control: trying table cache..."); | ||
|
||
builder->emitIndent(); | ||
builder->target->emitTableLookup(builder, cacheTableName, key, cacheVal); | ||
builder->endOfStatement(true); | ||
|
||
builder->emitIndent(); | ||
builder->appendFormat("if (%s != NULL) ", cacheVal.c_str()); | ||
builder->blockStart(); | ||
|
||
builder->target->emitTraceMessage(builder, | ||
"Control: table cache hit, skipping later lookup(s)"); | ||
builder->emitIndent(); | ||
builder->appendFormat("%s = &(%s->value)", value.c_str(), cacheVal.c_str()); | ||
builder->endOfStatement(true); | ||
builder->emitIndent(); | ||
builder->appendFormat("%s = %s->hit", program->control->hitVariable.c_str(), cacheVal.c_str()); | ||
builder->endOfStatement(true); | ||
|
||
builder->blockEnd(false); | ||
builder->append(" else "); | ||
builder->blockStart(); | ||
|
||
builder->target->emitTraceMessage(builder, "Control: table cache miss, nevermind"); | ||
builder->emitIndent(); | ||
|
||
// Do not end block here because we need lookup for (default) value | ||
// and set hit variable at this indent level which is done in the control block | ||
} | ||
|
||
void EBPFTablePSA::emitCacheUpdate(CodeBuilder* builder, cstring key, cstring value) { | ||
cstring cacheUpdateVarName = "cache_update"; | ||
|
||
builder->emitIndent(); | ||
builder->appendFormat("if (%s != NULL) ", value.c_str()); | ||
builder->blockStart(); | ||
|
||
builder->emitIndent(); | ||
builder->appendLine("/* update table cache */"); | ||
|
||
builder->emitIndent(); | ||
builder->appendFormat("struct %s %s = {0}", cacheValueTypeName.c_str(), | ||
cacheUpdateVarName.c_str()); | ||
builder->endOfStatement(true); | ||
|
||
builder->emitIndent(); | ||
builder->appendFormat("%s.hit = %s", cacheUpdateVarName.c_str(), | ||
program->control->hitVariable.c_str()); | ||
builder->endOfStatement(true); | ||
|
||
builder->emitIndent(); | ||
builder->appendFormat("__builtin_memcpy((void *) &(%s.value), (void *) %s, sizeof(struct %s))", | ||
cacheUpdateVarName.c_str(), value.c_str(), valueTypeName.c_str()); | ||
builder->endOfStatement(true); | ||
|
||
builder->emitIndent(); | ||
builder->target->emitTableUpdate(builder, cacheTableName, key, cacheUpdateVarName); | ||
builder->newline(); | ||
|
||
builder->target->emitTraceMessage(builder, "Control: table cache updated"); | ||
|
||
builder->blockEnd(true); | ||
} | ||
|
||
} // namespace EBPF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also expect that modifying the original table may invalidate the cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cache invalidation is up to control plane. I think it never occur from eBPF program side.