Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cbor): add Smithy RPCv2 CBOR protocol generator #1280

Merged
merged 3 commits into from
Aug 16, 2024
Merged

Conversation

kuhe
Copy link
Contributor

@kuhe kuhe commented May 14, 2024

Adds a protocol generator for the trait smithy.protocols#rpcv2Cbor

@@ -11,7 +11,7 @@
"clean": "rimraf ./dist-* && rimraf *.tsbuildinfo || exit 0",
"lint": "npx eslint -c ../../.eslintrc.js \"src/**/*.ts\" && node ./scripts/lint",
"format": "prettier --config ../../prettier.config.js --ignore-path ../.prettierignore --write \"**/*.{ts,md,json}\"",
"test": "yarn g:jest"
"test": "yarn g:jest --maxWorkers 1"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jest has problems passing the JSON data between workers when it includes BigInt's, apparently?

Copy link
Contributor

@trivikr trivikr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I didn't go through CBOR encode/decode source code, as it has unit tests.


ensureSpace(typeof input === "string" ? input.length * 4 : 64);

if (typeof input === "string") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it help to store the result of typeof into variable and reuse that rather than this many if/else statements that call it over and over? I also see typeof input === "string" above too.

I also think V8 would optimize a switch statement into jump statements. I doubt it would do that for these if/else statements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This testing shows if-statements to be faster https://jsperf.app/kuyaba/2.

typeof variable === "..." is likely optimized in a specific manner. Even storing typeof variable instead of calling it multiple times de-optimizes the code. The typeof operator is likely a property on the native object and not a function call.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very weird!

packages/core/src/submodules/cbor/cbor-encode.ts Outdated Show resolved Hide resolved
packages/core/src/submodules/cbor/cbor-encode.ts Outdated Show resolved Hide resolved
packages/core/src/submodules/cbor/cbor-encode.ts Outdated Show resolved Hide resolved
packages/core/src/submodules/cbor/cbor-encode.ts Outdated Show resolved Hide resolved
packages/core/src/submodules/cbor/cbor-encode.ts Outdated Show resolved Hide resolved

export function resize(size: number) {
const old = data;
data = alloc(size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this some kind of global buffer that all cbor encoding writes to? Forgive my JS ignorance, but is that dangerous if someone uses something like workers? Is this does use global write state, I'm curious why we wouldn't use an isolated state for each encoding process and do something like buffer pooling to borrow/return up to N already allocated buffers.

Copy link
Contributor Author

@kuhe kuhe Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and the global encoding buffer is inaccessible to consumers, so it cannot be passed to workers and won't be subject to concurrent operations. Workers will each have a copy of that buffer.

Because the encoding function is synchronous and the runtime single-threaded, the buffer is in use by only one encoding process at a time. A copy is made when the result is handed over to the outgoing request. This copy is done very quickly and represents <1% of the time spent.

I believe using a pool of buffers would not improve the speed or memory usage, even if it could avoid that copy operation, due to the speed of that copy operation and the overhead of managing the buffer pool.

}

export function toUint8Array(): Uint8Array {
const out = alloc(cursor);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any way to use something like buffer pooling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know of a specific Node.js API that maps to the term "buffer pooling". If you mean using for example creating a manager class that allocates and reclaims separate buffers, I don't believe it will improve the perf profile over using a single buffer, for reasons given below in the comment thread about the global buffer.

Initial development was done with individual buffers per encoding process, and iterative development showed that allocation/reclamation was costly and would unpredictably trigger GC. Even a well managed pool shouldn't beat the simplicity of a single buffer.

@kuhe kuhe merged commit 5865b65 into smithy-lang:main Aug 16, 2024
11 checks passed
@kuhe kuhe deleted the cbor branch August 16, 2024 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants