Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][fn] Add missing version field back to querystate API #21966

Merged
merged 6 commits into from
Jan 27, 2024

Conversation

jiangpengcheng
Copy link
Contributor

@jiangpengcheng jiangpengcheng commented Jan 25, 2024

Fixes #xyz

Main Issue: #xyz

PIP: #xyz

Motivation

#21597 introduce a change that always uses null for the version field for the querystate API. This may break users' behavior, need to add it back

Modifications

  1. Add a new class org.apache.pulsar.functions.api.state.StateValue, which contains value, version, and isNumber fields.
  2. Add two new methods to the org.apache.pulsar.functions.api.state.ByteBufferStateStore interface, both of which return a StateValue:
    1. default StateValue getStateValue(String key)
    2. default StateValue getStateValueAsync(String key)
  3. Use DefaultStateStore#getStateValue method instead of DefaultStateStore#get in org.apache.pulsar.functions.worker.rest.api.ComponentImpl to get the version field from the backend state store and return it to users.

Verifying this change

  • Make sure that the change passes the CI checks.

  • This change added tests and can be verified as follows:

    • Add unit tests for getStateValue and getStateValueAsync
    • Update integration test of PulsarStateTest#testPythonWordCountFunction, now it will check the version field too.

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: jiangpengcheng#24

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Jan 25, 2024
@jiangpengcheng jiangpengcheng force-pushed the add_missing_version_field branch from 1b256d9 to 6c0810f Compare January 25, 2024 03:25
@codelipenghui codelipenghui added the release/blocker Indicate the PR or issue that should block the release until it gets resolved label Jan 25, 2024
@codelipenghui codelipenghui added this to the 3.2.0 milestone Jan 25, 2024
@lhotari
Copy link
Member

lhotari commented Jan 25, 2024

Regarding https://apache-pulsar.slack.com/archives/C5ZSVEN4E/p1706147824703029?thread_ts=1705826989.958419&cid=C5ZSVEN4E ,

now the display changed to

{
  "key": "hello",
  "stringValue": "\u0000\u0000\u0000\u0000\u0000\u0000\u0000\n",
  "numberValue": 10
}

What caused the stringValue key to appear? Is there also a fix for that?

@jiangpengcheng
Copy link
Contributor Author

Regarding https://apache-pulsar.slack.com/archives/C5ZSVEN4E/p1706147824703029?thread_ts=1705826989.958419&cid=C5ZSVEN4E ,

now the display changed to

{
  "key": "hello",
  "stringValue": "\u0000\u0000\u0000\u0000\u0000\u0000\u0000\n",
  "numberValue": 10
}

What caused the stringValue key to appear? Is there also a fix for that?

It's partly resolved, if the backend state store can specify whether a state value is a number(BK supports this), it won't return stringValue and byteValue

But if a backend state store doesn't have such ability, we can't tell whether a state value is a number even its length is Long.BYTES, in this case, a stringValue or byteValue is returned with the numberValue

@lhotari
Copy link
Member

lhotari commented Jan 25, 2024

It's partly resolved, if the backend state store can specify whether a state value is a number(BK supports this), it won't return stringValue and byteValue

This happened with BK and 3.2.0 rc4. What does it mean that it's partly resolved?

@jiangpengcheng
Copy link
Contributor Author

jiangpengcheng commented Jan 26, 2024

It's partly resolved, if the backend state store can specify whether a state value is a number(BK supports this), it won't return stringValue and byteValue

This happened with BK and 3.2.0 rc4. What does it mean that it's partly resolved?

When using the BK state store, it won't return stringValue or byteValue if the value is a number, so it's fixed for BK.

But if users use another state store, this state store may not be able to specify whether a state value is a number or a string(such as the PulsarMetadataStateStoreImpl), then when the length of a state value's byte[] is equal to Long.BYTES, we can only return numberValue along with stringValue or byteValue because this state value can either be a number or string.

This is also the reason why the type of isNumber is Boolean, because not all state store has the ability to know whether a state value is a number or not.

I think this is fine, because before this PR and #21597, the querystate and putstate API were using the bookkeeper storage client directly, and users could not use other state stores, so it won't break user's behaviors since they are always using the BKStateStoreImpl.

@lhotari
Copy link
Member

lhotari commented Jan 26, 2024

/pulsarbot rerun-failure-checks

@Technoboy- Technoboy- merged commit 48e64b3 into apache:master Jan 27, 2024
52 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-not-needed Your PR changes do not impact docs ready-to-test release/blocker Indicate the PR or issue that should block the release until it gets resolved
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants