-
Notifications
You must be signed in to change notification settings - Fork 193
Fix PostgreSQL IP address serialization error #3900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
0b4a445 to
3324a00
Compare
2a92418 to
f810b59
Compare
artem-shelkovnikov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it's really easy to add this to ftest and we should.
Ideally the table(s) in ftest should have all possible data types that require serialisation.
Also, I see their test suite here https://github.com/MagicStack/asyncpg/blob/9b2b027224058e4590fdb9e41b5750a64dab1fce/tests/test_codecs.py that has some more fun data types, do you wanna try them out too? Geospatial types, for example, or UUIDs?
|
Okay, the only argument for not having ftest coverage is that this is somewhat duplicating stuff, and also probably not consistent with the rest of the ftests which as far as I understand don't try to be thorough. On more types, my coding assistant did suggest adding that stuff - I decided to try to keep the change minimal but without good reason. I'll see what I do about both, will ping for a repeat look. 👍 |
…p-serialization' into am/fix-postgresql-ip-serialization
|
@artem-shelkovnikov whenever you have time, have another 👀 - added some basic ftest coverage and also extended support to UUID and geometric types - I assume you meant built-in postgres geometric types from the linked test codecs file and not geospatial. I had to make some decisions about how to stringify things, and I decided to make no mention of the shape itself in the generated string, as the column type is already known from the schema, so including 'POINT' or 'POLYGON' in the string would be redundant. |
seanstory
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
lorenabalan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
## Closes #3879 Problem: The PostgreSQL connector fails with serialization error because asyncpg returns PostgreSQL's INET and CIDR columns as Python ipaddress module objects (IPv4Address, IPv6Address, IPv4Network, etc.) rather than strings. Our serialization layer doesn't handle these types, causing the connector to crash when attempting to convert documents to JSON. Solution: The implementation converts these objects to strings before calling the parent serializer, which handles standard types. Recursive serialization handles nested structures (lists/dicts containing IP addresses). Followed the same pattern from the MongoDB connector. Only added plain tests for these changes, not ftest coverage. You, reviewer!, give me an opinion on this! <!--Provide a general description of the code changes in your pull request. If the change relates to a specific issue, include the link at the top. If this is an ad-hoc/trivial change and does not have a corresponding issue, please describe your changes in enough details, so that reviewers and other team members can understand the reasoning behind the pull request.--> ## Checklists <!--You can remove unrelated items from checklists below and/or add new items that may help during the review.--> #### Pre-Review Checklist - [x] this PR does NOT contain credentials of any kind, such as API keys or username/passwords (double check `config.yml.example`) - [x] this PR has a meaningful title - [x] this PR links to all relevant github issues that it fixes or partially addresses - [ ] if there is no GH issue, please create it. Each PR should have a link to an issue - [x] this PR has a thorough description - [x] Covered the changes with automated tests - [x] Tested the changes locally - [x] Added a label for each target release version (example: `v7.13.2`, `v7.14.0`, `v8.0.0`) - [x] For bugfixes: backport safely to all minor branches still receiving patch releases - [x] Considered corresponding documentation changes - [x] Contributed any configuration settings changes to the configuration reference - [x] if you added or changed Rich Configurable Fields for a Native Connector, you made a corresponding PR in [Kibana](https://github.com/elastic/kibana/blob/main/packages/kbn-search-connectors/types/native_connectors.ts) #### Changes Requiring Extra Attention <!--Please call out any changes that require special attention from the reviewers and/or increase the risk to availability or security of the system after deployment. Remove the ones that don't apply.--> - [ ] Security-related changes (encryption, TLS, SSRF, etc) - [ ] New external service dependencies added. ## Related Pull Requests <!--List any relevant PRs here or remove the section if this is a standalone PR. * https://github.com/elastic/.../pull/123--> ## Release Note <!--If you think this enhancement/fix should be included in the release notes, please write a concise user-facing description of the change here. You should also label the PR with `release_note` so the release notes author(s) can easily look it up.--> --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
💔 Failed to create backport PR(s)
Successful backport PRs will be merged automatically after passing CI. To backport manually run: |
Backports the following commits to 9.3: - Fix PostgreSQL IP address serialization error (#3900) --------- Co-authored-by: Apostolos Matsagkas <Apmats@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
|
Yeah, due to our defactoring you'll need to do backports manually here, @Apmats |
|
@artem-shelkovnikov yeap working on that now, I knew it was coming. 😁 |
Backport of PR #3900 to 9.2 PostgreSQL connector uses asyncpg which returns special Python objects for certain PostgreSQL data types. These need to be serialized to strings before indexing: - Network types (INET, CIDR) -> ipaddress module objects - UUID type -> uuid.UUID objects - Geometric types (POINT, LINE, POLYGON, etc.) -> asyncpg.types objects - BitString type (BIT, VARBIT) -> asyncpg.types.BitString objects
Backport of PR #3900 to 9.2 PostgreSQL connector uses asyncpg which returns special Python objects for certain PostgreSQL data types. These need to be serialized to strings before indexing: - Network types (INET, CIDR) -> ipaddress module objects - UUID type -> uuid.UUID objects - Geometric types (POINT, LINE, POLYGON, etc.) -> asyncpg.types objects - BitString type (BIT, VARBIT) -> asyncpg.types.BitString objects
Backport of PR #3900 to 9.2 PostgreSQL connector uses asyncpg which returns special Python objects for certain PostgreSQL data types. These need to be serialized to strings before indexing: - Network types (INET, CIDR) -> ipaddress module objects - UUID type -> uuid.UUID objects - Geometric types (POINT, LINE, POLYGON, etc.) -> asyncpg.types objects - BitString type (BIT, VARBIT) -> asyncpg.types.BitString objects
Backport of PR #3900 to 9.2 PostgreSQL connector uses asyncpg which returns special Python objects for certain PostgreSQL data types. These need to be serialized to strings before indexing: - Network types (INET, CIDR) -> ipaddress module objects - UUID type -> uuid.UUID objects - Geometric types (POINT, LINE, POLYGON, etc.) -> asyncpg.types objects - BitString type (BIT, VARBIT) -> asyncpg.types.BitString objects
Manually backporting the work from the following PR to 9.2: - Fix PostgreSQL IP address serialization error (#3900) I think auto backport should work for these versions and so I can avoid seeking reviews for the other branches. 😁 --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Manually backporting the work from the following PR to 9.1: - Fix PostgreSQL IP address serialization error (#3900) I thought doing #3904 would backport even when merging to something that's not main, but apparently I was wrong, so handling in separate PRs. --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Closes #3879
Problem:
The PostgreSQL connector fails with serialization error because asyncpg returns PostgreSQL's INET and CIDR columns as Python ipaddress module objects (IPv4Address, IPv6Address, IPv4Network, etc.) rather than strings. Our serialization layer doesn't handle these types, causing the connector to crash when attempting to convert documents to JSON.
Solution:
The implementation converts these objects to strings before calling the parent serializer, which handles standard types. Recursive serialization handles nested structures (lists/dicts containing IP addresses). Followed the same pattern from the MongoDB connector.
Only added plain tests for these changes, not ftest coverage. You, reviewer!, give me an opinion on this!
Checklists
Pre-Review Checklist
config.yml.example)v7.13.2,v7.14.0,v8.0.0)Changes Requiring Extra Attention
Related Pull Requests
Release Note