Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UUID inconsistency in different SDKs #9799

Open
va-kuznecov opened this issue Sep 26, 2024 · 6 comments
Open

UUID inconsistency in different SDKs #9799

va-kuznecov opened this issue Sep 26, 2024 · 6 comments

Comments

@va-kuznecov
Copy link
Member

No description provided.

@va-kuznecov va-kuznecov changed the title UUID inconsistency in defferent SDK UUID inconsistency in defferent SDKs Sep 26, 2024
@SammyVimes
Copy link
Collaborator

Table:

CREATE TABLE `/local/ydb_row_table` (
    id Uint64,
    val Uuid,
    PRIMARY KEY (id)
)

First we insert a row using the UI:

UPSERT INTO `/local/ydb_row_table`
    ( `id`, `val` )
VALUES (1, Uuid("123e4567-e89b-12d3-a456-426614174000"));

Let's now see how different SDKs handle this UUID:

Java

We can use one of three different UUID factory methods in Java

List<Supplier<PrimitiveValue>> uuidGenerators = Arrays.asList(
          () -> PrimitiveValue.newUuid("123e4567-e89b-12d3-a456-426614174000"), // [1]
          () -> PrimitiveValue.newUuid(UUID.fromString("123e4567-e89b-12d3-a456-426614174000")), // [2]
          () -> {
              UUID uuid = UUID.fromString("123e4567-e89b-12d3-a456-426614174000");

              return PrimitiveValue.newUuid(uuid.getMostSignificantBits(), uuid.getLeastSignificantBits()); // [3]
          }
      );

1 – from a string literal
2 – from java.util.UUID
3 – supplying MSB and LSB manually

"SELECT * FROM `/local/ydb_row_table` val = $val"

This way 1 and 2 works and 3 doesn't. In Java SDK UUID from a string literal and from java.util.UUID does some bit manipulations to make UUID according to the implementation from YQL that is used in YDB server. Manually creating UUID from MSB and LSB yields a different UUID. So on server (and consequentially using UI or CLI) this same literal will yield 16-byte array that is different from the one we create manually with MSB and LSB.

Python

The same as with Java, but Python already has an implementation of the UUID we use on server, so there is no need to wrap UUID as we do in Java.

endpoint = "grpc://localhost:2136"
database = "/local"

with ydb.Driver(
        endpoint=endpoint,
        database=database,
) as driver:
    driver.wait(timeout=5, fail_fast=True)

    with ydb.QuerySessionPool(driver) as pool:
        result_sets = pool.execute_with_retries(
            "SELECT id, val FROM `/local/ydb_row_table` WHERE val = Uuid(\"123e4567-e89b-12d3-a456-426614174000\")"
        )
        print(result_sets)

This will yield the correct UUID, since internally we use bytes_le constructor argument to build UUID object from byte array we receive from the server.

Go

In Go SDK there is no wrapper for UUID and it is defined as follows

// uuid.go
type UUID [16]byte

Using github.com/google/uuid module and manually setting uuid.UUID bytes array, we face the same problem as with using [3] factory method in Java SDK.

       var (
		id  uint64
		val uuid.UUID
	)

	row, err := db.Query().QueryRow(ctx, "SELECT id, val FROM `/local/ydb_row_table` WHERE val = Uuid(\"123e4567-e89b-12d3-a456-426614174000\")")

	if err = row.Scan(&id, (*[16]byte)(&val)); err != nil {
		log.Printf("select failed: %v", err)
		return
	}
	s, err := db.Table().CreateSession()
	s.BulkUpsert()
	uid, err := uuid.FromBytes(val[:])

	log.Printf("id = %d, myStr = \"%s\"", id, uid.String())

And UUID value in this case will be "00401714-6642-56a4-12d3-e89b123e4567" and not "123e4567-e89b-12d3-a456-426614174000" as we expect.

Other languages

I was unable to find similar to Java or Python wrappers in SDKs for other languages, so I would assume that they either don't support UUID at all (seems like it is the case with Rust SDK) or do it the same way Go does, so the necessity of the bit manipulation is:
a) Is unknown to a regular user
b) Must be done manually

@alex268
Copy link
Member

alex268 commented Oct 3, 2024

The main reason for the problem in the Java SDK is the similarity of the internal representation of YDB's UUID and Java's UUID. They both have 128 bits and can be represented as two Java's long. At the same time, two methods of UUID creating use Java's UUID and Java's String and they are public. And one method uses two long, and this method is not intended for public use, it is used precisely to create a value from a protobuf. It can confuse users, so it is probably really better to hide that method from them (or deprecate for the beginning)

@va-kuznecov
Copy link
Member Author

With @SammyVimes, @alex268 and @dcherednik we discussed and came to decision.

In every SDK and in YQL we will have two options

  1. Built-in UUID, which will swap bytes in the way it does currently in Java, C++ and YQL, but in Go SDK we need to add it.
  2. New method called FromBytes, which will not touch bytes and will store it as-is. Go SDK works that way now, but in Java, C++ and YQL we need to add it.

@asmyasnikov
Copy link
Member

repro for future

CREATE TABLE example_table (
    id UUID PRIMARY KEY,
    name TEXT
);
INSERT INTO example_table (id, name)
VALUES ('7b66a2ca-91bb-48be-81b9-97d01a2e6692', 'John Doe');
SELECT * FROM example_table;

@rekby
Copy link
Member

rekby commented Oct 8, 2024

Let's fix Go SDK by add new method for work with typed UUID and convert it internals to server representation UUID's and deprecate current UUID method/type.

No need add FromBytes to other SDKs. It is a way for mistakes and confusing users: what method better?
And method "FromBytes" will be inconsistent with server YQL representation. It is a way to errors in customer's system and while debug.

@rekby
Copy link
Member

rekby commented Oct 11, 2024

I found additional unexpected sorting behaviour: #10328

@rekby rekby changed the title UUID inconsistency in defferent SDKs UUID inconsistency in different SDKs Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants