Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add blob scalar type #2091

Merged
merged 18 commits into from
Dec 7, 2023
Merged

Conversation

nasdf
Copy link
Member

@nasdf nasdf commented Nov 28, 2023

Relevant issue(s)

Resolves #2090

Description

This PR adds a blob scalar type to the schema system.

Tasks

  • I made sure the code is well commented, particularly hard-to-understand areas.
  • I made sure the repository-held documentation is changed accordingly.
  • I made sure the pull request title adheres to the conventional commit style (the subset used in the project can be found in tools/configs/chglog/config.yml).
  • I made sure to discuss its limitations such as threats to validity, vulnerability to mistake and misuse, robustness to invalidation of assumptions, resource requirements, ...

How has this been tested?

Added integration tests.

Specify the platform(s) on which this was tested:

  • MacOS

@nasdf nasdf added feature New feature or request area/schema Related to the schema system labels Nov 28, 2023
@nasdf nasdf added this to the DefraDB v0.9 milestone Nov 28, 2023
@nasdf nasdf self-assigned this Nov 28, 2023
@nasdf nasdf requested a review from a team November 28, 2023 20:02
@@ -14,6 +14,7 @@ import (
gql "github.com/sourcenetwork/graphql-go"

"github.com/sourcenetwork/defradb/client"
schemaTypes "github.com/sourcenetwork/defradb/request/graphql/schema/types"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: Seeing this make me wonder if the type should be added to the graphql-go package.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to that. Should we move the other types there as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me nervous, I think we added DateTime to that package, but graphql-go is supposed to be a general purpose gql package, not a defra package. And we may/do want to get rid of it/replace it at somepoint

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we added DateTime to that package

No, DateTime was already defined as a scalar in the graphql-go package, but there were some problems with it as it relates to Defra integration, there was PR that got merged on the graphql-go package related to the DateTime stuff, but it wasn't the implementation/definition of the DateTime scalar. (reference: #931 and sourcenetwork/graphql-go#8)

I do agree though that we shouldn't have to define the scalar in the graphql-go package, it exposes the necessary types/functions for us to define our own scalars outside that package, exactly as they're being used here. Moreover, the fact that we have our own defined custom scalars implies that w.e package we may replace the graphql-go package with, should implement the "base" scalars as defined in the GQL spec, and all we have to worry about is the specific "custom" scalars that are neatly organized in a single place.

Copy link

codecov bot commented Nov 28, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (51b20bb) 73.96% compared to head (438894b) 74.00%.

❗ Current head 438894b differs from pull request most recent head a46fad6. Consider uploading reports for the commit a46fad6 to get more accurate results

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2091      +/-   ##
===========================================
+ Coverage    73.96%   74.00%   +0.04%     
===========================================
  Files          248      249       +1     
  Lines        24820    24859      +39     
===========================================
+ Hits         18357    18396      +39     
  Misses        5206     5206              
  Partials      1257     1257              
Flag Coverage Δ
all-tests 74.00% <100.00%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
client/descriptions.go 76.47% <100.00%> (+0.57%) ⬆️
db/collection_update.go 73.44% <100.00%> (+0.14%) ⬆️
db/index.go 87.60% <100.00%> (+0.76%) ⬆️
request/graphql/schema/collection.go 91.38% <100.00%> (+0.08%) ⬆️
request/graphql/schema/manager.go 97.46% <100.00%> (+0.07%) ⬆️
request/graphql/schema/types/scalars.go 100.00% <100.00%> (ø)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 51b20bb...a46fad6. Read the comment docs.

gql.String: client.FieldKind_STRING,
&gql.Object{}: client.FieldKind_FOREIGN_OBJECT,
&gql.List{}: client.FieldKind_FOREIGN_OBJECT_ARRAY,
schemaTypes.BytesScalarType: client.FieldKind_BYTES,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Related to the other comment from Fred, I think it should be made clear that these are custom scalars, as the comment on line 36 suggests ("More custom ones to come"). Just a quick comment separating base scalars from custom scalars for clarity 👍

@@ -204,6 +204,7 @@ var FieldKindStringToEnumMapping = map[string]FieldKind{
"String": FieldKind_STRING,
"[String]": FieldKind_NILLABLE_STRING_ARRAY,
"[String!]": FieldKind_STRING_ARRAY,
"Bytes": FieldKind_BYTES,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: I think this is a strange way of representing this, and that [Byte] is more sensible and fits much better with the rest of the types, and any potential future Byte, [Byte!] or Byte! types.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to Blob as we discussed in the standup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Keenan :)

return nil
}
},
ParseValue: func(value any) any {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: Thank you for the reminder as to how much I dislike the graphql library and it's totally rubbish way of (not) handling errors...

func TestBytesScalarTypeParseValue(t *testing.T) {
input := "00ff"
output := []byte{0, 255}
invalid := "invalid"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Why is this invalid?

todo: Please document why this is invalid, if it remains invalid.

}
return data
default:
return nil
Copy link
Contributor

@AndrewSisley AndrewSisley Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Why does this return nil instead of []byte{}?

EDIT:
question: Is this a silent failure? Like the errors above:

if err != nil {
	return nil
}

todo: If so, please document

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just following how the other types work in the graphql-go library. It seems like nil is the return value when the type cannot be parsed.

}
return data
default:
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: Similar to another comment, please document this - it looks like a silent error, but I only know that by guessing given the:

if err != nil {
	return nil
}

stuff in other code blocks

@@ -64,30 +64,6 @@ func TestSchemaUpdatesAddFieldKind9(t *testing.T) {
testUtils.ExecuteTestCase(t, test)
}

func TestSchemaUpdatesAddFieldKind13(t *testing.T) {
Copy link
Contributor

@AndrewSisley AndrewSisley Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: Please add a tests (one by enum number, and one by String-name) in .../schema/updates/add/field/kind/foo.go that tests that this field works with PatchSchema


var BytesScalarType = graphql.NewScalar(graphql.ScalarConfig{
Name: "Bytes",
Description: "The `Bytes` scalar type represents an array of bytes.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: If it is an array of bytes, why does it only appear to accept string values?

todo: I'm not sure if it does only accept string values, could you add an integration test documenting the beheviour when provided an array of numbers:

testUtils.CreateDoc{
	Doc: `{
		"name": "John",
		"data": [0,1,2,3...]
	}`,
},

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is a bug in theclient.Document API. You can also do this to a string field:

testUtils.CreateDoc{
		Doc: `{
			"name":  12345
		}`,
}
testUtils.CreateDoc{
		Doc: `{
			"name":  false
		}`,
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite a bug, we just dont do any checking of the saved value. The read should fail though I think. Does the read of [0,1,2,3...] fail? Can we please document the behaviour, as it is a more likely question than assigning 1234 to a boolean :)

case *[]byte:
return hex.EncodeToString(*value)
default:
return nil
Copy link
Contributor

@AndrewSisley AndrewSisley Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: Please document why you return nil here

Copy link
Contributor

@AndrewSisley AndrewSisley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a handful of localized questions/todos that I would like resolving before merging, sorry about the hassle :)

Serialize: func(value any) any {
switch value := value.(type) {
case []byte:
return hex.EncodeToString(value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Why are you encoding/decoding at all here (including in ParseValue)?

todo: The graphql library lacks any documentation here, but we can add some to our code to document why Serialize and ParseValue exist.

var BlobScalarType = graphql.NewScalar(graphql.ScalarConfig{
Name: "Blob",
Description: "The `Blob` scalar type represents a binary large object.",
// Serialize converts the value to the serialized hex representation
Copy link
Contributor

@AndrewSisley AndrewSisley Dec 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Why is it converted to hex?

todo: If we preserve the serialization/deserialization to hex, please document why it is converted.

Copy link
Contributor

@AndrewSisley AndrewSisley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks for this, and the extra tweaks Keenan - I'm approving now with one open/pending discussion RE why you are converting to hex and back that should be resolved before merge :)

@nasdf
Copy link
Member Author

nasdf commented Dec 4, 2023

Looks good to me, thanks for this, and the extra tweaks Keenan - I'm approving now with one open/pending discussion RE why you are converting to hex and back that should be resolved before merge :)

Thanks! I refactored the parsing logic so that the value will always be a hex string.

Serialize: coerceBlob,
// ParseValue converts the value to a hex string
ParseValue: coerceBlob,
// ParseLiteral converts the ast value a hex string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: "to a hex string"?

Copy link
Contributor

@islamaliev islamaliev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but one test is missing.

db/index.go Outdated
@@ -51,6 +51,8 @@ func getValidateIndexFieldFunc(kind client.FieldKind) func(any) bool {
return canConvertIndexFieldValue[float64]
case client.FieldKind_BOOL:
return canConvertIndexFieldValue[bool]
case client.FieldKind_BLOB:
return canConvertIndexFieldValue[string]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: please add test here TestNonUnique_StoringIndexedFieldValueOfDifferentTypes

@nasdf nasdf changed the title feat: Add bytes scalar type feat: Add blob scalar type Dec 6, 2023
Copy link
Contributor

@islamaliev islamaliev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good job, Keenan!

@nasdf nasdf merged commit 9084218 into sourcenetwork:develop Dec 7, 2023
27 of 28 checks passed
shahzadlone pushed a commit to shahzadlone/defradb that referenced this pull request Jan 22, 2024
## Relevant issue(s)

Resolves sourcenetwork#2090 

## Description

This PR adds a blob scalar type to the schema system.

## Tasks

- [x] I made sure the code is well commented, particularly
hard-to-understand areas.
- [x] I made sure the repository-held documentation is changed
accordingly.
- [x] I made sure the pull request title adheres to the conventional
commit style (the subset used in the project can be found in
[tools/configs/chglog/config.yml](tools/configs/chglog/config.yml)).
- [x] I made sure to discuss its limitations such as threats to
validity, vulnerability to mistake and misuse, robustness to
invalidation of assumptions, resource requirements, ...

## How has this been tested?

Added integration tests.

Specify the platform(s) on which this was tested:
- MacOS
shahzadlone pushed a commit to shahzadlone/defradb that referenced this pull request Feb 23, 2024
## Relevant issue(s)

Resolves sourcenetwork#2090 

## Description

This PR adds a blob scalar type to the schema system.

## Tasks

- [x] I made sure the code is well commented, particularly
hard-to-understand areas.
- [x] I made sure the repository-held documentation is changed
accordingly.
- [x] I made sure the pull request title adheres to the conventional
commit style (the subset used in the project can be found in
[tools/configs/chglog/config.yml](tools/configs/chglog/config.yml)).
- [x] I made sure to discuss its limitations such as threats to
validity, vulnerability to mistake and misuse, robustness to
invalidation of assumptions, resource requirements, ...

## How has this been tested?

Added integration tests.

Specify the platform(s) on which this was tested:
- MacOS
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/schema Related to the schema system feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Blob scalar type
5 participants