-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace core.Key and ds.Key with strongly typed data structures #19
Replace core.Key and ds.Key with strongly typed data structures #19
Conversation
WithPriorityFlag is called within reg.getPriority and should not be the responsibility of the caller
Is not needed and complicates the code
WithValueFlag is called in writeObjectMarker
unit test this like crazyLines 85 to 90 in ecc6049
This comment was generated by todo based on a
|
Support multiple spansLines 115 to 118 in ecc6049
This comment was generated by todo based on a
|
, this is very lazyLines 226 to 229 in ecc6049
This comment was generated by todo based on a
|
Config logger param package wideLines 42 to 45 in ecc6049
This comment was generated by todo based on a
|
I think I have actually done something horrible, PR description benchmark were invalid as I forgot you need to re-run |
ill check the perf on my side and investigate. TBD |
I think I just got lucky running the queries from master the first few times. Performance seems comparable ~500us - 2ms to master. Added a couple of commits (including one fixup) whilst going through again today. I will rebase onto master and squash the fixup before merging after/if the PR is approved |
Is wasteful taking the whole struct (and initing one in most cases)
ef8c7aa
to
8cbf009
Compare
I think at the moment, I'd rather just implement the simpler version of the conversion, instead of the structured version. It kinda seems like the same problems plauge the implementation here, specifically in the Thoughts? |
// [CollectionId]/[PrimaryIndexId]/ | ||
// | ||
// Any properties before the above (assuming a '/' deliminator) are ignored | ||
func NewDataStoreKey(key string) DataStoreKey { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically these constructors is what im referring to
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be very much in favour of merging lol :) But I can see where you are coming from - I spent quite a while trying to get rid of as many prod-code references to those constructors as possible.
Those constructors are only called in a small number of places (ignoring tests and the deprecated get.go, 3 for NewDataStoreKey, and once for the HeadStoreKey. And those are all on direct output from the datastore where the keys should be 100% complete. Everything else is manually built.
I think even if these were more widely used (which they shouldn't be), the review still solves the following:
- Code is no longer order dependent - e.g. if something in place x changes the structure by (e.g.) adding the dockey, then place y wont break in really weird ways because of the string parsing hacks it does to get the collection id based on the number of '/' present
- It is now much more obvious what is being used and where, instead of having to track back through the object's lifetime and navigate through the ds.Namespace()/Base/Split/whatever you can see that x is the CollectionId with IndexId
- The format of string output is much more visible and you can now see what is being written/read
- Removes the db/namespace from the vars being passed through the codebase, forcing the datastores to be responsible for managing that and reducing the 'stuff' the devs see when debugging. I also found that their inclusion within the string-key was partial and very dependent on where you were in the codebase, not all places had them and some broke when were given them.
It also keeps all the scary stuff in the one place, where it is very visible, and tested. There is a risk that we'll accidentally call these functions when we shouldn't (i.e. when there is no reason to, and we can manually set stuff), but it is better than having poorly documented structured-strings floating round all over the place :P
These are all pretty valid reasons :). And the benefits are fairly clear.
I haven't checked yet, but does this play well with the actual data stores?
When you call txn.headstore.get(/my-doc-key) the 'headstore' references a
"Wrapped data store" from the go-datastore package. Which automatically
handles prefix wrapping and stripping for the "/db/heads" prefix.
Same applies to /db/data, /db/blocks, and /db/system.
…On Mon, Nov 8, 2021, 9:17 AM AndrewSisley ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In core/key.go
<#19 (comment)>:
> +}
+
+// Creates a new DataStoreKey from a string as best as it can,
+// splitting the input using '/' and ':' as field deliminaters. It assumes
+// that the input string is in one of the following formats:
+//
+// [DocKey]
+// [DocKey]:[InstanceType]
+// [DocKey]/[FieldId]
+// [DocKey]/[FieldId]:[InstanceType]
+// Any of the above prefixed but with one of the below:
+// [PrimaryIndexId]/
+// [CollectionId]/[PrimaryIndexId]/
+//
+// Any properties before the above (assuming a '/' deliminator) are ignored
+func NewDataStoreKey(key string) DataStoreKey {
Would be very much in favour of merging lol :) But I can see where you are
coming from - I spent quite a while trying to get rid of as many prod-code
references to those constructors as possible.
Those constructors are only called in a small number of places (ignoring
tests and the deprecated get.go, 3 for NewDataStoreKey, and once for the
HeadStoreKey. And those are all on direct output from the datastore where
the keys should be 100% complete. Everything else is manually built.
I think even if these were more widely used (which they shouldn't be), the
review still solves the following:
1. Code is no longer order dependent - e.g. if something in place x
changes the structure by (e.g.) adding the dockey, then place y wont break
in really weird ways because of the string parsing hacks it does to get the
collection id based on the number of '/' present
2. It is now much more obvious what is being used and where, instead
of having to track back through the object's lifetime and navigate through
the ds.Namespace()/Base/Split/whatever you can see that x is the
CollectionId with IndexId
3. The format of string output is much more visible and you can now
see what is being written/read
4. Removes the db/namespace from the vars being passed through the
codebase, forcing the datastores to be responsible for managing that and
reducing the 'stuff' the devs see when debugging. I also found that their
inclusion within the string-key was partial and very dependent on where you
were in the codebase, not all places had them and some broke when were
given them.
It also keeps all the scary stuff in the one place, where it is very
visible, and tested. There is a risk that we'll accidentally call these
functions when we shouldn't (i.e. when there is no reason to, and we can
manually set stuff), but it is better than having poorly documented
structured-strings floating round all over the place :P
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub
<#19 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA6UKNUVRBVTGSXW5AQHCILUK7LWVANCNFSM5G7VMXSQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Code relies on the datastore wrapping - I paid quite a lot of attention to the db dumps during testing to make sure I wasn't losing that.
|
OK, I'm inclined to go with this approach, however with this much string manipulation/allocation in the hot path, I want to get at least a rudimentary benchmark suite in place to ensure everything is good without notable (or ideally any) perf degredation. |
Okay fair enough :) I'll probs get cracking on that (after I finish the group stuff) if you don't beat me to it :D |
Closing this PR as Github doesnt seem to allow me to change the target branch once a PR is open. Will open a new one against develop. |
Replace the string based core.Key and ds.Key types with strongly typed data structures.
Namespacing is now entirely handled by the datastores and should not be present on the keys (i.e. no leading '/db/data').
Fixes the factory_test.go tests, only failing tests are failing on master too (Test_Generator_buildTypesFromAST_SingleScalarField, TestMutationParse_Update_Simple_Array and TestMutationParse_Create_Error_Missing_Data).
Database dumps seem to pair up with master (manually checked a few of the db/tests, and my local file-based instance).
Performance doesn't seem to have taken a hit (checked manually firing in queries via UI client, might even be improved due to removal of a few string manipulations - query time was 250-680 micro seconds), but maybe have an eye out for anything silly I may have done in the planner in particular.
System will likely need further extension in the time-traveling branch, and when implementing the non-primary index stuff.