Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Rework client.DB to ensure interface contains only public types #277

Merged
merged 16 commits into from
Mar 11, 2022

Conversation

AndrewSisley
Copy link
Contributor

Part of #200

Currently the interfaces on client/core.go are publicly dependent on a number of types from 'internal' packages. This PR begins to sort that out - (apart from the base.XYZDescriptions) the client.DB interface no longer is dependent on those types.

Will likely have at least two more PRs on this topic before ticket can be considered 'done' - need to sort out dockey.go (can be done whilst this is in review, and is why I put this up now as that will require more discussion perhaps). Then document.go and the descriptions need to be sorted out. Will need/want to document the interfaces/types properly after that as they currently have no godocs, and also select a location for integration tests (similar to the db/tests, but for the abi instead of query).

Suggest reviewing commit-by-commit, the rational behind each change should be described in each commit body.

@AndrewSisley AndrewSisley added area/collections Related to the collections system area/db-system Related to the core system related components of the DB refactor This issue specific to or requires *notable* refactoring of existing codebases and components labels Mar 7, 2022
@AndrewSisley AndrewSisley self-assigned this Mar 7, 2022
Copy link
Member

@jsimnz jsimnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, moving things in the right direction. Agree with most of the changes.

Have some questions about a few, for the sake of clarification. Had a possible alternative approach for the DSReaderWriter stuff, and lastly, noteable comment on the base package utility funcs.

client/core.go Outdated Show resolved Hide resolved
db/blockstore.go Outdated Show resolved Hide resolved
client/core.go Show resolved Hide resolved
db/db.go Outdated Show resolved Hide resolved
client/store.go Outdated
@@ -8,7 +8,7 @@
// by the Apache License, Version 2.0, included in the file
// licenses/APL.txt.

package core
package client
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to leave the DSReaderWriter interface under core instead of client.

Alternatively, we can move it over to our own datastore package, and revamp that package to house our datastore interfaces, iterable txn stuff, and sub packages (like badger) for implementaton specific stuff.

client package is ideally designed for the consumer side of the interface flow., where as DSReaderWriter is very much an internal abstraction. Note: "internal" here doesn't explicitly mean private, but more about the internals of the DB.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolving this, as it is an extension of the comment/discussion in versioned.go (where I commented "A public interface is pointless in my opinion if the types it exposes aren't in it....")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im not sure the discussion from the versioned.go thread applies here for the DSReaderWriter interface. Especially since the core package is already used for the core.Key structured stuff.

Comment on lines 405 to 433
func getIndexDocKey(c *base.CollectionDescription, key core.DataStoreKey, indexID uint32) core.DataStoreKey {
return core.DataStoreKey{
CollectionId: c.IDString(),
IndexId: key.IndexId,
}.WithInstanceInfo(key)
}

func getPrimaryIndexDocKey(c *base.CollectionDescription, key core.DataStoreKey) core.DataStoreKey {
return getIndexDocKey(c, key, c.Indexes[0].ID)
}

func getFieldKey(c *base.CollectionDescription, key core.DataStoreKey, fieldName string) core.DataStoreKey {
if !c.Schema.IsEmpty() {
return key.WithFieldId(fmt.Sprint(c.Schema.GetFieldKey(fieldName)))
}
return key.WithFieldId(fieldName)
}

func getPrimaryIndexDocKeyForCRDT(c *base.CollectionDescription, ctype core.CType, key core.DataStoreKey, fieldName string) (core.DataStoreKey, error) {
switch ctype {
case core.COMPOSITE:
return getPrimaryIndexDocKey(c, key).WithFieldId(core.COMPOSITE_NAMESPACE), nil
case core.LWW_REGISTER:
fieldKey := getFieldKey(c, key, fieldName)
return getPrimaryIndexDocKey(c, fieldKey), nil
}
return core.DataStoreKey{}, errors.New("Invalid CRDT type")
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This screams like a bad idea to me. These are pretty generic utility functions, and have the high probability of being used in several locations.

I see the thought process here. But they really should be left in the base package.

If the problem is the influence it has to the client.Collection interface. they can be removed.

So instead of calling collection.GetPrimaryIndexDocKeyForCRDT we can just call collection.Description().GetPrimaryIndexDocKeyForCRDT.

In general the base package is highly implementation specfic, so its OK for it to leak implementation details.

I agree it should be removed from the client interfaces, but should be kept in base.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A public interface is pointless in my opinion if the types it exposes aren't in it. This is a large problem for me atm as it is impossible to tell what is public (and thus what can be changed, what needs integration tests, what needs solid documentation, and what is dead code vs pub feature). It is also ugly for consumers if they have to explicitly import several packages to consume the abi. Also leads to confusing stuff like the mutexes in document.go.

In later PRs I was very much planning on moving the Description structs over to the client - making the client package the only 'public' entry point for defra. It would then be very clear as to what is public and what is not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a dozen reasons why a given interface returns types not defined within that interface. Not sure what you mean by what is public and what isn't. Technically its all public (exported), think of client package is the primary entry point for consumers, not the only object to touch.

You will naturally have to import different packages to consume a given API. However, for the client.Collection interface. Although, it exports types like base.CollectionDescription you as a consumer don't have to import base explicitly. You can just use it freely ie: calling collection.Description() if needed.

Unless we specifically starting moving things under the internal/ package (go's way of protecting internal packages from external consumption).

Also not following the document mutexes, those are just there to stop the consumer from shooting themslves in the foot, since no where in the DB are documents used concurrently.

The whole client vs core interface decleration doesn't necessarily have to do with private vs public, but implementation interfaces vs consumer interfaces.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Im only referencing keeping this interface stuff, and leaving the base.GetXYZ calls in place. I don't even think it touches the interface since base only deals with concrete types.

The only place it does need to change is removing those same util funcs from the Collection, which im totally fine with.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, they could be defined in some util package, but im usually weary of generic util packages, which is why these funcs were defined in the base package and defined as methods instead of funcs

Copy link
Contributor Author

@AndrewSisley AndrewSisley Mar 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I have mis-communicated my primary point/concern, if we want to expose the function foo below:

fn Foo(bar Bar) FooBar

Then we absolutely have to make sure that any public members of Bar and FooBar are fully documented and fully tested. We also have to make sure that any changes to Bar and FooBar respect semantic versioning (e.g. no additions in a path release, no modification/removals in a minor).

If Bar and FooBar are not in a very very obvious location, both internal and and external developers will at some point fuck up and break things. Internal (Source) devs will make breaking changes when they shouldn't, and will be less stringent with testing and documentation. External developers will struggle to find stuff, and may find stuff with out-of-date documentation, they will be exposed to accidental breaking changes when updating between patch/minor versions.

This also applies to any Types of public members on Bar and FooBar - recursively.

Copy link
Contributor Author

@AndrewSisley AndrewSisley Mar 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spoke about this in standup, base/description leakage can be addressed later, and leaky functions can be moved from Collection to CollectionDescription in the short-term.

private functions here (getPrimaryIndexDocKeyForCRDT etc will be moved somewhere (internally) shared.

  • Move getPrimaryIndexDocKeyForCRDT
  • Move leaky functions onto description types (make sure you check the resolved comments)

db/base/descriptions.go Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Mar 10, 2022

Codecov Report

Merging #277 (b81b076) into develop (0faf80f) will increase coverage by 0.00%.
The diff coverage is 74.31%.

❗ Current head b81b076 differs from pull request most recent head 958c2e7. Consider uploading reports for the commit 958c2e7 to get more accurate results

Impacted file tree graph

@@           Coverage Diff            @@
##           develop     #277   +/-   ##
========================================
  Coverage    63.30%   63.30%           
========================================
  Files           84       83    -1     
  Lines         9254     9230   -24     
========================================
- Hits          5858     5843   -15     
+ Misses        2779     2771    -8     
+ Partials       617      616    -1     
Impacted Files Coverage Δ
api/http/api.go 0.00% <0.00%> (ø)
datastore/badger/v3/compat_logger.go 0.00% <ø> (ø)
datastore/badger/v3/datastore.go 33.17% <ø> (ø)
datastore/badger/v3/iterator.go 51.57% <ø> (ø)
datastore/iterable/iterable_transaction_shim.go 64.58% <ø> (ø)
db/schema.go 34.14% <0.00%> (-3.07%) ⬇️
document/encoded.go 64.44% <ø> (-1.15%) ⬇️
merkle/crdt/merklecrdt.go 26.08% <ø> (ø)
query/graphql/planner/type_join.go 68.01% <25.00%> (+0.19%) ⬆️
db/collection_update.go 42.10% <42.85%> (ø)
... and 32 more

@AndrewSisley AndrewSisley force-pushed the sisley/refactor/I200-client-vs-core branch 3 times, most recently from b2d1841 to 12fb71f Compare March 10, 2022 16:47
@jsimnz
Copy link
Member

jsimnz commented Mar 10, 2022

Im good with everything in here except the DSReaderWriter flow.

If you feel strongly about leaving it where it is in the client package, im happy to support you. But wanted to give my 2 cents

Previous fn name was a bit ambigious, particularly due to the existance of GetCollectionBySchemaID, which has the same method signature.
Exposes a large amount of internal generator types and is not nice on a public interface
Pub fn is an oddity that might be removed, this is one of two refs to it
Removes the defra-only DagStore type/name from the db interface - blockstore concept appears to be larger than defra, and there seems no sensible way to remove it from the interfaces (p2p lib requires one).  If Cid and Block should be considered standard, then blockstore should be too.
Is a duplicate of the public PrintDump
None of these need to be public.  Leaves us with the (standard-typed, standard-concept) Root and Block stores only.
txn is dependent on store, and txn is on the public db+collection interfaces
Is not used, and creates extra dependency headaches when moving public items into client package.
@AndrewSisley AndrewSisley force-pushed the sisley/refactor/I200-client-vs-core branch from 12fb71f to 121e65a Compare March 10, 2022 22:17
Copy link
Member

@jsimnz jsimnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome work! Like the inclusion of Txn too, keeps everything logically together.

@AndrewSisley AndrewSisley merged commit 764be70 into develop Mar 11, 2022
@AndrewSisley AndrewSisley deleted the sisley/refactor/I200-client-vs-core branch March 11, 2022 02:31
shahzadlone pushed a commit to shahzadlone/defradb that referenced this pull request Feb 23, 2024
…ypes (sourcenetwork#277)

* Rename var to collection name

Is clearer

* Rename GetCollection=>GetCollectionByName

Previous fn name was a bit ambigious, particularly due to the existance of GetCollectionBySchemaID, which has the same method signature.

* Remove schemaManager from DB interface

Exposes a large amount of internal generator types and is not nice on a public interface

* Use internal multistore instead of public fn

Pub fn is an oddity that might be removed, this is one of two refs to it

* Use blockstore over dagstore for public interfaces

Removes the defra-only DagStore type/name from the db interface - blockstore concept appears to be larger than defra, and there seems no sensible way to remove it from the interfaces (p2p lib requires one).  If Cid and Block should be considered standard, then blockstore should be too.

* Remove printDebugDb

Is a duplicate of the public PrintDump

* Remove public XYZStore accessors

None of these need to be public.  Leaves us with the (standard-typed, standard-concept) Root and Block stores only.

* Move store and txn interfaces into client

txn is dependent on store, and txn is on the public db+collection interfaces

* Reorder the DB interface

* Remove unused sequence interface

* Remove schema from document

Is not used, and creates extra dependency headaches when moving public items into client package.

* Remove blockstore.GetBlock

* Rename datastores to datastore

* Move store package to datastore dir

* Move store.go into datastore package

* Remove unwanted comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/collections Related to the collections system area/db-system Related to the core system related components of the DB refactor This issue specific to or requires *notable* refactoring of existing codebases and components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants