Skip to content

Conversation

@dobertRowneySr
Copy link
Collaborator

@dobertRowneySr dobertRowneySr commented Sep 22, 2021

as #2677.
Query node mappings are left to @Lezek123

@shamil-gadelshin shamil-gadelshin changed the base branch from storage_v2 to giza October 6, 2021 15:22
@mnaamani
Copy link
Member

mnaamani commented Oct 7, 2021

I think in the Sumer release we were a little over ambitious about what features in the new content directory we were going to implement.

The only type that was modified and is still in use after these changes is the struct Video {} type (or VideoRecord alias).

Two related changes must be done
1 . in the on_runtime_upgrade() clear the VideoById storage map which we are intending to do anyway as part of content migration step.
2. update the Video type definition in joystream/types package to reflect removal of the field in_series

Copy link
Member

@mnaamani mnaamani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only minor merge conflict needed, otherwise LGTM.
Left instructions on related changes needed. #2680 (comment)

@bedeho
Copy link
Member

bedeho commented Oct 7, 2021

in the on_runtime_upgrade() clear the VideoById storage map which we are intending to do anyway as part of content migration step.

First off, I think clearing is a linear time operation in number of mappings, so that may not work with the number of videos we have. It could also be one of those issues that work in a small staging network, but with 100 validator real network it possibly may not.

Second, could a simpler fix here be to just change the name of the map in the new content directory? There is actually a good case for going away from the concept of calling stuff in channels Video, but rather Content, allowing us to call the map ContentById, as suggested here #2739. This idea is perhaps a bit risky to do now, just putting it out there.

@mnaamani
Copy link
Member

mnaamani commented Oct 8, 2021

First off, I think clearing is a linear time operation in number of mappings, so that may not work with the number of videos we have. It could also be one of those issues that work in a small staging network, but with 100 validator real network it possibly may not.

Agreed. it is risky to do a time consuming operation like clearing the map entries explicitly.

Second, could a simpler fix here be to just change the name of the map in the new content directory? There is actually a good case for going away from the concept of calling stuff in channels Video, but rather Content, allowing us to call the map ContentById, as suggested here #2739. This idea is perhaps a bit risky to do now, just putting it out there.

Yes that would be very neat way, and we have used this approach before when updating from one version of our content data_directory to the next (when we changed fields of a DataObject). We would also have to remember to reset the next_video_id back to 0. This renaming trick is only practical to do once of course when we settle on the new name.

A more general approach, if we expect to want to clear content again in another upgrade is to add a new pointer into the map. Currently we have one pointer (to the end of the map) the next_video_id and the first video id is implicitly 0. If we introduce a new first_video_id = Option<VideoId> then when we upgrade and want to "clear" the map, we just set first_video_id to None, next_video_id remains unchanged. All extrinsics will need to add one extra check that the id being referenced is in the "correct" range.

One more simpler way is to just delete the first entry, and set next_video_id to 0 and just add checks the id is less than next_video_id.

For both approaches the impact on disk storage is the same. We can never really recover the on disk storage usage. But in the former approach the chain state size is reduced.

@mnaamani
Copy link
Member

mnaamani commented Oct 8, 2021

There is a problem actually with my suggested approach of just updating the "pointers". The entries in the map still exist, which means if runtime code does VideoById::exists() are not reliable, iteration on the map which we shouldn't do anyway are not safe, and over RPC getting all the keys or entries of a map will be inconsistent, so applications if reading state from the map would have to also be constrained to reading values in the correct range.

We should look to see if there is a built in api method on the map to clear it that is O(1), or just rename the map as you suggested.

@dobertRowneySr
Copy link
Collaborator Author

dobertRowneySr commented Oct 8, 2021

In order to solve the on_chain_upgrade problem I would proceed as follows:
Change from VideoById to ContentById. So the map clearing on on_chain_update should not be needed.
In a future release we (I) can update the names VideoRecord and all the video-extrinsics, and all the relevant fields. In order to realize the idea outlined in #2739.
Sounds ok @bedeho @mnaamani ?

@mnaamani
Copy link
Member

mnaamani commented Oct 12, 2021

In order to solve the on_chain_upgrade problem I would proceed as follows: Change from VideoById to ContentById. So the map clearing on on_chain_update should not be needed. In a future release we (I) can update the names VideoRecord and all the video-extrinsics, and all the relevant fields. In order to realize the idea outlined in #2739. Sounds ok @bedeho @mnaamani ?

I think that is acceptable.

On thing we can ask also is should we in on_runtime_update() reset NextVideoId value to 0? I don't think it is actually necessary for runtime code to work correctly, nor for client applications, as long as there is no code that assumes the first value in a map has id of 0. Keeping it unchanged would make it possible to do some basic static redirection or mapping from old videos to new ones in Atlas since we would not be reusing the same id values. reference: #2732 (comment)

@mnaamani
Copy link
Member

We should look to see if there is a built in api method on the map to clear it that is O(1), or just rename the map as you suggested.

Would remove_all() work? https://substrate.dev/rustdocs/latest/frame_support/storage/types/struct.StorageMap.html#method.remove_all

@bedeho
Copy link
Member

bedeho commented Oct 12, 2021

We should look to see if there is a built in api method on the map to clear it that is O(1), or just rename the map as you suggested.

I have explicitly asked about this, and Shawn@Parity said he thought that was impossible in principle. Now I think that is a bit strong, but that almost certainly means it's not possible in Substrate.

@mnaamani
Copy link
Member

I have explicitly asked about this, and Shawn@Parity said he thought that was impossible in principle. Now I think that is a bit strong, but that almost certainly means it's not possible in Substrate.

I got the same feedback. remove_all() is O(n).

It should be okay for deleting the ChannelById map since that is (at time of writing) only 500 items (compared with approx 8000 items in VideoById) on current network.

We will have to test on staging network. But it is still risky to assume we will get same results in production.

@bedeho
Copy link
Member

bedeho commented Oct 12, 2021

We will have to test on staging network. But it is still risky to assume we will get same results in production.

Exactly, can we really do this if we dont know it wont be a disaster? Perhaps we could ask someone whether 500 database writes is plausible on standard hardware?

@mnaamani
Copy link
Member

We will have to test on staging network. But it is still risky to assume we will get same results in production.

Exactly, can we really do this if we dont know it wont be a disaster? Perhaps we could ask someone whether 500 database writes is plausible on standard hardware?

I can ask.

One final approach can be to rename the entire module in the runtime implementation itself from Content to Content2 here thereby clearing the entire state of the module. This is not very pretty because it affects the api name space so javascript applications would have to change from api.query.content.nextChannelId() to api.query.content2.nextChannelId(), the query-node manifest would have to updated, code in Atlas. If this will be a repeated occurance, I'm sure we can refactor code to at least use a pattern like api.query[CONTENT_MODULE_NAME].nextChannelId() to make subsequent name change easier.

@dobertRowneySr dobertRowneySr changed the base branch from giza to giza_staging October 20, 2021 08:09
@dobertRowneySr
Copy link
Collaborator Author

Sorry for the confusion. Here's what I did:

  • Agreed with @mnaamani to rebase on giza-staging and change base branch on this PR
  • The map will remain VideoById on this PR and I will change it to ContenById on the next PR dealing with the map clearing
  • updated types have been generated

@mnaamani mnaamani marked this pull request as draft December 21, 2021 10:32
@dobertRowneySr
Copy link
Collaborator Author

Closing in favor of #3137

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants