Skip to content

Update linkset model with spec metadata#1900

Merged
sonalkr132 merged 1 commit into
rubygems:masterfrom
sonalkr132:metadata-linkset
Mar 12, 2019
Merged

Update linkset model with spec metadata#1900
sonalkr132 merged 1 commit into
rubygems:masterfrom
sonalkr132:metadata-linkset

Conversation

@sonalkr132
Copy link
Copy Markdown
Member

We fall back to linkset when spec metadata is absent
(version.metadata["documentation_uri"].presence || linkset&.docs&.presence).
It keeps aside links populated for gems which don't see version updates.
This also meant that gem releases which have not set these links in spec
metadata, will also fallback to old linkset data. Intended behaviour
for not setting these links (for gem releases henceforth) should be that
we do not show the unset/empty links.

Closes: #1899
cc: @kbrock

@sonalkr132 sonalkr132 added this to the Metadata milestone Feb 3, 2019
Copy link
Copy Markdown

@grosser grosser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks alright, thx!

@sonalkr132
Copy link
Copy Markdown
Member Author

I think we need more opinion/clarification on this. Users (as is the case with grosser) may not know that they need to use metadata for links to be set. Putting a flash message on the web page is not among the nicest of migration strategies.
As for #1899, we could create a web page just for deletion of the legacy linkset values.

@kbrock
Copy link
Copy Markdown
Contributor

kbrock commented Feb 7, 2019

Do we want to convert metadata to an all or nothing type thing?

I assume when people learn about the metadata, they add all metadata fields instead of selectively choosing one or two fields. This is total speculation, but possibly verifiable.

We could change all our handling to treat version.metadata.present? as a switch between the 2 metadata sources.

- version.metadata["documentation_uri"].presence || linkset&.docs&.presence
+ if version.metadata.present?
+   version.metadata["documentation_uri"].presence
+ else
+   linkset&.docs&.presence
+ end
  • I'm not sure if we need all the presence checks. It may be a little too cautious. It would read much better without it.
  • We may want to drop linkset records and use the gem's most recent version metadata as the fallback instead. This is assuming that

Copy link
Copy Markdown
Contributor

@kbrock kbrock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks nice as is.

Would be nice to not hardcode changelog in there.

Also, I guess I lean more towards getting rid of linkset rather than updating it.
But I haven't run the numbers yet to know adoption.

Comment thread app/models/linkset.rb Outdated
def update_attributes_from_gem_specification!(spec)
update!(home: spec.homepage)
Links::LINKS.except("changelog", "download").each do |short, long|
self.attributes = { short => spec.metadata[long] }

This comment was marked as outdated.

This comment was marked as outdated.

Comment thread app/models/linkset.rb Outdated
self.attributes = { short => spec.metadata[long] }
end

self.home = spec.homepage if spec.homepage

This comment was marked as outdated.

@kbrock
Copy link
Copy Markdown
Contributor

kbrock commented Feb 7, 2019

Sorry, I'm being verbose on this PR. Also, sorry for changing my mind.

The version.metadata will go stale over time. And I'm guessing we won't update those.
As @grosser saw, linkset goes stale over time. I'd prefer not to update that either.

At first pass, I really liked the idea of updating linkset, so it keeps the metadata up to date.
But now I'm leaning towards protecting us from stale linkset values with the goal of dropping the table.

Short term

only use linkset if version.metadata does not exist at a less granular level.
I reworded the code to make the if (and thus my intention) more obvious.

- # version.medata.field || linkset.field 
- version.metadata.field ? version.metadata.field : linkset.field
+ version.metadata ? version.metadata.field : linkset.field 

long term

When we do decide to drop linkset, I think using latest_version.metadata as fallback would be a good substitute.

- # version.medata.field || linkset.field 
- version.metadata.field ? version.metadata.field : linkset.field
+ (version.metadata || latest_version.metadata).field

The decision to drop linkset would come after a sufficient number of gems with linkset records also have version.metadata values in their most recent version record.

@sonalkr132
Copy link
Copy Markdown
Member Author

version.metadata ? version.metadata.field : linkset.field

A user setting spec.metadata doesn't mean they know about changes to link. metadata field has existed before we repurposed it for gem links. I guess we could be more specific and check if any of the *_uri keys are present. It still won't solve the issue of users wanting to remove an outdated link from their latest version page, without being force to migrate to metadata specification. To quote your own words:

Since the data previously populated is staying in the tables, nothing is going to break or require change. Only developers who want to make their gemspec more rich will need to add these fields. (source)

I would like to meet this expectation and not impose any expectation on users. We tend to consider versions and its data immutable once pushed to our server, showing legacy linkset values on older versions indefinitely is fine.
I proposed changes in this PR with goal of meeting above mentioned expectation. As you can tell, updating all columns of linkset will mean, users will end up deleting links unintentionally. Alternatively, we could expose an API/UI endpoint for linkset deletion or come up with special syntax (empty string, nil etc as value) or something more elegant.

@kbrock
Copy link
Copy Markdown
Contributor

kbrock commented Feb 11, 2019

metadata field has existed before we repurposed it for gem links.

Ugh. Very nice point.

I guess once we removed the edit form for links, I assumed we were on a path to drop the links table.

I like being conservative here. Most(?) gems are maintained by volunteers. So reducing work on them is preferable. But it is nice to have an end goal. Even if it is a multi year and possibly unrealistic plan.

Have you run numbers to see how many gems are at play here? Have that many gems transitioned over to metadata? Do that many gems still rely upon the links table for current version urls?

I'll have to download a recent copy of the database and get some numbers. It would probably add a better context to my aspirations.

Maybe it is time to send out PRs to repos to request adding urls to gemspec metadata. I think @andrew did this from a script a little while ago.

@sonalkr132
Copy link
Copy Markdown
Member Author

Have you run numbers to see how many gems are at play here? Have that many gems transitioned over to metadata?

5% of gems which were updated in last year has used this (source). We are due a blog explaining rational for this feature.

I assumed we were on a path to drop the links table.

linkset table has data we can't replace. I would argue our site is better for the fallback than using latest version. A good number of gems will never see latest version release just fixing links.

We are due a blog post explaining this feature.

@colszowka
Copy link
Copy Markdown

colszowka commented Feb 25, 2019

For Ruby Toolbox I use gem's references to their GitHub repos to make the link and display github stats accordingly, so the availability of a github reference on the gemspec / rubygems.org API is very important for Ruby Toolbox's data quality. Today I wrote a little guide for project maintainers from the Toolbox's perspective, and I tried to emphasize to maintainers that having this in place is very relevant. During my research for this I realized that URLs web UI is not available anymore and the metadata way is the way to go.

From what I've seen when speaking to maintainers in recent years I think that spec metadata links are still rather unknown (which seems to be backed up by the numbers mentioned by @sonalkr132 above) and rarely used, so it would be really great to have some blog post / additional guides to get the word out to gem maintainers that this is how it's supposed to be done now (even rails does not use them)

Because of this, please do not drop legacy links if there are no existing links in the metadata, it would break the Ruby Toolbox's github linking for many gems :) However I believe it's fair to assume that if a gem has links in the metadata the legacy data can be ignored, I can't really think of maintainers being aware of this metadata links feature and then deciding to keep some things here and some things there :)

@grosser
Copy link
Copy Markdown

grosser commented Feb 25, 2019

... another approach could be to only drop a link when it is explicitly set to mil

We are changing metadata links to all or nothing type. Having removed the
edit form, there was no way update sidebar links to remove values shown
from linkset. When user sets any of metadata uri attributes, it is
safe to assume they are migrating away from linkset.
@sonalkr132
Copy link
Copy Markdown
Member Author

sonalkr132 commented Mar 4, 2019

Do we want to convert metadata to an all or nothing type thing?

I believe it's fair to assume that if a gem has links in the metadata the legacy data can be ignored

I have updated my PR to do this. So yes, removal of metadata will take version release too and one would have to set at least one metadata uri attribute.

I also wrote this. @kbrock please feel free to suggest changes to it.

@sonalkr132 sonalkr132 merged commit 81c821f into rubygems:master Mar 12, 2019
@sonalkr132 sonalkr132 deleted the metadata-linkset branch March 12, 2019 17:58
@rubygems-deployer rubygems-deployer temporarily deployed to staging March 12, 2019 18:09 Inactive
@dwradcliffe dwradcliffe temporarily deployed to production March 14, 2019 01:35 Inactive
@sonalkr132
Copy link
Copy Markdown
Member Author

sonalkr132 commented Mar 25, 2019

removal of metadata will take version release too and one would have to set at least one metadata uri attribute.

This screws half the number of gems who adopted URI attributes and it includes quite popular gems. I guess we should exclude homepage_uri from the check? Does removing a legacy link need its own (awkward) guide now?

grosser added a commit to ambethia/recaptcha that referenced this pull request May 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants