Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signposting on Dataverse 5.9 #8424

Closed
wants to merge 80 commits into from
Closed

Conversation

vicding-mi
Copy link
Contributor

@vicding-mi vicding-mi commented Feb 15, 2022

What this PR does / why we need it:
This PR adds signposting to Dataverse. Signposting enables easier machine access to datasets and hence increases accessibility.
Which issue(s) this PR closes:
N/A
Closes #5962
N/A
Special notes for your reviewer:

Suggestions on how to test this:
This PR does not change any human readable part of Dataverse.

curl -I http://linkto.dataverse/dataset to see the signposting L1 header

Link to L2 linkset can be found in L1 header and can be directly viewed in browser

Does this PR introduce a user interface change? If mockups are available, please link/include them here:
No
Is there a release notes update needed for this change?:
in folder doc/release-notes/8424-signposting.md
Additional documentation:

Vic Ding and others added 30 commits November 4, 2020 12:32
* uses external js
* adding signposting configuration setting
* signposting poc from eko
* error fix
* move the signposting location in code
* Signposting with MicroSettings cleanup v1 xhtml revert to vanilla
* clean up
* bug fix and add default file type; TODO: Bug on the baseurl (empty string)
* hide type when do not use default
* set default when no config is set for signposting
* modification according to reviews
* move long json string from code to bunddle
* allow empty config on the level 2 profile

Co-authored-by: Kevin Condon <[email protected]>
Co-authored-by: ekoi <[email protected]>
@vicding-mi vicding-mi requested a review from qqmyers February 22, 2022 14:56
@vicding-mi
Copy link
Contributor Author

hey guys, not sure how to carry on the process/discussion. Can @qqmyers please give some guidance? I changed according to the review some of the points, while for the other points I tried to answers the review questions and ask for suggestions. Many thanks!

@pdurbin pdurbin self-assigned this Mar 7, 2022
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.05%) to 18.816% when pulling 5fe6a8a on vicding-mi:signposting-5.9 into 0d853b7 on IQSS:develop.

Copy link
Member

@qqmyers qqmyers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to add some feedback but it looks like there are still some open questions/change requests too. I'll contact Herbert about which metadata file(s) should be referenced.
Most of the discussion is about what should be configurable and if it's worth refactoring some parts. Neither are things that I'd hold this up for so I think when the other issues are resolved.

"CC0": "https://creativecommons.org/licenses/cc0/"
},
"describedby": {
"doi": "https://doi.org/",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GlobalId class provides a toURL method that uses public strings and will assign the doi or handle version according the the ID type.
So I think you could just use dataset.getGlobalIdentifier().toUrl() where you need the url.

"defaultFileTypeValue": "https://schema.org/Dataset",
"maxItems": 5,
"maxAuthors": 5
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the :Signposting setting you're giving people a very easy way to change all of these parameters (or to put '{}' as the value - not sure if that just turns off signposting or breaks the dataset page).

In general, I'm OK with leaving in any parameters you think are needed. My concern is really that by exposing these here you're not just making it possible for Herbert to change them but everyone else as well. If I as a Dataverse installation admin change the type from application/vnd.citationstyles.csl+json or use a different filetypevalue than schema.org/Dataset, would I still be 'Signposting-compliant'? The ones that make the most sense to me are the maxItems/maxAuthor ones - a local admin might want to show 10 authors which would still be valid. The rest seem more like things that should be constants in the code, or only read from the properties file if there has to be a way to change them without recompiling code.

That might suggest that just having a :Signposting setting that lets you change maxItems/maxAuthors (defaulting to 5 if you can't parse) and doesn't otherwise let you mess-up/disable Signposting overall might be better.

As far as I'm concerned, it's up to you whether this makes sense - I doubt many people will even touch the setting since it works by default anyway.

* `maxAuthors` Same with `maxItems`, `maxAuthors` sets the max number of authors to be shown in `level 1` profile.
If amount of authors exceeds this value, no link of authors will be shown in `level 1` profile.

Note: Authors without author link will not be counted nor shown in any profile/linkset.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully @pdurbin has a suggestion as to where in the guides would be best.

One thing I'll note about the above is that it seems to mix things I can change (e.g. maxAuthors) with info about signposting and fields like AboutPage that I can't change. It might be worthwhile to just give an example level 1 header and then just talk about the parts you might want to change, or at least indicate what is changeable when you list fields.

} else {
//TODO: if a user has access to this file, they should be given the swift url
// perhaps even we could use this as the "private url"
fileDownloadUrl = swiftIO.getTemporarySwiftUrl();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think URLs like http://localhost:8080/api/access/datafile/15 would be best practice. I don't think you get that from the swiftIO.getTemporarySwiftUrl() call. That download URL - /api/access/datafile - comes from the api/Access.java class where I think you could add a header. (Similarly, I think you could respond to a HEAD instead of GET there as well).

Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vicding-mi thanks for this pull request! I'm excited for Signposting support in Dataverse. I haven't had a chance to do a proper review yet but I'm at least leaving a comment about the config that's been on my mind.

* The level 2 linkset can be fetched by visiting the dedicated linkset page for
that artifact. The link can be seen in level 1 links with key name `rel="linkset"`.

The configuration is stored as JSON string in the `Bundle.properties` file, key name is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the config in Bundle.properties? I see elsewhere that @qqmyers has suggested a database setting called :Signposting, which sounds like a better approach to me.

@vicding-mi
Copy link
Contributor Author

Hi @pdurbin and @qqmyers, many thanks for your comment.
bundle file is the solution I got from the proof of concept, so I sticked to it. Since db is a better place for storing config, I will change it and update you guys.

@qqmyers , thanks for your review, will revise accordingly.

@pdurbin
Copy link
Member

pdurbin commented Mar 10, 2022

@vicding-mi great! Thanks! Please feel free to create as many database settings as you need. I think the record is all the "PV" settings: :PVMinLength :PVMaxLength :PVNumberOfConsecutiveDigitsAllowed :PVCharacterRules :PVNumberOfCharacteristics :PVDictionaries :PVGoodStrength:PVCustomPasswordResetAlertMessage 😄

pdurbin added a commit that referenced this pull request Mar 10, 2022
Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments and some doc suggestions in the form of a pull request.

* The level 2 linkset can be fetched by visiting the dedicated linkset page for
that artifact. The link can be seen in level 1 links with key name `rel="linkset"`.

The configuration is stored as JSON string in the `Bundle.properties` file, key name is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the config in Bundle.properties? I see elsewhere that @qqmyers has suggested a database setting called :Signposting, which sounds like a better approach to me.


There are 2 Signposting profile levels, level 1 and level 2. In this implementation,
* level 1 links are shown in
HTTP header, which can be fetched by `curl -I https://domain/link-to-article`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would link-to-dataset make more sense than link-to-article?

* level 1 links are shown in
HTTP header, which can be fetched by `curl -I https://domain/link-to-article`.
* The level 2 linkset can be fetched by visiting the dedicated linkset page for
that artifact. The link can be seen in level 1 links with key name `rel="linkset"`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The term "artifact" might be confusing here. Are we talking about a dataset?

* `maxAuthors` Same with `maxItems`, `maxAuthors` sets the max number of authors to be shown in `level 1` profile.
If amount of authors exceeds this value, no link of authors will be shown in `level 1` profile.

Note: Authors without author link will not be counted nor shown in any profile/linkset.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no FAIR section but I just created vicding-mi#1 to suggest creating a "Discoverability" page in the Admin Guide. @vicding-mi if you like this, please feel free to merge it and add more docs. Thanks.

Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, great pull request. I'm excited to have Signposting support in Dataverse!

One thing I want to make clear from the start is that core developers can probably jump in and hack on the code and docs to get this over the finish line. Please just let us know.

As I mentioned at #8424 (comment) I made a pull request (into this one) to suggest where to document Signposting.

I also left some other feedback about a <null> I'm seeing, about tests, agreeing with Jim about the ok method, etc.

* @param bld
* @return HTTP OK response which contains the json structure of linkset
*/
protected Response okLinkset( JsonArrayBuilder bld ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree. Instead of a proliferation of "ok" methods like okLinkset, it would be better to delete it from this PR and use ok( JsonObjectBuilder bld ) method like this:

old:

return okLinkset(JsonPrinter.jsonLinkset(new SignpostingResources(systemConfig, dsv, signpostingConf)));

new:

return ok(Json.createObjectBuilder().add("linkset", JsonPrinter.jsonLinkset(new SignpostingResources(systemConfig, dsv, signpostingConf))));

}
}

logger.info(String.format("identifierSchema: %s", identifierSchema));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be lowered to logger.fine.

visibleAuthorCounter++;
// return empty if number of visible author more than max allowed
if (visibleAuthorCounter >= maxAuthors) return "";
singleAuthorString = "<" + authorURL + ">;rel=\"author\"";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why but I'm getting null for authorURL like this (I'm on 5fe6a8a):

Link: <null>;rel="author", <https://doi.org/...

This is just a local dataset in dev. Here's the curl command:

curl -s -i http://localhost:8080/dataset.xhtml?persistentId=doi:10.5072/FK2/KQRKB8

<c:if test="#{not empty DatasetPage.getSignpostingLinkHeader()}">
<f:event type="preRenderView" listener="#{facesContext.externalContext.response.setHeader('Link', DatasetPage.getSignpostingLinkHeader())}" />
</c:if>
<!-- End add Signposting-->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaving this comment here at the end because this is where we usually have tests. Can some tests be written for this code?

@pdurbin pdurbin assigned vicding-mi and unassigned pdurbin Mar 11, 2022
@pdurbin pdurbin mentioned this pull request Apr 5, 2022
qqmyers pushed a commit to GlobalDataverseCommunityConsortium/dataverse that referenced this pull request Aug 15, 2022
qqmyers pushed a commit to GlobalDataverseCommunityConsortium/dataverse that referenced this pull request Aug 15, 2022
@qqmyers qqmyers mentioned this pull request Sep 16, 2022
@qqmyers
Copy link
Member

qqmyers commented Sep 16, 2022

Closing in favor of #8981 which builds on the commits here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Signposting support in Dataverse
7 participants