-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract software metadata from the web (service endpoints and/or webpages) #92
Comments
It may be worth identifying if there are already CLARIAH services and websites that make their tool metadata available in other ways that may be harvestable (i.e. published by the web endpoint itself, not some other higher-order registry). An important example currently is CLAM, widely used for WP3 webservices and outputting metadata in its own XML format; I will make that output an OpenAPI Info block too (proycon/clam#32). Please comment if you can answer what metadata descriptions certain CLARIAH partners are currently using? |
Should the type of service instance be documented with the software and/or be derived from the service definition as it is retrieved over HTTP by the harvester? Example: the fact that software |
I am indeed hoping that the type of the service can be automatically extracted, and once extracted I want to represent these webservices using the pending WebAPI proposal ( schemaorg/schemaorg#2635 , schemaorg/schemaorg#1423) . The type of instance would fit their I must also add describing web services is still relatively low on the priority list. Describing the From the perspective of the harvester and the metadata it produces. I see the source code metadata as the primary representation. This |
…h-plus#92) WebAPI still needs to be worked out in more detail
The harvesting pipeline that is being implemented currently (#33) is set up in such a way that the source-code is always the most authoritative place for holding software metadata descriptions.
However, there is a distinction between the software source code and service instances of that software, and the latter may add some metadata that is not applicable to the source as such. Instances are hosted on a particular URL and may have particular access limitations. We want to make that distinction explicit.
In the tool source registry for the harvester, we therefore provide the link to the source code alongside the web endpoints. The harvester first queries the source code repositories and converts the metadata in there to schema.org/codemeta's
@SoftwareSourceCode
, then it queries the web endpoints and enriches the metadata in the way proposed in codemeta/codemeta#271 .How can websites and webservices provide metadata? I want to support the following for the harvester pipeline:
<script type="application/ld+json">
block, with@type
any subclass ofschema:SoftwareApplication
or any of the other ones proposed in Linking source code to software applications (entrypoints and service endpoints aka SaaS) with regard for interface type codemeta/codemeta#271, includingschema:WebAPI
andschema:WebPage
.meta
tags in the HTMLhead
The text was updated successfully, but these errors were encountered: