Skip to content

Commit

Permalink
Deploy ga4gh/data-repository-service-schemas to github.com/ga4gh/data…
Browse files Browse the repository at this point in the history
…-repository-service-schemas.git:gh-pages
  • Loading branch information
traviscibot committed Sep 11, 2024
1 parent 54b36a7 commit 2883f45
Show file tree
Hide file tree
Showing 5 changed files with 58 additions and 4 deletions.
3 changes: 3 additions & 0 deletions pages/more-background-on-compact-identifiers/openapi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,3 +27,6 @@ tags:
- name: Example DRS Client Compact Identifier-Based URI Resolution Process - Registering a new Compact Identifier for Your DRS Server
description:
$ref: ./tags/ExampleRegisterIdentifier.md
- name: Example How To Handle Extra Metadata for DRS Objects
description:
$ref: ./tags/DRSPlusDataConnect.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
## DRS and Data Connect

With DRS objects it may be necessary to attach additional metadata to your objects. We believe that a change to the API of DRS to include metadata is not in the spirit of the DRS spec and in general DRS should have no knowledge of the metadata associated with the objects. We have found that a good GA4GH standard to support this is Data Connect (https://github.com/ga4gh-discovery/data-connect). The general approach would be to have a Data Connect service on your platform and to include "tables" with the ID matching your DRS ID for the same object. This means that if you have metadata associated with an object id `abcd` (ex. additional information about Compound Objects) all you need to do is request the information from the Data Connect client at `/tables/abcd/info`. There are optional functionalities of Data Connect, such as querying of tables, but we do not explore them or give any recommendations here.

Here is an example of using Data Connect with DRS in the fasp-scripts repository (https://github.com/ga4gh/fasp-scripts/blob/master/notebooks/drs/DRS%20File%20Data.ipynb). In this notebook we can see that data connect is used to get DRS IDs from a platform. Those DRS IDs are then used to gather aditional information about the file that might be necessary for analysis. This is just one example of how DRS and Data Connect can interact with each other to gather information about data on a platform.
21 changes: 17 additions & 4 deletions preview/develop/docs/more-background-on-compact-identifiers.html

Large diffs are not rendered by default.

4 changes: 4 additions & 0 deletions preview/develop/more-background-on-compact-identifiers.json
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,10 @@
{
"name": "Example DRS Client Compact Identifier-Based URI Resolution Process - Registering a new Compact Identifier for Your DRS Server",
"description": "See the documentation on [n2t.net](https://n2t.net/e/compact_ids.html) and [identifiers.org](https://docs.identifiers.org/) for adding your own compact identifier type and registering your DRS server as a resolver. We document this in more detail in the [main specification document](./index.html).\n\nNow the question is how does a client resolve your newly registered compact identifier for your DRS server? *It turns out, whether specific to a DRS implementation or using existing compact identifiers like ARKs or DOIs, the DRS client resolution process for compact identifier-based URIs is exactly the same.* We briefly run through process below for a new compact identifier as an example but, again, a client will not need to do anything different from the resolution process documented in \"DRS Client Compact Identifier-Based URI Resolution Process - Existing Compact Identifier Provider\".\n\nNow we can issue DRS URI for our data objects like:\n\n```\ndrs://mydrsprefix:12345\n```\n\nThis is a little simpler than working with DOIs or other existing compact identifier issuers out there since we can create our own IDs and not have to allocate them through a third-party service (see \"Issuing Existing Compact Identifiers for Use with Your DRS Server\" below).\n\nWith a namespace of \"mydrsprefix\", the following GET request will return information about the namespace:\n\n```\nGET https://registry.api.identifiers.org/restApi/namespaces/search/findByPrefix?prefix=mydrsprefix\n```\n\n*Of course, this is a hypothetical example so the actual API call won’t work, but you can see the GET request is identical to \"DRS Client Compact Identifier-Based URI Resolution Process - Existing Compact Identifier Provider\".*\n\nThis information then points to resolvers for the \"mydrsprefix\" namespace. Hypothetically, this \"mydrsprefix\" namespace was assigned a namespace ID of 1829 by identifiers.org. This \"id\" has nothing to do with compact identifier accessions (which are used in the URL pattern as `{$id}` below) or DRS IDs. This namespace ID (1829 below) is purely an identifiers.org internal ID for use with their APIs:\n\n```\nGET https://registry.api.identifiers.org/restApi/resources/search/findAllByNamespaceId?id=1829\n```\n\n*Like the previous GET request this URL won’t work but you can see the GET request is identical to \"DRS Client Compact Identifier-Based URI Resolution Process - Existing Compact Identifier Provider\".*\n\nThis returns enough information to, ultimately, identify one or more resolvers and each have a URL pattern that, for DRS-supporting systems, provides a URL template for making a successful DRS GET request. For example, the \"mydrsprefix\" urlPattern is:\n\n```\nurlPattern: \"https://mydrs.server.org/ga4gh/drs/v1/objects/{$id}\"\n```\n\nAnd the `{$id}` here refers to the accession from the compact identifier (in this example the accession is `12345`). If applicable, a provider code can be supplied in the above requests to specify a particular mirror if there are multiple resolvers for this namespace.\n\nGiven this information you now know you can make a GET on the URL:\n\n```\nGET https://mydrs.server.org/ga4gh/drs/v1/objects/12345\n```\n\nSo, compared to using a third party service like DOIs and ARKs, this would be a direct pointer to a DRS server. However, just as with \"DRS Client Compact Identifier-Based URI Resolution Process - Existing Compact Identifier Provider\", the client should always be prepared to follow HTTPS redirects.\n\n*To summarize, a client resolving a custom compact identifier registered for a single DRS server is actually the same as resolving using a third-party compact identifier service like ARKs or DOIs with a DRS server, just make sure to follow redirects in all cases.*\n\n**Note: Issuing Existing Compact Identifiers for Use with Your DRS Server**\n\nSee the documentation on [n2t.net](https://n2t.net/e/compact_ids.html) and [identifiers.org](https://docs.identifiers.org/) for information about all the compact identifiers that are supported. You can choose to use an existing compact identifier provider for your DRS server, as we did in the example above using DOIs (\"DRS Client Compact Identifier-Based URI Resolution Process - Existing Compact Identifier Provider\"). Just keep in mind, each provider will have their own approach for generating compact identifiers and associating them with a DRS data object URL. Some compact identifier providers, like DOIs, provide a method whereby you can register in their network and get your own prefix, allowing you to mint your own accessions. Other services, like the University of California’s [EZID](https://ezid.cdlib.org/) service, provide accounts and a mechanism to mint accessions centrally for each of your data objects. For experimentation we recommend you take a look at the EZID website that allows you to create DOIs and ARKs and associate them with your data object URLs on your DRS server for testing purposes.\n"
},
{
"name": "Example How To Handle Extra Metadata for DRS Objects",
"description": "## DRS and Data Connect\n\nWith DRS objects it may be necessary to attach additional metadata to your objects. We believe that a change to the API of DRS to include metadata is not in the spirit of the DRS spec and in general DRS should have no knowledge of the metadata associated with the objects. We have found that a good GA4GH standard to support this is Data Connect (https://github.com/ga4gh-discovery/data-connect). The general approach would be to have a Data Connect service on your platform and to include \"tables\" with the ID matching your DRS ID for the same object. This means that if you have metadata associated with an object id `abcd` (ex. additional information about Compound Objects) all you need to do is request the information from the Data Connect client at `/tables/abcd/info`. There are optional functionalities of Data Connect, such as querying of tables, but we do not explore them or give any recommendations here.\n\nHere is an example of using Data Connect with DRS in the fasp-scripts repository (https://github.com/ga4gh/fasp-scripts/blob/master/notebooks/drs/DRS%20File%20Data.ipynb). In this notebook we can see that data connect is used to get DRS IDs from a platform. Those DRS IDs are then used to gather aditional information about the file that might be necessary for analysis. This is just one example of how DRS and Data Connect can interact with each other to gather information about data on a platform."
}
],
"components": {}
Expand Down
29 changes: 29 additions & 0 deletions preview/develop/more-background-on-compact-identifiers.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -395,4 +395,33 @@ tags:
experimentation we recommend you take a look at the EZID website that
allows you to create DOIs and ARKs and associate them with your data
object URLs on your DRS server for testing purposes.
- name: Example How To Handle Extra Metadata for DRS Objects
description: >-
## DRS and Data Connect
With DRS objects it may be necessary to attach additional metadata to your
objects. We believe that a change to the API of DRS to include metadata is
not in the spirit of the DRS spec and in general DRS should have no
knowledge of the metadata associated with the objects. We have found that
a good GA4GH standard to support this is Data Connect
(https://github.com/ga4gh-discovery/data-connect). The general approach
would be to have a Data Connect service on your platform and to include
"tables" with the ID matching your DRS ID for the same object. This means
that if you have metadata associated with an object id `abcd` (ex.
additional information about Compound Objects) all you need to do is
request the information from the Data Connect client at
`/tables/abcd/info`. There are optional functionalities of Data Connect,
such as querying of tables, but we do not explore them or give any
recommendations here.
Here is an example of using Data Connect with DRS in the fasp-scripts
repository
(https://github.com/ga4gh/fasp-scripts/blob/master/notebooks/drs/DRS%20File%20Data.ipynb).
In this notebook we can see that data connect is used to get DRS IDs from
a platform. Those DRS IDs are then used to gather aditional information
about the file that might be necessary for analysis. This is just one
example of how DRS and Data Connect can interact with each other to gather
information about data on a platform.
components: {}

0 comments on commit 2883f45

Please sign in to comment.