Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UUID to Fedora Path does not work inside transactions #185

Closed
whikloj opened this issue Apr 12, 2016 · 18 comments
Closed

UUID to Fedora Path does not work inside transactions #185

whikloj opened this issue Apr 12, 2016 · 18 comments
Assignees

Comments

@whikloj
Copy link
Member

whikloj commented Apr 12, 2016

The idea of using a UUID matched via the triplestore to a fedora path assumes that it is indexed. In the case of a transaction, nothing is done until the transaction is committed.

So...

> curl -i -XPOST http://localhost:8282/islandora/transaction
HTTP/1.1 201 Created
Date: Tue, 12 Apr 2016 15:48:44 GMT
Server: Apache/2.4.18 (Ubuntu)
Location: http://localhost:8080/fcrepo/rest/tx:9065bb33-803a-4b1c-8bb8-8238f83560c5
Expires: Tue, 12 Apr 2016 15:51:44 GMT
Cache-Control: private, must-revalidate
Content-Length: 0
Connection: close
Content-Type: text/html; charset=UTF-8

> curl -i -XPOST "http://localhost:8282/islandora/collection?tx=tx:9065bb33-803a-4b1c-8bb8-8238f83560c5"
HTTP/1.1 201 Created
Date: Tue, 12 Apr 2016 15:49:48 GMT
Server: Apache/2.4.18 (Ubuntu)
Cache-Control: must-revalidate, private
Location: http://localhost:8282/islandora/resource/05c0224b-4ace-4092-a8f6-603c94260d08
Content-Length: 77
Connection: close
Link: <http://localhost:8282/islandora/resource/05c0224b-4ace-4092-a8f6-603c94260d08/members>; rel="hub"
Content-Type: text/plain; charset=UTF-8

http://localhost:8282/islandora/resource/05c0224b-4ace-4092-a8f6-603c94260d08

> curl -i -XPOST "http://localhost:8282/islandora/collection/05c0224b-4ace-4092-a8f6-603c94260d08/member/5fa71ed6-f5f3-4831-8662-b1b132815a52?tx=tx:9065bb33-803a-4b1c-8bb8-8238f83560c5"
HTTP/1.1 404 Not Found
Date: Tue, 12 Apr 2016 15:51:27 GMT
Server: Apache/2.4.18 (Ubuntu)
Cache-Control: no-cache
Content-Length: 89
Content-Type: text/html; charset=UTF-8

Failed getting resource Path for "05c0224b-4ace-4092-a8f6-603c94260d08" from triple store

@Islandora-CLAW/7-x-2-x-committers : Ideas?

@whikloj
Copy link
Member Author

whikloj commented Apr 12, 2016

I am trying to think how you can resolve this in any fashion? Do we use a SQLite db?

We would need to handle transactions as well as regular resources.

So
curl -XPOST http://localhost:8282/islandora/collection generates a UUID and POSTs to Fedora, then links the UUID <-> fedora path in SQLite. No problem.

In a transaction, the above returns a transaction prefixed path, so we use that until the transaction is committed (or rolled back) then we need to update with the un-prefixed path.

How do we deal with abandoned transactions? Add a timestamp and update it for each action on a transaction, wipe them after N minutes of inactivity?

@whikloj
Copy link
Member Author

whikloj commented Apr 12, 2016

Or we could submit our own quad to the triplestore from the microservices.
<http://localhost:8080/fcrepo/rest/tx:1234abcd-1234-abcd-78gh-56op78lm/path/to/resource> nfo:uuid "05c0224b-4ace-4092-a8f6-603c94260d08"^^xsd:string

Once it is committed or rolledback we can delete it as either:

  1. The transaction is committed and the normal triple is added to the triplestore
  2. The transaction is rolled back and the resource was not created.

Also we don't have to change the behaviour for normal actions, only items created in transactions.

@acoburn
Copy link
Contributor

acoburn commented Apr 12, 2016

The advantage of using the triplestore for this seems mostly to be the fact that the triplestore already exists in the infrastructure. For this particular session-based interaction, I'd highly recommend using something like Redis that already supports key expiry. Also, all operations are entirely atomic, which means one less thing to worry about in a distributed context. The downside is that it's one more thing to install and keep running.

@whikloj
Copy link
Member Author

whikloj commented Apr 12, 2016

Thanks @acoburn, I figured there would be something out there. Redis looks very simple and easy. I'll see about implementing it as simple key (UUID) value (fedora path) pair for now.

@whikloj
Copy link
Member Author

whikloj commented Apr 15, 2016

Okay, so I played around with this for a bit and I have two problems. I have solutions but I'm sure there are better ways. So let me know if I am crazy or if you have a suggestion.

  1. I need to link all the UUID -> Fedora URI pairs together with the transaction ID. Because any action on a transaction ID should update the expiry time on ALL the UUID -> F4 URI pairs acted on within that transaction.

So I think it is probably easiest to generate a JSON list of values ala:

{
   [
      { "UUID-1234" : "http://localhost:8080/fcrepo/rest/obj1" },
      {"UUID-5678" : "http://localhost:8080/fcrepo/rest/obj2" }
   ]
}

Then store that in Redis with the transaction ID as the key

> SET "tx:86dd0891-d975-42d8-8837-a24ad6041b59" "$json-array"

This would mean we would pull the entire object for each action on a transaction, but I think it is the easiest as we set a single expiry in Redis and update it if the transaction is acted upon.

  1. When you PUT/POST to Fedora, we don't have the Fedora URI and UUID at the same time. After the PUT/POST action we need to use the Location: header and get the UUID from the object.

So I am thinking about a super simple transform like:

@prefix nfo : <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo/v1.2/>
id      = . :: xsd:string ;
uuid  = nfo:uuid :: xsd:string ;

So after an object is created in Fedora we can get the transform

curl -i -H"Accept: application/ld+json" "http://localhost:8080/fcrepo/path/from/location/fcr:transform/uuidTransform"

This gives us the path and UUID in a simple JSON-LD object.

Which we add using the data structure in 1.

Thoughts?

@ruebot
Copy link
Member

ruebot commented Apr 16, 2016

That makes sense to me.

@DiegoPino
Copy link
Contributor

Hi sadly stil on the train and i have a lot of thoughts on this. How will you manage a tx session that expired or was rolled back, will to clean that how from reddis? If we are keeping resources involved in a transaction checked (means we keep track on them) there are a lot of better/faster ways than reddis right now (on year 2016) so we could maybe explore the options/discuss them before going further with this? Also, since we are passing TX around, assuming all resources belonging in a common tx will be done using the same client/server pair(means coming from the same source and using the same microservice..i hope we are not try resolve right now multi service same tx) we can simply make use of cookies and headers right? We can even use silex/symphony caching options to avoid putting yet.another.service right now to maintain. Train moves a lot!.

@ruebot ruebot added this to the Community Sprint - 06 milestone Apr 18, 2016
@whikloj
Copy link
Member Author

whikloj commented Apr 18, 2016

So the idea is that Redis allows you to add an expiry to your entries.

So if nothing happens on them then they are automatically removed. I would also only use this for actions in a transaction and as a fail-over from the triplestore, because I would like to have the triplestore as the main source of this information.

I am dealing only within a single transaction, but if you had two clients using the same transaction ID. Then they could refer to objects the other generated as they would both get back the same object from Redis. Changes that each makes, might cause a problem however.

I am abstracting this with an interface so we should be able to put any implementation behind it that you want. So if you've got a better one, we can happily make that the default. Redis seems like a nice easy solution for now.

@whikloj
Copy link
Member Author

whikloj commented Apr 18, 2016

This is very early, but just to give an idea of what I was thinking.
https://github.com/Islandora-CLAW/chullo/compare/master...whikloj:issue-185?expand=1

@whikloj
Copy link
Member Author

whikloj commented Apr 18, 2016

Also, that doesn't do anything special (hence the name KeyCache). It could be made specialized and have it deal with the intricacies of the information internally.

For example: $service->get($txID, $uuid) could get whatever way you store the transaction information and locate it. So the organization of the information could be however works best for a specific use-case.

@whikloj
Copy link
Member Author

whikloj commented May 16, 2016

@DiegoPino I have a problem. Where this keyCache would be useful is inside the idToUri. So if the triplestore query returns 0 rows then check the keyCache.

But I need access to the transaction ID.

We could pass the TX ID in each time, but is there another way to access the Request?

@DiegoPino
Copy link
Contributor

DiegoPino commented May 17, 2016

@whikloj, of course.
->convert always gets as second argument the Request object. Just change the idToUri signature and add a Request type $request param (no need to change the call itself)
e.g
$callback = function ($post, Request $request) {
return new Post($request->attributes->get('slug'));
};
More fun here:
http://silex.sensiolabs.org/doc/master/usage.html

(Convert callbacks can be also services, but i suspect you don't like services so much! 👍 )

@whikloj
Copy link
Member Author

whikloj commented Jun 21, 2016

Okay, so here are the 3 for this now.

The cache is super simple, but the idea is you could easily replace it with APC, Memcache, Redis, etc using the same (https://github.com/moust/silex-cache-service-provider) library.

@ruebot
Copy link
Member

ruebot commented Jun 24, 2016

@ruebot ruebot closed this as completed Jun 24, 2016
@whikloj
Copy link
Member Author

whikloj commented Jun 24, 2016

FINALLY!!!!!!!!

I feel like dancing...

@DiegoPino
Copy link
Contributor

Found also an extra use for you cache Jared... in the future we could even block a resource to be touched by another TX or direct call if it's in the cache. Good work!

@whikloj
Copy link
Member Author

whikloj commented Jun 24, 2016

We might want to refactor the cache, and namespace the different caches then. Like have my UuidCache class prepend "uuidcache:" to the keys. To allow for multiple separate caches.

@DiegoPino
Copy link
Contributor

cool, next sprint

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants