-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a desktop application for bulk ingest and editing of Islandora objects #1172
Comments
On behalf of iCamp in Switzerland:
|
I, too, am interested in this 🙋🏻♂️ |
This is essentially how WWU's Islandora (7) Batch Uploader (an Electron app) works. After It does a bunch of neat stuff with various web-services it builds the MODS XML then SCPs the images and metadata to the Islandora server and triggers a Drush migration via SSH. We could do essentially the same, build a spreadsheet or collection of JSON documents in the app then SCP them over and trigger commands via SSH. What I don't like about this approach is that it requires managing user accounts for the server (or, even worse, a shared account). Personally, I would prefer we stick to what we can trigger via REST (which should be pretty much everything except Drush commands, but those can be ported) so we can use the Drupal authentication instead. |
REST all the way. Only drawback would be pushing up files that exceed server's file size limits.
|
I agree, REST all the way. I think we probably need a chunked file uploader, so we can load files that exceed the servers limits, but it can be chunked and checksum client side then uploaded. Something like PLUpload. Then we need to test the limits of that, hopefully we can support very large batches that way. |
Ooooh... PLUpload looks interesting. Also, I know our staff would like a spreadsheet experience for doing batch metadata editing. I've been looking around and Handsontable looks like the most promising option (it has auto-fill!). While they are open source, they aren't free for commercial use, but they do have a static non-commercial license key we can use. |
I remember seeing the WWU batch uploader at iCamp San Diego, it was amazing. I would be 👍 to some Python with Electron on top. I'm sure we could pick @davidbasswwu's brain for some lessons learned. |
Happy to help if I can. The IBU is the first time I've used Electron/Node/Vue, so there was a steep learning curve, but using that environment has been great to work with, so I think it was worth it. I'll think about lessons learned a bit more and note them here if anything in particular comes to mind, but feel free to ask me anything. |
I just recently uploaded the views bulk edit modules on my local vm so maybe I just haven't found it...but a search function to filter the objects to be "bulk edited" would be nice |
Notes from today's call: @alxp suggested an Open Refine plugin |
@rangel35 I think you can do what you want in Views Bulk Edit by exposing the filters on the view, so the results can be altered on the fly. Then check "all" and have at it! |
thanks @manez, I'll give that a try |
The best approach (imo) would be creating a REST API. The Electron app can then use this API and it leaves the door open for other developers to implement their own input (e.g. command/mobile). |
@ibrahimab there are existing REST endpoints in Drupal to allow for the creation of nodes/media/files we use those and we have created...one (I think) but will keep making/exposing those as needed. In my mind this would be a separate program that would use those various REST API endpoints to do these ingest/edit/delete actions So when you say creating a REST API, you mean define these endpoints as an API? My concern is that some of these endpoints are out of our control (ie. Drupal controls them) so we would need to duplicate them, provide a wrapper in-front of them or have an API tied to a specific version of Drupal. Not that we can't do one of those, just not sure which is best but also maintainable. |
Why a "desktop application" instead of a web application? |
@akuckartz I think because there are/will be further Drupal development which will fill the gaps for web application, whereas (IMHO) this is something that staff who's primary job is to prepare and ingest data would use. You could certainly also make a web application that does this. Personally, I would want to understand what the web application would do that we can't do in Drupal. |
@akuckartz main advantage of a desktop app is that the binary files to be ingested can sit on the user's computer instead of having to be on the Islandora server's filesystem. |
I've been thinking of the advantages of having a command-line tool do the actual interaction with islandora, and the GUI an optional app that sits on top of this CLI tool. Apparently Electron is very happy to interact with Python scripts (user only ever interacts with the GUI), and I've heard they bundle up nicely for distribution. I'd be happy to port https://github.com/mjordan/claw_rest_ingester to Python. If we have both a CLI and an accompanying GUI, we have the best of both worlds. |
How is that a significant advantage? (Still trying to understand this issue.) |
I guess it depends on workflows and tooling at specific sites, but at my institution, the staff that scan the content and prepare the metadata don't have direct access to the Islandora server, so to get the content up to the server to perform a batch load, we need to get someone from Systems involved. Skipping that last step would be a significant gain for us. Prior to adopting Islandora we were a CONTENTdm shop, and its desktop client is still described lovingly even after 3 years, not because it was a great piece of software (it was pretty horrible), but because getting content into CONTENTdm was sooo much easier than getting content into Islandora. |
@akuckartz, for us it partly a resource load distribution. We can have 10+ students or staff working with large collections of materials at one time adding metadata to hundreds of items. That puts a lot of load on a single machine. If they are all continuously sending updates and retrieving updated records. Sure, we could add more servers and place them behind a load balancer OR we can leverage the staff's desktop resources while they do their metadata creation and updates. When they are done with that project they can add their collections to a rate-limited queue. No one's work gets slowed down at busy times and pushes to the server can happen during slow times. Their machines could potentially also generate derivatives. |
Ninja'd by @mjordan. |
@mjordan @seth-shaw-unlv Thanks, now I understand. I was not aware of the resource load distribution issue. |
I've ported the CLAW REST Ingester to Python so it can be wrapped in an Electron app: https://github.com/mjordan/islandora_workbench. |
@mjordan I'll take a swing at integrating that into Electron. Working on getting Claw up and running now... |
@davidbasswwu awesome! Workbench is a work in progress and I plan on adding a bunch more CRUD functionality, but what's there should be good enough to get you going. I'm happy to test on Windows and Linux if you want. |
@davidbasswwu and @mjordan, I'm happy to test OS X and Windows. |
I'm taking next week off, but I did make some good progress today, so I'll let you know when I have something ready. |
@davidbasswwu I've added delete, update, and add_media ability to workbench. If you pull in the latest version, you will need to add |
Over in mjordan/islandora_workbench#8, @seth-shaw-unlv has created a proof of concept implementation of jExcel that can be integrated into the Electron desktop app, providing a GUI CSV editor for end users. His implementation queries Islandora to get taxonomy terms (in the screenshot below, the vocabulary is "Access Terms") which are implemented in the GUI as spreadsheet pick lists: The disk icon in the upper-left corner saves the file to your local disk. I can imaging a workflow like:
THIS IS AWESOME. Sorry for shouting. To try this yourself, get the code at @seth-shaw-unlv's repo at https://github.com/seth-shaw-unlv/islandora-jexcel-scratchpad. |
@manez you like this? |
I could toss a few more emoji on there to be clear 😄 I'm imagining being able to add this to the Admin track at iCamps. |
Keep in mind that it is just a proof of concept so far. There is a lot of potential here. Validation can happen on a cell value change so you know there is a problem immediately. Also, you could probably build this directly into your Drupal as a mass editor. |
@seth-shaw-unlv yes, that is another option... interesting. |
@mjordan I believe what you meant to shout was |
Yeah... just wait till it can load rows from views results 🚀 |
I'm interested in seeing where this goes. Two projects that may be relevant:
|
@kayakr excellent pointers. tus certainly looks like it would be useful, and I've already noted it over at the workbench repo, |
@kayakr thanks for the pointers! Both of these look interesting. I will certainly take inspiration from DataCurator. It looks like DataCurator is tagging an older version of the JavaScript spreadsheet library Handsontable (^3.0.0). I was initially planning on using Handsontable until they switched their licensing model as of version 7 and I wasn't confident on relying on forked versions from version 6. |
Hi @seth-shaw-unlv , I am interested in seeing this app but I don't have permissions... please can you give me a hint. |
Hi @mrtngrsbch - I realized that my app has some security vulnerabilities that are going to be very hard to fix, so I closed the repository and I'm now working on converting it to a web application. I plan to make that available once it's ready, but for now, I would recommend not using IBU. |
@mrtngrsbch - If you want to see what the IBU desktop app looks like, there is a video on https://mabel.wwu.edu/ibu (skip to 2:50). The web version should work and look just like the desktop version. |
OMG @davidbasswwu ! Thanks so much David, you have impressed me with the progress of this tool... and I take the opportunity to give you back my current sandbox & proof of concept. In a few words. Inject the metadata [ISAD(G), DC, VRA...] with ExifTool by Phil Harvey (also Adobe Bridge) and populate the contents with the EXIF module.
Next step: Extending Drupal's EXIF module to read new metadata presets [https://museosabiertos.org/node/85], for example:
Context: So we have decided (we are still deciding *) that the fastest method to have an online publication, with its standardized metadata, is to first help the museum community to gently climb a step up the ladder. Islandora would be paradise itself, but still far away.
Indeed, I have participated in this thread because I am interested in a bulk upload. best |
Thanks @mrtngrsbch! It sounds like you are working on some neat things too. In case you're not already familiar, check out https://github.com/Islandora-Collaboration-Group/ISLE which may help with implementation, and https://islandora.ca/community for some additional ways to get involved. 👍 |
Thanks @davidbasswwu, I think I still need to read and experiment on my own, before touching dad's toys. ;-) |
At iCampEU there was a great discussion of a desktop application for managing Islandora content. This wish has also been expressed in mjordan/claw_rest_ingester#8, and this Islandora Conference session will also generate discussion on this.
One of the use cases that came up at the Camp was the need for a repo manager to update a field value on large numbers of objects, large here being more than is convenient to do using Views Bulk Edit within a browser, for example several thousand objects.
@seth-shaw-unlv has expressed interest in working on this, and @ibrahimab, who was at the Camp, has also. Both of these community members mentioned that Electron would be an excellent environment to build this in so it can be used easily on Windows, Mac, and Linux desktops.
I suggest adding use cases for this tool so we can start refining user interfaces, interactions with Islandora 8.
Thanks EU Campers for the discussion!
The text was updated successfully, but these errors were encountered: