Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request - flatten collection objects #4707

Closed
dustymc opened this issue May 23, 2022 · 44 comments
Closed

Feature Request - flatten collection objects #4707

dustymc opened this issue May 23, 2022 · 44 comments
Labels
Enhancement I think this would make Arctos even awesomer! Priority - Wildfire Potential ignore this at everyone's peril, may smolder for now ... Priority-High (Needed for work) High because this is causing a delay in important collection work..
Milestone

Comments

@dustymc
Copy link
Contributor

dustymc commented May 23, 2022

EDIT: Here's a migration path spreadsheet summarizing the conversation below: https://docs.google.com/spreadsheets/d/1jfCdsMV4e4dDU-G1guYbyLBqTfuPSNZB_1zjDXol8NU

dependencies are

#6297
#6691

Original below


Is your feature request related to a problem? Please describe.

From #4706, the join to coll_object is surprisingly slow and hard to tune

Describe what you're trying to accomplish

Go fast.

Describe the solution you'd like

There are now only two types of collection objects in Arctos, and I can't envision having more. They're very different shapes, and I don't think there's any real reason to normalize the core. (Both need to join to table coll_object, but not everything in there is appropriate to both.) I don't think there's any obvious drawback to flattening, and doing so should simplify the model and make things a bit faster. Specifically,

  1. Move to table specimen_part fields
    • coll_object.entered_person_id (or not?)
    • coll_object.coll_object_entered_date (or not?)
    • coll_object.last_edited_person_id (or not?)
    • coll_object.last_edit_date (or not?)
    • coll_object.coll_obj_disposition
    • coll_object.lot_count
    • coll_object.condition
    • coll_object.flags
    • coll_object_remark.disposition_remarks
    • coll_object_remark.coll_object_remarks
  2. Move to table cataloged_item fields:
    • coll_object.entered_person_id
    • coll_object.coll_object_entered_date
    • coll_object.last_edited_person_id (or not? Flat edit history is about the same thing with more info)
      *coll_object. last_edit_date (or not? Flat edit history is about the same thing with more info)
    • coll_object_remark.coll_object_remarks

and rebuild everything to use that simplified structure

Describe alternatives you've considered

Keep on keeping on.

Additional context

  1. This would be a good opportunity to think about names - eg specimen_part.collection_object_id-->specimen_part.part_id, coll_object_remarks-->cataloged_item_remarks and part_remarks, etc.
  2. This would leave coll_object_remarks.associated_species needing a new home, but we've known about that for a long time.

Priority

Lowish, probably; might be higher if the potential performance impact was better understood.

@dustymc dustymc added the Enhancement I think this would make Arctos even awesomer! label May 23, 2022
@dustymc dustymc added this to the Needs Discussion milestone May 23, 2022
@ewommack
Copy link

specimen_part.collection_object_id

I think it would be a good idea to take out specimen from the name if we could.

@Jegelewicz
Copy link
Member

coll_object.condition and ArctosDB/BackBurner#7

@dustymc said to add this here so...

Let's work with condition in a similar method that we do preservation.

There is no more "condition" field

Everything currently there gets migrated to the condition report attribute.

When entering data, the "condition" field still exists, but like preservation, it ends up as an attribute with the date stamp and determiner of the Arctos user who entered it. (Want more precision, then use the actual condition report attribute).

All of the "condition" attributes are grouped together to help us review the condition history of any given part using the "Metadata" (but I'll need instruction on how to do this properly!).

What did I forget?

@AJLinn
Copy link

AJLinn commented Aug 30, 2022

There is no more "condition" field

Everything currently there gets migrated to the condition report attribute.

Wait, what? Is that what's actually going to happen? Currently UAM:EH uses that field for an overall statement of the condition (Excellent, good, fair, poor) and then we do the detailed condition report as a part attribute. Is the plan to totally get rid of the condition field?
https://arctos.database.museum/guid/UAM:EH:UA2007-010-0111 for an example of this in detail.

@Jegelewicz
Copy link
Member

This is just a proposal. Clearly we will need to discuss and decide how best to do this.

@AJLinn
Copy link

AJLinn commented Aug 30, 2022

If it gets rid of some of the columns in the parts table I'd be okay in getting rid of it though. We can change as long as the info in there gets migrated into a condition report part attribute. It would save me from having to write "not recorded" in every single media condition field!

@campmlc
Copy link

campmlc commented Aug 30, 2022

I like this idea in general - I think it could work, especially if @AJLinn can get what she needs!
@jldunnum

@dustymc
Copy link
Contributor Author

dustymc commented Aug 31, 2022

I also like the idea in general - it's a simplification, one less way to specify condition.

save me from having to write

Yes, this would allow for any number of assertions, and zero is still a number.

@Jegelewicz
Copy link
Member

I'm guessing this should go to the AWG now.

@Jegelewicz Jegelewicz added the Priority-High (Needed for work) High because this is causing a delay in important collection work.. label Aug 31, 2022
@Jegelewicz
Copy link
Member

AWG asks that we be able to create new condition attribute(s) from the loan form.

@Jegelewicz Jegelewicz modified the milestones: Next AWG Meeting, Next Task Sep 1, 2022
@dustymc
Copy link
Contributor Author

dustymc commented Sep 1, 2022

I still have questions before this can go next task.

FOR PARTS:

    coll_object.entered_person_id (or not?)
    coll_object.coll_object_entered_date (or not?)
    coll_object.last_edited_person_id (or not?)
    coll_object.last_edit_date (or not?)

Do we need to record entered and edited for parts? (My vote: Nope, I don't think anyone's ever asked for that, I think that can be moved completely to cataloged items.)

coll_object.flags

Actually can we just get rid of this? Maybe move whatever's there to https://arctos.database.museum/info/ctDocumentation.cfm?table=ctattribute_type#processing_history? If anyone wants something similar for parts they can request a new part attribute.

coll_object.coll_obj_disposition
coll_object.lot_count
coll_object_remark.coll_object_remarks 

Is there any reason to do anything new and different with these? (I don't see one.)

coll_object_remark.disposition_remarks

I think this hasn't been exposed and holds nothing of value, suggesting finding a place (https://arctos.database.museum/info/ctDocumentation.cfm?table=ctspecpart_attribute_type#location perhaps) for whatever's in there if anything is and dropping the concept.

coll_object_remark.coll_object_remarks

rename to 'part_remark' (unless someone has better ideas)


FOR CATALOGED ITEMS

    coll_object.entered_person_id
    coll_object.coll_object_entered_date
    coll_object.last_edited_person_id (or not? Flat edit history is about the same thing with more info)
    *coll_object. last_edit_date (or not? Flat edit history is about the same thing with more info)

Do we still need that, do we need something different, ????

coll_object_remark.coll_object_remarks

rename to - uhhh - help! cataloged_item_remark?? record_remark? item_remark??


TODO

Figure out what's in condition history, make a lot of attributes or concatenate it with current condition into the one new attribute or ??? Need to pull data, then discuss.

(ditto disposition_remarks, see above)


CLARIFICATION

AWG asks that we be able to create new condition attribute(s) from the loan form.

I think it'll just be any new part attribute(s) - https://github.com/ArctosDB/BackBurner/issues/7 should use https://arctos.database.museum/info/ctDocumentation.cfm?table=ctspecpart_attribute_type#tissue_quality and/or https://arctos.database.museum/info/ctDocumentation.cfm?table=ctspecpart_attribute_type#storage_temperature instead of the generic descriptive option, for example.

@dustymc dustymc modified the milestones: Next Task, Needs Discussion Sep 1, 2022
@Jegelewicz
Copy link
Member

FOR PARTS:

coll_object.entered_person_id (or not?)
coll_object.coll_object_entered_date (or not?)
coll_object.last_edited_person_id (or not?)
coll_object.last_edit_date (or not?)

Do we need to record entered and edited for parts? (My vote: Nope, I don't think anyone's ever asked for that, I think that can be moved completely to cataloged items.)

HMMMM - I say we keep it because eventually someone is going to ask "who the heck added 500 parts to my collection and why?".

@Jegelewicz
Copy link
Member

coll_object.flags

Actually can we just get rid of this? Maybe move whatever's there to https://arctos.database.museum/info/ctDocumentation.cfm?table=ctattribute_type#processing_history? If anyone wants something similar for parts they can request a new part attribute.

What's in there now?

@Jegelewicz
Copy link
Member

coll_object.coll_obj_disposition
coll_object.lot_count
coll_object_remark.coll_object_remarks

Is there any reason to do anything new and different with these? (I don't see one.)

Agree - these are fine as they are.

@Jegelewicz
Copy link
Member

coll_object_remark.disposition_remarks

I think this hasn't been exposed and holds nothing of value, suggesting finding a place (https://arctos.database.museum/info/ctDocumentation.cfm?table=ctspecpart_attribute_type#location perhaps) for whatever's in there if anything is and dropping the concept.

What is in there and how did it get there?

@dustymc
Copy link
Contributor Author

dustymc commented Sep 1, 2022

keep it because eventually

It's a fair bit of data for which we pay in performance and storage. Definitely a cost we can pay if we must, but I'd prefer not to get too theoretical here. The flat edit history mentioned in the initial comment is more than capable of getting at someone adding 500 parts.

flags

@Jegelewicz
Copy link
Member

Jegelewicz commented Sep 1, 2022

coll_object_remark.coll_object_remarks

rename to 'part_remark' (unless someone has better ideas)

Yes, please. Do we actually have two things called coll_object_remark.coll_object_remarks? Should all the part stuff be renamed so that it is less confusing? (Maybe that's crazy though...)

@Jegelewicz
Copy link
Member

FOR CATALOGED ITEMS

coll_object.entered_person_id
coll_object.coll_object_entered_date
coll_object.last_edited_person_id (or not? Flat edit history is about the same thing with more info)
*coll_object. last_edit_date (or not? Flat edit history is about the same thing with more info)

Do we still need that, do we need something different, ????

So this stuff is about the ENTIRE record, correct? I am in favor of only looking at last edited in one way, so as long as everyone can figure out who edited stuff and when with flat edit history, then let's simplify?

@dustymc
Copy link
Contributor Author

dustymc commented Sep 1, 2022

Do we actually have two things called coll_object_remark.coll_object_remarks?

No, but it's not flat. (So there's one, or there's at least one, or there's as many as you need, depending on your perspective.)

Should all the part stuff be renamed

That's what @ewommack suggested. It would add work, maybe quite a bit, but there will probably never be a better time.

@dustymc
Copy link
Contributor Author

dustymc commented Aug 30, 2023

The obstacles are gone, proceeding with this, there's important stuff in the first comment.

@dustymc dustymc added the Priority - Wildfire Potential ignore this at everyone's peril, may smolder for now ... label Sep 5, 2023
@dustymc
Copy link
Contributor Author

dustymc commented Sep 7, 2023

AWG - no overwhelming support for doing anything clever with condition, move it straight across

@dustymc
Copy link
Contributor Author

dustymc commented Sep 7, 2023

AWG: last edit isn't useful, cache and drop (and maybe we need a focus group to find some actually-useful 'edited').

@dustymc
Copy link
Contributor Author

dustymc commented Sep 21, 2023

I am splitting loan_item.collection_object_id into loan_item.part_id and loan_item.cataloged_item_id. Everything involving loan items should be checked.

@dustymc
Copy link
Contributor Author

dustymc commented Sep 22, 2023

Unless someone has immediate and compelling reasons not to, I think I'll rename some of the coll_obj.... things - starting with https://arctos.database.museum/info/ctDocumentation.cfm?table=ctcoll_obj_disp - while I'm rebuilding kinda everything anyway.

@campmlc
Copy link

campmlc commented Sep 22, 2023

Is this related to the Parts View/Download issue, where the view version shows Part Remark but the download version calls same content Collection Object Remarks?
Any clarification on these sounds like a good idea.

@dustymc
Copy link
Contributor Author

dustymc commented Sep 25, 2023

Essentially everything - anything that touches involving parts, catalog records, or loans - has been revised in test (https://arctos-test.tacc.utexas.edu) and should be tested.

@Jegelewicz
Copy link
Member

Whats the deal with the little box after the part name in a loan?

image

https://arctos-test.tacc.utexas.edu/loanItemReview.cfm?transaction_id=21112061

@dustymc
Copy link
Contributor Author

dustymc commented Sep 26, 2023

Screenshot 2023-09-26 at 07 46 14

@Jegelewicz
Copy link
Member

fair - possible to put some standard thing when there are no attributes so it doesn't look weird?

"no attributes"

OR just leave it off if there are none?

@dustymc
Copy link
Contributor Author

dustymc commented Sep 26, 2023

Definitely not from here...

"Looks weird" is a feature - it should be obvious when there are no attributes vs. when they're not shown in this format.

@dustymc
Copy link
Contributor Author

dustymc commented Oct 6, 2023

Per #4707 (comment) here's last edit data for cataloged items:

temp_last_edit_catitem.zip

@dustymc
Copy link
Contributor Author

dustymc commented Oct 6, 2023

Per #4707 (comment) here's last edit data for parts:

Err, nope, it's too big and won't attach, it's on my drive and the test VM, ping me ASAP if anyone wants a copy. @mkoo maybe you have a place for a 30MB file?

@Jegelewicz
Copy link
Member

place for a 30MB file

Upload it to TACC? Sounds weird, but isn't that what we have storage there for?

@dustymc
Copy link
Contributor Author

dustymc commented Oct 6, 2023

Upload it to TACC?

I think it's all garbage (that's why we're unloading it!) and don't really want to put that in archival storage....

@dustymc
Copy link
Contributor Author

dustymc commented Oct 6, 2023

disposition_remarks cache:

temp_part_dispn_remark.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement I think this would make Arctos even awesomer! Priority - Wildfire Potential ignore this at everyone's peril, may smolder for now ... Priority-High (Needed for work) High because this is causing a delay in important collection work..
Projects
None yet
Development

No branches or pull requests

5 participants