Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determine the right logic around updating and merging annotations #375

Closed
Mr0grog opened this issue Sep 5, 2018 · 14 comments
Closed

Determine the right logic around updating and merging annotations #375

Mr0grog opened this issue Sep 5, 2018 · 14 comments

Comments

@Mr0grog
Copy link
Member

Mr0grog commented Sep 5, 2018

I’m separating out a conversation around the business logic of annotations here from #114 so we can merge the useful API fixes in that PR without it being held up by a discussion that was never concluded.

The principle questions here are:

  • Can annotations be updated? (Used to be no, now is yes, right answer still confused.)
  • How do annotations get merged together into the “current” or “total” annotation, based on what happens when they are updated (or not)?

Below is the discussion so far from #114

@Mr0grog
Copy link
Member Author

Mr0grog commented Sep 5, 2018

@Mr0grog:

When we first added annotations, the contract was that they couldn’t be edited. If a user annotated the same change twice, there would be two annotations (the Change.current_annotation property is all the existing annotations merged in chronological order, so not all annotations need have the same fields). We thought this might also be nice for auditing purposes.

When we started thinking about the UI, we thought it might be good not to need an explicit save button and instead save automatically whenever the user interacts with the annotation form. That would result in a huge number of annotations on any given change, though. To solve that, we switched to having only one annotation per user, where a creating an annotation replace’s the user’s previous annotation, if any (#48).

In the original design (append-only/non-editable) it didn’t make sense to accept an empty annotation with no data because that would have no effect. However, in fixing this bug, I realized POSTing a non-empty annotation does have a meaningful effect now—because it can replace an existing annotation, it would zero out your previous annotations. It also broke a nice contract we originally had, where an annotation is simply the set of changes you wish to make to the current annotation. Now a user (or the UI serving them) needs to be careful to include all the fields previously used in that user’s annotation alongside all those changes.

A good concrete example of the above is that the analysis process is currently two steps:

  1. Users fill out a bunch of checkboxes for various aspects of the change in step one, and
  2. In step two they do a more detailed look with tagging and written descriptions of the change.

If the same user did both rounds, the second-round code would have to include all the fields and data about the first round alongside the form data for the second round that the user is actually working with.

I’m feeling like that might be a real problem I overlooked before. Some approaches I’m thinking of:

  1. Still make POSTing an annotation edit the user’s existing annotation, but treat it as new data to merge in, rather than replacing the old annotation. This still has a subtle side effect, but maybe not a big deal—if the sequence of annotations is:

    // Annotation sequence:
    User A: {field1: 'value1-1', field2: 'value2-1'}
    User B: {field1: 'value1-2',                     field3: 'value3-1'}
    User A: {                                        field3: 'value3-2'}
    
    expect= {field1: 'value1-2', field2: 'value2-1', field3: 'value3-2'}
    actual= {field1: 'value1-1', field2: 'value2-1', field3: 'value3-2'}
                     ^^^^^^^^^^
    // Actual `current_annotation` has a different `field1` value because
    // editing makes User A’s annotation newer than User B’s
  2. Edit a user’s existing annotation only if it is the newest annotation (and do so by merging instead of replacing, as above), otherwise, if a different user has since added an annotation, create a new annotation. In the above example, you’d wind up with 3 annotations (instead of two) and the problem would be resolved.

  3. Just don’t worry about it ¯\_(ツ)_/¯

  4. Something else?

@Mr0grog
Copy link
Member Author

Mr0grog commented Sep 5, 2018

@Mr0grog:

Leaning towards (2) above. It's a little more complicated to reintroduce the idea of multiple annotations per user, but setting up that idea lets us split off annotations on other conditions in the future, e.g. tracking the user agent an annotation was made with.

@Mr0grog
Copy link
Member Author

Mr0grog commented Sep 5, 2018

@danielballan:

Can the load the current annotation data into its form so that any additional saves will represent changes based on the original state?

@Mr0grog
Copy link
Member Author

Mr0grog commented Sep 5, 2018

@Mr0grog:

@danielballan Are you asking if the UI can? Or saying that the DB API here should always treat a saved annotation as the submitted fields + the current merged state of all existing annotations, so in the example above, you’d have…

User A (posted):        {field1: 'value1-1', field2: 'value2-1'}
Result in DB:
    User A:             {field1: 'value1-1', field2: 'value2-1'}
    current_annotation: {field1: 'value1-1', field2: 'value2-1'}

User B (posted):        {field1: 'value1-2',                     field3: 'value3-1'}
Result in DB:
    User A:             {field1: 'value1-1', field2: 'value2-1'}
    User B:             {field1: 'value1-2', field2: 'value2-1', field3: 'value3-1'}
    current_annotation: {field1: 'value1-1', field2: 'value2-1'}

User A (posted):        {                                        field3: 'value3-2'}
Result in DB:
    User B:             {field1: 'value1-2', field2: 'value2-1', field3: 'value3-1'}
    User A:             {field1: 'value1-2', field2: 'value2-1', field3: 'value3-2'}
    current_annotation: {field1: 'value1-2', field2: 'value2-1', field3: 'value3-2'}

?

If the former, I think the UI already does this. I wasn’t comfortable placing a requirement on auto-analysis tools or anything else in future needing to load the Change record and its current_annotation before posting, though.

If the latter, that would work, but it feels kinda weird to me; we’d no longer really be accurately recording what a someone or some machine actually said about a change (which I kinda felt like was the main reason for keeping annotation records around instead of just modifying current_annotation).

I think it’s also unclear in the example whether User A had intended to say {field1: 'value1-2'} (did they see B’s annotation and assent to it, or are they just sending more in addition to what they already sent?), but we record them in the DB as having done so.

@Mr0grog
Copy link
Member Author

Mr0grog commented Sep 5, 2018

@danielballan:

I see where the confusion is. I imagined that each unique would start with a blank slate, with no annotation. Users could update their own annotations, but annotations from different users would be non-interfering.

Some admin view would present all the annotations.

The idea of a "current_annotation" has always seemed muddy to me, but I don't think it's possible in general to merge annotations together.

@Mr0grog
Copy link
Member Author

Mr0grog commented Sep 5, 2018

@Mr0grog:

Ha. I feel like current_annotation is the one that is primarily useful, while the full list of annotation is more of an audit tool.

e.g. Tom does a first-round analysis and checks off some boxes and makes some simple notes, then marks as 0.5 significance; Jane and Alice then each do a second-round deeper analysis (if it’s worth the deeper look, it might be worth two pairs of eyes)—they each add some tags, maybe one corrects an ealier checkbox that was mischecked, etc.

If you want to know what we as a team have to say about that page, you’d want to check the current_annotation. Each individual annotation only has a piece of the picture.

@Mr0grog
Copy link
Member Author

Mr0grog commented Sep 5, 2018

@danielballan:

Got it. I think my concept came from imagining (prematurely) the day when we could have a horde of semi-trusted volunteer contributors working independently, not a small stable of trusted volunteers coordinating closely.

@Mr0grog
Copy link
Member Author

Mr0grog commented Sep 5, 2018

@danielballan:

I don't see an obviously-correct approach here. I will think on this tonight.

@Mr0grog
Copy link
Member Author

Mr0grog commented Sep 5, 2018

------------------ (End of discussion so far) -------------------

@stale
Copy link

stale bot commented Sep 7, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in seven days if no further activity occurs. If it should not be closed, please comment! Thank you for your contributions.

@stale stale bot added the stale label Sep 7, 2019
@Mr0grog
Copy link
Member Author

Mr0grog commented Sep 10, 2019

@SYU15 is going to be bringing annotations back to the UI soon, so we will need to tackle these questions in the next month or two.

@Mr0grog
Copy link
Member Author

Mr0grog commented Oct 8, 2019

Additional note from @mrotondo’s work on edgi-govdata-archiving/web-monitoring-processing#487:

Should posting an annotation include a flag indicating whether to merge vs. replace the annotation, similar to how imports work? (I’m thinking merge would be the default.)

POST /api/v0/pages/{page_id}/changes/{change_id}/annotations?update=replace

I’m imagining “merge” as the default since multiple apps/services/clients might provide tooling to annotate different kinds of data, and we wouldn’t want them accidentally blowing away fields from each other that aren’t common.

You could post {"field_name": null} to explicitly delete a field when in merge mode:

Assuming all these POSTs are merges, not replacements...
User A POSTs:        {field1: 'value1', field2: 'value2'}
User A's annotation: {field1: 'value1', field2: 'value2'}

User A POSTs:        {                  field2: 'value3'}
User A's annotation: {field1: 'value1', field2: 'value3'}

User A POSTs:        {field1: null                      }
User A's annotation: {                  field2: 'value3'}

@mrotondo
Copy link

Per an offline conversation with Rob, I support the idea of keeping one annotation per user per change, and merging new values into from a user into their extant annotation (or creating one). Leaving aside questions of auditing, presentation and search (all of which seem to have reasonable answers even with merged annotations), this feels like the most semantically appropriate representation of the intentions of multiple users each annotating a single change.

@stale
Copy link

stale bot commented Apr 8, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in seven days if no further activity occurs. If it should not be closed, please comment! Thank you for your contributions.

@stale stale bot added the stale label Apr 8, 2020
@stale stale bot closed this as completed Apr 16, 2020
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants