Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory impact of span annotations should be improved: refactor data model #41

Closed
proycon opened this issue Aug 9, 2016 · 8 comments
Closed
Assignees
Milestone

Comments

@proycon
Copy link
Owner

proycon commented Aug 9, 2016

Span annotations are duplicated per word, causes some unnecessary overhead. Ideally, some type of references should be used instead.

@proycon
Copy link
Owner Author

proycon commented Nov 18, 2016

Major refactoring in both foliadocserve and flat is needed to come to a better model, this is becoming more of a necessity for issue #3, subissue e and f.

@proycon proycon changed the title Memory impact of span annotations can be improved Memory impact of span annotations should be improved: refactor data model Nov 18, 2016
@proycon
Copy link
Owner Author

proycon commented Nov 18, 2016

This is a brainstorm comment for a better data model (subject to be edited heavily):

CURRENT model:

  • annotations
  • $word_id
    • $tokenannotation
      • children
        • $higherorderannotation
      • $spanannotation
        • children
        • $higherorderannotation
      • self
        • type
        • parent
        • set
        • etc..

Span annotations are duplicated per target word, children often contains too much as well.

NEW proposal

All structure elements will be first-level citizens under structure.
All token/span annotation will be first-level citizens under annotations

  • structure

    • $structure_id (everything previously under self goes here directly)
      • structure => links to IDs for nested structure elements
      • annotations => links to extended IDs for token annotations and span annotations that refer to this element
      • spanlayers => links to IDs of span annotation elements that are embedded in layers at this level (disjunct with those in annotations!)
      • children
        * $higherorderannotation (and only this, the rest is in annotations)
  • annotations

    • $annotation_ext_id (extended ID is FoLiA ID for spanannotation or if availble, otherwise $parentid/$type/$set conjunct, e.g for token annotation)
      • annotations => links to IDs for nested annotations
      • children
        • $higherorderannotation (and only this, the rest is in annotations)

Correction Handling

Corrections are stored as part of annotations:

  • structural: Is the correction structural or not?
  • new/current/original: links to IDs in annotations or structure (depending on structural)
  • suggestions: lists/tuples with links ...

Annotations or structural elements that are part of a correction have their incorrection attribute set (also applies to corrections in corrections).

@proycon proycon self-assigned this Nov 18, 2016
proycon added a commit that referenced this issue Nov 23, 2016
proycon added a commit that referenced this issue Nov 23, 2016
…editor.js later + removed old loadannotations()
@proycon
Copy link
Owner Author

proycon commented Nov 24, 2016

Things to verify after refactoring is complete (list subject to further editing):

  • Rendering of insertions and deletions (show original text)
  • Rendering original text of corrections
  • Rendering of insertions
  • String handling (TICCL output)
  • Display of corrections (structural and non-structural)
  • Display of suggestions for correction
  • Break off of large documents (foliadocserve's bookkeeper)
  • Pagination

proycon added a commit that referenced this issue Nov 24, 2016
proycon added a commit that referenced this issue Nov 24, 2016
proycon added a commit that referenced this issue Nov 24, 2016
…e that was retrieved in the latest update round #41
proycon added a commit that referenced this issue Nov 24, 2016
proycon added a commit that referenced this issue Nov 25, 2016
…e been implicit before but got lost in refactoring #41
proycon added a commit that referenced this issue Nov 25, 2016
…... structural corrections broken still
proycon added a commit that referenced this issue Nov 25, 2016
@proycon
Copy link
Owner Author

proycon commented Nov 25, 2016

Structural corrections are broken

@proycon
Copy link
Owner Author

proycon commented Nov 25, 2016

"Show text prior to corrections" is broken

proycon added a commit that referenced this issue Nov 25, 2016
…Using this for visualisation of structural corrections as well. (#41)
@proycon
Copy link
Owner Author

proycon commented Nov 25, 2016

Structural corrections display again.

@proycon
Copy link
Owner Author

proycon commented Nov 25, 2016

  • "Show text prior to corrections" works again
  • Pagination and abortion still works after refactor
  • Corrections and suggestions for correction works in straightforward cases but still needs work in advanced cases (to be addressed in issue [v0.7 development] Fix advanced correction handling after major refactoring #91)
  • String handling (TICCL output) works partially, corrections in strings are broken

@proycon
Copy link
Owner Author

proycon commented Nov 29, 2016

Refactoring complete, but string handling still broken, continuing in issue #92

proycon added a commit that referenced this issue Dec 9, 2016
@proycon proycon closed this as completed Dec 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant