Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coordinate transform utils have misleading mnemonics #157

Closed
bertsky opened this issue Aug 10, 2018 · 5 comments
Closed

coordinate transform utils have misleading mnemonics #157

bertsky opened this issue Aug 10, 2018 · 5 comments
Assignees

Comments

@bertsky
Copy link
Collaborator

bertsky commented Aug 10, 2018

The word points is used for very different ways of representing 8 coordinates for 4 corners:

  • points_from_xywh: a list of integer coordinates
  • points_from_x0y0x1y1, xywh_from_points and polygon_from_points: a string with space and comma as delimiters

Can we please rename them to be more precise and consistent? I suggest

  • points for the string (because it is also the name of PAGE's CoordType attribute),
  • x0y0x1y0x1y1x0y1 for a list of strings (because it is straightforward and analogous to x0y0x1y1),
  • coords or bbox for a list of integers.

Or maybe tltrbrbl instead of x0y0x1y0x1y1x0y1 (but also tlbr instead of x0y0x1y1).

@bertsky
Copy link
Collaborator Author

bertsky commented Aug 10, 2018

Sorry, I got all confused :-)

@bertsky bertsky closed this as completed Aug 10, 2018
@kba
Copy link
Member

kba commented Aug 14, 2018

It is confusing and I'm open to changing names to make it less confusing.

points means a string representation of space-separated coordinates, x and y of a coordinate separated by comma. Usable in PAGE XML or hOCR attributes.

xywh is the top-left corner coordinate and width and height serialized as a dictionnary with keys x, y, w, h. Used e.g. by kraken and kind-of by tesseract (w and h in component images).

x0y0x1y1 is the representation of a bbox used in ocropy. tlbr instead of x0y0x1y1 is a good idea.

@kba kba reopened this Aug 14, 2018
@wrznr
Copy link
Contributor

wrznr commented Jul 18, 2019

@bertsky Will this be solved with your new common functionalities?

@bertsky
Copy link
Collaborator Author

bertsky commented Jul 18, 2019

It will become somewhat better, because all possible conversion options for all the different formats can be seen, and the docstrings are even more precise. But I have not touched the docstrings in ocrd_utils yet.

We now have the following mnemonics:

  • polygon – numeric polygon xy-pair list,
  • bbox – numeric bbox tuple,
  • x0y0x1y1 – string bbox tuple (this name is not ideal),
  • xywh – numeric bbox dict,
  • points – PAGE polygon xy-pair string

Routines can either convert (without loss of information) or construct (with loss of information or loss of detail).

@kba
Copy link
Member

kba commented Aug 23, 2019

common.py has been merged into core, #268 and #254 should fix this.

@kba kba closed this as completed Aug 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants