Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unrestricted custom elements #19

Open
bertsky opened this issue Nov 22, 2019 · 3 comments
Open

Add unrestricted custom elements #19

bertsky opened this issue Nov 22, 2019 · 3 comments

Comments

@bertsky
Copy link
Contributor

bertsky commented Nov 22, 2019

I don't know if this is related to #18, but generally people have been trying to add annotations to (earlier versions of) PAGE's PcGts, and – due to the lack of support for a free sub-namespace – have spawned their own namespace.

PAGE-XML does offer @custom and @comments attributes everywhere and the predefined UserDefinedType, but this is not nearly as powerful/expressive as an arbitrary XML subtree.

In comparison, ALTO-XML has XmlData for that purpose, and it uses:

<xsd:any namespace="##any" processContents="lax" maxOccurs="unbounded"/>

One example of where this could be useful is for holding an OCR hypotheses lattice without changing the namespace.

@bertsky
Copy link
Contributor Author

bertsky commented Jul 1, 2020

@chris1010010 would you care for a PR?

@chris1010010
Copy link
Contributor

I think the majority vote was not to use 'any' for the time being, but I'll raise this again with the others.
How about 'anyAttribute' for selected elements? Would that be useful?

@bertsky
Copy link
Contributor Author

bertsky commented Jul 2, 2020

How about 'anyAttribute' for selected elements? Would that be useful?

What do you mean?

What ALTO does is allow (any number of) elements XmlData under TagType (i.e. /alto/Tags/LayoutTag|StructureTag|RoleTag|NamedEntityTag|OtherTag/XmlData which can then have arbitrary child elements of an arbitrary namespace.

For PAGE we would still have to decide under which path such free content elements make most sense. Maybe Labels (which is under Metadata and Page and all regions)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants