Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More configurable TOML serialization #254

Open
yawkat opened this issue Apr 9, 2021 · 11 comments
Open

More configurable TOML serialization #254

yawkat opened this issue Apr 9, 2021 · 11 comments
Labels
TOML Issue related to TOML format backend

Comments

@yawkat
Copy link
Member

yawkat commented Apr 9, 2021

Right now, the toml serializer generates only top level properties, and inline tables where necessary (i.e. where arrays are used):

abc.foo = 1
abc.bar = 2
abc.xyz = [{foo = 1, bar = 2}]

There are special cases where we could generate normal toml tables. The example above can be expressed as:

[abc]
foo = 1
bar = 2

[[abc.xyz]]
foo = 1
bar = 2

This format will in many cases be more readable. However, there are problems with non-inline tables that prevent us from emitting them in a streaming generator:

  • Once a table is started, it is impossible to go back to the parent or root table. If an object has scalar properties that we only see after we started a subtable, we have a problem.
  • Array tables only work with arrays of objects. If an array is heterogeneous, but we've already started emitting it as an array table, we have a problem.

These problems can only be solved with knowledge of the tree being serialized, and perhaps even by influencing property order so that scalars always come first. However, because I expect serialization to toml to be fairly niche – it is a configuration format meant to be written by humans, after all – it is not worth adding machinery to databind to implement this.

Another question to consider: What even is the best representation of an object? A short inline object may be more readable than starting a new table that includes the entire path to that object.

A possible solution to this problem would be to add API to the TomlGenerator to specifically start a table. One approach would be to create overloads for writeStartObject and writeStartArray that allow forcing generation as a table. If data is then passed that cannot be represented as a table (the bullet points above), we would error. While we're at it, could also allow emitting comments with additional methods.

We could also inspect annotations on the forValue passed to writeStartObject, though this seems like misuse of that parameter, so not a good idea.

@yawkat yawkat added the TOML Issue related to TOML format backend label Apr 9, 2021
@cowtowncoder
Copy link
Member

Good questions. Streaming writer can still buffer all of content and only emit things at the end (Properties backend does this), but whether that makes sense is an open question.

I suspect that users will be asking for this in one particular case, for what that is worth: when modifying an existing TOML document.

I agree that while annotations might be nice way, they cannot be accessed at streaming backend and would need to be somehow passed by databind. I have thought about this a bit wrt YAML output: there are various styles for textual content (no less than... five variations); no good ideas yet on how those should be passed. In case of YAML, custom String serializer could be defined; but then the issue becomes that of annotation handling. The only module that really supports custom annotations quite extensively is XML module, and it is bit problematic.

One possibility, I think, would be an option that would force use of Tables but also assume strict ordering -- such that if closed table is "re-written", it'd throw exception. This could work in read-modify-write cycle where ordering is preserved (f.ex via JsonNode), or at least statically forced (POJOs).
That might not be a bad option, I think, with appropriate warnings on feature used to enable it?

@qiyuey
Copy link

qiyuey commented Aug 14, 2022

hope to support table

@cowtowncoder
Copy link
Member

PRs welcome!

@sysmat
Copy link

sysmat commented Nov 22, 2022

tables and nested tables is one of the basics

@cowtowncoder
Copy link
Member

@sysmat Yes, as I said, PRs welcome. Arguing about usefulness of something does little to implement said feature.

@ebresie
Copy link

ebresie commented Feb 26, 2023

What is the status of this?

I started trying to use jackson toml support and when I write presently it seems to output things in format similar to the above mentioned like

abc.foo = 1 abc.bar = 2 abc.xyz = [{foo = 1, bar = 2}]

But was expecting the later

[abc]
foo = 1
bar = 2

[[abc.xyz]]
foo = 1
bar = 2

So is the ability to add a "table" (or whatever the toml nomenclature is - still new to it) not available in Jackson and is it related to this?

@ebresie
Copy link

ebresie commented Feb 26, 2023

Would adding some sort of annotation be a way forward? Say a @ Table(header="abc") which could be applied to a given java class?

Although I suppose if multiple "Table" were added, that might have to be applied to each attribute to allow given object to group related items into the same or different sections.

@cowtowncoder
Copy link
Member

Challenge with format-specific annotations is that they cannot be supported by jackson-databind, which guides mapping from properties (Java object) to format events.
So typically annotations need to have more general applicability. There are some exceptions -- XML module has a few that operate at low enough level to change token streams -- and so annotation support for format-specific things need to work at level beyond databinding.

But it is also possible that no annotations were needed and it is just a question of making use of existing naming conventions and re-construct output. This is what "Properties" backend does.

So I think it may be just that output side was left at minimum support level, not due to specific limitations.

@yawkat
Copy link
Member Author

yawkat commented Feb 26, 2023

a good first step would be to support it in the generator, if anyone is interested in writing a pr. it's not on my roadmap atm.

@ebresie
Copy link

ebresie commented Feb 27, 2023

Challenge with format-specific annotations is that they cannot be supported by jackson-databind, which guides mapping from properties (Java object) to format events.

So typically annotations need to have more general applicability. There are some exceptions -- XML module has a few that operate at low enough level to change token streams -- and so annotation support for format-specific things need to work at level beyond databinding.

When working with JPA there are annotations to identify tables, columns, ids, etc. ,

With xml annotations in JAXB there are annotations like xml root, xml elements, and xml attributes.

Would either of these be examples to build off of for possible annotation development here?

@cowtowncoder
Copy link
Member

What I am trying to say, as is @yawkat, is that the output side of TOML needs work even before considering need for new annotations.

As to JAXB, it is XML-specific so not really (although Jackson has some compatibility support); JPA is DB-specific so I don't think so.

But the original description of the issue is relevant: first things first, output formatting basically does not exist wrt sections. It would be possible to add that with default logic, and if necessary, then consider other annotations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
TOML Issue related to TOML format backend
Projects
None yet
Development

No branches or pull requests

5 participants