Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow configuring the newLine used on Gson.toJson #776

Closed
ricardojoaoreis opened this issue Jan 26, 2016 · 8 comments
Closed

Allow configuring the newLine used on Gson.toJson #776

ricardojoaoreis opened this issue Jan 26, 2016 · 8 comments

Comments

@ricardojoaoreis
Copy link

Currently (version 2.5) there's no way to use Windows new lines on Gson, being able to configure this would be nice.

Also some old software doesn't like files that don't have a dangling new line at the end of the JSON, being able to configure this would be nice as well.

@inder123
Copy link
Collaborator

Ideally, GsonBuilder should have a setFormattingStyles(String style) method that should determine the entire formatting. But we have never had time to research and find a good format to specify this. Suggestions welcome.

Until then, you can use Gson.toJsonTree() and format the output yourself.

@mihnita
Copy link
Member

mihnita commented Nov 18, 2022

Originally posted by @eamonnmcmanus in #2231 (comment)


Hi, just back from vacation. :-)

My initial reactions:

  1. Rather than GsonBuilder.setNewlineStyle which is only relevant if GsonBuilder.setPrettyPrinting() has been called, I think it might be better to overload GsonBuilder.setPrettyPrinting with a version that takes a parameter describing the pretty-printing style. We could imagine that this might later include things like indentation (currently hardcoded to two spaces).
  2. I'm not really convinced that the NewlineStyle class is justified. If we had a FormattingStyle class then we might imagine it looking like this initially:
public class FormattingStyle {
  private final String newline;

  private FormattingStyle(String newline) {
    this.newline = newline;
  }

  public static final FormattingStyle DEFAULT = new FormattingStyle("\n");

  public FormattingStyle withNewline(String newline) {
    return new FormattingStyle(newline);
  }
}

which would evolve in the expected way if we add other parameters like indent. I don't think it is really necessary to have constants for the kinds of newline style. We can certainly document the two prevailing styles, and perhaps require that the newline string match [\r\n]+. If people need a CRLF constant they can trivially define one themselves.

I'll also note, playing Devil's Advocate a bit, that it's pretty easy to do gson.toJson(...).replace("\n", "\r\n").

@mihnita
Copy link
Member

mihnita commented Nov 18, 2022

Originally posted by @mihnita in #2231 (comment)


I'll also note, playing Devil's Advocate a bit, that it's pretty easy to do gson.toJson(...).replace("\n", "\r\n").

That's true, and it is something I did for years (and it annoyed me for years :-).
It feels "unclean" and it renders all the toJson methods that take a JsonWriter useless, and probably doubles the memory consumption (probably not an issue).
One can also bring the same "devil's advocate" argument about the toJson + JsonWriter: you can easily toJson to a String and write it yourself in a writer.
To apply the workaround you have to either do it every time you invoke toJson, or wrap the whole gson in a custom class that does that.

So, there are workarounds, but at the core this makes the people who need Windows EOL feel like second class citizens.

When the JSON spec (and ECMAScript spec) does not specify a certain EOL as preferred, they accept all the valid combinations.

In fact there is an effort to update the spec to also allow for U+2028 (LINE SEPARATOR) and U+2029 (PARAGRAPH SEPARATOR), to bring JSON closer to ECMAScript:
https://github.com/tc39/proposal-json-superset/blob/master/README.md
Apparently "It is already implemented in Chrome 66 and Safari TP49" (https://bugzilla.mozilla.org/show_bug.cgi?id=1435828). And in progress (or done?) in Firefox.
So being able to specify that as EOL would be nice. If it's valid / standard, why not?


Anyway...
About the implementation, if you are open to accept a contribution,
I would be happy to make this change follow the direction you suggest.

Maybe move this discussion into an issue, so that you can review the direction before I implement something debatable?

I would probably go with a more "fluent" FormattingStyle?
So that in time one can do style.newline(CRLF).indent("\t").bracketStyle(KandR).someOtherOption(opt)?
(TBD if newline would be an enum or just a string)

Thank you,
Mihai

@eamonnmcmanus
Copy link
Member

Thanks for those further details!

I'm not reading the business about LINE SEPARATOR and PARAGRAPH SEPARATOR the same as you. My understanding is that JavaScript previously did not allow these characters unescaped inside string literals, while JSON always has. So "aU+2028b" was legal in JSON but not in JavaScript, even though JSON is supposed to be a subset of JavaScript. The proposal was then to change JavaScript so it does allow U+2028 and U+2029 in string literals, and that seems to have been incorporated into the ECMAScript 2019 spec. So any well-formed JSON is indeed well-formed JavaScript. I don't see any suggestion of allowing U+2028 outside string literals in JSON. The allowed whitespace characters remain exactly \t, \n, \r, and \u0020 and it would not be legal to use U+2028 to separate lines when pretty-printing.

For the FormattingStyle API, my sketch suggested this (for example): FormattingStyle.DEFAULT.withNewline("\r\n").withIndent("\t"). That's basically the same as what you wrote except for the with prefix and the need to start with DEFAULT. (I'm not sure what style is above?) I think withX names may make it clearer that the calls don't modify the object they are called on but instead return a new object (if necessary). An alternative would be to have a builder, but that seems like overkill.

I wouldn't be averse to having public static final String CRLF = "\r\n"; in FormattingStyle, since there doesn't seem to be such a constant in the standard Java API. But I don't think we really need an enum here. (And maybe some people might find it useful to use "\n\n" or "\r\n\r\n" as a line separator.)

@mihnita
Copy link
Member

mihnita commented Nov 18, 2022

Thanks, I'll go ahead with FormattingStyle.DEFAULT and withX, and I will implement withIndent and withNewline taking a String.

It is your call if it is handy to define some constants (CR / CRLF / LF / OS_DEFAULT / WINDOWS / MACOS / LINUX / ...)
Based on experience (and an informal "survey") many devs don't know what the mapping is between CR, "\r", U+000D, or what kind of end of line is used by each OS. Or about System.lineSeparator :-)
The fact that "\n" is OS dependent in C/C++ (and others), but fixed in Java (and others) only adds to the confusion (for people writing cross platform code in both Java and C/C++).

I'm not sure what style is above

That was my attempt to separate the creation part discussion (FormattingStyle.DEFAULT vs constructor vs builder) from the fluent style and method names.


I might need to properly read that LINE SEPARATOR / PARAGRAPH SEPARATOR proposal, maybe do some tests.
Not really relevant for the "main shape" of the implementation.
If withNewline takes a string I might have to allow those 2 characters in the newline string, or not.
TBD.
Will jump of that bridge when we come to it :-)

@BruceLin2000
Copy link

And then how to igore newlines?

@Marcono1234
Copy link
Collaborator

Marcono1234 commented Aug 20, 2023

And then how to igore newlines?

@BruceLin2000, what do you mean by that? Could you please show a small example of want the JSON data to look like?
In current Gson versions, unless you call GsonBuilder.setPrettyPrinting() or JsonWriter.setIndent(...), the JSON data will by default be compact, without any newlines.

Note that the changes for this feature here introduced by pull request #2231 have not been released yet.

@Zumisha
Copy link

Zumisha commented Dec 25, 2023

When will version with this feature be released?
Almost a year has passed since it was merged to the main branch.

tibor-universe pushed a commit to getuniverse/gson that referenced this issue Sep 14, 2024
…#2231)

* Add settings for kind of newline to use

* Fix amp in javadoc

* Fixing link in javadoc

* Doc: use JSON instead of Json

Co-authored-by: Marcono1234 <[email protected]>

* PR Feedback: Objects.requireNonNull

Co-authored-by: Marcono1234 <[email protected]>

* PR Feedback: $next-version$, don't hardcode

Co-authored-by: Marcono1234 <[email protected]>

* s/testNewlineLF/testNewlineLf/

Co-authored-by: Marcono1234 <[email protected]>

* Implement PR feedback

* Round two of review

* Restore copyright year, no reason to update

* Rename OS named enum values to CR and LF

* Add javadoc to NewlineStyle.getValue()

* Implement PR feedback, round 2

* Fix typo

Co-authored-by: Marcono1234 <[email protected]>

* No need for line break

Co-authored-by: Marcono1234 <[email protected]>

* Shorter, cleaner doc

Co-authored-by: Marcono1234 <[email protected]>

* Using a FormattingStyle for pretty print

* Fix Junit4 and Truth after merge from master

* Implement review feedback

* Double backslash in message

---------

Co-authored-by: Marcono1234 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants