Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to make sure a string or string literal is a valid JSON? #458

Closed
ghost opened this issue Feb 16, 2017 · 59 comments
Closed

How to make sure a string or string literal is a valid JSON? #458

ghost opened this issue Feb 16, 2017 · 59 comments
Assignees
Labels
kind: enhancement/improvement solution: proposed fix a fix for the issue has been proposed and waits for confirmation
Milestone

Comments

@ghost
Copy link

ghost commented Feb 16, 2017

I'm gonna be sending strings to the parser that might or might not be valid JSONs and I need a way to check they are. I'm programming for a Nintendo 3DS, but using try to catch an exception does not work and the application hangs anyway if it finds an error while parsing. What can I do?

@jmazzola
Copy link

it'd be nice to have a method perhaps a boolean to "validate" json. Similar to how this website works:

http://codebeautify.org/jsonviewer#

@gregmarr
Copy link
Contributor

There is a library that just does validation and supports this parser:
https://github.com/tristanpenman/valijson

@ghost
Copy link
Author

ghost commented Feb 17, 2017

That doesn't do what I need, I asked the author. If I provide a non-valid JSON it'll still hang :/

@gregmarr
Copy link
Contributor

Really, a hang? That's very surprising. Can you share a sample of the json that causes a hang?

@ghost
Copy link
Author

ghost commented Feb 17, 2017

I doubt it's got something to do with the JSON, it's because there is no good way that I know of handling exceptions when programming for a Nintendo 3DS so the program will just freeze.

@gregmarr
Copy link
Contributor

JSON for Modern C++ Version 2.1.0
@nlohmann nlohmann released this 19 days ago · 56 commits to develop since this release

Release date: 2017-01-28

✨ Exceptions can now be switched off. This can be done by defining the preprocessor symbol JSON_NOEXCEPTION or by passing -fno-exceptions to your compiler. In case the code would usually thrown an exception, abort() is now called.

@gregmarr
Copy link
Contributor

Oh, that probably doesn't help much either, as it calls abort.

@gregmarr
Copy link
Contributor

So what you really need is for the library to return an error code instead of throwing or aborting when calling parse().

@ghost
Copy link
Author

ghost commented Feb 17, 2017

Yes, pretty much.

@nlohmann
Copy link
Owner

I understand the issue. Inside parse_internal() all exceptions are thrown by expect()/unexpect(). (https://github.com/nlohmann/json/blob/develop/src/json.hpp#L11455) We may use a different replacement for JSON_THROW there, but still would have the problem of leaving the function immediately in case of an error.

@ghost
Copy link
Author

ghost commented Feb 17, 2017

Would it be possible to return an empty json object instead?

@nlohmann
Copy link
Owner

If replacing the throw call is possible, returning a discarded object would be better. You would check it with is_discarded().

@ghost
Copy link
Author

ghost commented Feb 18, 2017

Hey, commenting the JSON_THROW part on except() and unexcept() and then using an "if (json::parse(string) != NULL)" seems to work!

@nlohmann
Copy link
Owner

To properly provide a solution for this issue, I think it would be best to have a dedicated check function that returns a boolean and does not create any basic_json values at all.

@ghost
Copy link
Author

ghost commented Feb 19, 2017

Yeah, a well-programmed function would be a better idea, but for now this trick works for me.

@nlohmann nlohmann added this to the Release 3.0.1 milestone Feb 19, 2017
@ronaaron
Copy link

ronaaron commented Mar 23, 2017

I'd definitely add my vote for a validation function. I'm having trouble where the exceptions aren't caught (in a multi-thread app, clang++ on linux) and it's a critical bit of infrastructure.

Update: I was able to finally trap exceptions. For anyone else having a similar issue, you may need to use '-stdlib=libc++' along with some other nonsense.

@nlohmann nlohmann modified the milestones: Release 3.0.0, Release 3.0.1 Mar 23, 2017
@nlohmann
Copy link
Owner

Such a validation function should be added to 3.0.0.

@ronaaron
Copy link

Thank you!

@user1095108
Copy link

the function has to have a noexcept specifier.

@nlohmann
Copy link
Owner

This is not possible, because then we would need noexpect functions to read from streams.

@user1095108
Copy link

? streams don't necessarily throw, actually they don't by default.

@nlohmann
Copy link
Owner

But their functions are not nothrow.

@theodelrieu
Copy link
Contributor

Even if we don't have std::optional nor std::expected right now, maybe a json::parse_optional method would be great. There is the same kind of method in Boost.PropertyTree for example.

It could return a std::unique_ptr, which is not ideal, but gets the job done, what do you think?

@user1095108
Copy link

the functions are nothrow, but if you can disable/restore the throwing at runtime and in that case noexcept is OK.

@nlohmann
Copy link
Owner

I meant noexcept. I cannot declare a validator function noexcept if it calls functions that itself are not marked noexcept.

@nlohmann
Copy link
Owner

nlohmann commented Jun 17, 2017

Then I misunderstood what was meant by "validation function". Therefore, my comment #458 (comment).

@nlohmann
Copy link
Owner

I shall think of a solution. I do not really like the approach with a pointer though.

@1Hyena
Copy link

1Hyena commented Jun 17, 2017

@nlohmann sure it was just an example of the concept. You can pass by reference if you want to. I'm just more a pointer-person.

@1Hyena
Copy link

1Hyena commented Jun 17, 2017

@nlohmann but there is one advantage to my proposed pointer concept. It allows you to pass nullptr as an argument which would instruct the parse function to not produce the nlohmann::json object but only validate the string and return either true or false. if you pass by reference then you can't pass a nullptr by reference you know :P

@1Hyena
Copy link

1Hyena commented Jun 17, 2017

Example usecase

nlohmann::json obj;
if (!nlohmann::parse(json_str, &obj) printf("Invalid json string: %s", json_str.c_str());
if (!nlohmann::parse(json_str, nullptr) printf("Invalid json string: %s", json_str.c_str());

@nlohmann
Copy link
Owner

The problem is that the current implementation of parse already has a last parameter of type parser_callback_t with a default parameter. For this function, I am not sure whether overloading is the way to go. Furthermore, we support six different inputs to read from (string, char pointer, ...) - I do not really like this explosion in parse-related functions.

One way (and I am not convinced it is the best) would be to expose the parser class to the user who then can decide which function to call. This would avoid implementing six versions of all possible parse function. Even better, we may use the same class to cope with MessagePack and CBOR.

Hypothetical example:

json::parser p(my_input);
json j;

bool x = p.parse(j);
j = p.parse(my_callback);

If we add an operator json() to this parser class, we may still write json j = json::parse(my_input).

What do you think?

@nlohmann
Copy link
Owner

Opened #623 as place to discuss the structure of the parsing functions.

@CodeMasterYi
Copy link

Well. We agree we need a validation function.

This is what i need! Aha~ #640

@nlohmann
Copy link
Owner

FYI: It is now possible to call static bool accept(i) with any type i from which an input adapter can be constructed. Basically, i can have the same types as static basic_json parse(i). This function returns true iff the input contains a valid serialized JSON value. No exceptions are thrown in case of an error. The function is not noexcept, because reading from an input (e.g., an std::ifstream) may throw.

Next thing I would work on is a way to configure parse not to throw an exception in case of a parse error.

nlohmann added a commit that referenced this issue Jul 27, 2017
You can now pass a boolean "allow_exceptions" to the parse functions. If it is false, no exceptions are thrown in case of a parse error. Instead, parsing is stopped at the first error and a JSON value of type "discarded" (check with is_discarded()) is returned.
@nlohmann nlohmann removed the state: please discuss please discuss the issue or vote for your favorite option label Jul 27, 2017
@nlohmann
Copy link
Owner

You can now pass a boolean allow_exceptions to the parse functions. If it is false, no exceptions are thrown in case of a parse error. Instead, parsing is stopped at the first error and a JSON value of type "discarded" (check with is_discarded()) is returned.

@nlohmann
Copy link
Owner

Done.

@houglum
Copy link

houglum commented Oct 4, 2018

You can now pass a boolean allow_exceptions to the parse functions. If it is false, no exceptions are thrown in case of a parse error. Instead, parsing is stopped at the first error and a JSON value of type "discarded" (check with is_discarded()) is returned.

Looks like that's correct, given a conversation I had today to test this :) However, the docs indicate that this can't happen:

This function will always be false for JSON values after parsing. That is, discarded values can only occur during parsing, but will be removed when inside a structured value or replaced by null in other cases.

@nlohmann
Copy link
Owner

nlohmann commented Oct 4, 2018

@houglum Is there an issue?

@houglum
Copy link

houglum commented Oct 4, 2018

Just that (I believe) the docs need to be updated. They say that is_discarded() will always be false after parsing, but that's only the case if allow_exceptions was set to true, IIUC. The docs should be updated to reflect the fact that if allow_exceptions is set to false, in the case of a parse error, calling is_discarded() on the value returned from parse() will return true.

@mohammadrasekh68
Copy link

mohammadrasekh68 commented Aug 30, 2019

Im using Accept method to check if my file is a valid Json or not,
the problem is if i do as follow :
`
ifstream Valid(jsonFile);
if(Valid.is_open()){
if(!json::accept(Valid))
{

			DLTLOG( DLT_LOG_ERROR,jsonFile," is not a valid Json file");
			return -1;
		}
		JsonObject=json::parse(Valid); //error happens here
		Valid.close();
	}

I will get segmentation error (core dump) so im force to do this :
ifstream Valid(jsonFile);
if(Valid.is_open())
{
if(!json::accept(Valid))
{

			DLTLOG( DLT_LOG_ERROR,jsonFile," is not a valid Json file");
			return -1;
		}
		Valid.close();
	}
	else
	{
		DLTLOG(DLT_LOG_ERROR,"JsonFile : ",jsonFile," Not founded");
		return -1;
	}
	ifstream ifs(jsonFile);

	DLTLOG(DLT_LOG_INFO,"Converting ",jsonFile," JsonFile To JsonObject");
	JsonObject=json::parse(ifs);
	ifs.close();

`
does Accept method, delete my ifstream object?

@gregmarr
Copy link
Contributor

No, but it does read all of the content. You need to reset it to the beginning before you can read from it again.
https://en.cppreference.com/w/cpp/io/basic_istream/seekg

@tawmoto
Copy link

tawmoto commented Oct 16, 2020

Does anyone knows if there is there a performance difference between accept and parse() + is_discarded() ?
Thank you

@1Hyena
Copy link

1Hyena commented Oct 16, 2020

@tawmoto I personally threw out all uses of nlohmann::json from my project because it is bloatware. Not only does it expect you to use exceptions on a regular basis (anti-pattern as the use of exceptions should be an exception itself), but its most fundamental use-case of turning a string into a valid JSON representation can be done in 150 lines of C++ (includes JSON escaping, UTF-8 parsing and does not rely on exceptions).

If speed is important to you, I highly suggest not using nlohmann::json. Serialized JSON is such a trivial format that constructing it "manually" makes much more sense than relying on some header-only library. If it is acceptable for you to concatenate strings in order to make up SQL statements, then it should be equally acceptable to do the same for JSON serialization. Only if you need to deserialize JSON you might want to use a library, but then you might as well as consider the amazing Linux tool known as jq.

@tawmoto
Copy link

tawmoto commented Oct 16, 2020

@1Hyena Thank you for the answer, speed is important but not mandatory. I have a curl connection which takes 20-30 times more than the parsing, so it's ok.
I don't agree with "parsing" manually, many teams uses this approach and the cloud is overcrowded with a lot of garbage, I preffer to use a well-tested library. At least my jsons are not trivial at all, I have vectors in vectors of structures, which I am reading/writing in 1 line using this great parser. :-)

@1Hyena
Copy link

1Hyena commented Oct 16, 2020

@tawmoto
In case you're using the command line version of curl, you should really take a look at what jq has to offer. It's such a standard tool these days and in combination with GNU parallel it allows you to build horizontally scalable applications in a breeze.

@nlohmann
Copy link
Owner

Does anyone knows if there is there a performance difference between accept and parse() + is_discarded() ?

Accept is not building a DOM representation and is therefore much faster.

@nlohmann
Copy link
Owner

@tawmoto I personally threw out all uses of nlohmann::json from my project because it is bloatware.

The library offers more than just parsing. If you do not need this, than this library may not be for you.

Not only does it expect you to use exceptions on a regular basis (anti-pattern as the use of exceptions should be an exception itself), but its most fundamental use-case of turning a string into a valid JSON representation can be done in 150 lines of C++ (includes JSON escaping, UTF-8 parsing and does not rely on exceptions).

You can in fact switch off exceptions and still use the library.

I would be really interested in seeing the 150 lines of C++ code.

@1Hyena
Copy link

1Hyena commented Oct 16, 2020

I would share the code but it's proprietary. The trick is to use lambdas for reporting valid and invalid byte sequences back to the caller. UTF8 validation is 80 lines. JSON escaping is 100 lines. Line width limit is 80 symbols. The escape_json function calls the parse_utf8 function and does the JSON escaping in the report_valid lambda that is called on valid UTF8 sequences by the parse_utf8 function.

The declarations of the said functions are given below.

size_t escape_json(
    const char *str,
    std::function<bool(const char *, size_t)> report_valid,
    std::function<bool(const char *, size_t)> report_invalid,
    size_t max_symbols, size_t max_bytes
);

size_t parse_utf8(
    const char *str,
    std::function<bool(const char *, size_t)> report_valid,
    std::function<bool(const char *, size_t)> report_invalid,
    size_t max_symbols, size_t max_bytes
);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: enhancement/improvement solution: proposed fix a fix for the issue has been proposed and waits for confirmation
Projects
None yet
Development

No branches or pull requests