Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: Ability to listen to startObject/endObject tokens #111

Open
itsjustbrian opened this issue Feb 27, 2024 · 2 comments
Open

Enhancement: Ability to listen to startObject/endObject tokens #111

itsjustbrian opened this issue Feb 27, 2024 · 2 comments

Comments

@itsjustbrian
Copy link

itsjustbrian commented Feb 27, 2024

I'll start with my use case since I may be missing a simpler way to do it.

I'm parsing a non-blocking stream of a newline-delimited JSON file where each object has a massive array.
I have a path listener that emits on every object in the array and saves it to a db.
However, I also need to include some values from the top-level object in each of the array objects.
I made use of context here and bound a listener for each top-level key I needed and saved it on the context.
Then, once an object has finished being parsed, I take the values from the context and update the previously saved objects in the db.

The problem is detecting when the current object ends so I can clear the context in preparation for the next object.
Right now I'm relying on key order: if the listener fires for what I know is the last key, I know I can clear the context. But this is flimsy/hacky since I have no control over the key order. I've also tried listening for the path "$" but this appears to load each whole object into memory which is not viable for the size of objects I'm dealing with.

It would be nice to be able to hook into context callbacks like startObject/endObject:


by binding a new type of listener that would fire for these tokens, along with the current depth (though I suppose this could be derived from currentPosition)

Maybe something like:

        surfer.configBuilder()
                .bindTokenListener(new TokenListener() {
                    @Override
                    public void onStartObject(ParsingContext context) {
                        System.out.println(context.depth); // 0...n
                    }
                    
                    @Override
                    public void onEndObject(ParsingContext context) {
                        System.out.println(context.depth); // 0...n
                    }
                })

This would also be applicable to any token handled by SurfingContext.

Awesome library by the way!

@wanglingsong
Copy link
Owner

Maybe you can check getCurrentArrayIndex in the ParsingContext to detect whether it starts parsing a new array element

@itsjustbrian
Copy link
Author

itsjustbrian commented Feb 28, 2024

@wanglingsong
That would work if the file I was dealing with was a top-level array e.g.

[
  {...},
  {...},
  {...}
]

But I'm dealing with newline-delimited objects e.g.

{...}
{...}
{...}

So getCurrentArrayIndex is always -1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants