-
Notifications
You must be signed in to change notification settings - Fork 482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parsers: lines: support "rules" for multiple sets of regexes #407
Conversation
rmilecki
commented
Sep 25, 2022
Sometimes companies use more than 1 format for line-parseable data. They may e.g. randomly add some extra columns that are used occasionally. This commit adds "rules" field support to the "lines" parser. It allows defining multiple sets or regexes ("start", "end", "line" & friends) for a single field. Usage of "rules" is optional. Backward compatibility wiht existing templates is preserved. Signed-off-by: Rafał Miłecki <[email protected]>
This doesn't really change anything and isn't necessary. It however allows testing "rules" implementation in the "lines" parser. Signed-off-by: Rafał Miłecki <[email protected]>
599fe79
to
bd9e12d
Compare
@bosd: I think this implementation is a bit simpler than the one suggested in the #377 . I hope my changes are rather simple to understand & review. An advantage of this approach I see is that If we want to support new features in the old syntax I believe that code should go into plugins |
To start off: I'm all in, for cleaner and easier to understand code. edit: added txt input for easier testing
Repeating the functional test from #378 (comment) re-written the template to the syntax of this pr:
Result:
Conclusion, lines output is in the wrong order. |
For completeness, Here is the desired outcome of the test:
|
@rmilecki What to do with this pr / functionality? |
I need to rework this. Describe better, provide use case, test, probably avoid modifying Amazon YAML as there is no strong reason for this. Converted into draft for now. I think meanwhile we can focus on #423 |
@rmilecki No worries, You'll have some time for this. As we've merged #417 , |
That particular case ended up being discussed in the #428. It seems we can already support such invoices with current code. There may be more than 1 way of handling such complex lines - depending on expected output. As for changes from this pull request I should rewrite them and add custom test. I'll open another pull request for that when I get it ready. |
One more update: Mekro invoices can be parsed the way @bosd expected since #417. It can be done with something like:
Above template fragment results in parsing invoice provided by @bosd into:
(which seems to match what was expected). As for coolblue invoices those are more tricky, it's even hard to agree on ideal expected output. That it being discussed in the #428. |