-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: parse triple backtick strings, discarding the info header #1162
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to 8748913 in 59 seconds
More details
- Looked at
820
lines of code in10
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. engine/baml-lib/schema-ast/src/ast/expression.rs:4
- Draft comment:
Thededent
function is duplicated here and inengine/bstd/src/dedent.rs
. Consider using thededent
function frombstd
to avoid code duplication. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment seems to be addressing an issue that has already been resolved in the diff. Thededent
function is no longer duplicated because it has been removed from this file and is now being imported frombstd
. Therefore, the comment is outdated and not useful.
I might be missing some context about whether thededent
function inbstd
is exactly the same as the one that was removed. However, the fact that the function was removed and replaced with an import strongly suggests that the duplication issue is resolved.
Even if there are slight differences, the intention to remove duplication is clear from the changes made. The comment is not needed because the action to resolve the issue has already been taken.
The comment is outdated because the code duplication issue it addresses has already been resolved in the diff. Therefore, the comment should be deleted.
Workflow ID: wflow_iUAafNMCertTBOPr
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
FieldType::class("Test"), | ||
{ | ||
"type": "code", | ||
"code": "```\nconst { execSync } = require('child_process');\n```", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i thought we said to trim + dedent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is triple backticks inside a string, that's why the newlines get preserved!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dedent? everything else looks good!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on da7415d in 53 seconds
More details
- Looked at
26
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. engine/baml-lib/jsonish/src/jsonish/parser/fixing_parser/json_parse_state.rs:502
- Draft comment:
The logic for checking triple backticks is flawed due to incorrect use ofpeek
. Consider refactoring to correctly advance the iterator when checking for triple backticks. - Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_rNbUceArEiS7dId6
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
// TODO: this logic is busted. peekable.peek() does not | ||
// advance the iterator (this is easily verified with | ||
// a unit test), but to fix this we need to do a bit of | ||
// refactoring, so for now we'll live with it. | ||
let is_triple_quoted = match next.peek() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for checking triple quotes is flawed due to incorrect use of peek
. Consider refactoring to correctly advance the iterator when checking for triple quotes.
da7415d
to
289e1d9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 289e1d9 in 50 seconds
More details
- Looked at
26
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. engine/baml-lib/jsonish/src/jsonish/parser/fixing_parser/json_parse_state.rs:405
- Draft comment:
The logic for checking triple quotes usingpeek()
is incorrect becausepeek()
does not advance the iterator. This can lead to incorrect parsing of triple-quoted strings. Consider refactoring to correctly handle iterator advancement. - Reason this comment was not posted:
Comment was on unchanged code.
2. engine/baml-lib/jsonish/src/jsonish/parser/fixing_parser/json_parse_state.rs:502
- Draft comment:
The logic for checking triple backticks usingpeek()
is incorrect becausepeek()
does not advance the iterator. This can lead to incorrect parsing of triple-backtick strings. Consider refactoring to correctly handle iterator advancement. - Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_ghT08XKDv0v7keJ0
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 8c56484 in 37 seconds
More details
- Looked at
107
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. engine/bstd/src/dedent.rs:62
- Draft comment:
Consider handling the case where the input string is empty to avoid potential errors or unexpected behavior. - Reason this comment was not posted:
Comment was on unchanged code.
Workflow ID: wflow_HIdEVDcFUhmRY8em
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on 3d5cd23 in 41 seconds
More details
- Looked at
162
lines of code in2
files - Skipped
0
files when reviewing. - Skipped posting
0
drafted comments based on config settings.
Workflow ID: wflow_8OAj2ihYvGmbGSUj
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
let (_, tail) = l.split_at(prefix.len()); | ||
tail | ||
} else { | ||
"" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines that are entirely whitespace should not be replaced with an empty string. This can lead to incorrect results when dedenting, as seen in the test_empty_lines
test case. Consider preserving such lines as they are.
"" | |
l |
This is a more LLM-reliable way to get codegen outputs
We should add a documentation guide page for this (possibly a great community contribution candidate here), but in the meantime I've drafted up a blog post
Important
Add support for parsing triple backtick strings in JSON parser, including dedenting and discarding info headers.
TripleBacktickString
variant toJsonCollection
injson_collection.rs
to handle triple backtick strings.process_token()
injson_parse_state.rs
to parse triple backtick strings, discarding the info header and dedenting content.dedent
function todedent.rs
inbstd
and update its usage inexpression.rs
andjson_collection.rs
.test_code.rs
to verify parsing of triple backtick strings, including edge cases like nested backticks and dedenting.This description was created by for 3d5cd23. It will automatically update as commits are pushed.