-
Notifications
You must be signed in to change notification settings - Fork 62
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Validate fieldnames and types when using pydantic codegen (#1189)
Python and pydantic do not allow arbitrary identifiers to be used as fields in classes. This PR adds checks to the BAML grammar, which run conditionally when the user includes a python/pydantic code generator block: - field names must not be Python keywords. - field names must not be lexographically equal to the field type, or the base of an optional type. E.g. rule 1: ```python # Not ok class Foo(BaseModel): if string ``` E.g. rule 2: ```python class ETA(BaseModel): time: string # Not ok class Foo(BaseModel): ETA: ETA ``` These rules are now checked during validation of the syntax tree prior to construction of the IR, and if they are violated we push an error to `Diagnostics`. Bonus: There are a few changes in the PR not related to the issue - they are little cleanups to reduce the number of unnecessary `rustc` warnings. <!-- ELLIPSIS_HIDDEN --> ---- > [!IMPORTANT] > Add validation for field names in BAML classes to prevent Python keyword and type name conflicts when using Pydantic code generation. > > - **Validation**: > - Add `assert_no_field_name_collisions()` in `classes.rs` to check field names against Python keywords and type names when using Pydantic. > - Use `reserved_names()` to map keywords to target languages. > - **Diagnostics**: > - Update `new_field_validation_error()` in `error.rs` to accept `String` for error messages. > - **Miscellaneous**: > - Remove unused code and features in `lib.rs` and `build.rs` to reduce rustc warnings. > - Add tests `generator_keywords1.baml` and `generator_keywords2.baml` to validate new rules. > > <sup>This description was created by </sup>[<img alt="Ellipsis" src="https://img.shields.io/badge/Ellipsis-blue?color=175173">](https://www.ellipsis.dev?ref=BoundaryML%2Fbaml&utm_source=github&utm_medium=referral)<sup> for 49d31fb. It will automatically update as commits are pushed.</sup> <!-- ELLIPSIS_HIDDEN -->
- Loading branch information
1 parent
8c3d536
commit 93b393d
Showing
11 changed files
with
190 additions
and
29 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
29 changes: 29 additions & 0 deletions
29
engine/baml-lib/baml/tests/validation_files/class/generator_keywords1.baml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
generator lang_python { | ||
output_type python/pydantic | ||
output_dir "../python" | ||
version "0.68.0" | ||
} | ||
|
||
class ETA { | ||
thing string | ||
} | ||
|
||
class Foo { | ||
if string | ||
ETA ETA? | ||
} | ||
|
||
// error: Error validating field `if` in class `if`: Field name is a reserved word in generated python/pydantic clients. | ||
// --> class/generator_keywords1.baml:12 | ||
// | | ||
// 11 | class Foo { | ||
// 12 | if string | ||
// 13 | ETA ETA? | ||
// | | ||
// error: Error validating field `ETA` in class `ETA`: When using the python/pydantic generator, a field name must not be exactly equal to the type name. Consider changing the field name and using an alias. | ||
// --> class/generator_keywords1.baml:13 | ||
// | | ||
// 12 | if string | ||
// 13 | ETA ETA? | ||
// 14 | } | ||
// | |
17 changes: 17 additions & 0 deletions
17
engine/baml-lib/baml/tests/validation_files/class/generator_keywords2.baml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
// This file is just like generator_keywords1.baml, except that the fieldname | ||
// has been changed in order to not collide with the field type `ETA`, and an | ||
// alias is used to render that field as `ETA` in prompts. | ||
|
||
generator lang_python { | ||
output_type python/pydantic | ||
output_dir "../python" | ||
version "0.68.0" | ||
} | ||
|
||
class ETA { | ||
thing string | ||
} | ||
|
||
class Foo { | ||
eta ETA? @alias("ETA") | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,2 @@ | ||
fn main() { | ||
// If you have an existing build.rs file, just add this line to it. | ||
#[cfg(feature = "use-pyo3")] | ||
pyo3_build_config::use_pyo3_cfgs(); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters