-
Notifications
You must be signed in to change notification settings - Fork 62
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add constraints to test blocks. (#1185)
This PR extends the syntax for test blocks with constraints (checks and asserts): ```baml test SomeTest { functions [Succ] args { x 1 } @@Assert( {{ this == 2 }} ) } ``` Testing done: - [x] Unit tests for parsing and interpreting test-level constraints. - [x] Integ tests still pass, but none were added for this feature. - [x] Manual testing of checks & asserts in vscode extension. ## Details `test` blocks in BAML code may now contain checks and asserts. A slightly different set of variables are available in the context of the jinja expressions a user can write in a test, compared to the constraints a user would place on types: - The `_` variable contains fields `result`, `checks` and `latency_ms`. - The `this` variable refers to the value computed by the test. - In a given constraint, `_.checks.NAME` can refer to the NAME of any earlier check that was run in the same test block. The UI has been updated to reflect the results of test-level constraints. Failing asserts result in a test error, and failing constraints result in a message indicating that user intervention is required to assess the response. ## Example screenshots <img width="902" alt="Screenshot 2024-11-20 at 3 23 46 PM" src="https://github.com/user-attachments/assets/952751bc-1b7f-4978-ad06-0639c4269ba0"> This example shows three checks that use different builtin variables in their predicate functions, and an assert that refers to the previous checks. --- <img width="932" alt="Screenshot 2024-11-20 at 3 29 32 PM" src="https://github.com/user-attachments/assets/beb5d296-fe10-4227-a0c0-20f3f5ff6f92"> This example shows how a failing assert is rendered. The failing assert is the result of asserting the status of a prior check, named `fast`, which failed. --- <img width="534" alt="Screenshot 2024-11-20 at 3 31 07 PM" src="https://github.com/user-attachments/assets/364d79dc-78f7-4f6b-8cc2-741b14b2b659"> Jinja expressions that try to reference nonexistent checks, or checks that are defined later in the test, raise compiler warnings. --- <img width="407" alt="Screenshot 2024-11-20 at 3 46 02 PM" src="https://github.com/user-attachments/assets/5b06e88c-8348-4e17-8ca1-4c41ba587f5f"> Function arguments are available in jinja expressions, and are visible to the static analyzer, so that warnings can be raised when attempting to use a nonexistent argument. --- <img width="909" alt="Screenshot 2024-11-22 at 4 17 36 PM" src="https://github.com/user-attachments/assets/d110d6a6-0b75-4663-a744-d8cd82de59ad"> This screenshot shows the provisional UI for when some subset of the constraints fails. - When at least one constraint fails, all checks and their status are rendered. - Any time tests are run, the status of the whole test suite is indicated with an icon (in this case, the yellow warning sign) <!-- ELLIPSIS_HIDDEN --> ---- > [!IMPORTANT] > This PR adds constraints to BAML test blocks, updates the runtime for constraint evaluation, and enhances the UI to display results. > > - **Behavior**: > - Adds constraints (`check` and `assert`) to BAML `test` blocks. > - Constraints use variables like `this`, `_.result`, and `_.checks`. > - Failing asserts cause test errors; failing checks require user intervention. > - **Validation**: > - Adds validation for constraints in `tests.rs`. > - Ensures constraints are correctly parsed and validated in `constraint.rs`. > - **Runtime**: > - Implements `evaluate_test_constraints` in `constraints.rs` to process constraints. > - Updates `orchestrate` functions to handle constraint evaluation. > - **UI**: > - Updates test status handling in `testHooks.ts` and `test_result.tsx` to include `constraints_failed` status. > - Adds UI elements to display constraint evaluation results. > - **Misc**: > - Updates `Cargo.toml` files to include necessary dependencies. > - Adds tests for constraint evaluation in `constraints.rs`. > > <sup>This description was created by </sup>[<img alt="Ellipsis" src="https://img.shields.io/badge/Ellipsis-blue?color=175173">](https://www.ellipsis.dev?ref=BoundaryML%2Fbaml&utm_source=github&utm_medium=referral)<sup> for 49e312d. It will automatically update as commits are pushed.</sup> <!-- ELLIPSIS_HIDDEN -->
- Loading branch information
1 parent
93b393d
commit cafd2ea
Showing
28 changed files
with
1,035 additions
and
95 deletions.
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
94 changes: 94 additions & 0 deletions
94
engine/baml-lib/baml-core/src/validate/validation_pipeline/validations/tests.rs
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
use baml_types::{Constraint, ConstraintLevel}; | ||
use internal_baml_diagnostics::{DatamodelError, DatamodelWarning, Span}; | ||
use internal_baml_jinja_types::{validate_expression, JinjaContext, PredefinedTypes, Type}; | ||
|
||
use crate::validate::validation_pipeline::context::Context; | ||
|
||
pub(super) fn validate(ctx: &mut Context<'_>) { | ||
let tests = ctx.db.walk_test_cases().collect::<Vec<_>>(); | ||
tests.iter().for_each(|walker| { | ||
let constraints = &walker.test_case().constraints; | ||
let args = &walker.test_case().args; | ||
let mut check_names: Vec<String> = Vec::new(); | ||
for ( | ||
Constraint { | ||
label, | ||
level, | ||
expression, | ||
}, | ||
constraint_span, | ||
expr_span, | ||
) in constraints.iter() | ||
{ | ||
let mut defined_types = PredefinedTypes::default(JinjaContext::Parsing); | ||
defined_types.add_variable("this", Type::Unknown); | ||
defined_types.add_class( | ||
"Checks", | ||
check_names | ||
.iter() | ||
.map(|check_name| (check_name.clone(), Type::Unknown)) | ||
.collect(), | ||
); | ||
defined_types.add_class( | ||
"_", | ||
vec![ | ||
("checks".to_string(), Type::ClassRef("Checks".to_string())), | ||
("result".to_string(), Type::Unknown), | ||
("latency_ms".to_string(), Type::Number), | ||
] | ||
.into_iter() | ||
.collect(), | ||
); | ||
defined_types.add_variable("_", Type::ClassRef("_".to_string())); | ||
args.keys() | ||
.for_each(|arg_name| defined_types.add_variable(arg_name, Type::Unknown)); | ||
match (level, label) { | ||
(ConstraintLevel::Check, Some(check_name)) => { | ||
check_names.push(check_name.to_string()); | ||
} | ||
_ => {} | ||
} | ||
match validate_expression(expression.0.as_str(), &mut defined_types) { | ||
Ok(_) => {} | ||
Err(e) => { | ||
if let Some(e) = e.parsing_errors { | ||
let range = match e.range() { | ||
Some(range) => range, | ||
None => { | ||
ctx.push_error(DatamodelError::new_validation_error( | ||
&format!("Error parsing jinja template: {}", e), | ||
expr_span.clone(), | ||
)); | ||
continue; | ||
} | ||
}; | ||
|
||
let start_offset = expr_span.start + range.start; | ||
let end_offset = expr_span.start + range.end; | ||
|
||
let span = Span::new( | ||
expr_span.file.clone(), | ||
start_offset as usize, | ||
end_offset as usize, | ||
); | ||
|
||
ctx.push_error(DatamodelError::new_validation_error( | ||
&format!("Error parsing jinja template: {}", e), | ||
span, | ||
)) | ||
} else { | ||
e.errors.iter().for_each(|t| { | ||
let tspan = t.span(); | ||
let span = Span::new( | ||
expr_span.file.clone(), | ||
expr_span.start + tspan.start_offset as usize, | ||
expr_span.start + tspan.end_offset as usize, | ||
); | ||
ctx.push_warning(DatamodelWarning::new(t.message().to_string(), span)) | ||
}) | ||
} | ||
} | ||
} | ||
} | ||
}); | ||
} |
Oops, something went wrong.