Fixed conformance reporting #3348

Razican · 2023-10-03T19:09:17Z

We received reports in our Discord saying that our conformance representation was wrong.

We count a test that needs to be run both in strict and non-strict mode as 2 tests, and if it passes in both modes, we count 2 tests as passing, while if only one of them passes, we count one as passed, one as failed.

The correct interpretation is that a test is only considered passing if both strict and no-strict modes pass (tc39/test262#3871).

It changes the following:

TestResult now contains 2 fields: strict and no_strict. strict is serialized as s, and no_strict is serialized as r (from "result")
We don't store result strings in the TestResult structure, since it was being skipped and never used
We no longer store feature information, since we were not using it, and the idea behind it was to get conformance per feature, not to have a feature list.
Minified output by using test / suite names as keys
Properly calculate conformance & comparisons

This requires a change in the conformance page to adhere to the new result format, and we need to migrate the current conformance results. I'll open a new PR for that.

This PR will fail when doing the result comparison, thoguh, so we might need to first merge the new result format, then this.

codecov · 2023-10-03T19:25:33Z

Codecov Report

Attention: Patch coverage is 0% with 51 lines in your changes are missing coverage. Please review.

Project coverage is 45.42%. Comparing base (6ddc2b4) to head (b68e588).
Report is 43 commits behind head on main.

❗ Current head b68e588 differs from pull request most recent head dca72d4. Consider uploading reports for the commit dca72d4 to get more accurate results

Files	Patch %	Lines
boa_tester/src/exec/mod.rs	0.00%	25 Missing ⚠️
boa_tester/src/results.rs	0.00%	25 Missing ⚠️
boa_tester/src/main.rs	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3348      +/-   ##
==========================================
- Coverage   47.24%   45.42%   -1.82%     
==========================================
  Files         476      483       +7     
  Lines       46892    49768    +2876     
==========================================
+ Hits        22154    22609     +455     
- Misses      24738    27159    +2421

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

jedel1043 · 2023-10-05T17:27:41Z

boa_tester/src/main.rs

+    strict: Option<TestOutcomeResult>,
+    #[serde(rename = "r", default)]
+    no_strict: Option<TestOutcomeResult>,


We could simplify this by making TestOutcomeResult something like:

#[serde(tag = "t", content = "d") enum TestOutcomeResult { #[serde(rename = "O")] Passed, #[serde(rename = "I")] Ignored, #[serde(rename = "F")] Failed { strict: bool, reason: String }, #[serde(rename = "P")] Panic { strict: bool, reason: String, } }

Then, we won't need separate fields for strict and no_strict, just a single result.

I was thinking about that, and also about the file size & JS performance in the browser. I think that having dictionaries with test/suite names as keys would decrease the size of the JSON around a 15%.

Booleans occupy a bit more. Maybe adding more variants, such as "StrictFailed" and so on could be more performant.

Maybe it's time to transition to a binary serialization format such as MessagePack?

Another idea I had was to use Postcard and a small wasm glue code to deserialize the postcard and serialize it to JSON, then we would call that function from the conformance page.

I'll play a bit with those

I've given this a thought, and it could be a nice enhancement in the future, but I would for now just use JSON until the new website is up and running. What do you think?

Yep, we can revisit this when we finish the site redesign.

Putting that aside, I still think it would be better to have FailedStrict and PanicStrict for simplicity, if we do use that info. If we don't, maybe we can obviate the need for those variants and just have a single result field?

Razican · 2023-10-24T20:42:34Z

I converted it back to draft, because I'm still missing:

ensure that the output result is the same as the new result in the website
ensure that "short" results only contain the last value (except for the main branch)

jedel1043 · 2024-02-15T01:05:26Z

Since we will probably do a release before this is finished, and we need to fix our tester before then, I'll open a PR with a simpler version of this that just fixes the bug without changing the data representation too much. We can apply the changes of this PR after that.

Razican · 2024-02-24T20:03:45Z

This has been rebased. It's still missing:

ensure that the output result is the same as the new result in the website
ensure that "short" results only contain the last value (except for the main branch)

Razican added the test Issues and PRs related to the tests. label Oct 3, 2023

Razican added this to the v0.18.0 milestone Oct 3, 2023

Razican requested a review from a team October 3, 2023 19:09

jedel1043 mentioned this pull request Oct 3, 2023

Bump test262 #3349

Merged

jedel1043 reviewed Oct 5, 2023

View reviewed changes

Razican added a commit that referenced this pull request Oct 8, 2023

Fixed latest.json result files for the new format (#3348)

9fe5f48

Razican mentioned this pull request Oct 8, 2023

Use the new format for the test262 representation #3364

Closed

6 tasks

Razican added a commit that referenced this pull request Oct 8, 2023

Fixed latest.json result files for the new format (#3348)

34615ec

Razican marked this pull request as draft October 24, 2023 20:41

Razican force-pushed the real_conformance branch from 391416e to b68e588 Compare October 27, 2023 16:08

Razican added a commit that referenced this pull request Oct 27, 2023

Fixed latest.json result files for the new format (#3348)

46fbe14

jedel1043 added the waiting-on-author Waiting on PR changes from the author label Nov 29, 2023

Fixed conformance reporting

3a8360b

Razican force-pushed the real_conformance branch from 0ba1929 to 3a8360b Compare February 24, 2024 19:56

Small optimization

dca72d4

Razican added a commit that referenced this pull request Feb 24, 2024

Fixed latest.json result files for the new format (#3348)

4ed8aff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed conformance reporting #3348

Fixed conformance reporting #3348

Razican commented Oct 3, 2023 •

edited

Loading

codecov bot commented Oct 3, 2023 •

edited

Loading

jedel1043 Oct 5, 2023 •

edited

Loading

Razican Oct 5, 2023

jedel1043 Oct 5, 2023

Razican Oct 5, 2023

Razican Oct 8, 2023

jedel1043 Oct 8, 2023 •

edited

Loading

Razican commented Oct 24, 2023

jedel1043 commented Feb 15, 2024

Razican commented Feb 24, 2024

Fixed conformance reporting #3348

Are you sure you want to change the base?

Fixed conformance reporting #3348

Conversation

Razican commented Oct 3, 2023 • edited Loading

codecov bot commented Oct 3, 2023 • edited Loading

Codecov Report

jedel1043 Oct 5, 2023 • edited Loading

Choose a reason for hiding this comment

Razican Oct 5, 2023

Choose a reason for hiding this comment

jedel1043 Oct 5, 2023

Choose a reason for hiding this comment

Razican Oct 5, 2023

Choose a reason for hiding this comment

Razican Oct 8, 2023

Choose a reason for hiding this comment

jedel1043 Oct 8, 2023 • edited Loading

Choose a reason for hiding this comment

Razican commented Oct 24, 2023

jedel1043 commented Feb 15, 2024

Razican commented Feb 24, 2024

Razican commented Oct 3, 2023 •

edited

Loading

codecov bot commented Oct 3, 2023 •

edited

Loading

jedel1043 Oct 5, 2023 •

edited

Loading

jedel1043 Oct 8, 2023 •

edited

Loading