Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] What does threshold mean in a method toMatchSnapshot #10219

Closed
KirProkopchik opened this issue Nov 10, 2021 · 13 comments
Closed

[Question] What does threshold mean in a method toMatchSnapshot #10219

KirProkopchik opened this issue Nov 10, 2021 · 13 comments

Comments

@KirProkopchik
Copy link

Hi all,
Now I am using jest as a test runner with jest-playwright-preset and jest-image-snapshot for image comparisons.
Versions:
jest: 27.0.4
jest-playwright-preset: 1.7.0
jest-image-snapshot: 4.5.1
playwright: 1.16.2

Screenshot comparisons looks like:

const customConfig = { threshold: 0.5 };
  expect(image).toMatchImageSnapshot({
    customDiffConfig: customConfig,
  });

Where threshold is the ratio of differing to the total number of pixels. Or the maximum value of differing pixels, in case if threshold
is > 1.

When I started migrating to the native playwright runner, I faced with the image comparison issue. The expected and actual screenshots are not equals. But the difference is only a few pixels. For the image with resolution 1920x1080. I tried to set the threshold 0.5 or even more. Without any positive results.

I started looking the pixlematch source and find out that threshold determines how much each pixel can differ in color, and not the number of different pixels.
https://github.com/mapbox/pixelmatch/blob/b9261a447515f5aff37a15cfab9f4a491868f720/index.js#L45

Further, the number of different pixels is simply compared with zeros in playwright matcher.

return count > 0 ? { diff: PNG.sync.write(diff) } : null;

In my case, this approach leads to failures even for several pixels with a fairly high threshold.

@aslushnikov aslushnikov self-assigned this Nov 10, 2021
@aslushnikov
Copy link
Collaborator

@KirProkopchik Thank you for the input! Pixelmatch threshold is not very well explained in their docs, so we could've been confused. Let me look into this.

@GitHubby23
Copy link

GitHubby23 commented Nov 23, 2021

I observe the same behavior. Subsequent runs can take new snapshots that differ by 1 pixel in resolution. toMatchSnapshot fails regardless of threshold value, even when bumped up to 1.

@p01
Copy link
Contributor

p01 commented Dec 9, 2021

In Pixelmatch which is used so far by the method toMatchSnapshot(...), the threshold is about the percentage of color difference for one individual pixel.

If a single pixel exceeds that threshold, the images are considered different.
Images are also considered different if they have different resolution.

@dimkin-eu
Copy link

dimkin-eu commented Dec 23, 2021

old one ticket, where pics with big and small diffs are same after same threshold
#9444

maybe its worth to investigate usage of some alternative?

@frkj600
Copy link

frkj600 commented Feb 2, 2022

Hi Team,
Even I noticed the same behavior while running test cases in BrowserStack, threshold is set to 1
"@playwright/test": "^1.17.2",
"playwright": "^1.17.2",

image

@p01
Copy link
Contributor

p01 commented Feb 2, 2022

@dimkin-eu I developed a pixel-buffer-diff library that takes care of partial pixel differences to reduce false positive. Additionally it is 4x faster than pixelMatch ( the image comparison lib used in Playwright ). You can see the PR #10823 which tried to bump to an early version of the library. The current version is 1.3.2 addressed some issues found around that time.

Could you try on your project and let us know if it helped ?

@fanjum66 the screenshots in your example have different resolutions and therefore could not be diffed.

@aslushnikov aslushnikov added v1.20 and removed v1.19 labels Feb 2, 2022
@frkj600
Copy link

frkj600 commented Feb 3, 2022

@dimkin-eu I developed a pixel-buffer-diff library that takes care of partial pixel differences to reduce false positive. Additionally it is 4x faster than pixelMatch ( the image comparison lib used in Playwright ). You can see PR #10823 which tried to bump to an early version of the library. The current version is 1.3.2 addressed some issues found around that time.

Could you try on your project and let us know if it helped ?

@fanjum66 the screenshots in your example have different resolutions and therefore could not be diffed.

@p01 - Thanks for your quick response, I think even though a different library is used the below line still throw an error as PNG library is used to compare the sizes https://github.com/p01/playwright/blob/5148c1dbf62ae2bc3c36083fc9d9fa54c9405e0d/packages/playwright-test/src/matchers/golden.ts#L63

It could be better if a flag can be used to ignore the size diff and focus only on the data change.

And is there any playwright version I can use to test the mentioned fix ?

@p01
Copy link
Contributor

p01 commented Feb 3, 2022

@fanjum66 In your example, one image is 1px taller than the other. How can the pixel buffer diffing library make sense of that ? Is the whole page/UI really 1px taller ? Is everything shifted 1px to the bottom ? to the top ? Any way you look at it these snapshots won't match and need to be reviewed. It could be a true positive where your page/UI really did shrank by 1px but it is likely a real issue that needs to be investigated to figure why the screenshots have different resolution to fix the tests and ensure it is deterministic.

To try pixel-buffer-diff in a recent version of Playwright, one could fork Playwright and cherry-pick / redo the commit #9f913d94791e3e11dff98461609c729069fffc15 with the latest version of pixel-buffer-diff. If you have access to Github Codespaces, this should be fairly quick. you could pull the PR #11838 and use the env variable PW_USE_PIXEL_BUFFER_DIFF

@p01
Copy link
Contributor

p01 commented Feb 3, 2022

@aslushnikov @pavelfeldman @dgozman , it seems that Playwright uses a threshold of 0.2, that's 20% difference per pixel. Was this threshold inspired by other test runners ?

const thresholdOptions = { threshold: 0.2, ...options };

If we look at the colors black, #000000, this 0.2 threshold means that the further color Playwright will mark as the same is #252525. This is a very significant difference.

In our tests, we use 0.03, which brings the furthest possible color from black to #050505.

@p01
Copy link
Contributor

p01 commented Feb 3, 2022

Let's open a PR #11838 Add pixel-buffer-diff behind env variable opt-in, so anyone who want can try it without disrupting existing tests.

@aslushnikov
Copy link
Collaborator

If we look at the colors black, #000000, this 0.2 threshold means that the further color Playwright will mark as the same is #252525. This is a very significant difference. In our tests, we use 0.03, which brings the furthest possible color from black to #050505.

@p01 Did our defaults result in some tests passing erroneously? With the new suggested pixelCount option (with a non-zero default value), it might be possible to tighten the threshold back to some extend.

aslushnikov added a commit to aslushnikov/playwright that referenced this issue Feb 17, 2022
This patch adds additional options to `toMatchSnapshot` method:
- `pixelCount` - acceptable number of pixels that differ to still
  consider images equal.
- `pixelRatio` - acceptable percentage of pixels that differ to still
  consider images equal.

Since some anti-aliasing artifacts still can cripple in, we default
`pixelCount` to some arbitrary small number - `17` - to improve
tolerance.

Fixes microsoft#12167, microsoft#10219
aslushnikov added a commit to aslushnikov/playwright that referenced this issue Feb 17, 2022
This patch adds additional options to `toMatchSnapshot` method:
- `pixelCount` - acceptable number of pixels that differ to still
  consider images equal.
- `pixelRatio` - acceptable percentage of pixels that differ to still
  consider images equal.

Since some anti-aliasing artifacts still can cripple in, we default
`pixelCount` to some arbitrary small number - `17` - to improve
tolerance.

Fixes microsoft#12167, microsoft#10219
aslushnikov added a commit to aslushnikov/playwright that referenced this issue Feb 17, 2022
This patch adds additional options to `toMatchSnapshot` method:
- `pixelCount` - acceptable number of pixels that differ to still
  consider images equal.
- `pixelRatio` - acceptable percentage of pixels that differ to still
  consider images equal.

Since some anti-aliasing artifacts still can cripple in, we default
`pixelCount` to some arbitrary small number - `17` - to improve
tolerance.

Fixes microsoft#12167, microsoft#10219
aslushnikov added a commit that referenced this issue Feb 17, 2022
…12169)

This patch adds additional options to `toMatchSnapshot` method:
- `pixelCount` - acceptable number of pixels that differ to still
  consider images equal. Unset by default.
- `pixelRatio` - acceptable ratio of all image pixels (from 0 to 1) that differ to still
  consider images equal. Unset by default.

Fixes #12167, #10219
@aslushnikov
Copy link
Collaborator

Everybody: We just landed two new options to toMatchSnapshot: pixelCount and pixelRatio.
With these, you can have the following options to configure snapshot matching:

  • threshold <[float]> an acceptable percieved color difference in the YIQ color space between pixels in compared images, between zero (strict) and one (lax). Defaults to 0.2.
  • pixelCount <[int]> an acceptable amount of pixels that could be different, unset by default.
  • pixelRatio <[float]> an acceptable ratio of pixels that are different to the total amount of pixels, between 0 and 1 , unset by default.

You can give them a try using npm i @playwright/test@next. Let us know how it goes!

@turnkey-commerce
Copy link

Everybody: We just landed two new options to toMatchSnapshot: pixelCount and pixelRatio. With these, you can have the following options to configure snapshot matching:

  • threshold <[float]> an acceptable percieved color difference in the YIQ color space between pixels in compared images, between zero (strict) and one (lax). Defaults to 0.2.
  • pixelCount <[int]> an acceptable amount of pixels that could be different, unset by default.
  • pixelRatio <[float]> an acceptable ratio of pixels that are different to the total amount of pixels, between 0 and 1 , unset by default.

You can give them a try using npm i @playwright/test@next. Let us know how it goes!

@aslushnikov I gave them a try on a scenario and they work well for me, thanks very much. I found the options to actually be named as:

  • maxDiffPixelRatio
  • maxDiffPixels

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants