[Salvo] Add documentation 'Measure Envoy's Performance Change with an A/B Testing' by gyohuangxin · Pull Request #127 · envoyproxy/envoy-perf

gyohuangxin · 2022-02-07T07:26:05Z

Fixed: #122

Signed-off-by: Huang Xin xin1.huang@intel.com

mum4k

Thank you for starting this @gyohuangxin. We have a good introduction of the user story and a set of commands to run Salvo.

What we should invest more time in is connecting the story to the conclusion. Salvo currently runs that one example test scenario (the one pytest target). That target wasn't developed with an exact goal in mind, it serves as an example of how to use the testing framework. It is unlikely that test will show useful results when it comes to comparing performance.

Additionally the users of Salvo are likely unaware of how to interpret the pytest and Nighthawk outputs. Which is another road block we should try to overcome.

What we could do is connect the dots from the user story down to the very end of interpreting the results. How would you feel about making the story very concrete. I.e. compare two commits where one is making a clear performance degradation in Envoy. We can then develop a specific test that exercises this new hot path and show how to read the performance results. I.e. exactly prove how we can identify a performance degradation using Salvo and Nighthawk.

Happy to talk more on Slack or in a meeting to hash out the details.

gyohuangxin · 2022-02-13T08:26:14Z

@mum4k Thanks for your reviwing and suggestions, it will greatly benefit the intepretation of this doc.

What we should invest more time in is connecting the story to the conclusion. Salvo currently runs that one example test scenario (the one pytest target). That target wasn't developed with an exact goal in mind, it serves as an example of how to use the testing framework. It is unlikely that test will show useful results when it comes to comparing performance.
Additionally the users of Salvo are likely unaware of how to interpret the pytest and Nighthawk outputs. Which is another road block we should try to overcome.

Is "the one example test scenario" you meaned is the test_discovery.py file comes from https://github.com/envoyproxy/nighthawk/tree/main/benchmarks/test. And I can understand that "it may not show useful results when it comes to comparing performance", because it's an common example file. Maybe I can find a real PR as an example and write a sepicific test file, I can ask for help with my team member about what they care about and which test cases we should write.

And regarding the Nighthawk outputs, yes, it's not easy to understand and it's a huge job to explain each items. We can take two or three items as an example. Have you or other Nighthawk deveplors already written any doc we can link to about Nighthawk outputs before?

What we could do is connect the dots from the user story down to the very end of interpreting the results. How would you feel about making the story very concrete. I.e. compare two commits where one is making a clear performance degradation in Envoy. We can then develop a specific test that exercises this new hot path and show how to read the performance results. I.e. exactly prove how we can identify a performance degradation using Salvo and Nighthawk.

It's a good point, I was thinking about it when I wrote the documentation too, what I feel about the future organization/tree of the stories documentation is:

├── salvo
│   ├── docs
│   │   ├── ENVOY_DEVELOP_WORKFLOW.md # the doc of user story "Envoy Developer to perform an A/B testing"
│   │   ├── ENVOY_CI_WORKFLOW.md # the doc of user story "Integration Salvo with Envoy CI system"
│   │   ├── ENVOY_USERS_WORKFLOW.md # the doc of user story "Envoy Users to performan Envoy test performance on their hardware"
│   │   ├── CULPRIT_FINDING_WORKFLOW.md
│   │   ├── ....... 
│   ├── README.md # we need to add a link list of above workflows

What do you think about it?

mum4k · 2022-02-15T05:16:36Z

@gyohuangxin for test cases, we could start by developing a generic Nighthawk based load test case that would measure the latency at a set QPS (open loop mode), or measure the achieved QPS (closed loop mode). I would be happy to share what we learned about using Nighthawk for testing of load balancers to help design such test case. We can set up another meeting or communicate over Slack, whichever you prefer. Such test case probably wouldn't PASS/FAIL on a specific threshold, but would report the measured values for comparative purposes.

As far as I know, we don't have a document that will explain how to read the Nighthawk outputs yet, but I agree that we need one to make the Salvo document useful. If you do feel like undertaking this as part of the Salvo improvements you are working on, we should probably add such document into the Nighthawk repository itself. We can then link into it from here. As with the test development, I am happy to share what we know to help bootstrap such document.

gyohuangxin · 2022-02-15T05:54:30Z

@mum4k Thank you. I prefer a meeting for sharing if our time is right, your knowledge about Nighthawk will help me a lot to understand what we should do next.

gyohuangxin · 2022-03-02T06:22:42Z

@mum4k Thanks for the offline meeting, I updated the documentation as we discussed. I added the details of test cases to explain the composition of the test case file. And I added a image of report to show the result output intuitively.

As we mentioned on Slack, there's still much for us to improve and investigate about the test cases and outputs. Therefore, it's difficult for me to define everything in it now, maybe we can improve it continuously in the future PRs. What do you think?

mum4k · 2022-03-03T04:21:59Z

@gyohuangxin improving this iteratively in multiple PRs sounds like a good plan. Please update this PR with the main branch, so that it is in a merge ready state and let me know.

I will take a few days to review this, as I am planning to run it according to the instructions to verify them.

… A/B Testing' Signed-off-by: Huang Xin <xin1.huang@intel.com>

…typos in README.md Signed-off-by: Huang Xin <xin1.huang@intel.com>

…_WORKFLOW.md Signed-off-by: Huang Xin <xin1.huang@intel.com>

gyohuangxin · 2022-03-03T05:39:17Z

@mum4k Thanks for the reminder, updated.

salvo/ENVOY_DEVELOP_WORKFLOW.md

mum4k · 2022-03-10T15:58:15Z

salvo/ENVOY_DEVELOP_WORKFLOW.md

+
+    - `_run_benchmark` function: 
+
+      At first, it defined a function named [`_run_benchmark`](https://github.com/envoyproxy/nighthawk/blob/main/benchmarks/test/test_discovery.py#L20) to run the specific PyTest fixture, which will define the behavior of Envoy and [Nighthawk test server](https://github.com/envoyproxy/nighthawk/blob/main/source/server/README.md) to be tested, you can find fixture definitions from these two files:


The portion saying "Nighthawk test server to be tested" makes it sound like we are testing the Nighthawk test server.

Can we rephrase this to avoid confusion? Also similarly to the discussion we had offline - it might be good to add a paragraph explaining the architecture of the test. I.e. we have Nighhawk, Envoy, Test server; we should explain their roles. (optional) We could even add a diagram.

It makes sense, the explanation of their roles and a diagram will be helpful to understand, will rephrase it.

salvo/ENVOY_DEVELOP_WORKFLOW.md

Signed-off-by: Huang Xin <xin1.huang@intel.com>

…VELOP_WORKFLOW.md Signed-off-by: Huang Xin <xin1.huang@intel.com>

gyohuangxin · 2022-03-16T09:47:00Z

@mum4k The documentation has been updated based on your comments. Can you review again? There are two TODOs:

Try the real commits you mentioned and convert fake commits to real ones.
A separated file named test_architecture.md introduces the architecture of Salvo, and add its link in README.md and this doc.
Above two things will take time to research and understand, I will create issues and raise PRs for them.

mum4k

Thank you @gyohuangxin for this contribution. This document makes Salvo more accessible and easier to understand.

mum4k mentioned this pull request Feb 10, 2022

Benchmarks: Failed to run scavaging benchmark envoyproxy/nighthawk#803

Closed

mum4k requested changes Feb 10, 2022

View reviewed changes

gyohuangxin force-pushed the envoy_development branch from 8ba8c25 to 77a43dd Compare March 2, 2022 06:05

Huang Xin added 3 commits March 3, 2022 13:34

[Salvo] Add documentation 'Measure Envoy's Performance Change with an…

7b64e9f

… A/B Testing' Signed-off-by: Huang Xin <xin1.huang@intel.com>

[Salvo] Add the link to detailed workflow documentation and fix some …

67ad12f

…typos in README.md Signed-off-by: Huang Xin <xin1.huang@intel.com>

[Salvo] Add details of test cases and result outputs in ENVOY_DEVELOP…

6df5bcf

…_WORKFLOW.md Signed-off-by: Huang Xin <xin1.huang@intel.com>

gyohuangxin force-pushed the envoy_development branch from 77a43dd to 6df5bcf Compare March 3, 2022 05:35

mum4k requested changes Mar 10, 2022

View reviewed changes

Huang Xin added 3 commits March 11, 2022 11:32

[Salvo] Add more details and fix errors in ENVOY_DEVELOP_WORKFLOW.md

5145a61

Signed-off-by: Huang Xin <xin1.huang@intel.com>

[Salvo] Wrap line at 100 characters in ENVOY_DEVELOP_WORKFLOW.md

66c0dca

Signed-off-by: Huang Xin <xin1.huang@intel.com>

[Salvo] Add workflow of setting up commits for evaluation in ENVOY_DE…

e53c361

…VELOP_WORKFLOW.md Signed-off-by: Huang Xin <xin1.huang@intel.com>

gyohuangxin force-pushed the envoy_development branch from 0d25e72 to e53c361 Compare March 16, 2022 09:34

mum4k approved these changes Mar 17, 2022

View reviewed changes

mum4k merged commit 6127e3a into envoyproxy:main Mar 17, 2022


		- `_run_benchmark` function:

		At first, it defined a function named [`_run_benchmark`](https://github.com/envoyproxy/nighthawk/blob/main/benchmarks/test/test_discovery.py#L20) to run the specific PyTest fixture, which will define the behavior of Envoy and [Nighthawk test server](https://github.com/envoyproxy/nighthawk/blob/main/source/server/README.md) to be tested, you can find fixture definitions from these two files:

Conversation

gyohuangxin commented Feb 7, 2022

Uh oh!

mum4k left a comment

Choose a reason for hiding this comment

Uh oh!

gyohuangxin commented Feb 13, 2022

Uh oh!

mum4k commented Feb 15, 2022

Uh oh!

gyohuangxin commented Feb 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gyohuangxin commented Mar 2, 2022

Uh oh!

mum4k commented Mar 3, 2022

Uh oh!

gyohuangxin commented Mar 3, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mum4k Mar 10, 2022

Choose a reason for hiding this comment

Uh oh!

gyohuangxin Mar 11, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gyohuangxin commented Mar 16, 2022

Uh oh!

mum4k left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gyohuangxin commented Feb 15, 2022 •

edited

Loading