Skip to content

[Salvo] Add documentation 'Measure Envoy's Performance Change with an A/B Testing'#127

Merged
mum4k merged 6 commits intoenvoyproxy:mainfrom
gyohuangxin:envoy_development
Mar 17, 2022
Merged

[Salvo] Add documentation 'Measure Envoy's Performance Change with an A/B Testing'#127
mum4k merged 6 commits intoenvoyproxy:mainfrom
gyohuangxin:envoy_development

Conversation

@gyohuangxin
Copy link
Copy Markdown
Member

Fixed: #122

Signed-off-by: Huang Xin xin1.huang@intel.com

Copy link
Copy Markdown
Contributor

@mum4k mum4k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for starting this @gyohuangxin. We have a good introduction of the user story and a set of commands to run Salvo.

What we should invest more time in is connecting the story to the conclusion. Salvo currently runs that one example test scenario (the one pytest target). That target wasn't developed with an exact goal in mind, it serves as an example of how to use the testing framework. It is unlikely that test will show useful results when it comes to comparing performance.

Additionally the users of Salvo are likely unaware of how to interpret the pytest and Nighthawk outputs. Which is another road block we should try to overcome.

What we could do is connect the dots from the user story down to the very end of interpreting the results. How would you feel about making the story very concrete. I.e. compare two commits where one is making a clear performance degradation in Envoy. We can then develop a specific test that exercises this new hot path and show how to read the performance results. I.e. exactly prove how we can identify a performance degradation using Salvo and Nighthawk.

Happy to talk more on Slack or in a meeting to hash out the details.

@gyohuangxin
Copy link
Copy Markdown
Member Author

@mum4k Thanks for your reviwing and suggestions, it will greatly benefit the intepretation of this doc.

What we should invest more time in is connecting the story to the conclusion. Salvo currently runs that one example test scenario (the one pytest target). That target wasn't developed with an exact goal in mind, it serves as an example of how to use the testing framework. It is unlikely that test will show useful results when it comes to comparing performance.
Additionally the users of Salvo are likely unaware of how to interpret the pytest and Nighthawk outputs. Which is another road block we should try to overcome.

Is "the one example test scenario" you meaned is the test_discovery.py file comes from https://github.com/envoyproxy/nighthawk/tree/main/benchmarks/test. And I can understand that "it may not show useful results when it comes to comparing performance", because it's an common example file. Maybe I can find a real PR as an example and write a sepicific test file, I can ask for help with my team member about what they care about and which test cases we should write.

And regarding the Nighthawk outputs, yes, it's not easy to understand and it's a huge job to explain each items. We can take two or three items as an example. Have you or other Nighthawk deveplors already written any doc we can link to about Nighthawk outputs before?

What we could do is connect the dots from the user story down to the very end of interpreting the results. How would you feel about making the story very concrete. I.e. compare two commits where one is making a clear performance degradation in Envoy. We can then develop a specific test that exercises this new hot path and show how to read the performance results. I.e. exactly prove how we can identify a performance degradation using Salvo and Nighthawk.

It's a good point, I was thinking about it when I wrote the documentation too, what I feel about the future organization/tree of the stories documentation is:

├── salvo
│   ├── docs
│   │   ├── ENVOY_DEVELOP_WORKFLOW.md # the doc of user story "Envoy Developer to perform an A/B testing"
│   │   ├── ENVOY_CI_WORKFLOW.md # the doc of user story "Integration Salvo with Envoy CI system"
│   │   ├── ENVOY_USERS_WORKFLOW.md # the doc of user story "Envoy Users to performan Envoy test performance on their hardware"
│   │   ├── CULPRIT_FINDING_WORKFLOW.md
│   │   ├── ....... 
│   ├── README.md # we need to add a link list of above workflows

What do you think about it?

@mum4k
Copy link
Copy Markdown
Contributor

mum4k commented Feb 15, 2022

@gyohuangxin for test cases, we could start by developing a generic Nighthawk based load test case that would measure the latency at a set QPS (open loop mode), or measure the achieved QPS (closed loop mode). I would be happy to share what we learned about using Nighthawk for testing of load balancers to help design such test case. We can set up another meeting or communicate over Slack, whichever you prefer. Such test case probably wouldn't PASS/FAIL on a specific threshold, but would report the measured values for comparative purposes.

As far as I know, we don't have a document that will explain how to read the Nighthawk outputs yet, but I agree that we need one to make the Salvo document useful. If you do feel like undertaking this as part of the Salvo improvements you are working on, we should probably add such document into the Nighthawk repository itself. We can then link into it from here. As with the test development, I am happy to share what we know to help bootstrap such document.

@gyohuangxin
Copy link
Copy Markdown
Member Author

gyohuangxin commented Feb 15, 2022

@mum4k Thank you. I prefer a meeting for sharing if our time is right, your knowledge about Nighthawk will help me a lot to understand what we should do next.

@gyohuangxin
Copy link
Copy Markdown
Member Author

@mum4k Thanks for the offline meeting, I updated the documentation as we discussed. I added the details of test cases to explain the composition of the test case file. And I added a image of report to show the result output intuitively.

As we mentioned on Slack, there's still much for us to improve and investigate about the test cases and outputs. Therefore, it's difficult for me to define everything in it now, maybe we can improve it continuously in the future PRs. What do you think?

@mum4k
Copy link
Copy Markdown
Contributor

mum4k commented Mar 3, 2022

@gyohuangxin improving this iteratively in multiple PRs sounds like a good plan. Please update this PR with the main branch, so that it is in a merge ready state and let me know.

I will take a few days to review this, as I am planning to run it according to the instructions to verify them.

Huang Xin added 3 commits March 3, 2022 13:34
… A/B Testing'

Signed-off-by: Huang Xin <xin1.huang@intel.com>
…typos in README.md

Signed-off-by: Huang Xin <xin1.huang@intel.com>
…_WORKFLOW.md

Signed-off-by: Huang Xin <xin1.huang@intel.com>
@gyohuangxin
Copy link
Copy Markdown
Member Author

@mum4k Thanks for the reminder, updated.


- `_run_benchmark` function:

At first, it defined a function named [`_run_benchmark`](https://github.com/envoyproxy/nighthawk/blob/main/benchmarks/test/test_discovery.py#L20) to run the specific PyTest fixture, which will define the behavior of Envoy and [Nighthawk test server](https://github.com/envoyproxy/nighthawk/blob/main/source/server/README.md) to be tested, you can find fixture definitions from these two files:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The portion saying "Nighthawk test server to be tested" makes it sound like we are testing the Nighthawk test server.

Can we rephrase this to avoid confusion? Also similarly to the discussion we had offline - it might be good to add a paragraph explaining the architecture of the test. I.e. we have Nighhawk, Envoy, Test server; we should explain their roles. (optional) We could even add a diagram.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense, the explanation of their roles and a diagram will be helpful to understand, will rephrase it.

Huang Xin added 3 commits March 11, 2022 11:32
Signed-off-by: Huang Xin <xin1.huang@intel.com>
Signed-off-by: Huang Xin <xin1.huang@intel.com>
…VELOP_WORKFLOW.md

Signed-off-by: Huang Xin <xin1.huang@intel.com>
@gyohuangxin
Copy link
Copy Markdown
Member Author

@mum4k The documentation has been updated based on your comments. Can you review again? There are two TODOs:

  1. Try the real commits you mentioned and convert fake commits to real ones.
  2. A separated file named test_architecture.md introduces the architecture of Salvo, and add its link in README.md and this doc.
    Above two things will take time to research and understand, I will create issues and raise PRs for them.

Copy link
Copy Markdown
Contributor

@mum4k mum4k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @gyohuangxin for this contribution. This document makes Salvo more accessible and easier to understand.

@mum4k mum4k merged commit 6127e3a into envoyproxy:main Mar 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Can you please give a more detailed readme of how to use salvo?

2 participants