[Salvo] Add documentation 'Measure Envoy's Performance Change with an A/B Testing'#127
Conversation
mum4k
left a comment
There was a problem hiding this comment.
Thank you for starting this @gyohuangxin. We have a good introduction of the user story and a set of commands to run Salvo.
What we should invest more time in is connecting the story to the conclusion. Salvo currently runs that one example test scenario (the one pytest target). That target wasn't developed with an exact goal in mind, it serves as an example of how to use the testing framework. It is unlikely that test will show useful results when it comes to comparing performance.
Additionally the users of Salvo are likely unaware of how to interpret the pytest and Nighthawk outputs. Which is another road block we should try to overcome.
What we could do is connect the dots from the user story down to the very end of interpreting the results. How would you feel about making the story very concrete. I.e. compare two commits where one is making a clear performance degradation in Envoy. We can then develop a specific test that exercises this new hot path and show how to read the performance results. I.e. exactly prove how we can identify a performance degradation using Salvo and Nighthawk.
Happy to talk more on Slack or in a meeting to hash out the details.
|
@mum4k Thanks for your reviwing and suggestions, it will greatly benefit the intepretation of this doc.
Is "the one example test scenario" you meaned is the And regarding the Nighthawk outputs, yes, it's not easy to understand and it's a huge job to explain each items. We can take two or three items as an example. Have you or other Nighthawk deveplors already written any doc we can link to about Nighthawk outputs before?
It's a good point, I was thinking about it when I wrote the documentation too, what I feel about the future organization/tree of the stories documentation is: What do you think about it? |
|
@gyohuangxin for test cases, we could start by developing a generic Nighthawk based load test case that would measure the latency at a set QPS (open loop mode), or measure the achieved QPS (closed loop mode). I would be happy to share what we learned about using Nighthawk for testing of load balancers to help design such test case. We can set up another meeting or communicate over Slack, whichever you prefer. Such test case probably wouldn't PASS/FAIL on a specific threshold, but would report the measured values for comparative purposes. As far as I know, we don't have a document that will explain how to read the Nighthawk outputs yet, but I agree that we need one to make the Salvo document useful. If you do feel like undertaking this as part of the Salvo improvements you are working on, we should probably add such document into the Nighthawk repository itself. We can then link into it from here. As with the test development, I am happy to share what we know to help bootstrap such document. |
|
@mum4k Thank you. I prefer a meeting for sharing if our time is right, your knowledge about Nighthawk will help me a lot to understand what we should do next. |
8ba8c25 to
77a43dd
Compare
|
@mum4k Thanks for the offline meeting, I updated the documentation as we discussed. I added the details of test cases to explain the composition of the test case file. And I added a image of report to show the result output intuitively. As we mentioned on Slack, there's still much for us to improve and investigate about the test cases and outputs. Therefore, it's difficult for me to define everything in it now, maybe we can improve it continuously in the future PRs. What do you think? |
|
@gyohuangxin improving this iteratively in multiple PRs sounds like a good plan. Please update this PR with the main branch, so that it is in a merge ready state and let me know. I will take a few days to review this, as I am planning to run it according to the instructions to verify them. |
… A/B Testing' Signed-off-by: Huang Xin <xin1.huang@intel.com>
…typos in README.md Signed-off-by: Huang Xin <xin1.huang@intel.com>
…_WORKFLOW.md Signed-off-by: Huang Xin <xin1.huang@intel.com>
77a43dd to
6df5bcf
Compare
|
@mum4k Thanks for the reminder, updated. |
salvo/ENVOY_DEVELOP_WORKFLOW.md
Outdated
|
|
||
| - `_run_benchmark` function: | ||
|
|
||
| At first, it defined a function named [`_run_benchmark`](https://github.com/envoyproxy/nighthawk/blob/main/benchmarks/test/test_discovery.py#L20) to run the specific PyTest fixture, which will define the behavior of Envoy and [Nighthawk test server](https://github.com/envoyproxy/nighthawk/blob/main/source/server/README.md) to be tested, you can find fixture definitions from these two files: |
There was a problem hiding this comment.
The portion saying "Nighthawk test server to be tested" makes it sound like we are testing the Nighthawk test server.
Can we rephrase this to avoid confusion? Also similarly to the discussion we had offline - it might be good to add a paragraph explaining the architecture of the test. I.e. we have Nighhawk, Envoy, Test server; we should explain their roles. (optional) We could even add a diagram.
There was a problem hiding this comment.
It makes sense, the explanation of their roles and a diagram will be helpful to understand, will rephrase it.
Signed-off-by: Huang Xin <xin1.huang@intel.com>
Signed-off-by: Huang Xin <xin1.huang@intel.com>
…VELOP_WORKFLOW.md Signed-off-by: Huang Xin <xin1.huang@intel.com>
0d25e72 to
e53c361
Compare
|
@mum4k The documentation has been updated based on your comments. Can you review again? There are two TODOs:
|
mum4k
left a comment
There was a problem hiding this comment.
Thank you @gyohuangxin for this contribution. This document makes Salvo more accessible and easier to understand.
Fixed: #122
Signed-off-by: Huang Xin xin1.huang@intel.com