-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stats and UI Support for Multiple Hosts #2394
Conversation
Hmm... This is quite a significant change for a very rare use case.. I'd like to wait until after we rework the UI. The html template files are unreadable enough as it is, and more parametrization makes them even uglier. |
I understand the hesitation 👍 I tried to keep the changes as minimal as possible but we need to pass the I think the changes are quite safe, as the tests pass and I have added further tests for the cases with multiple hosts. We could omit the changes to the template files for the moment, but it probably makes more sense to wait so that we can keep the UI consistent with the downloaded reports Although this is a very rare use case, I think it is something that we should support, considering that we do support setting multiple hosts, and considering that returning I will focus on the UI updates in that case and then hopefully we can re-visit this |
cbaa7f0
to
0a0e576
Compare
@cyberw I have updated this now so that the changes to the UI will only affect the modern-ui, and once again, no changes will occur unless multiple hosts are present. Now that I have some more flexibility with the UI, I have also proposed using separated tables in the UI and CSV files for better visibility of the data. Currently the CLI still uses one table, with an additional column for indicating the host. I think in the CLI it becomes a bit too difficult to read if we have several tables per report, but let me know if you prefer this to be changed |
Cool stuff! (I'll remember to squash the changes this time :) Is this change backwards compatible? I noticed that it isn't compatible between different versions on master-worker, but that is... not terrible. |
I am perhaps not familiar enough with the release process, what do you mean that this wouldn't be compatible between different versions on master-worker? |
it changes the protocol between master and worker, so a worker without this change cannot send stats to a master with this change. There is a version check between worker and master when they connect, but usually they just log a warning if they are not the same minor version (only major versions are considered breaking by default) |
Ah interesting, are there really cases where users would want to update the master worker but not the workers themselves? I think a work-around could be possible, where we ensure the Otherwise we save this for a major version change? |
Its not really a supported use case across minor versions, but people often forget to upgrade one or the other :) Its completely fine to adjust that code to refuse connections from workers with version <=2.17.0 |
@cyberw I performed some tests using worker versions that did not have this change, reporting to a master worker that would have this change. I was able to then make this backwards compatible with minimal changes When no host can be found in the stats, the UI shows as it would for when only a single host is present (tested all tabs and all reports) Finally, I also realized the "Aggregated" column was not quite placed correctly if we are separating the tables per host. I have proposed showing the aggregated values as their own table and updated the PR description accordingly |
sorry, that was a bug in the example... |
My locust file looks like this:
|
Hmm... I'm not sure this will work. Users frequently access other domains/hosts than the one in
output:
|
@cyberw So what would you prefer as an output in that case? To separate the tables per host as in the UI / CSV? I was thinking this might make it a bit difficult to read but I think it could be possible with minimal changes. Or we could change the |
Just to be clear, the problem isnt the presentation being unclear, the problem is that the output is a lie: no requests have been made to http://0.0.0.0:10/another_host I guess what we'd have to do is determine what host was actually used instead of relying on the |
Ah my mistake, I think that's actually a very easy fix:
Output:
And because of |
👍 Would it be possible/good to parse this on the "receiving" end instead? I mean in or maybe in |
@cyberw To me, I think it makes the most sense that the HTTPSession (or FastHttpSession) be the one that interprets the host. The class already has context that absolute urls can exist and everything related to the request is stored and passed via the We also wouldn't be saving ourselves from passing variables by handling this logic in the |
I agree, having it in the User/client makes sense when I think about it. What I’m not so sure about is whether it is worth the added complexity in the stats library. And perhaps there is a use case when you want requests against two different hosts to be grouped as one metric? Perhaps a rare use case, but one made impossible by this change I think? Perhaps it could be done more easily with a switch on the User that changes the name to use the whole URL instead of just the path? That way the name is still all you need to know/control in order to do grouping, and reports are not made extra complicated but your use case is still supported. |
To be honest, I'm not even sure I understand the use case where someone would want to both create a request to an absolute url and track the metrics of this request. For example, I can see use cases existing where you need to query some 3rd party to get some data to pass into the next request to your own client. But then the metrics for this 3rd party host wouldn't really be your concern. I do understand what you mean, however, that we may prefer to group stats by the User, rather than the Host. However I'm not sure how easily this could be done again without adding too much complexity to the stats library. I imagine for this to be done, each StatsEntry would need to track which User the request came from rather than the Host. Changing the name on the User wouldn't really solve the problem, because when the list of Stats gets sent to the frontend, we still have no context as to which Stat belongs to which user |
We do receive the The one issue is that this would mean we start to group stats by user in the UI, regardless if they are using different hosts or not. Either this or we would have to track by both host and user |
That's not what I meant, I meant an option to do something like this: class MyUser(HttpUser): ... in order to make the name of the request be http://somewhere.com/foo, instead of just /foo. |
You mean Is there not a use case where it would be useful to allow filtering stats / failures by user? |
I havent forgotten about this, but I cant really decide whether this is a good idea or not, or put too much time into making an alternative. Rewriting names to be full URLs ("log_full_url") would be a much more low-impact change that doesnt complicate statistics tracking and presentation. So I kind of prefer that. |
Yes but in my opinion it doesn't really solve the issues (and as you showed, with a long host name, it doesn't really solve the problem of presentation either). The issue is that we are grouping stats together when they should not be grouped. I think it doesn't make much sense that the user then has to have the knowledge to set this "log_full_url" in order for the stats to be "properly" grouped (just as you showed above, if the user class has a host property but uses an absolute URL, we wouldn't expect the user to have to use "log_full_url" in order for absolute URLs to be tracked correctly right?) |
I have a new proposal for this. I think it can simplify things a lot. Let me know what you think:
From there:
|
Hmm... If it simplifies the code then that sounds reasonable! To me, showing the host name does make sense in the web ui (maybe making it optional, like you showed in the screenshot) and splitting but not showing the host column makes sense in command line (if we want to be really ambitious we can show it if the terminal width exceeds a certain value, or add a setting for it). |
26c1029
to
35f63fc
Compare
Ok so I've simplified the code as best as I can:
I was thinking there could be a way to make this change without having to change so many tests, but unfortunately it doesn't really work. The idea was to change the way we record the stats, but any changes to recording the stats would equally change the way we look up the stats (and thus we would need to update the same amount of tests). The nice part about this change is that the "group_by" property is built to be generic, so in the future we could quite easily add more group options if we wished :) I actually wanted to have the option to group by user, and technically we're already quite close to being able to do it. The
The problem is
But to me it doesn't feel super clean and I understand that the decision to use the dunder was performance based and I didn't want to be responsible for reverting the performance increase :) so for now we can group by host and if it's interesting to you perhaps we can look into other grouping options in a future ticket |
35f63fc
to
557d580
Compare
This PR is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 10 days. |
Hey @cyberw, I am still of the opinion that two hosts being recorded as the same stat is a bug and should be fixed. I understand there is some hesitation with how many tests this change affects, but I am of the opinion that these tests should be updated. Since #2410, the Is there anything that could be changed for you to consider this? :) |
Fixes Issue #2389
Proposal
group_stats_by
andgroup_failures_by
. This will allow the tables in the HTML report to be grouped as specifiedgroup_stats_by
andgroup_failures_by
. This will allow the tables in the CSV report to be grouped as specifiedMultiple Hosts
Single Host
(no change)