Add GeoIP db fixtures for filebeat tests#8447
Add GeoIP db fixtures for filebeat tests#8447jsoriano wants to merge 1 commit intoelastic:masterfrom
Conversation
9cd8d0c to
f5cec58
Compare
f5cec58 to
d688c4c
Compare
|
Added another city to the fixtures, and networks for all ips we are currently using. |
|
On the one hand this is a really cool solution and would definitively solve our problem. On the other hand I think we should know when Geoip fields change, especially new fields added or removed. Also it makes adding new log files harder, meaning the geoip database file also has to be updated. The other part that I think we shouldn't do is change our testing environment as at least I personally often use this setup also for personal testing, building dashboards etc. Having a non standard geoip library in there is not optimal. When would we want our test to fail? Assuming an IP change it's coordinates/address a bit would not be relevant for our tests. New fields added or missing fields seems more important to me and I would expect tests to fail. We had this in the past but it was because geoip added new fields. How often does this happen? Is it possible that we know when and how ES updates the geoip lib? |
|
Thanks @ruflin for your comments.
I don't think GeoIP fields change so much, #8204 happened after changes in geoip plugin, not changes in the GeoIP DB, I agree that this case should require an action on our side, because for the same data the plugin will give a different set of values. For changes in the schema of GeoIP DBs maybe we could have an additional explicit test, that downloads the Lite DB and checks for existing fields.
We could try to do the override optional. Having tests depending on data we don't control is also not optimal 🙂
On #8401 tests failed both because some coordinates changed, but also because an IP changed its location completely, I don't think we should worry about IP movements. I agree that new fields added by the geoip plugin should be detected and I think that with this solution they will.
I have seen it happening twice this month, not sure if it happened before, maybe ES started to update geoip data with more frequency.
This is included in the docker image, I guess that it is downloaded when the image is built, but not sure, I'll check. In any case while using snapshots in builds we will always have builds failing before we update the expected values, even if we know when the DB has been updated. Maybe the root problem is to use snapshot images, tests shouldn't depend on moving parts. We could use fixed versions on tests, and update them as part of the release process. |
I would still prefer to keep using snapshot images, ES move fast and I would prefer to know early if we break something. |
|
++ on keep using the snapshots. For how often it happens: I didn't see it for a long time, then around 6 months ago and now twice in the last week. Does that mean now it will disappear again for a long time? Not so sure. If we do not care too much about the exact values of the geoip fields I would say it's best that we check the top level entry is there but not the exact content. If we generate it, we still generate all the fields. The nice part about this is we see that things changed when run GENERATE but having the changes will not fail the tests on old builds. |
|
-- on using snapshots (at least by default 🙂) I think that tests should be repeatable and its results reproducible, by using unversioned snapshots we cannot guarantee any of these things. There have been very few problems with geoip but I see them now as symptoms. We had also tests failing by other reasons, as snapshot versions being removed from repositories, and the thing is that when these issues appear they affect to all builds of integration branches, generating additional noise in all active PRs to these branches. The errors happening when images are removed are fixed by using a new snapshot version, this indicates that at least during some periods of time we are not even testing the latest snapshot till it is manually updated, so even the advantage of testing with latest builds mitigates. In my opinion tests should depend on fixed versions whenever possible (ideally on released versions, so they are better tested and have more chances to be available during more time), I see the advantages of using snapshot images to detect issues as soon as possible in integration branches, but I am not sure this justifies breaking all builds, specially when it affects a growing number of people. We could test on fixed versions by default, and have something like daily builds using snapshot versions, this way we wouldn't be affecting PRs and we'd still detect issues. We could add a process to increase the fixed versions at least on every release. |
|
I am going to close this by now, let's continue the discussion on snapshots next week offline 🙂 |
Add a fake GeoIP DB so filebeat tests don't depend on the DB provided in elasticsearch images.
This protects us from changes in GeoIP, but we'd still detect changes in the fields sent by the plugin.
Fixes #8400