-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Review mapping templates of beats #2238
Comments
Thanks @jpountz for taking time to review the fields. First the good news for int / long: We automatically convert all integer to float in the templates we generate: https://github.com/elastic/beats/blob/master/metricbeat/metricbeat.template.json#L137 Based on this, all values sent to elasticsearch are mapped as long as we didn't see any disadvantage in this. Please tell us if that isn't a good idea. That doesn't mean we should not update the templates but we saw it as lower priority. For I agree that we should do a full review of the existing Metricsets for the mapping before beta1 to make sure we use the correct values. I think all the points you pointed out about redis for example are inconsistencies that have to be fixed and I'm quite sure there are more. Currently our focus on the system module. An additional task we should do when reviewing all event documents is to check for consistency in naming: https://www.elastic.co/guide/en/beats/libbeat/master/event-conventions.html I'm mentioning this here as I think this could be done in one go. |
The benefit of integer vs long is indeed quite small, so that would be fine. Thanks for explaining! |
I reviewed all the half_float, scaled_float and float and opened #2430 with the changes. Here some observations:
Conclusion: scaled_floats are awesome and seem to be the perfect fit for our use case. As default we currently use |
This might not be an issue, but note that this will make scaled floats use more disk than floats if the amplitude of the values in a field grows beyond
This would increase disk usage by about 2.3 bits per value. |
* The mappings for float, half_float and scaled_float were reviewed and adjusted * All integers were converted to long as this was already happening automatically in the script * Json examples updated Closes elastic#2238
* The mappings for float, half_float and scaled_float were reviewed and adjusted * All integers were converted to long as this was already happening automatically in the script * Json examples updated Closes #2238
I was looking at templates after the switch to scaled/half floats and found a couple potential issues, maybe we should do a review of these templates before the 5.0 release. For instance:
apache.status.total_accesses
andapache.status.total_kbytes
are currently an integers but a high traffic instance could overflow them I think? Should it be along
?apache.status.requests_per_sec
is a half float, which means that the integer value will become fuzzy after 2048, and rounded to +Infinity after 65504, which are values that could be reached on a high-traffic server that serves small static files. Maybe they should be integers instead? or scalled floats if we really want to be able to have some decimal places.There are also some inconsistencies, for instance most CPU usage stats seem to have moved from half floats to scaled floats, but the cpu stats of redis are still using half floats? Is it intentional? Similarly, the system load averages are using scaled floats while the apache load averages are using half floats.
The text was updated successfully, but these errors were encountered: