Update gosigar package#28909
Conversation
|
This pull request does not have a backport label. Could you fix it @narph? 🙏
NOTE: |
|
This pull request is now in conflicts. Could you fix it? 🙏 |
💔 Build Failed
Expand to view the summary
Build stats
Test stats 🧪
Steps errors
Expand to view the steps failures
|
|
Pinging @elastic/integrations (Team:Integrations) |
| @@ -112,8 +115,10 @@ func filterFileSystemList(fsList []sigar.FileSystem) []sigar.FileSystem { | |||
| // GetFileSystemStat retreves stats for a single filesystem | |||
| func GetFileSystemStat(fs sigar.FileSystem) (*FSStat, error) { | |||
| stat := sigar.FileSystemUsage{} | |||
There was a problem hiding this comment.
I still wonder if we should log this somewhere. Maybe in the main filesystem.go?
There was a problem hiding this comment.
my worry is flooding the logs with this warning/error message, knowing folks will collect these metrics in short intervals.
I would avoid it for now and keep it in the documentation. If we get any user feedback about any confusion there, then we might need to reconsider.
| stat := sigar.FileSystemUsage{} | ||
| if err := stat.Get(fs.DirName); err != nil { | ||
| return nil, err | ||
| if fs.SysTypeName != UnavailableDisk { |
There was a problem hiding this comment.
Do you have a sample event when this happens? Are stats populated with zero values?
There was a problem hiding this comment.
no stats json object, the example is right in this ticket description
There was a problem hiding this comment.
Oh ok, I see. There are all populated with zeroes:
"filesystem": {
"available": 0,
"free": 0,
"used": {
"pct": 0,
"bytes": 0
},
"type": "unavailable",
"device_name": "D:\\",
"mount_point": "D:\\",
"total": 0
}
This may be confusing, for example when preparing alerts, you may need to use other fields to check if the available space is 0 because the filesystem is actually full or because the metric was not available. I think that in these cases these metrics shouldn't be reported:
"filesystem": {
"type": "unavailable",
"device_name": "D:\\",
"mount_point": "D:\\",
}
The event may still be useful for inventory reasons.
There was a problem hiding this comment.
somehow I thought the stats would not be there.
I had a look at the function retrieving the stats and it is reused by the fsstat metricset. The fsstat metricset logs the exception as a debug message and continues to the next disk. I tried to do the same for the filesystem for consistency purposes.
So the debug logs will show:
Line 46: {"log.level":"debug","@timestamp":"2021-11-15T16:26:13.465+0100","log.logger":"system.fsstat","log.origin":{"file.name":"fsstat/fsstat.go","file.line":85},"message":"error fetching filesystem stats for 'D:\\': GetDiskFreeSpaceEx failed: The device is not ready.","service.name":"metricbeat","ecs.version":"1.6.0"}
Line 47: {"log.level":"debug","@timestamp":"2021-11-15T16:26:13.465+0100","log.logger":"system.filesystem","log.origin":{"file.name":"filesystem/filesystem.go","file.line":85},"message":"error fetching filesystem stats for 'D:\\': GetDiskFreeSpaceEx failed: The device is not ready.","service.name":"metricbeat","ecs.version":"1.6.0"}
nothing changed for fsstat
but for filesystem events will look like this:
{
"_index" : "metricbeat-8.1.0-2021.11.15-000001",
"_type" : "_doc",
"_id" : "mis1JH0BLuBcqFZ9zBxN",
"_score" : null,
"_source" : {
"@timestamp" : "2021-11-15T15:27:24.244Z",
"system" : {
"filesystem" : {
"type" : "ntfs",
"device_name" : """C:\""",
"mount_point" : """C:\""",
"total" : 1004205502464,
"available" : 477790535680,
"free" : 477790535680,
"used" : {
"bytes" : 526414966784,
"pct" : 0.5242
}
}
},
"ecs" : {
"version" : "8.0.0"
},
"host" : {
"name" : "DESKTOP-K76UDQL"
},
"agent" : {
...
},
"metricset" : {
"name" : "filesystem",
"period" : 10000
},
"event" : {
"dataset" : "system.filesystem",
"module" : "system",
"duration" : 12886200
},
"service" : {
"type" : "system"
}
},
"sort" : [
1636990044244
]
},
{
"_index" : "metricbeat-8.1.0-2021.11.15-000001",
"_type" : "_doc",
"_id" : "nCs1JH0BLuBcqFZ9zBxN",
"_score" : null,
"_source" : {
"@timestamp" : "2021-11-15T15:27:24.244Z",
"agent" : {
...
},
"event" : {
"dataset" : "system.filesystem",
"module" : "system",
"duration" : 21079100
},
"metricset" : {
"name" : "filesystem",
"period" : 10000
},
"service" : {
"type" : "system"
},
"system" : {
"filesystem" : {
"type" : "unavailable",
"device_name" : """D:\""",
"mount_point" : """D:\"""
}
},
"ecs" : {
"version" : "8.0.0"
},
"host" : {
"name" : "DESKTOP-K76UDQL"
}
},
"sort" : [
1636990044244
]
}
@fearful-symmetry , @jsoriano wdyt?
There was a problem hiding this comment.
Agreeing with @jsoriano , we shouldn't report "null zero" metrics. The fields should be omitted where possible, and I think having "bare" events with no/few metrics in this case isn't terrible.
| if fs.SysTypeName != UnavailableDisk { | ||
| if err := stat.Get(fs.DirName); err != nil { | ||
| return nil, err | ||
| } | ||
| } |
There was a problem hiding this comment.
I wonder if we should just ignore this error if we want to always report some event even if these stats are not available.
| if fs.SysTypeName != UnavailableDisk { | |
| if err := stat.Get(fs.DirName); err != nil { | |
| return nil, err | |
| } | |
| } | |
| if fs.SysTypeName != UnavailableDisk { | |
| if err := stat.Get(fs.DirName); err != nil { | |
| log.Debugf("....: %v", err) | |
| } | |
| } |
There was a problem hiding this comment.
same here, behavior would more consistent
|
/package |
jsoriano
left a comment
There was a problem hiding this comment.
LGTM, please wait for @fearful-symmetry opinion.
| "free": fsStat.Free, | ||
| "used": common.MapStr{ | ||
| } | ||
| if addStats == true { |
There was a problem hiding this comment.
Nit. Redundant boolean comparison.
| if addStats == true { | |
| if addStats { |
|
/package |
fearful-symmetry
left a comment
There was a problem hiding this comment.
LGTM, I think that bool check so we don't report a bunch of null metrics is a reasonable solution.
|
/package |
* update gosigar * text changes * check for unavailable stats * new changes (cherry picked from commit 08642d0) # Conflicts: # metricbeat/module/system/fields.go
* update gosigar * text changes * check for unavailable stats * new changes (cherry picked from commit 08642d0)
* update gosigar * text changes * check for unavailable stats * new changes
* Update gosigar package (#28909) * update gosigar * text changes * check for unavailable stats * new changes (cherry picked from commit 08642d0) # Conflicts: # metricbeat/module/system/fields.go * Update gosigar package (#28909) * update gosigar * text changes * check for unavailable stats * new changes Co-authored-by: Mariana Dima <mariana@elastic.co>
What does this PR do?
Updates gosigar lib which has fix for filesystem windows error
Update docs with limitation on external disks
Stats are also unavailable in these situations.
Why is it important?
Updates gosigar lib which has fix for filesystem windows error
Checklist
CHANGELOG.next.asciidocorCHANGELOG-developer.next.asciidoc.Related issues
Ex: