Skip to content

Implement IOCountersWithContext for AIX with CGO#2016

Merged
shirou merged 4 commits intoshirou:masterfrom
pgimalac:pgimalac/aix-disk-io-counters
Mar 8, 2026
Merged

Implement IOCountersWithContext for AIX with CGO#2016
shirou merged 4 commits intoshirou:masterfrom
pgimalac:pgimalac/aix-disk-io-counters

Conversation

@pgimalac
Copy link
Copy Markdown
Contributor

@pgimalac pgimalac commented Mar 6, 2026

Implement IOCountersWithContext for AIX with CGO.

Uses the disk info from github.com/power-devops/perfstat (already used by other functions).
https://github.com/power-devops/perfstat/blob/main/types_disk.go

@pgimalac pgimalac marked this pull request as ready for review March 6, 2026 14:32
Copy link
Copy Markdown
Owner

@shirou shirou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the implementation. I checked the perfstat source and have a few questions:

  1. Unit of d.Time: The comment says "Time is already in milliseconds", but the perfstat Disk.Time comment only says "amount of time disk is active" without specifying the unit. Have you confirmed that dk_time is in milliseconds from the IBM perfstat_disk_t documentation? It seems inconsistent that Rserv/Wserv are in nanoseconds while Time is in milliseconds.
  2. Unit of Rserv / Wserv / WqTime: These are converted as nanoseconds, but could you confirm this from the IBM documentation? They might be in microseconds instead.
  3. QDepth: The perfstat.Disk struct has a QDepth field ("instantaneous service queue depth"). Could you map it to IopsInProgress?
    IopsInProgress: uint64(d.QDepth),

@pgimalac
Copy link
Copy Markdown
Contributor Author

pgimalac commented Mar 7, 2026

To be honest the documentation is somewhat lacking so it comes from experimentation...

  1. On the host I have I had a value which was only credible when interpreted in milliseconds, that's why I assumed that.
    However since you asked I did some more checking and I now think it's actually in "kernel ticks" (10ms on my host).

I ran a program which writes to a file (without buffering) for 10 seconds, it also gets perfstat info for this disk at the start and end, and the time delta was 467.

I also ran iostat at the same time as the program ran, it showed an average use of 46.6% over the 10 second window.
If 467 was in milliseconds, the average use would be 4.7%, not 47%, so the time unit must be 10ms, which matches kernel ticks. On my host, CLK_TCK is 100, which means the tick is 1s/100 = 10ms.
I tried this a couple times, some runs had a few % off due to timing, but overall it matches.

I updated the PR. See _SC_CLK_TCK in https://www.ibm.com/docs/en/aix/7.2.0?topic=s-sysconf-subroutine.

  1. I did a similar check for those fields in case I also got it wrong initially 😅

Two measures with a 10s sleep in between:

xfers=18472325 xrate=1274564
rserv=811487705067 wserv=1548373062628 min_wserv=0 max_wserv=0
wq_time=126732003901643

xfers=18508321 xrate=1274564
rserv=811487705067 wserv=1551492423966 min_wserv=40070 max_wserv=273794
wq_time=126732062753540

The deltas are

xfers=35996 xrate=0
rserv=0 wserv=3119361338
wq_time=58851897

Running iostat during the same window shows wps 3700, avgserv 0.2ms, minserv=0.1ms, maxserv=0.3~0.5ms, and queue avgtime 0.0ms (NB: units are not explicit but from this doc serv is in milliseconds https://docs.oracle.com/cd/E19455-01/805-7229/6j6q8svh7/index.html).

There are 35996 transfers (all writes), so the average time per write is 3119361338/35996=86658, which is only somewhat coherent with iostat (0.2ms) if the unit is nanoseconds.
Similarly min_wserv and max_wserv only match when interpreted as nanoseconds.
And finally wq_time per transfer is 58851897/35996=1635, which only matches 0.0ms if interpreted as nanoseconds.

Again I'm sorry I don't have much stronger arguments than that, the documentation is unfortunately not explicit so running things and comparing values is the best I can do.

  1. Yes for sure ! I think that's correct

Copy link
Copy Markdown
Owner

@shirou shirou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the thorough investigation! The experimental data convincingly confirms:

  • Time is in kernel ticks (using sysconf(_SC_CLK_TCK) to convert is the right approach)
  • Rserv/Wserv/WqTime are in nanoseconds
  • QDepth mapping to IopsInProgress added

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants