Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fio: Add flags to capture latency, bandwidth and IOPS logs #1205

Merged
merged 1 commit into from
Nov 21, 2016
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions perfkitbenchmarker/linux_benchmarks/fio_benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
from perfkitbenchmarker import vm_util
from perfkitbenchmarker.linux_packages import fio

PKB_FIO_LOG_FILE_NAME = 'pkb_fio_avg'
LOCAL_JOB_FILE_NAME = 'fio.job' # used with vm_util.PrependTempDir()
REMOTE_JOB_FILE_PATH = posixpath.join(vm_util.VM_TMP_DIR, 'fio.job')
DEFAULT_TEMP_FILE_NAME = 'fio-temp-file'
Expand Down Expand Up @@ -135,6 +136,19 @@
flags.DEFINE_list('fio_parameters', [],
'Parameters to apply to all PKB generated fio jobs. Each '
'member of the list should be of the form "param=value".')
flags.DEFINE_boolean('fio_lat_log', False,
'Whether to collect a latency log of the fio jobs.')
flags.DEFINE_boolean('fio_bw_log', False,
'Whether to collect a bandwidth log of the fio jobs.')
flags.DEFINE_boolean('fio_iops_log', False,
'Whether to collect an IOPS log of the fio jobs.')
flags.DEFINE_integer('fio_log_avg_msec', 1000,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fio's --log_avg_msec defaults to 0. Why do we default to 1000?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default, it will log data for every IO completion which can create a huge file. I thought 1 second, was a sensible default, so it aggregates the metrics per 1 second interval.

If you feel, we should use the same as fio as to not "surprise" the user, I can make the change.

'By default, this will average each log entry in the '
'fio latency, bandwidth, and iops logs over the specified '
'period of time in milliseconds. If set to 0, fio will '
'log an entry for every IO that completes, this can grow '
'very quickly in size and can cause performance overhead.',
lower_bound=0)


FLAGS_IGNORED_FOR_CUSTOM_JOBFILE = {
Expand Down Expand Up @@ -373,6 +387,17 @@ def GetConfig(user_config):
return config


def GetLogFlags():
collect_logs = FLAGS.fio_lat_log or FLAGS.fio_bw_log or FLAGS.fio_iops_log
fio_log_flags = [(FLAGS.fio_lat_log, '--write_lat_log=%(filename)s',),
(FLAGS.fio_bw_log, '--write_bw_log=%(filename)s',),
(FLAGS.fio_iops_log, '--write_iops_log=%(filename)s',),
(collect_logs, '--log_avg_msec=%(interval)d',)]
fio_command_flags = ' '.join([flag for given, flag in fio_log_flags if given])
return fio_command_flags % {'filename': PKB_FIO_LOG_FILE_NAME,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want all these logs written to the same file? FIO can have concurrent jobs, so does FIO ensure that writes to any one of these logs is the only write to these logs occurring at any given time? (atomic)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see below that these are supposed to be different file names. After staring at this code for several minutes it still looks like they all get the same file name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the argument we pass to the --write_(iops|lat|bw)_log flags, it's used as a prefix for the log name.

For each of those, fio will create the log files using this format:

<name>_<log_type>.<job_number>.log

Some examples:

pkb_fio_bw.1.log - Bandwidth log for job 1
pkb_fio_clat.1.log - Completed IO latency log for job 1
pkb_fio_slat.1.log - Submitted IO latency log for job 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok thanks for clarifying

'interval': FLAGS.fio_log_avg_msec}


def CheckPrerequisites():
"""Perform flag checks."""
WarnOnBadFlags()
Expand Down Expand Up @@ -452,6 +477,10 @@ def Run(benchmark_spec):
fio_command = 'sudo %s --output-format=json --directory=%s %s' % (
fio.FIO_PATH, mount_point, REMOTE_JOB_FILE_PATH)

collect_logs = any([FLAGS.fio_lat_log, FLAGS.fio_bw_log, FLAGS.fio_iops_log])
if collect_logs:
fio_command = ' '.join([fio_command, GetLogFlags()])

# TODO(user): This only gives results at the end of a job run
# so the program pauses here with no feedback to the user.
# This is a pretty lousy experience.
Expand All @@ -460,6 +489,9 @@ def Run(benchmark_spec):
stdout, stderr = vm.RobustRemoteCommand(fio_command, should_log=True)
samples = fio.ParseResults(job_file_string, json.loads(stdout))

if collect_logs:
vm.PullFile(vm_util.GetTempDir(), '%s_*.log' % PKB_FIO_LOG_FILE_NAME)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little weird to be getting the tmp dir from two different sources. Suggest using vm_util.VM_TMP_DIR as above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I'll make the change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While going over your suggestion, I remembered now why I didn't use vm_util.VM_TMP_DIR. I want to store the collected logs under /tmp/pkb/runs/<run_uri>/ so they are grouped with the results for that run, vm_util.VM_TMP_DIR is the path to just /tmp/pkb/ directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything passed to fio telling to put the logs there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Currently, it defaults to the working directory of where fio is called from. Then vm.PullFile(...) essentially does an scp for files ending with _*.log from fio's working directory to the local directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh I see.


return samples


Expand Down