Skip to content

Custom command have high CPU load / RAM usage #687

@nicolas-rdgs

Description

@nicolas-rdgs

Describe the bug

I'm tagging this as a bug, but it's mainly a topic for discussion.

We have several custom commands in our environment and we've noticed a few issues:

  • Significant CPU and RAM usage
  • Preview mode not working

After profiling, we've noticed that our command takes only a few seconds to execute, but the Splunk search takes 5-10 minutes (on average) to complete.

We have observed that this occurs mainly on an SH cluster, and not on a standalone system (but to be confirmed).

To Reproduce

A little complicated to reproduce, but if you have a custom command that processes more than 150,000 Python objects (see 1 million for the test), you may encounter this problem, mainly on SH Cluster

Expected behavior

Fast search time, with or without Preview, affordable CPU/RAM consumption

Logs or Screenshots

For example, our custom command processed 75k events, our command took 3 seconds (calculated before and after the yield), but Splunk measured a Python script execution time of 460 seconds:

Image

Splunk (please complete the following information):

  • Version: 9.3.1
  • OS: RedHat 9
  • Deployment: Search Head Cluster

SDK (please complete the following information):

  • Version: 2.1.1
  • Language Runtime Version: 3.9
  • OS: RedHat 9

Additional context

Our question is, why the Splunk-SDK do a list(records) instead of send data by batch to the output?

https://github.com/splunk/splunk-sdk-python/blob/develop/splunklib/searchcommands/internals.py#L554

We believe that this is the source of the problem. Instead of sending the data in batches to the standard output for Splunkd, Python loads all processed objects into memory, so the yield is no longer relevant from our custom command.

Seeing this, it now seems logical to us that preview mode does not work with custom commands and use lot of memory.

What do you think?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions