Skip to content

Writing Tests for Streams Operators

hildrum edited this page Oct 25, 2015 · 2 revisions

Testing your Streams operators

This page is intended to provide guidance to those of you wanting to write tests for the Streams operators you contribute to GitHub. It describes a useful pattern to build an individual test using two examples from IBMStreams repositories. Then it points to two examples of test suites in IBMStreams that you could modify and use your own testing needs.

Note that there's no official support on anything described here; this page is merely surveying what is in use.

Self-checking Streams applications

One very useful pattern for writing tests is to build self-checking Streams applications. In this pattern, your Streams application invokes the operator to be tested, and then checks the result of that invocation in SPL itself. This self-checking approach means you can do sophisticated checks. It also means that running the test is pretty easy.

There's two styles of self-checking applications.

Using abort()

One self-checking test style is to use the return code from a standalone run to indicate success or failure. Normal exit of a standalone application gives an exit value of 0, but when abort() causes the exit, the exit code is non-zero. The assert() function can also be used to give an non-zero exit code, but if an application is compiled with -a, the assert is not executed, so that can result in a test unexpectedly passing.

For concreteness, let's consider an example. Recently, a bug was discovered in InetSource. InetSource is used to access URLs, and the problem was that if the web page it was retrieving did not end in a newline, there would be no output corresponding to that last line. As part of fixing this defect, we added a test.

The URL passed as input gives back the status of the Atlanta airport, which begins with <AirportStatus> and ends with <\AirportStatus>. The InetSource operator was not emitting the tuple corresponding to the closing tag. So in our test, we read the tuples from InetSource, specifically looking for the <AirportStatus> and <\AirportStatus> tags. If we ever have 2 more enter tags than end tags, we call abort() in line 33. If we have one open tag and one closing tag, then we exit with success in line 36. (In many cases, you don't need an explicit shutdown, because the application shuts down once the sources are done, however, the InetSource never stops.)

To run this test:

  • compile in standalone
  • run
  • if the exit code is 0, the test passed, if anything else, the test failed.

This pattern is simple to use, but can't be easily used when distributed mode is required (eg, for consistent cut tests), since distributed mode jobs don't finish.

Print result

Another style of self-checking test is to have the test logic print the result to standard out or write the result to a file. This style can be used in distributed mode. For an example, we go to ElasticLoadBalance's noLostTuples test. The purpose of this test is to ensure that the ElasticLoadBalance operator doesn't lose any tuples. It has three operators: a Beacon generating tuples, the ElasticLoadBalance operator being tested, and Custom operator named Snk checking the results. The Custom operator counts the tuples it receives. If when it gets a punctuation, the number of tuples is as expected, it prints "pass", otherwise it prints "fail".

If run in standalone mode, the "pass" or "fail" is printed to the console, but if run in distributed mode, you'd have to see the logs to check the results.

Running many tests

IBMStreams does not have an official way of running tests like these, but several people have written one. I'm going to describe models you could follow in this section.

Makefile-based approach

The http tests in streamx.inet and the kafka tests in streamsx.messaging run their tests via a Makefile with the help of two scripts: one script for cases where success is expected, and another script for the case where a compile failure is expected.

The Makefile for the http tests in streamsx.inet has a list of tests to run. For the tests ending in FailMain, it expects failure on compile, and for the tests ending in TestMain it compiles standalone and expects the result of the standalone to be a return code of 0. The scripts referenced by the Makefile are in the scripts directory.

Python-based test runner from streamsx.plumbing

The ElasticLoadBalance tests take a different approach. Each test has its own subdirectory under tests/ElasticLoadBalance. In each directory is a scenario.py file which is used to actually run the test. Functionality common to the scenario.py file is in the testharness.py file. While the use of a python script as a test descriptor gives a powerful way to check the test succeeded, the tests follow the self-checking pattern, leaving the scenario.py to check whether the test printed "pass" if it's supposed to pass, and to check the error message if it's supposed to fail. A runTest.py script to run the tests.