Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Criterion integrationv2 python changes #3534

Merged
merged 18 commits into from
Jan 10, 2023
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions tests/integrationv2/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,3 +157,38 @@ An example of how to test that the server and the client can send and receive ap
An error similar to this is caused by a runtime error in a test. In `tox.ini` change `-n8` to `-n0` to
see the actual error causing the OSError.


# Criterion

### Why

We wanted to use the rust criterion project to benchmark s2n-tls, without re-writing all the integration tests.
To accomplish this, we created some criterion benchmarks that use s2nc and s2nd, and a new provider, CriterionS2N, in the python testing framework.

### Prerequisites

Normally, you'd run criterion with `cargo bench --bench <name>`, but cargo does some checks to see if it needs to rebuild
the benchmark and other housekeeping that slows things down a bit. Running `cargo bench --no-run` is the benchmark equivalent to `cargo build` and will produce a binary executable.

The CI will run `make install` and `make -C ./bindings/rust` to create and install the s2nc/d binaries in a system-wide location, then build the cargo criterion binary handlers for s2nc and s2nd.


### Running locally

The Criterion CodeBuild scripts can be used to run these locally, by setting LOCAL_TESTING the s3/github interactions are disabled. Tooling needed includes python3.9, rust, and write permissions to `/usr/local/bin|lib` (or use sudo) - in addition to the traditional C build tooling.

```
export LOCAL_TESTING=true
INTEGV2_TEST=test_well_known_endpoints ./codebuild/bin/criterion_baseline.sh
INTEGV2_TEST=test_well_known_endpoints ./codebuild/bin/criterion_delta.sh
```

The resulting reports are viewable via `tests/integrationv2/target/criterion/report/index.html`



## Troubleshooting CriterionS2N

The most direct trouble-shooting is done using the [interactive troubleshooting CodeBuild](https://docs.aws.amazon.com/codebuild/latest/userguide/session-manager.html#ssm-pause-build) `codebuild-break` line in the buildspec. Put the break right before the main build step and run interactivly.

As mentioned above, in order to get more output from the tests, set the `-n` or `XDIST_WORKER` environment variable to 0 and add a -v to the pytest command line in tox.ini.
5 changes: 4 additions & 1 deletion tests/integrationv2/conftest.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from global_flags import set_flag, S2N_PROVIDER_VERSION, S2N_FIPS_MODE, S2N_NO_PQ
from global_flags import set_flag, S2N_PROVIDER_VERSION, S2N_FIPS_MODE, S2N_NO_PQ, S2N_USE_CRITERION


def pytest_addoption(parser):
Expand All @@ -8,6 +8,8 @@ def pytest_addoption(parser):
default=False, type=int, help="S2N is running in FIPS mode")
parser.addoption("--no-pq", action="store", dest="no-pq",
default=False, type=int, help="Turn off PQ support")
parser.addoption("--provider-criterion", action="store", dest="provider-criterion",
default=False, type=str, help="Use Criterion provider in one of 3 modes: [off,delta,baseline]")


def pytest_configure(config):
Expand All @@ -27,6 +29,7 @@ def pytest_configure(config):
set_flag(S2N_FIPS_MODE, True)

set_flag(S2N_PROVIDER_VERSION, config.getoption('provider-version', None))
set_flag(S2N_USE_CRITERION, config.getoption('provider-criterion', None))


def pytest_collection_modifyitems(config, items):
Expand Down
9 changes: 8 additions & 1 deletion tests/integrationv2/fixtures.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,10 @@
import pytest
import subprocess

from global_flags import get_flag, S2N_USE_CRITERION
from processes import ManagedProcess
from providers import Provider
from providers import Provider, CriterionS2N, S2N

from common import ProviderOptions


Expand All @@ -21,6 +23,11 @@ def managed_process():

def _fn(provider_class: Provider, options: ProviderOptions, timeout=5, send_marker=None, close_marker=None,
expect_stderr=None, kill_marker=None, send_with_newline=None):
if provider_class == S2N and get_flag(S2N_USE_CRITERION) != "off":
provider_class = CriterionS2N
# This comes from the number of iterations specific in the rust benchmark handler(s).
# currently set at 10 iterations, so give us 10x more time.
timeout = timeout * 10
provider = provider_class(options)
cmd_line = provider.get_cmd_line()
# The process will default to send markers in the providers.py file
Expand Down
3 changes: 3 additions & 0 deletions tests/integrationv2/global_flags.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@
# (set from the S2N_LIBCRYPTO env var, which is how the original integration test works)
S2N_PROVIDER_VERSION = 's2n_provider_version'

# From S2N_USE_CRITERION env var.
S2N_USE_CRITERION = 's2n_use_criterion'

_flags = {}


Expand Down
115 changes: 114 additions & 1 deletion tests/integrationv2/providers.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
import os
import pytest
import threading

from common import ProviderOptions, Ciphers, Curves, Protocols, Signatures
from global_flags import get_flag, S2N_PROVIDER_VERSION, S2N_FIPS_MODE
from global_flags import S2N_USE_CRITERION
from stat import S_IMODE


TLS_13_LIBCRYPTOS = {
Expand Down Expand Up @@ -308,8 +311,92 @@ def setup_server(self):
return cmd_line


class OpenSSL(Provider):
class CriterionS2N(S2N):
"""
Wrap the S2N provider in criterion-rust
The CriterionS2N provider modifies the test command being run by pytest to be the criterion benchmark binary
and moves the s2nc/d command line arguments into an environment variable. The binary created by
`cargo bench --no-run` has a unique name and must be searched for, which the CriterionS2N provider finds
and replaces in the final testing command. The arguments to s2nc/d are moved to the environment variables
`S2NC_ARGS` or `S2ND_ARGS`, along with the test name, which are read by the rust benchmark when spawning
s2nc/d as subprocesses.
"""
criterion_off = 'off'
criterion_delta = 'delta'
criterion_baseline = 'baseline'
# Figure out what mode to run in: baseline or delta
criterion_mode = get_flag(S2N_USE_CRITERION)

def _find_s2n_benchmark(self, pattern):
# Use executable bit to find the correct file.
result = find_files(pattern, root_dir=self.cargo_root, modes=['0o775', '0o755'])
if len(result) != 1:
raise FileNotFoundError(
f"Exactly one s2n criterion benchmark not found. Found {result}.")
else:
return result[0]

def _find_cargo(self):
return os.path.abspath("../../bindings/rust")

def _cargo_bench(self):
"""
Find the handlers
"""
self.s2nc_bench = self._find_s2n_benchmark("s2nc-")
self.s2nd_bench = self._find_s2n_benchmark("s2nd-")

def __init__(self, options: ProviderOptions):
super().__init__(options)
# Set cargo root
self.cargo_root = self._find_cargo()

# Find the criterion binaries
self._cargo_bench()

# strip off the s2nc/d at the front because criterion
if 's2nc' in self.cmd_line[0] or 's2nd' in self.cmd_line[0]:
self.cmd_line = self.cmd_line[1:]

Check warning

Code scanning / CodeQL

Overwriting attribute in super-class or sub-class

Assignment overwrites attribute cmd_line, which was previously defined in superclass [Provider](1). Assignment overwrites attribute cmd_line, which was previously defined in superclass [Provider](2).

# strip off the s2nc -e argument, criterion handler isn't sending any STDIN characters,
# and makes it look like s2nc is hanging.
if '-e' in self.cmd_line:
self.cmd_line.remove('-e')
print(f"***** cmd_line is now {self.cmd_line}")

# Copy the command arguments to an environment variable for the harness to read.
if self.options.mode == Provider.ServerMode:
self.options.env_overrides.update(
{'S2ND_ARGS': ' '.join(self.cmd_line),
'S2ND_TEST_NAME': f"{self.options.cipher}_{self.options.host}"})
self.capture_server_args()
elif self.options.mode == Provider.ClientMode:
self.options.env_overrides.update(
{'S2NC_ARGS': ' '.join(self.cmd_line),
'S2NC_TEST_NAME': f"{self.options.cipher}_{self.options.host}"})
if self.criterion_mode == CriterionS2N.criterion_baseline:
self.capture_client_args_baseline()
if self.criterion_mode == CriterionS2N.criterion_delta:
self.capture_client_args_delta()

def capture_server_args(self):
self.cmd_line = [self.s2nd_bench, "--bench",
"s2nd", "--save-baseline", "main"]

# Saves baseline data with the tag "main"
# see https://bheisler.github.io/criterion.rs/book/user_guide/command_line_options.html
def capture_client_args_baseline(self):
self.cmd_line = [self.s2nc_bench, "--bench",
"s2nc", "--save-baseline", "main"]

# "By default, Criterion.rs will compare the measurements against the previous run"
# This run is stored with the tag "new"
# https://bheisler.github.io/criterion.rs/book/user_guide/command_line_options.html
def capture_client_args_delta(self):
self.cmd_line = [self.s2nc_bench, "--bench", "s2nc", "--plotting-backend", "plotters", "--baseline", "main"]


class OpenSSL(Provider):
_version = get_flag(S2N_PROVIDER_VERSION)

def __init__(self, options: ProviderOptions):
Expand Down Expand Up @@ -807,3 +894,29 @@ def supports_cipher(cls, cipher, with_curve=None):
@classmethod
def supports_signature(cls, signature):
return GnuTLS.sigalg_to_priority_str(signature) is not None


def find_files(file_glob, root_dir=".", modes=[]):
"""
find util in python form.
file_glob: a snippet of the filename, e.g. ".py"
root_dir: starting point for search
mode is an octal representation of owner/group/other, e.g.: '0o644'
"""
result = []
for root, dirs, files in os.walk(root_dir):
for file in files:
if file_glob in file:
full_name = os.path.abspath(os.path.join(root, file))
if len(modes) != 0:
try:
stat = oct(S_IMODE(os.stat(full_name).st_mode))
if stat in modes:
result.append(full_name)
except FileNotFoundError:
# symlinks
pass
else:
result.append(full_name)

return result
6 changes: 5 additions & 1 deletion tests/integrationv2/test_well_known_endpoints.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
from configuration import PROTOCOLS
from common import ProviderOptions, Ciphers, pq_enabled
from fixtures import managed_process # lgtm [py/unused-import]
from global_flags import get_flag, S2N_FIPS_MODE
from global_flags import get_flag, S2N_FIPS_MODE, S2N_USE_CRITERION
from providers import Provider, S2N
from utils import invalid_test_parameters, get_parameter_name, to_bytes

Expand Down Expand Up @@ -105,6 +105,10 @@ def test_well_known_endpoints(managed_process, protocol, endpoint, provider, cip
if get_flag(S2N_FIPS_MODE) is True:
client_options.trust_store = TRUST_STORE_TRUSTED_BUNDLE

# TODO: Understand the failure with criterion and this endpoint.
if get_flag(S2N_USE_CRITERION) != "off" and 'www.netflix.com' in endpoint:
Copy link
Contributor

@lrstewart lrstewart Jan 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little paranoid that if we get the disabling / default value of this flag wrong, we would always skip netflix. Like if you change "off" to "disable" but forget to update this line. Hardcoded strings are dangerous, particularly if you're only checking "!=".

Can you put CriterionS2N's criterion_off somewhere accessible to this logic, as well as everywhere else you have "off" hardcoded? So they're all guaranteed to use the same value? Or maybe add a "is_criterion_enabled" type method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a choices option to the pytest arguments to cause pytest to fail if the criterion flag is unexpected and moved the "is it off" logic into a function in global_flags.

pytest.skip()

# expect_stderr=True because S2N sometimes receives OCSP responses:
# https://github.com/aws/s2n-tls/blob/14ed186a13c1ffae7fbb036ed5d2849ce7c17403/bin/echo.c#L180-L184
client = managed_process(provider, client_options,
Expand Down
7 changes: 5 additions & 2 deletions tests/integrationv2/tox.ini
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,19 @@ skipsdist = True
[testenv]
# install pytest in the virtualenv where commands will be executed
setenv = S2N_INTEG_TEST = 1
passenv = DYLD_LIBRARY_PATH LD_LIBRARY_PATH OQS_OPENSSL_1_1_1_INSTALL_DIR
passenv = DYLD_LIBRARY_PATH LD_LIBRARY_PATH OQS_OPENSSL_1_1_1_INSTALL_DIR HOME TOX_TEST_NAME
ignore_errors=False
deps =
pep8
pytest==5.3.5
pytest-xdist==1.34.0
sslyze==5.0.2
pytest-rerunfailures
commands =
pytest -n 2 --maxfail=1 --reruns=2 --cache-clear -rpfsq \
pytest -x -n={env:XDIST_WORKERS:"2"} --maxfail=1 --reruns=2 --cache-clear -rpfsq \
-o log_cli=true --log-cli-level=INFO \
--provider-version={env:S2N_LIBCRYPTO} \
Comment on lines +18 to 19
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add log_cli?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was helpful for early development, with only one xdist worker and cli-level increased

--provider-criterion={env:S2N_USE_CRITERION:"off"} \
--fips-mode={env:S2N_TEST_IN_FIPS_MODE:"0"} \
--no-pq={env:S2N_NO_PQ:"0"} \
{env:TOX_TEST_NAME:""}