Skip to content
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
209 changes: 209 additions & 0 deletions rfcs/2022-10-31-13691-tooling-revamp.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,209 @@
# RFC 15056 - 2022-10-31 - Tooling revamp

-----

This RFC discusses improving Vector's developer tooling in order to ease maintenance and expand testing possibilities.

**Table of Contents**

- [Scope](#scope)
- [In scope](#in-scope)
- [Out of scope](#out-of-scope)
- [Pain](#pain)
- [Proposal](#proposal)
- [User Experience](#user-experience)
- [Implementation](#implementation)
- [Environment management](#environment-management)
- [Configuration](#configuration)
- [Interface](#interface)
- [Testing container](#testing-container)
- [CI](#ci)
- [Drawbacks](#drawbacks)
- [Prior Art](#prior-art)
- [Alternatives](#alternatives)
- [Status quo](#status-quo)
- [Python](#python)
- [Outstanding Questions](#outstanding-questions)
- [Plan Of Attack](#plan-of-attack)
- [Future Improvements](#future-improvements)

## Scope

### In scope

- Begin the standardization of interfacing with the repository (building, testing, releasing, etc.)
- Define a version support matrix for each integration
- Only run an integration's tests on PRs for changes
- Make developer UX consistent across platforms

### Out of scope

- Non-integration tests

## Pain

Comment thread
ofek marked this conversation as resolved.
- Test failures are difficult to debug, esp. since [this change](https://github.com/vectordotdev/vector/pull/13128)
- Windows is essentially unsupported since Make takes a great deal of effort to install and most [scripts](https://github.com/vectordotdev/vector/tree/v0.24.2/scripts) require Bash and utilities like `find`
- Adding tests for new integrations to CI ([here](https://github.com/vectordotdev/vector/blob/v0.24.2/Makefile#L333-L341) and [here](https://github.com/vectordotdev/vector/blob/v0.24.2/.github/workflows/integration-test.yml#L58-L91)) is a manual and occasionally forgotten step (see [outstanding questions](#outstanding-questions))

## Proposal

### User Experience

Interaction with the repository will be done with a new tool called `vdev`, which will eventually replace most Bash and all Ruby scripts. (PoC in [#14990](https://github.com/vectordotdev/vector/pull/14990))

```text
Vector's unified dev tool

Usage: vdev [OPTIONS] <COMMAND>

Commands:
build Build Vector
config Manage the vdev config file
exec Execute a command within the repository
int Manage integrations
meta Collection of useful utilities
status Show information about the current environment

Options:
-v, --verbose... More output per occurrence
-q, --quiet... Less output per occurrence
-h, --help Print help information
-V, --version Print version information
```

### Implementation

The [integration test directory](https://github.com/vectordotdev/vector/tree/v0.24.2/scripts/integration) will become nested, with each integration having its own directory, which will be a Rust project.
Comment thread
ofek marked this conversation as resolved.
Outdated

#### Environment management

Each project will expose a CLI binary with 2 commands:

- `start` - will set up the environment, create mock test data (like hitting some endpoint), wait until it's "ready", etc.
- `stop` - will tear down the environment

Both commands receive a single argument representing the JSON configuration of the environment generated by the `matrix` (see [below](#configuration)).

Note that while most environments will continue to be Docker-based, this unlocks the ability to use Kubernetes, Terraform, or do arbitrary things.

#### Configuration

Each directory will have a `test.toml` with the following options:
Comment thread
ofek marked this conversation as resolved.
Outdated

- `on` - array of paths relative to the root that will trigger tests if changed
Comment thread
neuronull marked this conversation as resolved.
Outdated
- `args` - default array of arguments to pass to the test command
- `env` - table of environment variables that will be set during tests
- `matrix` - array of tables used to generate the environments

For example, the [Elasticsearch setup](https://github.com/vectordotdev/vector/blob/v0.24.2/scripts/integration/docker-compose.elasticsearch.yml) might be:

```toml
on = [
"src/sinks/elasticsearch",
]
args = [
"--features",
"es-integration-tests",
"--lib",
"::elasticsearch::integration_tests::",
]

[env]
AWS_ACCESS_KEY_ID = "dummy"
AWS_SECRET_ACCESS_KEY = "dummy"
ELASTICSEARCH_AWS_ADDRESS = "http://localstack:4571"
ELASTICSEARCH_HTTP_ADDRESS = "http://elasticsearch:9200"
ELASTICSEARCH_HTTPS_ADDRESS = "https://elasticsearch-secure:9200"

[[matrix]]
version = ["7.13.1"]
```

We could then extend the support matrices:

```toml
[[matrix]]
version = ["7.13.1", "8.4.3"]
type = ["classic"]

[[matrix]]
version = ["1.3.6", "2.3.0"]
type = ["opensearch"]
```

That would indicate the following unique environments:

```text
7.13.1-classic
8.4.3-classic
1.3.6-opensearch
2.3.0-opensearch
```

If we were testing the `8.4.3-classic` environment the [management CLI](#environment-management) would get the following payload:

```json
{
"type": "classic",
"version": "8.4.3"
}
```

#### Interface

The `vdev int` command would have the following sub-commands:

- `show` - accepts an integration name and shows the available/running environments
- `start` - calls the equivalently named command of the [management CLI](#environment-management). requires an integration name and environment name. the integration names refer to the names of the integration test directories
Comment thread
tobz marked this conversation as resolved.
- `stop` - ^^
- `test` - executes the tests and will terminate with the same exit code. accepts an integration name, environment name, and arbitrary extra arguments to pass to the test command. if no environment name is provided then all will be tested and any not already started will be torn down

The [testing container](#testing-container) will be created first always.

#### Testing container

There will be a single container per integration for running tests. The entry point will be `/bin/sleep infinity` and will be attached to the host network.
Comment thread
ofek marked this conversation as resolved.
Outdated

#### CI

The test matrix on PRs will be computed ([example](https://github.com/vectordotdev/vector/blob/v0.24.2/.github/workflows/soak.yml#L137-L169)) by `vdev` based on what has changed. The will save [somewhere between](https://github.com/vectordotdev/vector/actions/workflows/integration-test.yml) 6.5-8 hours of billable CI time per commit.
Comment thread
ofek marked this conversation as resolved.
Outdated

## Drawbacks

- Slight change in process for all developers which would require communication and docs

## Prior Art

- [ddev](https://datadoghq.dev/integrations-core/ddev/cli/) is a tool specific to Datadog Agent integrations
- [Hatch](https://github.com/pypa/hatch) provides environment management and the matrix syntax we borrowed

## Alternatives

### Status quo

- Could ignore all pain and add some more Docker compose test files

### Python

- Could call environments' `start`/`stop` methods through Hatch to avoid maintaining matrix logic
- Could write all tooling in Python to theoretically increase development velocity

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm definitely 👍 on the dedicated CLI, but I did have some concern about using Rust for it since I was concerned that might decrease developer productivity when working within the CLI, which has less of a concern on correctness and does a lot of subprocess execution. It sounds like you've found it plenty productive though, given your experience working on a similar tool in Python? It does have the advantage that everyone on the team is experienced in Rust.

As another data point, @tobz how would you have felt writing https://github.com/vectordotdev/vector/blob/master/scripts/generate-components-docs.rb in Rust vs. Ruby?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also thought of this yesterday - if this is to power CI, how/when would it be built for CI?

If we need to compiler during the the CI run that adds time (hopefully not much because the code should be small...) or built only on changes to the tool and pull from those artifacts.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As another data point, @tobz how would you have felt writing generate-components-docs.rb in Rust vs. Ruby?

At the time, it probably would have sucked more to slog through writing it in Rust from scratch. Doing that now... probably wouldn't be terrible because it'd be easy to know when I reached feature parity by comparing the outputs directly, and having a good mental model of how it's structured.

🤷🏻


Since Rust is already a requirement, it's easiest to run a single Cargo command to install the tooling. Introducing a dependency on Python at any point would not just require installing the right version of Python but also installing tools inside of a dedicated virtual environment (to avoid dependency conflicts) and adding its `bin` (`Scripts` on Windows) directory to PATH. [pipx](https://github.com/pypa/pipx) could help with that.

Python development on macOS can be particularly painful.

## Outstanding Questions

- Is there a way to remove the `on` config i.e. go from changed "integration" file -> integration test. If not then there's no way to test new integrations in CI automatically

## Plan Of Attack

- [ ] Merge the [PoC](https://github.com/vectordotdev/vector/pull/14990) of `vdev`
- [ ] Implement the testing logic and add step to CI
- [ ] For every integration ported remove it from the Makefile
- [ ] When complete, enable integration tests on all PRs (even for contributors)

## Future Improvements

- Command to generate the scaffolding for new integrations