Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

continuous testing - bot like traffic to stress test reliability and availability #59

Closed
5 tasks
InoMurko opened this issue Jun 2, 2020 · 32 comments
Closed
5 tasks
Assignees
Labels
L Large amount of work testing testing the description that this is a test

Comments

@InoMurko
Copy link
Contributor

InoMurko commented Jun 2, 2020

We would like to stress test our services (childchain + vault + postgres specifically) and how they work under continuous trafic.

The testing kit should do the following:

  • act like a true accounting backoffice
  • deposit, transact, exit
  • configurable TPS and actors (ethereum addresses)
  • configurable TPS and actors over time span (5 tps over 5h, 3tps over 604800s)

Backoffice testing kit will do accounting like operations:

  • deposit - gas = omg network balance
  • deposit - gas - (gets debited) = omg network balance
  • deposit - gas + (gets credited) = omg network balance
  • deposit - gas + (gets credited) + SE = ethereum balance

Update. Task list based on #59 (comment) and #59 (comment) :

@achiurizo achiurizo added testing testing the description that this is a test L Large amount of work labels Jun 17, 2020
@achiurizo
Copy link
Contributor

Follow up look into:

  • merge and/or delete elixir-omg perf sub app
  • merge and/or delete omg-js integration tests (figure out if we can absorb this)

@ayrat555 ayrat555 self-assigned this Aug 7, 2020
@ayrat555
Copy link
Contributor

ayrat555 commented Aug 7, 2020

@InoMurko @achiurizo
My plan of action:

  1. move omg_performance application into perf project. Currently, it's not obvious because omg_performance uses child_chain as a dependency {:omg_child_chain, in_umbrella: true, only: [:test], runtime: false},, maybe it can be solved with common behaviour or something.
  2. move perf into the specs repo
  3. make perf work both for elixir-omg and childchain
  4. look into omg-js integration tests

is it ok?

@InoMurko
Copy link
Contributor Author

InoMurko commented Aug 7, 2020

  1. So there's gonna be a couple of issues with your approach, that's because omg_performance is rather big (there are byzantine_events tests and perf exits tests as well). The proper way would be to implement these 1:1 in cabbage and in performance (depends on what the test does, not all are performance related). You'll need to look at them case by case.

Before you do 2 and 3, we need to look at what specs is becoming in terms of testing and how it's used. Having multiple repositories is notoriously hard to maintain and keeping them in sync.
Imagine the following scenario:
A new feature requires you do add a cabbage or/and performance tests to spec repo. How do you test that in cirleci within your repository? cc @achiurizo
I don't have an answer.

@ayrat555
Copy link
Contributor

ayrat555 commented Aug 7, 2020

Versioning the tests repo or adding feature tags to tests

@InoMurko
Copy link
Contributor Author

InoMurko commented Aug 7, 2020

can you elaborate on that? 🤔

@ayrat555
Copy link
Contributor

ayrat555 commented Aug 7, 2020

  1. We can add feature tags to tests @moduletag my_new_feature and run these tests in repos where the feature is available mix test --include my_new_feature
  2. the specs repo is added as a submodule so we can use different branches/tags/commit_hashes in different repos

@InoMurko
Copy link
Contributor Author

InoMurko commented Aug 7, 2020

regarding

  1. this can quickly become a mess and I imagine it's hard to maintain
  2. I don't see how this solves the problem of having multiple features being done in parallel, do they all branch from each other? how do you get the master tests back into branches and tags?

@ayrat555
Copy link
Contributor

ayrat555 commented Aug 7, 2020

  1. will we have to maintain 2 childchains indefinitely? Shouldn't only one childchain leave in the end?
  2. common tests will be stored in the master branch, specific tests will be stored in the branches. when the master is updated, child branches will have to pull changes. blockscout block explorer uses this approach: it maintains several instances of blockscout for different sidechains

@InoMurko
Copy link
Contributor Author

InoMurko commented Aug 7, 2020

  1. The idea is that we'll be able to drop childchain1. but that's anywhere between 6months to a year from now.
  2. specific tests will be stored in the branches. so if I have multiple features, do I reference multiple branches and submodule them?
    For example:
    Ayrat and Arthur both work on childchain2:

Ayrat works on /v2/block.get and adds tests into cabbage branch ayrat/v2_block_get
Arthur works on /v2/alarms.get and adds tests into cabbage branch arthur/v2_alarms_get

If you merge your PR into childchain2, how does Arhur in his PR reference your newly added tests when it gets into childchain2 master?

So... what I'm trying to avoid is having to maintain specs repo like it's a new project/product. Because to maintain it, we will need to dedicate engineering resources to merge PRs, resolve conflicts and branches. But since we're spawning a new repo I don't think there's any other way.

As long as every stakeholder (@achiurizo @T-Dnzt ) are aware of the associated cost and risk.

@boolafish
Copy link

maintain a syncing master-v2 branch like elixir-omg is probably the best practical idea so far? 😅

your v2/block.get, v2/alarms.get would be merged to master-v2 and auto sync promises any conflict would be resolved ASAP.

@achiurizo
Copy link
Contributor

So... what I'm trying to avoid is having to maintain specs repo like it's a new project/product. Because to maintain it, we will need to dedicate engineering resources to merge PRs, resolve conflicts and branches. But since we're spawning a new repo I don't think there's any other way.

I think for the current interim, this will probably be the case until we decommission childchain1. It's not ideal, but the specs repository current setup seems cleanest for the current setup(public v. private / watcher v. childchain v. elixir-omg).

maintain a syncing master-v2 branch like elixir-omg is probably the best practical idea so far? 😅

It really probably has to come to some form of branch management like this. Ignoring the API example above, contract changes alone(like the v2 contracts being worked on) will need some tracking branch on the other repositories to test the changes(so that would be now childchain, specs, elixir-omg/watcher). I'm open to other suggestions for us to consider, but currently the cost and risk are acknowledged.

I'll also mention that we are also hiring for a QA role 🔜 . Hopefully this specs project could also use support from this position as well.

@InoMurko
Copy link
Contributor Author

InoMurko commented Sep 4, 2020

@ayrat555 could we get a review of the current
specs and performance testing? Are we able to do with the existing tooling what the ticket is requesting?

@boolafish
Copy link

We have a watcher_info tests in perf: https://github.com/omgnetwork/elixir-omg/blob/master/priv/perf/apps/load_test/test/load_tests/runner/watcher_info_test.exs

The main logic is in this file: https://github.com/omgnetwork/elixir-omg/blob/master/priv/perf/apps/load_test/lib/scenario/account_transactions.ex

This one simulates a user using watcher info APIs to create transaction and send transaction (and randomly hit other watcher info APIs). The script was tested to be able to run for a day long time ago (note, at the time hakney retry was turned on). Might be a good one to use it for simulate production load testing (plus this is more similar to how our user would hit our service).

@ayrat555
Copy link
Contributor

ayrat555 commented Sep 17, 2020

Follow up look into:

  • merge and/or delete elixir-omg perf sub app
  • merge and/or delete omg-js integration tests (figure out if we can absorb this)
omg-js integration tests (org-mode format)

** addTokenTest

*** add token that the chain had

*** add token that the chain did not have

** amountTypesTest

*** validates types that approveToken acceps

** challengeExitTest

*** challenge a dishonest exit

** challengeInFlightExitInputSpentTest

*** challenge an in-flight exit as non canonical and challenge an invalid input piggyback

** challengeInFlightExitOutputSpentTest

*** challenge output from a canonical invalid output piggyback

** createSubmitTypedTransactionTest

*** create and submit a single currency transaction using submitTyped

** createTransactionTest

*** create and submit a single currency transaction

*** create a single currency transaction

*** create and submit a mixed currency transaction

*** create an incomplete transaction

*** split a utxo into 4

** deleteNonPiggybackedInFlightExitTest

*** delete an ife if not piggybacked after the first phase and return ife bond

** depositTest

*** deposit ETH to the Plasma contract

*** deposit ERC20 tokens to the Plasma contract

** exitWithoutWatcherTest

*** creates required exit data

** getExitQueueTest

*** processes exits and returnds empty queue

** inFlightExitChallengeResponseTest

*** invalid IFE challenge response

** inFlightExitChallengeTest

*** challenge an in-flight exit by providing a competitor

*** challenge an in-flight exit as non canonical

** inFlightExitTest

*** exit a ChildChain transaction

*** exit a ChildChain transaction that is not included

** mergeUtxoTest

*** merge utxos to a single utxo

** metadataTest

*** checks if childchain adds different types of metadata

** simpleStartStandardExitTest

*** start standard exit for a ChildChain transaction

** standardExitTest

*** exit a deposit

*** exit a ChildChain transaction

*** exit ERC20 token

** transferTest

*** transfer ETH and ERC20 tokens on the childchain

*** transfer funds on the childchain with eth currency

*** transfer ERC20 tokens on the childchain

*** transfer ETH on the childchain

*** send a transaction with 4 inputs and 4 outputs

merge and/or delete omg-js integration tests (figure out if we can absorb this)

Do we really need to delete anything from omg-js integration tests? I think the main purpose of them is to test the correctness of omg-js, not elixir-omg.
Also, I think omg-js integration tests cover more case than specs. specs is missing some challenge cases, add token tests, different kind of transaction submit cases

merge and/or delete elixir-omg perf sub app

I think it's possible to use perf but it will have to be extended because:

  1. It doesn't have cases described in the issue (deposit, transact, exit)
  2. I think it's meant only to measure execution time since it doesn't check if balances are correct

chaperon library used for perf has Loop which can be used to run a test repeatedly for a given duration.

So my suggestion is to create separate cases which will cover points in the issue

@kevsul
Copy link

kevsul commented Sep 18, 2020

Do we really need to delete anything from omg-js integration tests?

No, I don't think we need to delete anything here - apart from testing the omg-js code itself, they also provide good example code for JS developers.

  1. It doesn't have cases described in the issue (deposit, transact, exit)

There's already deposit and transact code in there, and PR omgnetwork/elixir-omg#1732 (review needed 😄 ) introduces a test for starting standard exits. Exit processing, IFEs, challenges, etc still need to be added.

  1. I think it's meant only to measure execution time since it doesn't check if balances are correct

Yeah, the perf tests are just that - to measure perf/load/stress. It's assumed that correctness is already tested elsewhere.

One thing I find cumbersome about Chaperon tests is how they're configured - if you want to change e.g. how many iterations, you have to edit the config file, recompile and run. It would be great to be able to pass params on the command line. There are lots of configuration params though, so ideally we could set defaults in a file and override them on the command line. I don't know whats the best way of doing that in Elixir though...

@InoMurko
Copy link
Contributor Author

So if we agree that omg-js and perf/ are two separate projects, not supposed to cover the same thing... I propose we do the following:

Lets make perf usable all around by:

  • package perf into a release
  • interact with perf via a simple frontend (perhaps liveview?) or some frontend machinery that allows us to quickly build forms
  • create RESTful APIs so that we can interact with it from our CI/CD tooling
  • do not forget the ability to point the test towards different environment
  • set configuration via website and REST API

Of course, the question is, why would we built a frontend if we have omg-js - I think we will see that the amount of pressure elixir node can create (in terms of IO and concurrency) would far exceed any other tooling.

As this is considered developer tooling and testing, we will specify our requirements on the fly.
@achiurizo can we get your approval on this?

@boolafish
Copy link

can you elaborate a bit more on why the frontend is needed?

@boolafish
Copy link

boolafish commented Sep 18, 2020

To elaborate more on what’s needed for release pipeline, I think it would either being run directly on the release pipeline’s agent to respond back test succeed or not or perf need to call a web hook back the release pipeline to continue or abort the automated promotion

@InoMurko
Copy link
Contributor Author

InoMurko commented Sep 18, 2020

In terms of usability, APIs are great for machine 2 machine collaboration.

APIs are less suitable for observability and introspection where humans are needed. And as we turn these tests into stateful services that might run tests for more then 24hours this kind of tooling will become useful. Imagine Datadog being just APIs...

@boolafish
Copy link

I think I kind of get but still not following well. I get UI is better for human, curious what do you expect it to show/use from UI. From description comment above it seems to provide ability to set config via a form UI?

For other part, if we need any inspection of the perf test running, should we just get ability of observation on datadog. Or do you want to build fancy UI showing how loads are hitting 😱😱

@ayrat555
Copy link
Contributor

I was playing with LiveView over the weekend. It seems it's pretty neat because you can write interactive web pages almost without writing any js code. But I have the same question as @boolafish. what kind of real-time information will be shown there (errors, rps, current test, progress, etc) ?

@InoMurko
Copy link
Contributor Author

Will provide details by EOD!

@InoMurko
Copy link
Contributor Author

Because these tests are going to be statefull, we would display:

  • pick available tests and set test configuration
  • review and export test data
  • view currently running test, it's progress, configuration (what parameters were used to kick off the test), statistics (what is the success rate of the currently running test, what TPS it's using, account balances, accounts in use, average response time received, ...). For certain things it will definitely be easier to integrate with Datadog! Look at the screenshot below.
  • provide hooks to kick off tests from accounts that are in MetaMask for example, or provide private keys,...
  • https://github.com/zorbash/observer_live useful to see the impact of the test on the node

For example, data like this definetly belongs into DD:
image

@achiurizo
Copy link
Contributor

As this is considered developer tooling and testing, we will specify our requirements on the fly.
@achiurizo can we get your approval on this?

I think this is great. Thanks for looking into this! I approve. 👍

@achiurizo
Copy link
Contributor

pick available tests and set test configuration
review and export test data
view currently running test, it's progress, configuration (what parameters were used to kick off the test), statistics (what is the success rate of the currently running test, what TPS it's using, account balances, accounts in use, average response time received, ...). For certain things it will definitely be easier to integrate with Datadog! Look at the screenshot below.
provide hooks to kick off tests from accounts that are in MetaMask for example, or provide private keys,...
https://github.com/zorbash/observer_live useful to see the impact of the test on the node

This is just a suggestion, while the form and UI does sound nice to work with, Datadog also provides synethtic testing, this allows us to create custom API/browser tests within Datadog for a service. A couple of nice things;

  • PRO: You can do this all within datadog and record the response. I believe you can also save the tests
  • PRO: Can specify a lot of configurations like location, alarms, assertions(did you get a 200) and test frequency (do this every 1 hr)
  • CON: You are stuck in Datadog.
  • CON: Maybe need to do some re-work to show all the information above (stats and TPS may need to be grab from another graph/visualization not in the API synthetic test)

@ayrat555
Copy link
Contributor

what can we do now as a proof of concept:

  1. Create a testing module that will trigger deposits and check for balances on the root chain and the child chain. This test will have configurable:
  • from and to addresses
  • tps
  • period
  • geth node url
  1. Add frontend which will allow:
  • configuring values described above for the test
  • starting the test
  • monitoring the test ( status, configuration, response speed )
  • monitoring the chain state ( balances )

@boolafish
Copy link

boolafish commented Sep 23, 2020

FYI, share some knowledges:

  1. It is not really possible to trigger specific "tps" accurately. Most frameworks, including chaperon, allows to config concurrent user/session instead. One will need to "translate" concurrent session into TPS. However, it would not be super accurate. For instance, in past experience, if we config to run TPS 100 from my Mac, it usually results in average TPS of 9x. If we would like to assert the test result, I guess we would have to, for instance, config target TPS to 110 or 120 and check the end result is > 100.

  2. To check the balance, we probably need sth like: feat: add metrics: balance per currency, number of unspent outputs, number of addresses elixir-omg#1666 that submits childchain balance metric to datadog. Otherwise I am not sure how could you get total balance of childchain.

  3. Shortest path yo make perf useful now without manual action so that we can hook into current release pipeline automation is to add assertion in the test with datadog metric and put all var that need to config from env var I guess. (eg. TPS, period, addresses...) and let the pipeline agent runs the perf tests directly. The pipeline can set its own env var to run the test.

@ayrat555
Copy link
Contributor

@boolafish I have a question the third point. Do you suggest adding datadog metrics to the current tests? will it be useful?

@boolafish
Copy link

boolafish commented Sep 23, 2020

ah...was inspired by this: https://k6.io/blog/datadog-alerts-fail-load-test

But something not going so far as 👆above would be like getting API latency metric (eg P90/P99) and assert on that. That is sth probably not that great if we use metric from perf as it would be the round time latency metric instead. For TPS, probably we can get it by perf itself. And for API Error rate I guess we can put a metric counter on the level of Tesla middleware to collect (?) 🤔

I guess above might be some basic metrics we would can assert against (latency, TPS, error rate) for performance tests. Then we can have automation tests saying we can run TPS 100 for 30 minutes with P99 latency to not exceed Xms and error rate < Y%. Currently.....lol it really depends on the release manager to decide what to see after running the load test. (we run load test on sandbox when there is release. But there is no good interpretation atm)

For sth more fancy like memory leak, CPU rate we might apply the idea from the article.

@boolafish
Copy link

Link to some pipeline requirement and random thought: #141 (comment)

@boolafish
Copy link

An update on the release tooling. We have decide to go forward with spinnaker which have out of box support to trigger/monitor kubernate job. As a result, the way to hook perf and release pipeline would be trigger perf with k job.

@InoMurko
Copy link
Contributor Author

Done part of performance and spinnaker. thanks @ayrat555 @boolafish

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
L Large amount of work testing testing the description that this is a test
Projects
None yet
Development

No branches or pull requests

5 participants