Skip to content

How to create test data

John Rusk [MSFT] edited this page Nov 5, 2019 · 8 revisions

Sometimes you need some dummy data for testing - e.g. to see what throughput you can get.

For uploads, you should usually just use AzCopy's benchmark mode (since that will create the randomized test data on the fly automatically). But in version 10.3.x, benchmark mode doesn't directly support testing of account-to-account copies and downloads. In those cases, you can follow the procedure outlined below to create and use some random test blobs.

  1. Decide which storage container to use. If you will be testing with a large number of blobs, it's best to create a new blank container just for the testing - simply because this makes it easier to delete them all at the end. If you'll have just a few hundred blobs, or less, you can use an existing container because deletion won't be a problem when you don't have many.
  2. Run azCopy's benchmark mode, targeting the container, but tell it to not delete the test data. You're not really using the benchmark results here. You're just using it as a data generator. The key point is to set delete-test-data to false, so that the benchmark run will leave the data files in the storage container. E.g. ./azcopy bench "containerSASURL" --file-count 100 --size-per-file 20M --delete-test-data=false
  3. The benchmark run will create a virtual directory in the blob container. The virtual directory will be named like "benchmark_jobid", where jobid is the id of the AzCopy benchmark job. The directory will contain the specified number of files, of the specified size. Each file contains nearly-random data. (To allow quick data generation, it's not quite perfectly random, but it's close enough for testing purposes.)
  4. Now you can use those files for testing.
    • To test an account to account copy, use those files as the source. Your source URL should contain the virtual directory name, like this: https:\account.blob.core.windows.net\container\benchmark_jobid?SAS. To construct that URL, just make a container-level SAS URL then edit it to add the directory name.
    • To test a download, use the same process to construct the source URL. And also, consider specifying the destination as NUL (on Windows) or /dev/null (on Linux). That means "download the data, but throw it away without saving it to disk". That's handy for testing, since it means you're testing pure Storage and network throughput, unaffected by local disk.
  5. When you've finished testing, delete the test files. If you put them into an empty container, the quickest way is to just delete the whole container. If you have other data in the container that you want to keep, you need to delete just the virtual directory that has the test data. To do this, Storage Explorer is a good tool.