Skip to content

Conversation

@DaveTeng0
Copy link
Contributor

@DaveTeng0 DaveTeng0 commented Sep 14, 2022

What changes were proposed in this pull request?

Measure r/w performance when there is a very large of amount of metadata in rocksDB, & different amount of working sets whose size are larger than cache available.

  • Pure read
  • Pure write
  • Mixed workload, Read + Write

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-7199

How was this patch tested?

Robot tests, manual tests in cluster.

@DaveTeng0
Copy link
Contributor Author

DaveTeng0 commented Sep 14, 2022

@DaveTeng0 DaveTeng0 changed the title HDDS 7199: Implement new mix workload Read/Write Freon command which meets specific test requirements HDDS-7199: Implement new mix workload Read/Write Freon command which meets specific test requirements Sep 14, 2022
@sodonnel sodonnel changed the title HDDS-7199: Implement new mix workload Read/Write Freon command which meets specific test requirements HDDS-7199. Implement new mix workload Read/Write Freon command which meets specific test requirements Sep 15, 2022
@kerneltime
Copy link
Contributor

cc @duongkame

HDDS_BLOCK_TOKEN_ENABLED_DEFAULT);
// this.grpcBlockTokenEnabled = conf.getBoolean(HDDS_BLOCK_TOKEN_ENABLED,
// HDDS_BLOCK_TOKEN_ENABLED_DEFAULT);
this.grpcBlockTokenEnabled = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please ensure this doesn't get in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes!! let me change it back!

} else {
byte[] data = new byte[objectSizeInBytes];
try (OzoneInputStream inputStream = ozoneBucket.readKey(keyName)) {
inputStream.read(data);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kerneltime hey Ritesh, if I don't need to access the result from read method, is there a way we could suppress the build error thrown from git workflow by this line (because the build throws an error that we ignore the result of read method)? Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add a @SuppressWarnings({"unused"}) to tell findbugs make an exception here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!!


@CommandLine.Option(names = {"-r", "--range"},
description = "index range of read/write operations.",
defaultValue = "0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to establish if --range is optional? It could read contiguously till it hit not found for a key and limit the range to that point. Or it could always be a required option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put it as required for now! We could think about to make it become optional as an improvement later!

public String getKeyName(int clientIndex) {
int start, end;
// separate tasks evenly to each client
if (range < clientsCount) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please validate the expected behavior when client >> range. It might make sense to cap the client to be <= range, so in the case client configured is less than range, we only create enough clients such that each client reads only one object. Or to hammer the same client from multiple clients, if range < clients, each client gets the entire range.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I update the logic a little bit!
Each thread has its own client which holds tcp connection to OM.

We set the range here which would become the range of each thread/client could read/write on.

For example, If we set range = 10, start_index (index of key) = 20, threadNo. = 30, and we run the Freon test 5 times.
Then only 5 clients would process the read/write operation from key index 20 to 29 (range=10).

If we set range = 10, start_index (index of key) = 20, threadNo. = 30 (clients = 30), and we run the Freon test 40 times.
Then first 30 tests would be assign to 30 client to read/write from key index 20 to 29 (range=10).
The rest of 10 tests would be picked up by any 10 clients whoever finish their tasks first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we utilize multiple nodes to separate the total range of keys the user want to test on.

@DaveTeng0
Copy link
Contributor Author

Yes! The command will potentially read some key which doesn't exist in the cluster.
Currently the freon command would stop running & report failure, but let me think more about this how to make it better!!

@kerneltime
Copy link
Contributor

Yes! The command will potentially read some key which doesn't exist in the cluster.
Currently the freon command would stop running & report failure, but let me think more about this how to make it better!!

Code can continue testing and add a metric at the end for successful reads vs. not found read. It is a valid test to look at the performance of OM to report a key does not exist. Object Stores can be bombarded with nonexistent keys, and the performance of the underlying storage to report keys that don't exist is important. LSM tree-based storage has to scan all levels before reporting a key as not found and in some ways represents the worst case performance at scale.

@kerneltime kerneltime merged commit e45f9b8 into apache:master Oct 19, 2022
@DaveTeng0
Copy link
Contributor Author

Yes! The command will potentially read some key which doesn't exist in the cluster.
Currently the freon command would stop running & report failure, but let me think more about this how to make it better!!

Code can continue testing and add a metric at the end for successful reads vs. not found read. It is a valid test to look at the performance of OM to report a key does not exist. Object Stores can be bombarded with nonexistent keys, and the performance of the underlying storage to report keys that don't exist is important. LSM tree-based storage has to scan all levels before reporting a key as not found and in some ways represents the worst case performance at scale.

Thanks Ritesh for the context!! I'll create a separate jira regarding to this! This definitely make sense!!

@duongkame
Copy link
Contributor

duongkame commented Oct 19, 2022

Code can continue testing and add a metric at the end for successful reads vs. not found read. It is a valid test to look at the performance of OM to report a key does not exist. Object Stores can be bombarded with nonexistent keys, and the performance of the underlying storage to report keys that don't exist is important. LSM tree-based storage has to scan all levels before reporting a key as not found and in some ways represents the worst case performance at scale.

agree, reading nonexistent keys is a valid test case and the tool should support it deterministically. To do that, it has to know (on its own) which key exists and which doesn't. Warp does that by having a pre-test phase in which warp creates a set of keys (10K or so) and keeps the created keys in memory for the real read test.

We can also do the same for this tool, by maintaining a set of known keys that can be initialized by a pretest phase and grows with the write test.

@kaijchen
Copy link
Member

There are failures in CI, please make sure CI pass before merge @kerneltime.

adoroszlai added a commit that referenced this pull request Oct 20, 2022
…d which meets specific test requirements (#3754)"

This reverts commit e45f9b8.
@adoroszlai
Copy link
Contributor

Sorry, I had to revert this, because checkstyle and findbugs failures affect all other PRs. Please fix the failures and open new PR.

@DaveTeng0
Copy link
Contributor Author

Code can continue testing and add a metric at the end for successful reads vs. not found read. It is a valid test to look at the performance of OM to report a key does not exist. Object Stores can be bombarded with nonexistent keys, and the performance of the underlying storage to report keys that don't exist is important. LSM tree-based storage has to scan all levels before reporting a key as not found and in some ways represents the worst case performance at scale.

agree, reading nonexistent keys is a valid test case and the tool should support it deterministically. To do that, it has to know (on its own) which key exists and which doesn't. Warp does that by having a pre-test phase in which warp creates a set of keys (10K or so) and keeps the created keys in memory for the real read test.

We can also do the same for this tool, by maintaining a set of known keys that can be initialized by a pretest phase and grows with the write test.

Sure!! I'll take a look how Warp do it and create a jira for it!

@DaveTeng0
Copy link
Contributor Author

There are failures in CI, please make sure CI pass before merge @kerneltime.

Sorry!! I'll take a look!!

@DaveTeng0
Copy link
Contributor Author

Sorry, I had to revert this, because checkstyle and findbugs failures affect all other PRs. Please fix the failures and open new PR.

Sorry!! I'll take a look!! thanks Attila!

@kerneltime
Copy link
Contributor

Sorry, I had to revert this, because checkstyle and findbugs failures affect all other PRs. Please fix the failures and open new PR.

Thank you @adoroszlai!
I should have checked

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants