HDDS-3622. Implement rocksdb tool to parse scm db #945

sadanand48 · 2020-05-19T17:59:27Z

What changes were proposed in this pull request?

This tool parses rocksdb file for SCM and dumps specified table data. Also there is a list command which parses any db file and lists the columnFamilies(tables).

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-3622

How was this patch tested?

Manually Tested.

Here are the commands:
1.bin/ozone debug ldb --db=/tmp/ozone/data/metadata/scm.db/ list_column_families
default
validCerts
deletedBlocks
pipelines
revokedCerts
containers

2.bin/ozone debug ldb --db=/tmp/ozone/data/metadata/scm.db/ scan --column_family=containers

hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/debug/ListTables.java

hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/debug/SCMDBParser.java

hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/debug/RDBParser.java

avijayanhwx · 2020-05-19T18:31:49Z

@sadanand48 Thanks for working on this useful tool.

There is an Ozone component called 'Recon' which is supposed to act as the "management" HQ for Ozone. Currently, it has access to SCM and OM metadata in the same format as they are in the respective components. The vision for Recon is to know what is happening in Ozone and understand what decisions are being taken by SCM, OM & DN. Please take a look at HDDS-1084 and HDDS-1996 for more information.

Since Recon already has the SCM DB, the same tool should also work against that. Can we add a flag to this CLI like --recon which allows the tool to read from the Recon server as well? That will be useful in understanding Recon's state.

swagle · 2020-05-19T18:40:43Z

@sadanand48 Thanks for working on this useful tool.

There is an Ozone component called 'Recon' which is supposed to act as the "management" HQ for Ozone. Currently, it has access to SCM and OM metadata in the same format as they are in the respective components. The vision for Recon is to know what is happening in Ozone and understand what decisions are being taken by SCM, OM & DN. Please take a look at HDDS-1084 and HDDS-1996 for more information.

Since Recon already has the SCM DB, the same tool should also work against that. Can we add a flag to this CLI like --recon which allows the tool to read from the Recon server as well? That will be useful in understanding Recon's state.

Why not add the code to a package in Recon?

elek · 2020-05-21T13:23:49Z

Thank you @sadanand48, I am very happy to have this feature.

Can we add a flag to this CLI like --recon which allows the tool to read from the Recon server as well?

I think it's a more generic question. How can I choose the appropriate db definition based on a given database / table? There can be multiple options:

If the column families are unique, we can try to use just the name
If not, we can try to identify the required db definition based on the name of the directory (scm.db, ...)

elek

Thanks again to work on this. I really like it.

One more usability comment:

This patch introduces two subcommands:

ozone rdbparser scmdbparser and ozone rdbparser list.

As I wrote it earlier, I would prefer to have just one single command which can detect the required db schema based on name. I think we can follow the parameter / arguments convention of ldb.

What do you think about this:

ozone ldb --db=/data/.../scm.db --column_family=pipeline scan
ozone ldb --db=/data/..../scm.db list_column_families

For me it's more natural (for example parser seems to be redundant in the current name)

But to be honest: none of my comments are so important: I am happy to commit it in this form if we will improve it in the next few commits...

`ozone

elek · 2020-05-21T13:32:46Z

hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/debug/ListTables.java

+    return null;
+  }
+
+  private static List<byte[]> getColumnFamilyList(String dbPath)


Nit1: I am not sure if we need a separated method just to call this function.

List<byte[]> columnFamilies = RocksDB.listColumnFamilies(new Options(), parent.getDbPath);

Nit2: I tend to agree with Uncle Bob, that we don't need to include the type of a variable in the name of the variable:

Hungarian Notation was considered to be pretty important back in the Windows C API, when everything was an integer handle or a long pointer or a void pointer, or one of several implementations of “string” (with different uses and attributes). The compiler did not check types in those days, so the programmers needed a crutch to help them remember the types.

In modern languages we have much richer type systems, and the compilers remember and enforce the types. What’s more, there is a trend toward smaller classes and shorter functions so that people can usually see the point of declaration of each variable they’re using.

Java programmers don’t need type encoding. Objects are strongly typed, and editing environments have advanced such that they detect a type error long before you can run a compile! So nowadays HN and other forms of type encoding are simply impediments. They make it harder to change the name or type of a variable, function, or class. They make it harder to read the code. And they create the possibility that the encoding system will mislead the reader.

(In: Clean Code / Chapter 2. Meaningful name)

hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/debug/SCMDBParser.java

elek · 2020-05-21T13:37:16Z

hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/debug/SCMDBParser.java

+)
+public class SCMDBParser implements Callable<Void> {
+
+  @CommandLine.Option(names = {"-table"},


Nit: Can you please use --table. It's more standard (or use -t and --table as an alias)

sadanand48 · 2020-05-29T15:22:04Z

Thanks @elek for the comments. I have made the changes you suggested. Also, i am now choosing the DBDefinition according to the directory name. In this patch , i have done it only for scm. Will extend it for om in a follow up jira.

If not, we can try to identify the required db definition based on the name of the directory (scm.db, ...)

sadanand48 · 2020-06-08T14:25:48Z

@elek Can you please review this ? I have made the changes you suggested.

elek

Also, i am now choosing the DBDefinition according to the directory name. In this patch , i have done it only for scm. Will extend it for om in a follow up jira.

Sounds like a great plan.

Thanks for the update. (And sorry for the late re-check . I had a few "code writing" days) Will merge it soon.

I think it will be a very useful tool to debug problems.

HDDS-3622.Implement rocksdb tool to parse scm db

1f13010

mukul1987 requested changes May 19, 2020

View reviewed changes

addressed review comments

0de7702

elek changed the title ~~HDDS-3622.Implement rocksdb tool to parse scm db~~ HDDS-3622. Implement rocksdb tool to parse scm db May 21, 2020

elek reviewed May 21, 2020

View reviewed changes

Reading DBDefinition from db file name

41790d7

nandakumar131 assigned sadanand48 Jun 4, 2020

elek approved these changes Jun 10, 2020

View reviewed changes

elek merged commit ef24b11 into apache:master Jun 10, 2020

elek mentioned this pull request Jun 10, 2020

HDDS-3405. Tool for Listing keys from the OpenKeyTable #864

Closed

sadanand48 mentioned this pull request Jun 15, 2020

HDDS-3773. Add OMDBDefinition to define structure of om.db. #1076

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HDDS-3622. Implement rocksdb tool to parse scm db #945

HDDS-3622. Implement rocksdb tool to parse scm db #945

Uh oh!

sadanand48 commented May 19, 2020 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

avijayanhwx commented May 19, 2020 •

edited

Loading

Uh oh!

swagle commented May 19, 2020

Uh oh!

elek commented May 21, 2020

Uh oh!

elek left a comment

Uh oh!

elek May 21, 2020

Uh oh!

Uh oh!

elek May 21, 2020

Uh oh!

sadanand48 commented May 29, 2020

Uh oh!

sadanand48 commented Jun 8, 2020

Uh oh!

elek left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

HDDS-3622. Implement rocksdb tool to parse scm db #945

HDDS-3622. Implement rocksdb tool to parse scm db #945

Uh oh!

Conversation

sadanand48 commented May 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

avijayanhwx commented May 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

swagle commented May 19, 2020

Uh oh!

elek commented May 21, 2020

Uh oh!

elek left a comment

Choose a reason for hiding this comment

Uh oh!

elek May 21, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elek May 21, 2020

Choose a reason for hiding this comment

Uh oh!

sadanand48 commented May 29, 2020

Uh oh!

sadanand48 commented Jun 8, 2020

Uh oh!

elek left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

sadanand48 commented May 19, 2020 •

edited

Loading

avijayanhwx commented May 19, 2020 •

edited

Loading