-
Notifications
You must be signed in to change notification settings - Fork 589
HDDS-3622. Implement rocksdb tool to parse scm db #945
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/debug/ListTables.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/debug/ListTables.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/debug/SCMDBParser.java
Show resolved
Hide resolved
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/debug/RDBParser.java
Outdated
Show resolved
Hide resolved
|
@sadanand48 Thanks for working on this useful tool. There is an Ozone component called 'Recon' which is supposed to act as the "management" HQ for Ozone. Currently, it has access to SCM and OM metadata in the same format as they are in the respective components. The vision for Recon is to know what is happening in Ozone and understand what decisions are being taken by SCM, OM & DN. Please take a look at HDDS-1084 and HDDS-1996 for more information. Since Recon already has the SCM DB, the same tool should also work against that. Can we add a flag to this CLI like --recon which allows the tool to read from the Recon server as well? That will be useful in understanding Recon's state. |
Why not add the code to a package in Recon? |
|
Thank you @sadanand48, I am very happy to have this feature.
I think it's a more generic question. How can I choose the appropriate db definition based on a given database / table? There can be multiple options:
|
elek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again to work on this. I really like it.
One more usability comment:
This patch introduces two subcommands:
ozone rdbparser scmdbparser and ozone rdbparser list.
As I wrote it earlier, I would prefer to have just one single command which can detect the required db schema based on name. I think we can follow the parameter / arguments convention of ldb.
What do you think about this:
ozone ldb --db=/data/.../scm.db --column_family=pipeline scan
ozone ldb --db=/data/..../scm.db list_column_families
For me it's more natural (for example parser seems to be redundant in the current name)
But to be honest: none of my comments are so important: I am happy to commit it in this form if we will improve it in the next few commits...
`ozone
| return null; | ||
| } | ||
|
|
||
| private static List<byte[]> getColumnFamilyList(String dbPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit1: I am not sure if we need a separated method just to call this function.
List<byte[]> columnFamilies = RocksDB.listColumnFamilies(new Options(), parent.getDbPath);Nit2: I tend to agree with Uncle Bob, that we don't need to include the type of a variable in the name of the variable:
Hungarian Notation was considered to be pretty important back in the Windows C API, when everything was an integer handle or a long pointer or a void pointer, or one of several implementations of “string” (with different uses and attributes). The compiler did not check types in those days, so the programmers needed a crutch to help them remember the types.
In modern languages we have much richer type systems, and the compilers remember and enforce the types. What’s more, there is a trend toward smaller classes and shorter functions so that people can usually see the point of declaration of each variable they’re using.
Java programmers don’t need type encoding. Objects are strongly typed, and editing environments have advanced such that they detect a type error long before you can run a compile! So nowadays HN and other forms of type encoding are simply impediments. They make it harder to change the name or type of a variable, function, or class. They make it harder to read the code. And they create the possibility that the encoding system will mislead the reader.
(In: Clean Code / Chapter 2. Meaningful name)
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/debug/SCMDBParser.java
Outdated
Show resolved
Hide resolved
| ) | ||
| public class SCMDBParser implements Callable<Void> { | ||
|
|
||
| @CommandLine.Option(names = {"-table"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Can you please use --table. It's more standard (or use -t and --table as an alias)
|
Thanks @elek for the comments. I have made the changes you suggested. Also, i am now choosing the DBDefinition according to the directory name. In this patch , i have done it only for scm. Will extend it for om in a follow up jira.
|
|
@elek Can you please review this ? I have made the changes you suggested. |
elek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, i am now choosing the DBDefinition according to the directory name. In this patch , i have done it only for scm. Will extend it for om in a follow up jira.
Sounds like a great plan.
Thanks for the update. (And sorry for the late re-check . I had a few "code writing" days) Will merge it soon.
I think it will be a very useful tool to debug problems.
What changes were proposed in this pull request?
This tool parses rocksdb file for SCM and dumps specified table data. Also there is a list command which parses any db file and lists the columnFamilies(tables).
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-3622
How was this patch tested?
Manually Tested.
Here are the commands:
1.bin/ozone debug ldb --db=/tmp/ozone/data/metadata/scm.db/ list_column_families
default
validCerts
deletedBlocks
pipelines
revokedCerts
containers
2.bin/ozone debug ldb --db=/tmp/ozone/data/metadata/scm.db/ scan --column_family=containers