Skip to content

Conversation

@smengcl
Copy link
Contributor

@smengcl smengcl commented Mar 18, 2023

What changes were proposed in this pull request?

  1. Print proper JSON object ({ }) if --with-keys=true.
  2. Print proper JSON array ([ ]) if --with-keys=false.
  3. --with-keys now defaults to true. Tweak some option names. Improve error messages. Should have no compatiblity concern what-so-ever as this is intended to be a debug tool.
  4. Rewritten and parameterized TestLDBCli. New test cases are added.
  5. Refactor DBScanner for readability and maintainability.
  6. Change the SchemaV3 separator to the one actually used in DatanodeConfiguration for DB key (currently |)
  7. Exit code 0 on success, 1 on error.

Note

Regarding the core serialization logic in DBScanner, I've chosen to stick to the current Gson approach that serializes each entry and immediately printing it. It should consume less memory than gathering all entries then serializing and printing it (could OOM if batch limit is too high like a few billion entries, while the current approach should work just fine).

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-6064

How was this patch tested?

  • Rewritten TestLDBCli and added new test cases.
    • Fully migrated to JUnit 5.
    • Fully parameterized tests. Able to easily add more params combination in the future.
      • Use Named.of() to describe each parameter for maintainability.
      • Pls feel free to contribute more test cases now that adding a new case is just a matter of a few lines. :)
    • Rewritten keyTable tests, which is heavily inspired by @adoroszlai 's change in HDDS-6064. ozone debug ldb should print proper JSON with --with-keys=true #2917 (in which testOMDB is split into multiple test cases).
    • Datanode DB block_data table schema V3 and V2 tests are completely rewritten.
    • This took longer for me than fixing and refactoring DBScanner :D

IntelliJ

Future potential improvement

  • For SchemaV3 block_data table, we could further nest entries inside another map. With the outer-most layer being containerId.

e.g. Currently (with the separator change in this PR), the JSON key for an entry inside V3 block_data is containerId|blockId:

{ "2|3": {
  "blockID": {
    "containerBlockID": {
      "containerID": 2,
      "localID": 3
    },
    "blockCommitSequenceId": 0
  },
  "metadata": {},
  "size": 0
}, "2|4": {
  "blockID": {
    "containerBlockID": {
      "containerID": 2,
      "localID": 4
    },
    "blockCommitSequenceId": 0
  },
  "metadata": {},
  "size": 0
} }

With another layer, the output could become even cleaner, making it easier to be filtered using jq:

{
  "2": {
    "3": {
      "blockID": {
        "containerBlockID": {
          "containerID": 2,
          "localID": 3
        },
        "blockCommitSequenceId": 0
      },
      "metadata": {},
      "size": 0
    },
    "4": {
      "blockID": {
        "containerBlockID": {
          "containerID": 2,
          "localID": 4
        },
        "blockCommitSequenceId": 0
      },
      "metadata": {},
      "size": 0
    }
  }
}

Example output with this PR

Outputs are mostly from integration test's stdout with dummy data for demo purpose. Command lines are reconstructed from test parameters. Actual CLI output can differ.

ozone debug ldb --db=/data/metadata/om.db scan --column-family=keyTable --limit=1

{ "key1": {
  "volumeName": "vol1",
  "bucketName": "buck1",
  "keyName": "key1",
  "dataSize": 1000,
  "keyLocationVersions": [
    {
      "version": 0,
      "locationVersionMap": {
        "0": []
      },
      "isMultipartKey": false
    }
  ],
  "creationTime": 1679105144793,
  "modificationTime": 1679105144793,
  "replicationConfig": {
    "replicationFactor": "ONE"
  },
  "isFile": false,
  "fileName": "key1",
  "acls": [],
  "parentObjectID": 0,
  "objectID": 0,
  "updateID": 0,
  "metadata": {}
} }

ozone debug ldb --db=/data/metadata/om.db scan --column-family=keyTable

{ "key1": {
  "volumeName": "vol1",
  "bucketName": "buck1",
  "keyName": "key1",
  "dataSize": 1000,
  "keyLocationVersions": [
    {
      "version": 0,
      "locationVersionMap": {
        "0": []
      },
      "isMultipartKey": false
    }
  ],
  "creationTime": 1679102602165,
  "modificationTime": 1679102602166,
  "replicationConfig": {
    "replicationFactor": "ONE"
  },
  "isFile": false,
  "fileName": "key1",
  "acls": [],
  "parentObjectID": 0,
  "objectID": 0,
  "updateID": 0,
  "metadata": {}
}, "key2": {
  "volumeName": "vol1",
  "bucketName": "buck1",
  "keyName": "key2",
  "dataSize": 1000,
  "keyLocationVersions": [
    {
      "version": 0,
      "locationVersionMap": {
        "0": []
      },
      "isMultipartKey": false
    }
  ],
  "creationTime": 1679102602279,
  "modificationTime": 1679102602279,
  "replicationConfig": {
    "replicationFactor": "ONE"
  },
  "isFile": false,
  "fileName": "key2",
  "acls": [],
  "parentObjectID": 0,
  "objectID": 0,
  "updateID": 0,
  "metadata": {}
}, "key3": {
  "volumeName": "vol1",
  "bucketName": "buck1",
  "keyName": "key3",
  "dataSize": 1000,
  "keyLocationVersions": [
    {
      "version": 0,
      "locationVersionMap": {
        "0": []
      },
      "isMultipartKey": false
    }
  ],
  "creationTime": 1679102602282,
  "modificationTime": 1679102602282,
  "replicationConfig": {
    "replicationFactor": "ONE"
  },
  "isFile": false,
  "fileName": "key3",
  "acls": [],
  "parentObjectID": 0,
  "objectID": 0,
  "updateID": 0,
  "metadata": {}
}, "key4": {
  "volumeName": "vol1",
  "bucketName": "buck1",
  "keyName": "key4",
  "dataSize": 1000,
  "keyLocationVersions": [
    {
      "version": 0,
      "locationVersionMap": {
        "0": []
      },
      "isMultipartKey": false
    }
  ],
  "creationTime": 1679102602284,
  "modificationTime": 1679102602284,
  "replicationConfig": {
    "replicationFactor": "ONE"
  },
  "isFile": false,
  "fileName": "key4",
  "acls": [],
  "parentObjectID": 0,
  "objectID": 0,
  "updateID": 0,
  "metadata": {}
}, "key5": {
  "volumeName": "vol1",
  "bucketName": "buck1",
  "keyName": "key5",
  "dataSize": 1000,
  "keyLocationVersions": [
    {
      "version": 0,
      "locationVersionMap": {
        "0": []
      },
      "isMultipartKey": false
    }
  ],
  "creationTime": 1679102602286,
  "modificationTime": 1679102602286,
  "replicationConfig": {
    "replicationFactor": "ONE"
  },
  "isFile": false,
  "fileName": "key5",
  "acls": [],
  "parentObjectID": 0,
  "objectID": 0,
  "updateID": 0,
  "metadata": {}
} }

ozone debug ldb --db=/data/metadata/om.db scan --column-family=keyTable --limit=2 --with-keys=false

[ {
  "volumeName": "vol1",
  "bucketName": "buck1",
  "keyName": "key1",
  "dataSize": 1000,
  "keyLocationVersions": [
    {
      "version": 0,
      "locationVersionMap": {
        "0": []
      },
      "isMultipartKey": false
    }
  ],
  "creationTime": 1679102602165,
  "modificationTime": 1679102602166,
  "replicationConfig": {
    "replicationFactor": "ONE"
  },
  "isFile": false,
  "fileName": "key1",
  "acls": [],
  "parentObjectID": 0,
  "objectID": 0,
  "updateID": 0,
  "metadata": {}
}, {
  "volumeName": "vol1",
  "bucketName": "buck1",
  "keyName": "key2",
  "dataSize": 1000,
  "keyLocationVersions": [
    {
      "version": 0,
      "locationVersionMap": {
        "0": []
      },
      "isMultipartKey": false
    }
  ],
  "creationTime": 1679102602279,
  "modificationTime": 1679102602279,
  "replicationConfig": {
    "replicationFactor": "ONE"
  },
  "isFile": false,
  "fileName": "key2",
  "acls": [],
  "parentObjectID": 0,
  "objectID": 0,
  "updateID": 0,
  "metadata": {}
} ]

ozone debug ldb --db=/data/hdds/hdds/CID-UUID1/DS-UUID2/container.db scan --column-family=block_data --dn-schema=V3 --container-id=2 --limit=2

{ "2|3": {
  "blockID": {
    "containerBlockID": {
      "containerID": 2,
      "localID": 3
    },
    "blockCommitSequenceId": 0
  },
  "metadata": {},
  "size": 0
}, "2|4": {
  "blockID": {
    "containerBlockID": {
      "containerID": 2,
      "localID": 4
    },
    "blockCommitSequenceId": 0
  },
  "metadata": {},
  "size": 0
} }

ozone debug ldb --db=/data/hdds/hdds/CID-UUID1/DS-UUID2/container.db scan --column-family=block_data --dn-schema=V2 --limit=4

{ "1": {
  "blockID": {
    "containerBlockID": {
      "containerID": 1,
      "localID": 1
    },
    "blockCommitSequenceId": 0
  },
  "metadata": {},
  "size": 0
}, "2": {
  "blockID": {
    "containerBlockID": {
      "containerID": 1,
      "localID": 2
    },
    "blockCommitSequenceId": 0
  },
  "metadata": {},
  "size": 0
}, "3": {
  "blockID": {
    "containerBlockID": {
      "containerID": 2,
      "localID": 3
    },
    "blockCommitSequenceId": 0
  },
  "metadata": {},
  "size": 0
}, "4": {
  "blockID": {
    "containerBlockID": {
      "containerID": 2,
      "localID": 4
    },
    "blockCommitSequenceId": 0
  },
  "metadata": {},
  "size": 0
} }

smengcl added 24 commits March 17, 2023 03:22
…ys=true`

Change-Id: I088ab07543c75cfa0cda22b0dd348827938fbc90
Change-Id: I54fd4f5440933824d152b8abf5e64bf303027b00
Change-Id: I65f79b8f858b2010c53a69def485db01291b785f
Change-Id: Iee2567ea81bb6559215ac320db14cc9356b68228
Change-Id: I684e555c8b0daf7f1973eea741e69d54dc4d158d
Change-Id: Ic0a52a9787c5ea93d061ad7bcbbc7dd30a060ac1
Change-Id: I65611172095301c1bfda418d0e977a8c0a40a204
Change-Id: Ifebbe6ad81c657813a77f89534f7613f1e7debe1
Change-Id: I8adf7aeef4d9f263bd7675f3bc52fd7a2173c2d5
Change-Id: I282a732fa9de5959f121b9cff24c42a9f7990228
…le()` which should not have been there)

Change-Id: I27915374e08cba3b858c564f8aaf87e2ddfc3eaa
Change-Id: Idab9a301d48e67ee00a9b668ebdc2b847e72e8ea
Change-Id: I439b0a9dad511073ee60b1c6f6f7309b65026971
Change-Id: I74656163dde8e004b08f839e4769e79687abb65a
Change-Id: Ie574e1ca4c6b38aa8277a301454f6373f753a5ad
Change-Id: I92f8c6a2d969d3bdc35818080dae6d2668a55edc
Change-Id: I22ed04646f742e4ac6913eb5430511688383d313
Change-Id: Icd0f3235f3da50f795d1a595abb1afe025cece30
Change-Id: Ia1a1b59ad2f103fe1905ee93988d8336ad4a4822
Change-Id: I0c753f4f68b4ae0696b65767039bc7dfbbdb44ec
Change-Id: If590813c4131f75dfcb3fb45d30a367e9ad11f53
Change-Id: Ie8ef793a28758596887be7be9598f48f568392c5
Change-Id: Id71b963cd1e9c1d5fe3f81875c1f752f75322b32
Change-Id: I3020e5b02a0f7360a038b4955963c4fb652008e9
Change-Id: I31e074c618b852e8c73023a312f17e66eb9cd434
columnFamilyDefinition.getTableName().getBytes(UTF_8),
columnFamilyHandleList);
if (columnFamilyHandle == null) {
throw new IllegalStateException("columnFamilyHandle is null");
Copy link
Contributor Author

@smengcl smengcl Mar 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi, rationale behind the switch from IllegalArgumentException to IllegalStateException:

#4376 (comment)

@kerneltime kerneltime requested a review from errose28 March 20, 2023 16:16
@kerneltime
Copy link
Contributor

@swamirishi @tanvipenumudy @DaveTeng0 can you please take a look

Copy link
Contributor

@hemantk-12 hemantk-12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @smengcl for the patch and upgrading to Junit-5.

Overall patch looks good. Left some minor comments.

Change-Id: Ie64c5f9ca93e9c6049d1b5fb92e245ef3f68845e
Copy link
Contributor

@hemantk-12 hemantk-12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1.

Thanks @smengcl for working on this.

Copy link
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for refactoring and improvements @smengcl. I just have one comment on the key format.

smengcl added 3 commits March 23, 2023 15:16
Conflicts:
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/debug/DBScanner.java

Change-Id: Ia0526717f5996e76c1aba681cb2212162208d2a4
Change-Id: I567ebeca3cc65955e5435c59169a695b6bd895ab
Change-Id: I887f7b9b8c1cbbcfb159ae315d644588b82393b4
@errose28
Copy link
Contributor

Thanks for the improvement @smengcl

@errose28 errose28 merged commit 7357f75 into apache:master Mar 25, 2023
@smengcl
Copy link
Contributor Author

smengcl commented Mar 28, 2023

Thanks @hemantk-12 and @errose28 for reviewing this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants