Skip to content

Conversation

@yuangu002
Copy link
Contributor

What changes were proposed in this pull request?

Table schema:

objectId --> class NSSummary
New class NSSummary should at least include these fields:

  1. Total number of files directly[1] under this directory,
  2. Total size of files directly under this directory,
  3. File Size Buckets of files directly under this directory.

Note: [1] directly means not counting files under any subdirectories, for now. We could further optimize the speed later by trading more writes (propagating the statistics any layers deep upwards) with speed. This is for reducing potential write-amplification for now. The stats could be extended into arbitrary layers deep later.

[06/21] Renamed ReconContainerDBProvider to ReconDBProvider
[06/21] Expanded the scope of this issue by designing a service provider interface for namespace summary, which wraps up all operations on the new schema.

[06/23] Merge with HDDS-5379
Because we are intending to add a new CF to the Recon container DB, the existing ContainerDBServiceProvider must be refactored to resolve conflict with a new service provider sharing the same RDB as the current one directly operates on the DB not just the CF/tables.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-5332

How was this patch tested?

Unit test: TestReconNamespaceSummaryManagerImpl

Copy link
Contributor

@avijayanhwx avijayanhwx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @yuangu002, great work on this refactoring. A few comments inline

@smengcl
Copy link
Contributor

smengcl commented Jun 24, 2021

pls check the findbugs warning:

M V EI: org.apache.hadoop.ozone.recon.api.types.NSSummary.getFileSizeBucket() may expose internal representation by returning NSSummary.fileSizeBucket  At NSSummary.java:[line 57]

Copy link
Contributor

@smengcl smengcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @yuangu002. Overall looks good. Pls address the comments. And I'll take another look after that.

Copy link
Contributor

@smengcl smengcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. @avijayanhwx would you take another look?

Copy link
Contributor

@avijayanhwx avijayanhwx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me. NSSummary classes can be refined in follow up patches.

@avijayanhwx avijayanhwx merged commit b9a86c3 into apache:master Jul 6, 2021
errose28 added a commit that referenced this pull request Jul 6, 2021
…ing-upgrade

* upstream/master:
  HDDS-5332. Add a new column family and a service provider in Recon DB for Namespace Summaries (#2366)
  HDDS-5405. Refactor pom files for HadoopRpc and Grpc/Ratis compilation properties. (#2386)
  HDDS-5406. add proto version to all the proto files. (#2385)
errose28 added a commit to errose28/ozone that referenced this pull request Jul 7, 2021
* master: (28 commits)
  HDDS-5332. Add a new column family and a service provider in Recon DB for Namespace Summaries (apache#2366)
  HDDS-5405. Refactor pom files for HadoopRpc and Grpc/Ratis compilation properties. (apache#2386)
  HDDS-5406. add proto version to all the proto files. (apache#2385)
  HDDS-5398. Avoid object creation in ReplicationManger debug log statements (apache#2379)
  HDDS-5396. Fix negligence issue conditional expressions in MockCRLStore.java (apache#2380)
  HDDS-5395. Avoid unnecessary numKeyOps.incr() call in OMMetrics (apache#2374)
  HDDS-5389. Include ozoneserviceid in fs.defaultFS when configuring o3fs (apache#2370)
  HDDS-5383. Eliminate expensive string creation in debug log messages (apache#2372)
  HDDS-5380. Get more accurate space info for DedicatedDiskSpaceUsage. (apache#2365)
  HDDS-5341. Container report processing is single threaded (apache#2338)
  HDDS-5387. ProfileServlet to move the default output location to an ozone specific directory (apache#2368)
  HDDS-5289. Update container's deleteTransactionId on creation of the transaction in SCM. (apache#2361)
  HDDS-5369. Cleanup unused configuration related to SCM HA (apache#2359)
  HDDS-5381. SCM terminated with exit status 1: null. (apache#2362)
  HDDS-5353. Avoid unnecessary executeBatch call in insertAudits (apache#2342)
  HDDS-5350 :  Add allocate block support in MockOmTransport (apache#2341). Contributed by Uma Maheswara Rao G.
  HDDS-4926. Support start/stop for container balancer via command line (apache#2278)
  HDDS-5269. Datandoe with low ratis log volume space should not be considered for new pipeline allocation. (apache#2344)
  HDDS-5367. Update modification time when updating quota/storageType/versioning (apache#2355)
  HDDS-5352. java.lang.ClassNotFoundException: org/eclipse/jetty/alpn/ALPN (apache#2347)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants