Skip to content

Conversation

@smengcl
Copy link
Contributor

@smengcl smengcl commented Aug 3, 2020

What changes were proposed in this pull request?

This Jira is to use deleteObjects in OFS delete now that HDDS-3286 is committed.

Currently when ozone.om.enable.filesystem.paths is enabled it normalizes the path, so using deleteKey for delete directory will fail.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-4040

How was this patch tested?

Added testFileDelete - mostly copied from o3fs test.

Change-Id: I263cfbad1f552b3c505f5745975173f120e036e3
@smengcl smengcl added bug Something isn't working client labels Aug 3, 2020
@smengcl smengcl self-assigned this Aug 3, 2020
Change-Id: I5213fc1267f8cdd3c40817f5913c565caa202c8e
@arp7 arp7 requested a review from bharatviswa504 August 3, 2020 20:20
…_PATHS.

Change-Id: I3752de9a7f2167c9b7687971c8426db9368796b2
@smengcl
Copy link
Contributor Author

smengcl commented Aug 5, 2020

It seems adding parameterized test for TestRootedOzoneFileSystem tripped the time out.
Before the parameterization it is already taking 40min to finish the integration test suite:
https://github.com/apache/hadoop-ozone/runs/941577903

[INFO] Reactor Summary for Apache Hadoop Ozone Integration Tests 0.6.0-SNAPSHOT:
[INFO] 
[INFO] Apache Hadoop Ozone Integration Tests .............. SUCCESS [40:01 min]
[INFO] Apache Hadoop Ozone Mini Ozone Chaos Tests ......... SUCCESS [  1.758 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS

After the parameterization commit it is timing out at over 45min:
https://github.com/apache/hadoop-ozone/pull/1286/checks?check_run_id=946974224

[INFO] Reactor Summary for Apache Hadoop Ozone Integration Tests 0.6.0-SNAPSHOT:
[INFO] 
[INFO] Apache Hadoop Ozone Integration Tests .............. FAILURE [45:44 min]
[INFO] Apache Hadoop Ozone Mini Ozone Chaos Tests ......... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  45:45 min
[INFO] Finished at: 2020-08-04T23:33:52Z

@bharatviswa504
Copy link
Contributor

bharatviswa504 commented Aug 5, 2020

It seems adding parameterized test for TestRootedOzoneFileSystem tripped the time out.

Do we need to increase time out in test?

…to 1000s.

Not sure if this would have side effects. Have a try.
@smengcl
Copy link
Contributor Author

smengcl commented Aug 6, 2020

It seems adding parameterized test for TestRootedOzoneFileSystem tripped the time out.

Do we need to increase time out in test?

Yep. But the timeout in this case seems to be imposed at a higher level (maybe forkedProcessTimeoutInSeconds) not at the test class level (300s for each test method).

Change-Id: Ic8ed3460d7cee5cc3c732175643123bfc84ae7a9
@smengcl
Copy link
Contributor Author

smengcl commented Aug 6, 2020

I ran TestRootedOzoneFileSystem locally again. It seems the problem is it is taking significant longer (double the time spent) to run the test with the parameter (14m 52s total for both, about 7m 30s for each) than without the parameter (7m 47s). Good new is all tests pass.

Another observation. On the GH workflow run, without parameterization we were spending around 9.5min for TestRootedOzoneFileSystem without parameterization:
https://github.com/apache/hadoop-ozone/runs/941577903

[INFO] Running org.apache.hadoop.fs.ozone.TestOzoneFileSystem
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 251.496 s - in org.apache.hadoop.fs.ozone.TestOzoneFileSystem
...
[INFO] Running org.apache.hadoop.fs.ozone.TestRootedOzoneFileSystem
[INFO] Tests run: 24, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 566.177 s - in org.apache.hadoop.fs.ozone.TestRootedOzoneFileSystem
...
[INFO] Running org.apache.hadoop.fs.ozone.TestOzoneFileInterfaces
[WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 381.674 s - in org.apache.hadoop.fs.ozone.TestOzoneFileInterfaces

With parameterization we can expect it to be spending over 1100s (18min) on this one test suite.. which is not good.

We will need to reduce the run time of the test suite. Probably should do this in another jira, with something like HDDS-2833 by reducing the overhead of starting up & tearing down the mini cluster.
Also I haven't figured out where the timeout is imposed as of now. @adoroszlai Do you know where this 45m-46m timeout is imposed?

@bharatviswa504 Now that we can prove the tests work with both ozone.om.enable.filesystem.paths options. Should we proceed with fdbc71c ?

@adoroszlai
Copy link
Contributor

Do you know where this 45m-46m timeout is imposed?

Looks like TestRootedOzoneFileSystem hits surefire.fork.timeout of 1000 seconds, as indicated by the error message There was a timeout or other error in the fork.

smengcl added 2 commits August 6, 2020 11:00
This reverts commit 5865c2f.

Change-Id: I96c1777aa753be44bd57564e9af1a660589152d1
Change-Id: I4eb115689bb9b99d44438b2b79e17a2cc93b4488
@smengcl
Copy link
Contributor Author

smengcl commented Aug 6, 2020

Do you know where this 45m-46m timeout is imposed?

Looks like TestRootedOzoneFileSystem hits surefire.fork.timeout of 1000 seconds, as indicated by the error message There was a timeout or other error in the fork.

Thanks! I'm increasing it further to check.

@bharatviswa504
Copy link
Contributor

@bharatviswa504 Now that we can prove the tests work with both ozone.om.enable.filesystem.paths options. Should we >proceed with fdbc71c ?

Can we use @BeforeClass for starting MiniOzoneCluster instead for each test, that would save time.
I would like to keep this parameterized, if anything is broken it would be easily caught, as default is false. (because of this, there were some cases missed)

@smengcl
Copy link
Contributor Author

smengcl commented Aug 6, 2020

@bharatviswa504 Now that we can prove the tests work with both ozone.om.enable.filesystem.paths options. Should we >proceed with fdbc71c ?

Can we use @BeforeClass for starting MiniOzoneCluster instead for each test, that would save time.
I would like to keep this parameterized, if anything is broken it would be easily caught, as default is false. (because of this, there were some cases missed)

Good idea. Just have to make everything static. Let me try.

Copy link
Contributor

@bharatviswa504 bharatviswa504 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
I have one minor comment, and also test change to make use of @BeforeClass.

Change-Id: I0c6215ad65973de3ee6099618be830e2e65549c3
@ChenSammi ChenSammi added the 0.6.0 label Aug 7, 2020
Copy link
Contributor

@xiaoyuyao xiaoyuyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, a few minor comments added inline.

@smengcl smengcl changed the title HDDS-4040. OFS should use deleteObjects when deleting directory HDDS-4040. [OFS] BasicRootedOzoneFileSystem to support batchDelete Aug 7, 2020
smengcl added 5 commits August 7, 2020 11:21
…ting the operation is successful.

Change-Id: I6d9ddc92299776e9f2aa07ab57c37b0f5a6af7a1
Change-Id: I3e92d159c664eeea4c54db06a4bb8a8ce7959306
Change-Id: I9e28debbd9be021b2e4dac46ab9d811ad41376b8
Change-Id: I94531e4e57f11bd2cf3e8f9c956725f708c14ec4
Change-Id: I7174e0841c280ea8d9f41e1cdc6b67f76b0e0ad1
@xiaoyuyao
Copy link
Contributor

Thanks @smengcl for the update. Let's merge it by EOD today if there is no additional comments.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @smengcl for implementing this. The change works fine:

TRACE ozone.BasicRootedOzoneFileSystem: Iterating path: ofs://om/fstest1/bucket1-ofs/test/ofs/dir
TRACE ozone.BasicRootedOzoneFileSystem: Iterating directory: fstest1/bucket1-ofs/test/ofs/dir/
TRACE ozone.BasicRootedOzoneFileSystem: iterating key path: fstest1/bucket1-ofs/test/ofs/dir/
TRACE ozone.BasicRootedOzoneFileSystem: iterating key path: fstest1/bucket1-ofs/test/ofs/dir/MOVED.TXT
TRACE ozone.BasicRootedOzoneFileSystem: iterating key path: fstest1/bucket1-ofs/test/ofs/dir/PUTFILE.txt
TRACE ozone.BasicRootedOzoneFileSystem: iterating key path: fstest1/bucket1-ofs/test/ofs/dir/TOUCHFILE-ofs.txt
TRACE ozone.BasicRootedOzoneFileSystem: Deleting keys: [fstest1/bucket1-ofs/test/ofs/dir/, fstest1/bucket1-ofs/test/ofs/dir/MOVED.TXT, fstest1/bucket1-ofs/test/ofs/dir/PUTFILE.txt, fstest1/bucket1-ofs/test/ofs/dir/TOUCHFILE-ofs.txt]

Comment on lines 305 to 307
@Deprecated
protected void incrementCounter(Statistic objectsRead) {
//noop: Use OzoneClientAdapterImpl which supports statistics.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@Deprecated
protected void incrementCounter(Statistic objectsRead) {
//noop: Use OzoneClientAdapterImpl which supports statistics.
protected void incrementCounter(Statistic objectsRead) {
incrementCounter(objectsRead, 1);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the old method from all Impls. Only kept it in BasicOzoneFileSystem/BasicRootedOzoneFileSystem.

Change-Id: I3cd47aed4e4c281bbbfddb3a865826d59fdd1449
2. Overload incrementCounter only in BasicOzoneFileSystem / BasicRootedOzoneFileSystem, where the old method calls the new one, minimizing the code change.

Change-Id: I0606951cac16b16118ff775fbda05c36f5f4155b
@smengcl
Copy link
Contributor Author

smengcl commented Aug 12, 2020

Thanks for reviewing the patch @adoroszlai . I have made some changes accordingly. Please take another look. Thanks!

@adoroszlai
Copy link
Contributor

Thanks @smengcl for updating the patch.

@adoroszlai adoroszlai merged commit b186b90 into apache:master Aug 12, 2020
@adoroszlai
Copy link
Contributor

Thanks @smengcl for the contribution, @bharatviswa504 and @xiaoyuyao for the reviews.

errose28 added a commit to errose28/ozone that referenced this pull request Aug 12, 2020
* master: (28 commits)
  HDDS-4037. Incorrect container numberOfKeys and usedBytes in SCM after key deletion (apache#1295)
  HDDS-3232. Include the byteman scripts in the distribution tar file (apache#1309)
  HDDS-4095. Byteman script to debug HCFS performance (apache#1311)
  HDDS-4057. Failed acceptance test missing from bundle (apache#1283)
  HDDS-4040. [OFS] BasicRootedOzoneFileSystem to support batchDelete (apache#1286)
  HDDS-4061. Pending delete blocks are not always included in #BLOCKCOUNT metadata (apache#1288)
  HDDS-4067. Implement toString for OMTransactionInfo (apache#1300)
  HDDS-3878. Make OMHA serviceID optional if one (but only one) is defined in the config (apache#1149)
  HDDS-3833. Use Pipeline choose policy to choose pipeline from exist pipeline list (apache#1096)
  HDDS-3979. Make bufferSize configurable for stream copy (apache#1212)
  HDDS-4048. Show more information while SCM version info mismatch (apache#1278)
  HDDS-4078. Use HDDS InterfaceAudience/Stability annotations (apache#1302)
  HDDS-4034. Add Unit Test for HadoopNestedDirGenerator. (apache#1266)
  HDDS-4076. Translate CSI.md into Chinese (apache#1299)
  HDDS-4046. Extensible subcommands for CLI applications (apache#1276)
  HDDS-4051. Remove whitelist/blacklist terminology from Ozone (apache#1306)
  HDDS-4055. Cleanup GitHub workflow (apache#1282)
  HDDS-4042. Update documentation for the GA release (apache#1269)
  HDDS-4066. Add core-site.xml to intellij configuration (apache#1292)
  HDDS-4073. Remove leftover robot.robot (apache#1297)
  ...
ChenSammi pushed a commit that referenced this pull request Aug 12, 2020
rakeshadr pushed a commit to rakeshadr/hadoop-ozone that referenced this pull request Sep 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working client

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants