Skip to content

Conversation

@steveloughran
Copy link
Contributor

Calls to Syncable.hflush() on S3ABlockOutputStream instances are logged at debug and the statistics counter upgraded

How was this patch tested?

  • modify existing tests for new behaviour
  • s3a tests -Dparallel-tests -DtestsThreadCount=9

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

…alls

Calls to Syncable.hflush() on S3ABlockOutputStream instances are logged
at debug and the statistics counter upgraded
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 53s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 41m 47s trunk passed
+1 💚 compile 0m 45s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 34s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 0m 31s trunk passed
+1 💚 mvnsite 0m 43s trunk passed
+1 💚 javadoc 0m 42s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 34s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 1m 10s trunk passed
+1 💚 shadedclient 39m 53s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 31s the patch passed
+1 💚 compile 0m 38s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 38s the patch passed
+1 💚 compile 0m 27s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 0m 27s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 21s the patch passed
+1 💚 mvnsite 0m 34s the patch passed
+1 💚 javadoc 0m 29s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 26s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 1m 10s the patch passed
+1 💚 shadedclient 40m 15s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 3m 33s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 36s The patch does not generate ASF License warnings.
137m 42s
Subsystem Report/Notes
Docker ClientAPI=1.49 ServerAPI=1.49 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7662/1/artifact/out/Dockerfile
GITHUB PR #7662
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux f3d94f735602 5.15.0-131-generic #141-Ubuntu SMP Fri Jan 10 21:18:28 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 429bbdd
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7662/1/testReport/
Max. process+thread count 529 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7662/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

statistics.hflushInvoked();
handleSyncableInvocation();
// do not reject these, but downgrade to a no-oop
LOG.debug("Hflush invoked");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steveloughran is parquet the only reader calling hflush? think this changes behaviour for everyone.. is this something we need to care about?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look at the fs spec. we say "don't use the api and highlight the inconsistent outcomes"

The semantics of hflush say "visible to all" but no persistence, so it's not changing any durability semantics. Are we changing the visibility? we're certainly not meeting them.

I remember having a long talk with others about hflush, as in "what does it do?" -the answer is "nothing you can rely on".

when exceptions are downgraded (default) all that happens is the log message is removed, so reducing confusion.

when exceptions are rejected, the failure goes away. The one I want to fail here is hsync(), and at holds. AFAIK nobody runs with that flag on except for some of our test setups.


Syncable.hflush()

Flush out the data in client's user buffer. After the return of
this call, new readers will see the data. The hflush() operation
does not contain any guarantees as to the durability of the data. only
its visibility.

Thus implementations may cache the written data in memory
—visible to all, but not yet persisted.

Copy link
Contributor

@ahmarsuhail ahmarsuhail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM

@steveloughran steveloughran merged commit b949ca6 into apache:trunk Apr 30, 2025
4 checks passed
steveloughran added a commit to steveloughran/hadoop that referenced this pull request Apr 30, 2025
…alls (apache#7662)

S3A output streams no longer logs warnings on use of hflush()
or, if fs.s3a.downgrade.syncable.exceptions = false,
raises an UnsupportedOperationException .

hsync() is still reported with a warning or rejected. 
That method is absolutely unsupported when writing to S3.

Contributed by Steve Loughran
steveloughran added a commit that referenced this pull request May 1, 2025
…calls (#7662)

S3A output streams no longer logs warnings on use of hflush()
or, if fs.s3a.downgrade.syncable.exceptions = false,
raises an UnsupportedOperationException .

hsync() is still reported with a warning or rejected. 
That method is absolutely unsupported when writing to S3.

Contributed by Steve Loughran
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants