Skip to content

Conversation

@z-bb
Copy link
Contributor

@z-bb z-bb commented May 12, 2023

What changes were proposed in this pull request?

Hadoop client write slowly when stream enabled

code stack

        at org.apache.hadoop.hdds.scm.storage.BlockDataStreamOutput.doFlushIfNeeded(BlockDataStreamOutput.java:321)
	at org.apache.hadoop.hdds.scm.storage.BlockDataStreamOutput.write(BlockDataStreamOutput.java:260)
	at org.apache.hadoop.ozone.client.io.BlockDataStreamOutputEntry.write(BlockDataStreamOutputEntry.java:108)
	at org.apache.hadoop.ozone.client.io.KeyDataStreamOutput.writeToDataStreamOutput(KeyDataStreamOutput.java:201)
	at org.apache.hadoop.ozone.client.io.KeyDataStreamOutput.handleWrite(KeyDataStreamOutput.java:179)
	at org.apache.hadoop.ozone.client.io.KeyDataStreamOutput.write(KeyDataStreamOutput.java:159)
	at org.apache.hadoop.hdds.scm.storage.ByteBufferStreamOutput.write(ByteBufferStreamOutput.java:37)
	at org.apache.hadoop.fs.ozone.OzoneFSDataStreamOutput.write(OzoneFSDataStreamOutput.java:72)
	at java.io.OutputStream.write(OutputStream.java:116)
	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58)
	at java.io.DataOutputStream.write(DataOutputStream.java:107)
	- locked <0x0000000080670800> (a org.apache.hadoop.fs.FSDataOutputStream)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:88)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:60)
	at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:120)
	at org.apache.hadoop.fs.shell.CommandWithDestination$TargetFileSystem.writeStreamToFile(CommandWithDestination.java:466)
	at org.apache.hadoop.fs.shell.CommandWithDestination.copyStreamToTarget(CommandWithDestination.java:391)
	at org.apache.hadoop.fs.shell.CommandWithDestination.copyFileToTarget(CommandWithDestination.java:328)
	at org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:263)
	at org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:248)
	at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317)
	at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289)
	at org.apache.hadoop.fs.shell.CommandWithDestination.processPathArgument(CommandWithDestination.java:243)
	at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271)
	at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255)
	at org.apache.hadoop.fs.shell.CommandWithDestination.processArguments(CommandWithDestination.java:220)
	at org.apache.hadoop.fs.shell.CopyCommands$Put.processArguments(CopyCommands.java:267)
	at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:201)
	at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
	at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
	at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)

Because OzoneFSDataStreamOutput does not override the OutputStream write method
it will write out a singleBytes

OutputStream.class

public void write(byte b[], int off, int len) throws IOException {
        Objects.checkFromIndexSize(off, len, b.length);
        // len == 0 condition implicitly handled by loop bounds
        for (int i = 0 ; i < len ; i++) {
            write(b[off + i]);
        }
    }

What is the link to the Apache JIRA

https://issues.apache.org/jira/projects/HDDS/issues/HDDS-8584

How was this patch tested?

# before fix:
[[email protected] ~]$ time ~/hadoop-2.7.2-5504-ozone-client/bin/hadoop fs -put file_1.5g  /vol1/buk1/key_test11
^C
real	13m4.849s
user	3m43.157s
sys	0m9.274s

# after fix
[[email protected] ~]$ time ~/hadoop-2.7.2-5504-ozone-client/bin/hadoop fs -put file_1.5g  /vol1/buk1/key_test22

real	0m12.368s
user	0m36.840s
sys	0m10.346s

@adoroszlai adoroszlai requested review from sadanand48 and szetszwo May 12, 2023 08:02
Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 the change looks good.

@szetszwo szetszwo merged commit 6d90022 into apache:master May 12, 2023
@z-bb z-bb deleted the HDDS-8584 branch May 17, 2023 03:20
errose28 added a commit to errose28/ozone that referenced this pull request May 17, 2023
* master: (78 commits)
  HDDS-8575. Intermittent failure in TestCloseContainerEventHandler.testCloseContainerWithDelayByLeaseManager (apache#4688)
  HDDS-7241. EC: Reconstruction could fail with orphan blocks. (apache#4718)
  HDDS-8577. [Snapshot] Disable compaction log when loading metadata for snapshot (apache#4697)
  HDDS-7080. EC: Offline reconstruction needs better logging (apache#4719)
  HDDS-8626. Config thread pool in ReplicationServer (apache#4715)
  HDDS-8616. Underreplication not fixed if all replicas start decommissioning (apache#4711)
  HDDS-8254. Close containers when volume reaches utilisation threshold (apache#4583)
  HDDS-8254. Close containers when volume reaches utilisation threshold (apache#4583)
  HDDS-8615. Explicitly show EC block type in 'ozone debug chunkinfo' command output (apache#4706)
  HDDS-8623. Delete duplicate getBucketInfo in OMKeyCommitRequest (apache#4712)
  HDDS-8339. Recon Show the number of keys marked for Deletion in Recon UI. (apache#4519)
  HDDS-8572. Support CodecBuffer for protobuf v3 codecs. (apache#4693)
  HDDS-8010. Improve DN warning message when getBlock does not find the block. (apache#4698)
  HDDS-8621. IOException is never thrown in SCMRatisServer.getRatisRoles(). (apache#4710)
  HDDS-8463. S3 key uniqueness in deletedTable (apache#4660)
  HDDS-8584. Hadoop client write slowly when stream enabled (apache#4703)
  HDDS-7732. EC: Verify block deletion from missing EC containers (apache#4705)
  HDDS-8581. Avoid random ports in integration tests (apache#4699)
  HDDS-8504. ReplicationManager: Pass used and excluded node separately for Under and Mis-Replication (apache#4694)
  HDDS-8576. Close RocksDB instance in RDBStore if RDBStore's initialization fails after RocksDB instance creation (apache#4692)
  ...
@z-bb z-bb restored the HDDS-8584 branch May 18, 2023 08:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants