Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-73090] Handle CR from LineTransformationOutputStream #9219

Conversation

jglick
Copy link
Member

@jglick jglick commented May 1, 2024

See JENKINS-73090 for background.

Testing done

Without the fix, the cr test with the count set to a billion failed with

java.lang.OutOfMemoryError: Java heap space
	at java.base/java.util.Arrays.copyOf(Arrays.java:3745)
	at java.base/java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:120)
	at java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95)
	at java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:137)
	at hudson.console.LineTransformationOutputStream.write(LineTransformationOutputStream.java:56)
	at hudson.console.LineTransformationOutputStream.write(LineTransformationOutputStream.java:75)
	at java.base/java.io.OutputStream.write(OutputStream.java:122)
	at hudson.console.LineTransformationOutputStreamTest.cr(LineTransformationOutputStreamTest.java:45)

(I am not leaving this count that high in the committed test since when it passes it takes over two minutes of high CPU on my laptop.)

I also tried installing the Timestamper plugin, enabling the global decorator (auto timestamps in all Pipeline builds), and running the Windows git clone pipeline mentioned in Jira. This produced timestamped output, though the output did seem to come all at once, though that seemed to be an aspect of Git not Jenkins so far as I could tell from looking at the jenkins.log file in the workspace temp dir; so I also tried

node('linux') {
    sh 'set +x; for i in `seq 00 99`; do printf "$i\r"; sleep 1; done'
}

which did produce live timestamped output, though the classic console does not reliably show it; Pipeline Graph View plugin does. (tail -f …/builds/…/log from a terminal actually updates the line in place.)

Proposed changelog entries

  • Treat lines of text (mainly in build logs) as completed by a single carriage return in addition to a newline or carriage return plus newline, avoiding an out of memory error if a large number of such lines are printed in sequence.

Proposed upgrade guidelines

N/A

Before the changes are marked as ready-for-merge:

Maintainer checklist

@jglick
Copy link
Member Author

jglick commented May 2, 2024

#3580 👀

@NotMyFault NotMyFault requested a review from a team May 4, 2024 08:11
@jglick jglick requested a review from dwnusbaum May 9, 2024 13:56
Copy link
Member

@dwnusbaum dwnusbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense. It is quite awkward right now that LineTransformationOutputStream has different behavior to most other Java line-based APIs like BufferedReader.readLine and Files.readAllLines (makes it difficult to use these APIs together for things like reading log files).

I do wonder about potential performance impact. It might also be worth running the PCT to get an idea in advance of how many tests might be affected, if any.

@jglick
Copy link
Member Author

jglick commented May 9, 2024

I do wonder about potential performance impact.

Any particular concerns?

It might also be worth running the PCT

Good point, I can do that in bom and/or @cloudbees.

@dwnusbaum
Copy link
Member

I do wonder about potential performance impact.

Any particular concerns?

My worry is just that this is inner loop code that looks at every byte that gets written to a build log, so it seems possible that even the small changes here which add a field lookup and some branches could have measurable impact at scale. We don't have benchmarks to measure against though, and maybe JIT compilation, branch prediction, and out-of-order execution will make it a nonissue. Either way, unless the performance impact is massive (which I doubt), I think the change is probably worth it.

@jglick
Copy link
Member Author

jglick commented May 10, 2024

this is inner loop code that looks at every byte

Indeed it is. On the other hand it is already calling ByteArrayOutputStream.write(byte) on every byte, even when write(byte[], int, int) is called, which seems like the most obvious target for optimization. I would expect this to be negligible compared to the overhead of receiving input on the one side (typically from a TLS-encrypted Remoting pipe) and forwarding content in Delegating on the other hand (typically to a file handle ultimately), plus various other processing such as masking secrets.

In the particular case being fixed here, that of a large block of text separated by CR, this is not just a performance improvement but potentially necessary to avoid OOME. Presumably that case is unusual and most text is NL-delimited or occasionally CRNL.

@MarkEWaite MarkEWaite added the bug For changelog: Minor bug. Will be listed after features label May 17, 2024
@MarkEWaite
Copy link
Contributor

This PR is now ready for merge. We will merge it after approximately 24 hours if there is no negative feedback.

/label ready-for-merge

@comment-ops-bot comment-ops-bot bot added the ready-for-merge The PR is ready to go, and it will be merged soon if there is no negative feedback label May 17, 2024
@MarkEWaite MarkEWaite self-assigned this May 17, 2024
@MarkEWaite MarkEWaite merged commit aaf4b64 into jenkinsci:master May 18, 2024
17 checks passed
@jglick jglick deleted the LineTransformationOutputStream-JENKINS-73090 branch May 20, 2024 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug For changelog: Minor bug. Will be listed after features ready-for-merge The PR is ready to go, and it will be merged soon if there is no negative feedback
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants