fix(periodic): config for maxage/maxsize to prevent recording upload timeouts due to large filesize #96

andrewazores · 2023-04-11T20:56:32Z

add configs for periodic maxage and maxsize
null out field on state transition
split between dedicated thread and worker pool

Fixes #95
Depends on #99

Adds two new config parameters for the maxage/maxsize JFR properties. These two properties could already be controlled previously for recordings that are uploaded when the host JVM is exiting, but periodically pushed recordings would always push the entire available JFR repository contents. It's likely that this results in overlapping recording chunks being pushed to the server frequently, wasting network bandwidth and storage capacity. The new config parameters can be tuned to minimize this waste. The maxsize is not applied by default, but the maxage is, and the default data age is taken as 1.5x the harvester period. This will result in some overlap of recording chunks on each push, but probably much less than previously for common harvester periods and JFR repository sizes.

Also included are some fixes to ensure that one thread is responsible for managing the harvester and the state that it controls, and to ensure that the harvester does not get into a bad spinning state that I've seen when uploads fail. I have also seen uploads fail when the server sends the registration refresh POST signal, which would cause the spinning behaviour. In this PR single uploads that overlap(?) with handling the registration signal may still fail, but they are handled gracefully and the periodic push resumes as normal on the next scheduled attempt.

"Spinning" fix testing

Use the following Cryostat smoketest.sh:

diff --git a/smoketest.sh b/smoketest.sh
index 4aebf722..e5622ad0 100755
--- a/smoketest.sh
+++ b/smoketest.sh
@@ -159,7 +159,7 @@ runDemoApps() {
         --env CRYOSTAT_AGENT_TRUST_ALL="true" \
         --env CRYOSTAT_AGENT_AUTHORIZATION="Basic $(echo user:pass | base64)" \
         --env CRYOSTAT_AGENT_REGISTRATION_PREFER_JMX="true" \
-        --env CRYOSTAT_AGENT_HARVESTER_PERIOD_MS=60000 \
+        --env CRYOSTAT_AGENT_HARVESTER_PERIOD_MS=15000 \
         --env CRYOSTAT_AGENT_HARVESTER_MAX_FILES=10 \
         --rm -d quay.io/andrewazores/quarkus-test:latest

CRYOSTAT_DISCOVERY_PING_PERIOD=30000 sh smoketest.sh. This sets the discovery callback POST ping signal to occur every 30 seconds. Every 15 seconds the agent should try to push a harvested JFR file to the server. Before this PR, the agent will get into a bad "spinning" state quite easily whenever it needs to reregister itself after the POST signal. After the PR the same root failure can still be observed, but results in the agent simply trying again later and succeeding.

maxage/maxsize testing

TODO determine a quicker way. I have observed some issues with the agent trying to push very large files and timing out when leaving the standard smoketest.sh setup running for long periods of time, but I'm not sure exactly how long this takes or if there are other extenuating circumstances that also contribute to the problems I have seen.

README.md

src/main/java/io/cryostat/agent/Harvester.java

andrewazores · 2023-04-14T21:25:10Z

@tthvo or @maxcao13 , any more comments?

maxcao13 · 2023-04-14T21:44:22Z

I'll take a look.

maxcao13 · 2023-04-14T23:10:55Z

I assume the maxAge cannot be smaller than the period? I tried this setting the CRYOSTAT_AGENT_HARVESTER_MAX_AGE_MS to 600ms and viewing the data in Grafana still showed 15 sec of jfr recording time. I also tried max size = 100, but that didn't seem to have an effect either because the minimum recording chunk size is 1MB? I will try again with larger values, just to confirm.

maxcao13 · 2023-04-14T23:20:30Z

2023-04-14 23:18:37,406 INFO  [io.cry.age.Harvester] (cryostat-agent-harvester) JFR Harvester starting
�2023-04-14 23:18:37,573 INFO  [io.cry.age.Harvester] (cryostat-agent-harvester) JFR Harvester started using template "default" with period PT15S
�2023-04-14 23:18:37,574 INFO  [io.cry.age.Harvester] (cryostat-agent-harvester) Periodic uploads will contain approximately the most recent 600ms (PT0.6S) of data
�2023-04-14 23:18:37,579 INFO  [io.cry.age.Harvester] (cryostat-agent-harvester) On-stop uploads will contain approximately the most recent 1000 bytes (1000 bytes) of data
2023-04-14 23:18:37,655 INFO  [io.cry.age.Harvester] (cryostat-agent-harvester) cryostat-agent(1) RUNNING
2023-04-14 23:18:52,671 INFO  [io.cry.age.Harvester] (cryostat-agent-worker-1) Snapshot(2) CLOSED
2023-04-14 23:18:52,743 INFO  [io.cry.age.CryostatClient] (cryostat-agent-worker-2) POST 200 (quarkus-test-agent_default_20230414T231852Z.jfr -> http://localhost:8181/api/beta/recordings/iMsHuf1ZXI4QJNtVbcjEkJcnCv-o8FMMeY2YNqJAOfE=): 321 KB/PT0.077247823S
2023-04-14 23:19:07,664 INFO  [io.cry.age.Harvester] (cryostat-agent-worker-1) Snapshot(3) CLOSED
2023-04-14 23:19:07,694 INFO  [io.cry.age.CryostatClient] (cryostat-agent-worker-2) POST 200 (quarkus-test-agent_default_20230414T231907Z.jfr -> http://localhost:8181/api/beta/recordings/iMsHuf1ZXI4QJNtVbcjEkJcnCv-o8FMMeY2YNqJAOfE=): 330 KB/PT0.031873328S
2023-04-14 23:19:22,664 INFO  [io.cry.age.Harvester] (cryostat-agent-worker-1) Snapshot(4) CLOSED
2023-04-14 23:19:22,684 INFO  [io.cry.age.CryostatClient] (cryostat-agent-worker-2) POST 200 (quarkus-test-agent_default_20230414T231922Z.jfr -> http://localhost:8181/api/beta/recordings/iMsHuf1ZXI4QJNtVbcjEkJcnCv-o8FMMeY2YNqJAOfE=): 319 KB/PT0.021000678S

andrewazores · 2023-04-14T23:24:30Z

Yea, there are certain minimums that the JFR system itself within the JVM will enforce, like the minimum size of a chunk and therefore the minimum number of events that will be included. If that minimum threshold isn't met then the maxage/maxsize policies won't be applied. In order to get a short maxage like that to actually apply you'd need to be recording on a target application that is generating a lot more events per second.

maxcao13 · 2023-04-15T00:01:25Z

One more question, I've set the harvesting period to 10min and the PING_PERIOD to the default (5min), in this case, the harvester will never upload its recording (Which makes sense, but I think having a warning somewhere in docs for this would be nice), but what's unexpected is that the onexit dump recording also never gets uploaded. Is that correct?

2023-04-14 23:51:13,187 INFO  [io.cry.age.WebServer] (cryostat-agent-worker-2) POST /
2023-04-14 23:51:13,187 INFO  [io.cry.age.WebServer] (cryostat-agent-worker-2) Using 'deflate' encoding
�2023-04-14 23:51:13,189 INFO  [io.cry.age.CryostatClient] (cryostat-agent-worker-0) DELETE http://localhost:8181/api/v2.2/credentials/1 HTTP/1.1
2023-04-14 23:51:13,189 INFO  [io.cry.age.WebServer] (cryostat-agent-worker-2) POST / : 204 2ms
�2023-04-14 23:51:13,208 INFO  [io.cry.age.CryostatClient] (cryostat-agent-worker-1) DELETE http://localhost:8181/api/v2.2/credentials/1 : 200
2023-04-14 23:51:13,209 INFO  [io.cry.age.CryostatClient] (cryostat-agent-worker-1) DELETE http://localhost:8181/api/v2.2/discovery/b20734cc-9828-4ad0-a36f-924e74574abc?token=eyJjdHkiOiJKV1QiLCJlbmMiOiJBMjU2R0NNIiwiYWxnIjoiZGlyIn0..u8o93wf4EVtpIIUF.RRxlYsF-F1VGGB0hIAm6SllTbcmmQ0mKJNN_nP79MrUeL4oKYZIcPXtEvG0qasH4QcNzk38IZnrsv-TH7mWU-yagGTkRJzJe9Vi9UG5gZoKSmmEy6uruVGvw4oct9xb7ZEwED72gTpaLFSzK6nuVPo2-45pXBJ9z5FMpxAY91vHyw6qDn79xYHp2iyxn6ddlPB3tk51XN4arbJ2Jg3c4Ul9glQyvhlYoV3BQYEMOUJcG7wXFpdDZIcG5UkoQytSWycrWq5grOZX_qak9VJR2Ajnuux_Fx0ClFpL97s2wNxASSJxsSdE-UdU2LHSJ5zBtEVMEJDAaZvNGaIg3ele4-RrlSaL_csd9HBxPlJyV_sqM9j6ast_j6oHTCjpy_kXg7aZooNLWU82wLBE5BJJtrtRzmtdQ9Eez8n-QMcp5kLqN7IbNGIsoCqcaxKOAIROlI2jDeGedLCyc0m90DPNzCsAjEeCmwyHSPDRMfiirA9Ig-oIzd3IsMp2khfaEl_IpafaVe2OSMHdZagrTU4I-7BVViC561YnyJKvQsOuhHIONsQMzmEzeVfrV_DxRl8rQ3Wma.B5UFhwMNEeDbW78T74HFiw HTTP/1.1
2023-04-14 23:51:13,226 INFO  [io.cry.age.CryostatClient] (cryostat-agent-worker-2) DELETE http://localhost:8181/api/v2.2/discovery/b20734cc-9828-4ad0-a36f-924e74574abc?token=eyJjdHkiOiJKV1QiLCJlbmMiOiJBMjU2R0NNIiwiYWxnIjoiZGlyIn0..u8o93wf4EVtpIIUF.RRxlYsF-F1VGGB0hIAm6SllTbcmmQ0mKJNN_nP79MrUeL4oKYZIcPXtEvG0qasH4QcNzk38IZnrsv-TH7mWU-yagGTkRJzJe9Vi9UG5gZoKSmmEy6uruVGvw4oct9xb7ZEwED72gTpaLFSzK6nuVPo2-45pXBJ9z5FMpxAY91vHyw6qDn79xYHp2iyxn6ddlPB3tk51XN4arbJ2Jg3c4Ul9glQyvhlYoV3BQYEMOUJcG7wXFpdDZIcG5UkoQytSWycrWq5grOZX_qak9VJR2Ajnuux_Fx0ClFpL97s2wNxASSJxsSdE-UdU2LHSJ5zBtEVMEJDAaZvNGaIg3ele4-RrlSaL_csd9HBxPlJyV_sqM9j6ast_j6oHTCjpy_kXg7aZooNLWU82wLBE5BJJtrtRzmtdQ9Eez8n-QMcp5kLqN7IbNGIsoCqcaxKOAIROlI2jDeGedLCyc0m90DPNzCsAjEeCmwyHSPDRMfiirA9Ig-oIzd3IsMp2khfaEl_IpafaVe2OSMHdZagrTU4I-7BVViC561YnyJKvQsOuhHIONsQMzmEzeVfrV_DxRl8rQ3Wma.B5UFhwMNEeDbW78T74HFiw : 200
�2023-04-14 23:51:13,227 INFO  [io.cry.age.Registration] (cryostat-agent-worker-2) Deregistered from Cryostat discovery plugin [b20734cc-9828-4ad0-a36f-924e74574abc]
2023-04-14 23:51:13,228 INFO  [io.cry.age.Agent] (cryostat-agent-worker-2) Registration state: UNREGISTERED
2023-04-14 23:51:13,228 INFO  [io.cry.age.Harvester] (cryostat-agent-harvester) Harvester stopping
2023-04-14 23:51:13,228 INFO  [io.cry.age.Harvester] (cryostat-agent-harvester) Harvester stopped
�2023-04-14 23:51:13,231 INFO  [io.cry.age.CryostatClient] (cryostat-agent-worker-0) GET http://localhost:8181/api/v2.2/credentials/1 HTTP/1.1
�2023-04-14 23:51:13,238 INFO  [io.cry.age.CryostatClient] (cryostat-agent-worker-2) GET http://localhost:8181/api/v2.2/credentials/1 : 404
�2023-04-14 23:51:13,239 INFO  [io.cry.age.CryostatClient] (cryostat-agent-worker-2) POST http://localhost:8181/api/v2.2/credentials HTTP/1.1
�2023-04-14 23:51:13,276 INFO  [io.cry.age.CryostatClient] (cryostat-agent-worker-1) POST http://localhost:8181/api/v2.2/credentials : 201
2023-04-14 23:51:13,277 INFO  [io.cry.age.WebServer] (cryostat-agent-worker-1) Defined credentials with id 3
�2023-04-14 23:51:13,277 INFO  [io.cry.age.CryostatClient] (cryostat-agent-worker-1) POST http://localhost:8181/api/v2.2/discovery HTTP/1.1
2023-04-14 23:51:13,287 INFO  [io.cry.age.WebServer] (cryostat-agent-worker-2) GET /
2023-04-14 23:51:13,287 INFO  [io.cry.age.WebServer] (cryostat-agent-worker-2) Using 'deflate' encoding
2023-04-14 23:51:13,289 INFO  [io.cry.age.WebServer] (cryostat-agent-worker-2) GET / : 204 2ms
�2023-04-14 23:51:13,301 INFO  [io.cry.age.CryostatClient] (cryostat-agent-worker-0) POST http://localhost:8181/api/v2.2/discovery : 201
�2023-04-14 23:51:13,302 INFO  [io.cry.age.Registration] (cryostat-agent-worker-0) Registered as c605a636-0399-4a75-98e4-d171fccbbf0a
2023-04-14 23:51:13,302 INFO  [io.cry.age.Agent] (cryostat-agent-worker-0) Registration state: REGISTERED
�2023-04-14 23:51:13,303 INFO  [io.cry.age.Registration] (cryostat-agent-worker-0) publishing self as service:jmx:rmi:///jndi/rmi://cryostat:9097/jmxrmi
2023-04-14 23:51:13,303 INFO  [io.cry.age.CryostatClient] (cryostat-agent-worker-0) POST http://localhost:8181/api/v2.2/discovery/c605a636-0399-4a75-98e4-d171fccbbf0a?token=eyJjdHkiOiJKV1QiLCJlbmMiOiJBMjU2R0NNIiwiYWxnIjoiZGlyIn0..LJRQmJH2Qwax23Wc.grp8DRtT1whlm9uicIKk48-7yxvIGQGLsWECBBVYXMxbifHNQuQsfDS5JNbpl3SnFn7M_7tnnM8TSWGhfmV6hhSdbG1p1M-sJPW3VVjGzwvzbIsozwrjZiQIM6OQE_xqyR_MQD6mNdeDKyx-7uT163GiQkgNy3Bns2OiUcvuMYQDhotxq_Nu1er5g7mgnHCAsxrde7OZGpgq0Dqs2g2K1a--O6yJnrjRgYlrK5WhuEWJgXEAVuRLNLnw5A0SJZBGMdMB5bVrjm6j3s-j_iTT7H0Cqygid9Hk8sS2l8LaDy_V_i8pLe5xWTnHs016WBsSQUgh1pjrms1oHW6BcYk1ZjJJX-B2SLWuujijLALmlFgbvoFuJd0-F__jBR1DRcMY0ej4CbflehhhVRkAdNXf3OtYcrh96LDC8NHMpL43xcdUqlfBF5zI9m4m7FOvB5nlhuvpvpBOKNrMDUi_kDLom-0uTxspidm6UW6uR6jLrKIBfYif1kIIUksqzJNGLoQ9nn7-HWE1QSoGlc0ogVenXWQDvJQdOyT7c6z8iClGI6gQ40wB3drmN1WKILv95iov1fGq.lkRlmaWzZfGs6pQNt45_OA HTTP/1.1
2023-04-14 23:51:13,346 INFO  [io.cry.age.CryostatClient] (cryostat-agent-worker-2) POST http://localhost:8181/api/v2.2/discovery/c605a636-0399-4a75-98e4-d171fccbbf0a?token=eyJjdHkiOiJKV1QiLCJlbmMiOiJBMjU2R0NNIiwiYWxnIjoiZGlyIn0..LJRQmJH2Qwax23Wc.grp8DRtT1whlm9uicIKk48-7yxvIGQGLsWECBBVYXMxbifHNQuQsfDS5JNbpl3SnFn7M_7tnnM8TSWGhfmV6hhSdbG1p1M-sJPW3VVjGzwvzbIsozwrjZiQIM6OQE_xqyR_MQD6mNdeDKyx-7uT163GiQkgNy3Bns2OiUcvuMYQDhotxq_Nu1er5g7mgnHCAsxrde7OZGpgq0Dqs2g2K1a--O6yJnrjRgYlrK5WhuEWJgXEAVuRLNLnw5A0SJZBGMdMB5bVrjm6j3s-j_iTT7H0Cqygid9Hk8sS2l8LaDy_V_i8pLe5xWTnHs016WBsSQUgh1pjrms1oHW6BcYk1ZjJJX-B2SLWuujijLALmlFgbvoFuJd0-F__jBR1DRcMY0ej4CbflehhhVRkAdNXf3OtYcrh96LDC8NHMpL43xcdUqlfBF5zI9m4m7FOvB5nlhuvpvpBOKNrMDUi_kDLom-0uTxspidm6UW6uR6jLrKIBfYif1kIIUksqzJNGLoQ9nn7-HWE1QSoGlc0ogVenXWQDvJQdOyT7c6z8iClGI6gQ40wB3drmN1WKILv95iov1fGq.lkRlmaWzZfGs6pQNt45_OA : 200
2023-04-14 23:51:13,347 INFO  [io.cry.age.Registration] (cryostat-agent-worker-2) Publish success
2023-04-14 23:51:13,348 INFO  [io.cry.age.Harvester] (cryostat-agent-harvester) JFR Harvester starting
�2023-04-14 23:51:13,348 INFO  [io.cry.age.Harvester] (cryostat-agent-harvester) JFR Harvester started using template "default" with period PT10M
�2023-04-14 23:51:13,348 INFO  [io.cry.age.Harvester] (cryostat-agent-harvester) Periodic uploads will contain approximately the most recent 60000ms (PT1M) of data
�2023-04-14 23:51:13,348 INFO  [io.cry.age.Harvester] (cryostat-agent-harvester) On-stop uploads will contain approximately the most recent 1200000 bytes (1 MB) of data
2023-04-14 23:51:13,359 INFO  [io.cry.age.Agent] (cryostat-agent-worker-2) Registration state: PUBLISHED
2023-04-14 23:51:13,368 INFO  [io.cry.age.Harvester] (cryostat-agent-harvester) cryostat-agent(1) STOPPED
2023-04-14 23:51:13,370 INFO  [io.cry.age.Harvester] (cryostat-agent-harvester) cryostat-agent(1) CLOSED
2023-04-14 23:51:13,371 INFO  [io.cry.age.Harvester] (cryostat-agent-harvester) cryostat-agent(1) CLOSED
2023-04-14 23:51:13,396 INFO  [io.cry.age.Harvester] (cryostat-agent-harvester) cryostat-agent(2) RUNNING
�2023-04-14 23:51:14,417 WARNING [io.cry.age.Harvester] (cryostat-agent-harvester) Could not upload exit dump file: java.util.concurrent.ExecutionException: java.net.SocketTimeoutException: Read timed out
        at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
        at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073)
        at io.cryostat.agent.Harvester.lambda$5(Harvester.java:223)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.net.SocketTimeoutException: Read timed out
        at java.base/sun.nio.ch.NioSocketImpl.timedRead(NioSocketImpl.java:283)
        at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:309)
        at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:350)
        at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:803)
        at java.base/java.net.Socket$SocketInputStream.read(Socket.java:966)
        at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
        at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
        at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
        at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
        at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
        at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
        at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
        at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
        at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
        at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
        at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
        at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
        at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
        at io.cryostat.agent.CryostatClient.executeQuiet(CryostatClient.java:346)
        at io.cryostat.agent.CryostatClient.lambda$24(CryostatClient.java:340)
        at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
        ... 6 more

�2023-04-14 23:51:15,420 WARNING [io.cry.age.Harvester] (cryostat-agent-harvester) Could not upload exit dump file: java.util.concurrent.ExecutionException: java.net.SocketTimeoutException: Read timed out
        at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
        at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073)
        at io.cryostat.agent.Harvester.lambda$5(Harvester.java:223)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.net.SocketTimeoutException: Read timed out
        at java.base/sun.nio.ch.NioSocketImpl.timedRead(NioSocketImpl.java:283)
        at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:309)
        at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:350)
        at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:803)
        at java.base/java.net.Socket$SocketInputStream.read(Socket.java:966)
        at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
        at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
        at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
        at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
        at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
        at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
        at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
        at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
        at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
        at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
        at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
        at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
        at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
        at io.cryostat.agent.CryostatClient.executeQuiet(CryostatClient.java:346)
        at io.cryostat.agent.CryostatClient.lambda$24(CryostatClient.java:340)
        at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
        ... 6 more

2023-04-14 23:52:13,303 INFO  [io.cry.age.CryostatClient] (cryostat-agent-worker-1) GET http://localhost:8181/api/v2.2/discovery/c605a636-0399-4a75-98e4-d171fccbbf0a?token=eyJjdHkiOiJKV1QiLCJlbmMiOiJBMjU2R0NNIiwiYWxnIjoiZGlyIn0..LJRQmJH2Qwax23Wc.grp8DRtT1whlm9uicIKk48-7yxvIGQGLsWECBBVYXMxbifHNQuQsfDS5JNbpl3SnFn7M_7tnnM8TSWGhfmV6hhSdbG1p1M-sJPW3VVjGzwvzbIsozwrjZiQIM6OQE_xqyR_MQD6mNdeDKyx-7uT163GiQkgNy3Bns2OiUcvuMYQDhotxq_Nu1er5g7mgnHCAsxrde7OZGpgq0Dqs2g2K1a--O6yJnrjRgYlrK5WhuEWJgXEAVuRLNLnw5A0SJZBGMdMB5bVrjm6j3s-j_iTT7H0Cqygid9Hk8sS2l8LaDy_V_i8pLe5xWTnHs016WBsSQUgh1pjrms1oHW6BcYk1ZjJJX-B2SLWuujijLALmlFgbvoFuJd0-F__jBR1DRcMY0ej4CbflehhhVRkAdNXf3OtYcrh96LDC8NHMpL43xcdUqlfBF5zI9m4m7FOvB5nlhuvpvpBOKNrMDUi_kDLom-0uTxspidm6UW6uR6jLrKIBfYif1kIIUksqzJNGLoQ9nn7-HWE1QSoGlc0ogVenXWQDvJQdOyT7c6z8iClGI6gQ40wB3drmN1WKILv95iov1fGq.lkRlmaWzZfGs6pQNt45_OA HTTP/1.1

andrewazores · 2023-04-15T08:38:53Z

I'll have to give this more thought. I think this reveals a deeper bug about the Agent lifecycle and how the discovery ping is handled with re-registration. The Agent currently handles this by going through deregistration and re-registration internally, but the deregistration is what interrupts the periodic upload schedule and I think is also breaking the onexit upload.

maxcao13

Everything looks good now!

Signed-off-by: Andrew Azores <[email protected]>

andrewazores · 2023-04-18T18:46:54Z

Rebased with no changes for commit signing.

andrewazores added the feat New feature or request label Apr 11, 2023

andrewazores force-pushed the periodic-maxage-maxsize branch from c690b39 to d7f394d Compare April 11, 2023 23:46

andrewazores marked this pull request as ready for review April 11, 2023 23:47

andrewazores requested review from tthvo and maxcao13 April 11, 2023 23:48

tthvo reviewed Apr 13, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

tthvo reviewed Apr 13, 2023

View reviewed changes

src/main/java/io/cryostat/agent/Harvester.java Show resolved Hide resolved

andrewazores force-pushed the periodic-maxage-maxsize branch from 1c49e60 to 4aab4d7 Compare April 13, 2023 15:17

andrewazores force-pushed the periodic-maxage-maxsize branch from 4aab4d7 to 2ed06ef Compare April 14, 2023 23:41

andrewazores changed the title ~~feat(periodic): add config for maxage/maxsize~~ fix(periodic): config for maxage/maxsize to prevent recording upload timeouts due to large filesize Apr 15, 2023

andrewazores force-pushed the periodic-maxage-maxsize branch from 2ed06ef to 6c389fa Compare April 17, 2023 17:27

maxcao13 previously approved these changes Apr 18, 2023

View reviewed changes

andrewazores added 7 commits April 18, 2023 14:46

add configs for periodic maxage and maxsize

dc0e57a

null out field on state transition

f577f47

split between dedicated thread and worker pool

9c39440

cleanup/refactor

751f3ce

fixup! cleanup/refactor

c319b9b

Signed-off-by: Andrew Azores <[email protected]>

fix typo

ae88ea3

log periodic settings

4b27d45

andrewazores dismissed maxcao13’s stale review via 4b27d45 April 18, 2023 18:46

andrewazores force-pushed the periodic-maxage-maxsize branch from 6c389fa to 4b27d45 Compare April 18, 2023 18:46

maxcao13 approved these changes Apr 18, 2023

View reviewed changes

andrewazores merged commit d6daade into cryostatio:main Apr 18, 2023

andrewazores deleted the periodic-maxage-maxsize branch April 18, 2023 18:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(periodic): config for maxage/maxsize to prevent recording upload timeouts due to large filesize #96

fix(periodic): config for maxage/maxsize to prevent recording upload timeouts due to large filesize #96

andrewazores commented Apr 11, 2023 •

edited

Loading

andrewazores commented Apr 14, 2023

maxcao13 commented Apr 14, 2023

maxcao13 commented Apr 14, 2023

maxcao13 commented Apr 14, 2023

andrewazores commented Apr 14, 2023

maxcao13 commented Apr 15, 2023 •

edited

Loading

andrewazores commented Apr 15, 2023

maxcao13 left a comment

andrewazores commented Apr 18, 2023

fix(periodic): config for maxage/maxsize to prevent recording upload timeouts due to large filesize #96

fix(periodic): config for maxage/maxsize to prevent recording upload timeouts due to large filesize #96

Conversation

andrewazores commented Apr 11, 2023 • edited Loading

"Spinning" fix testing

maxage/maxsize testing

andrewazores commented Apr 14, 2023

maxcao13 commented Apr 14, 2023

maxcao13 commented Apr 14, 2023

maxcao13 commented Apr 14, 2023

andrewazores commented Apr 14, 2023

maxcao13 commented Apr 15, 2023 • edited Loading

andrewazores commented Apr 15, 2023

maxcao13 left a comment

Choose a reason for hiding this comment

andrewazores commented Apr 18, 2023

andrewazores commented Apr 11, 2023 •

edited

Loading

maxcao13 commented Apr 15, 2023 •

edited

Loading