Add Instrumentation to Collect Missing HTTP Parameters #567

dhaval24 · 2018-02-21T21:46:38Z

This PR adds instrumentation to capture the missing HTTP parameters such as Target, Path etc and fixes the structure of reporting HTTP dependencies to match .NET SDK.

Fix #533

For significant contributions please make sure you have completed the following items:

Design discussion issue #
Changes in public surface reviewed
CHANGELOG.md updated

…dependencies

lmolkova · 2018-02-21T22:00:09Z

...ava/com/microsoft/applicationinsights/agent/internal/coresync/AgentNotificationsHandler.java

@@ -62,8 +65,23 @@
     * @param result - The result
     * @param delta - Time it took to do the call
     */
+    @Deprecated


what is the motivation to mark it as deprecated, why don't just remove it? Could someone call it externally?

@lmolkova this is a publically exposed method in an interface and that means any one can directly call it, the likelihood of someone doing it with this method is relatively very very low but still in theory this is a breaking change. So better API releasing norms say to first deprecate and then clean :- )

indeed, I thought that the interface is private. deprecation makes perfect sense now

lmolkova · 2018-02-21T22:02:58Z

...ain/java/com/microsoft/applicationinsights/internal/agent/CoreAgentNotificationsHandler.java

+        if (!LocalStringsUtils.isNullOrEmpty(uri) && (uri.startsWith("https://dc.services.visualstudio.com") || uri.startsWith("https://rt.services.visualstudio.com"))) {
+            return;
+        }
+        long deltaInMS = nanoToMilliseconds(delta);


is it needed for logging only?

@lmolkova great catch. No actually this should be passed while creating Remote Dependency Telemetry object, otherwise numbers would mess up. Will fix this.

lmolkova · 2018-02-21T22:04:36Z

...ain/java/com/microsoft/applicationinsights/internal/agent/CoreAgentNotificationsHandler.java

+        }
+        long deltaInMS = nanoToMilliseconds(delta);
+        String name = method + " " + path;
+        RemoteDependencyTelemetry telemetry = new RemoteDependencyTelemetry(name, uri, new Duration(delta), true);


what about start time? It looks like now dependency would start Now and finish in future. It would affect trace viewer and provide inaccurate experience.

About this one I don't think numbers would mess up. The logic to calculate the difference is in the Instrumented code. We calculare the difference there and the Remote Dependency Telemetry Object ultimately stores the execution time taken by dependency. https://github.com/Microsoft/ApplicationInsights-Java/blob/0cf699fca09c27b6092bdbc5dcf41ddf732b4e3f/agent/src/main/java/com/microsoft/applicationinsights/agent/internal/agent/HttpClientMethodVisitor.java#L143 Here is where LSUB instruction is used to calculate the difference. This should make it clear.

Well, there is no code that sets telemetry.timestamp in this method. If it's not set here, it will be set in TelemetryClient.track to Now

if (telemetry.getTimestamp() == null) { telemetry.setTimestamp(new Date()); }

i.e. duration is correct, but start time is not. I have not tried it though, it would be great if you can confirm it's not the case.

…passed

lmolkova · 2018-02-21T23:30:57Z

...ain/java/com/microsoft/applicationinsights/internal/agent/CoreAgentNotificationsHandler.java

@@ -170,7 +170,7 @@ public void httpMethodFinishedWithPath(String identifier, String method, String
        }
        long deltaInMS = nanoToMilliseconds(delta);
        String name = method + " " + path;
-        RemoteDependencyTelemetry telemetry = new RemoteDependencyTelemetry(name, uri, new Duration(delta), true);
+        RemoteDependencyTelemetry telemetry = new RemoteDependencyTelemetry(name, uri, new Duration(deltaInMS), true);


repeating the comment, it was collapsed by the commit
Well, there is no code that sets telemetry.timestamp in this method. If it's not set here, it will be set in TelemetryClient.track to Now

if (telemetry.getTimestamp() == null) { telemetry.setTimestamp(new Date()); }

i.e. duration is correct, but start time is not. I have not tried it though, it would be great if you can confirm it's not the case.

@lmolkova yes it's correct that there is no place where this is set explicitly and that means that TelemetryClient will set it to now which is actually after the completion of the dependency execution. The comment above the method though says that "time at which the telemetry is recorded" - I thought this means when the telemetry object is created because I do not see this any where being reported apart from that of RequestTelemetry object. Thoughts?

If we really want to record the start time, one thing that can be done is that when we create the DependencyTelemetry basically subtract the duration from the timeNow. Still it would have some nano seconds of error but that is the solution I have in my mind. Though it would not be related to this PR then. There are multiple places to fix that in whole CoreAgentNotificationHandler class. If you have better solution let me know.

timestamp is definitely a moment when dependency call was started, not when dependency telemetry was created (i.e. call finished).

I'm fine if you just subtract delta from Now,
It could also be done by storing start time in httpRequestStart method, but it's much more complicated.

we had this problem before in .NET. Here is how it looks like: microsoft/ApplicationInsights-dotnet-server#670,

It's confusing for users and breaks our portal experience: dependency starts at the end of request and finishes after request is completed. It's hard to understand what causes request to took so much time, etc.

So it would be great if you made sure start time is set correctly in this PR.

What are other places? You mean other kinds of dependencies? Let's fix http since we are here already and it's most important and create an issue to revisit other dependency tracking calls

sure. yeah, instrumenting it would be a difficult to maintain thing. I would love to instrument as less as possible unless it warrants the criticalness. I would subtract it for now as best alternative and relatively less loss of precision.

And yes, I was referring to other dependencies.

@lmolkova This is fixed. Attached pictures from UI indicating the same.

Before fix:

After Fix:

lmolkova · 2018-02-21T23:35:18Z

I think this is not related to this particular PR, but rather a general comment.
Why don't we have unit tests to verify actual agent behavior? It would be easy to implement and set up. It's also extremely easy to break this code. Without tests, it makes every change extremely dangerous and leads to issues like #506

dhaval24 · 2018-02-21T23:40:00Z

@lmolkova I think it's always a good idea to write test cases and when we took this project we thought of having test cases for agent but again, testing against the byte code is not as easy as it appears on the surface. One thing that could be done is to write test cases against the behavior of the agent, that means write test cases against CoreAgentNotificationHandler. That would then essentially mean that we put all the required dependencies in our code base as testCompile and then use that to make actual calls and test behavior. However, testing the IL code still remains a question and I do not have a clear solution on how to do that. Ideas ?

lmolkova · 2018-02-21T23:58:32Z

@dhaval24 tests may just call httpclient and with a fake channel and then verify that dependency was tracked and has all correct fields.

that's what we do in .NET SDK.
the only problem is to run tests with -javaagent VM prop, but I believe it could be solved, do you know any reasons why it could be hard?

dhaval24 · 2018-02-22T00:02:07Z

@lmolkova I thought you were mentioning to test the actual IL code added using ASM framework. That is hard. What you are referring here is to test the behavior (more of func tests). We already have the framework in place. See SMOKE TEST branch. It's where all the dockerized tests will be there and are running in containers. We are migrating / adding more tests to the new framework. Currently that work is being done by @littleaj. Essentially the mechanism is same of using the fake breeze and verifying the payload. But work is in progress there, framework is ready and tests are being added and ported on it.

lmolkova · 2018-02-22T00:06:39Z

@dhaval24 great, thanks for explanation!

…he file

dhaval24 · 2018-02-22T05:54:55Z

@lmolkova I added more commits to fix the start time of all the dependencies we collect in the CoreAgentNotificationHandler File. Here is how final output looks like. Take a look if needed and thanks for pointing out this issue.

dhaval24 · 2018-02-24T19:46:10Z

@SergeyKanzhelev @Dmitry-Matveev and @grlima @littleaj could you guys please review this PR so it could be merged.

grlima · 2018-02-27T00:45:21Z

...ain/java/com/microsoft/applicationinsights/agent/internal/agent/HttpClientMethodVisitor.java

+
+        //Load HttpUriRequest instance into Local Array. Contains information of Path and Host
+        mv.visitVarInsn(ALOAD, 2);
+        mv.visitMethodInsn(INVOKEINTERFACE, "org/apache/http/client/methods/HttpUriRequest", "getURI", "()Ljava/net/URI;", true);


@dhaval24 - have you tested this? You're relying on org.apache.http.client.methods.HttpUriRequest being present at index 2, but the method this is intrumenting has org.apache.http.HttpRequest in the signature. I'm not sure if it is a problem, but it might be. In addition, the URI was already being collected (in line 104), so you can just use that in the CoreAgentNotificationHandler to extract the host and path (instead of changing this visitor). That would be safer and cleaner.

@grlima yes I have tested this out. You can see the test results in the screen shot of this PR. Now coming to your point on that instrumenting method has signature HttpRequest I think instead instrumenting method has signature as follows public CloseableHttpResponse execute(HttpUriRequest request) throws IOException, ClientProtocolException. Here is the link to docs : https://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/impl/client/CloseableHttpClient.html#execute(org.apache.http.client.methods.HttpUriRequest)

CloseableHttpClient client = HttpClients.createDefault(); HttpGet get = new HttpGet("https://www.abc.com"); try { HttpResponse response = client.execute(get); System.out.println(response.getStatusLine().getStatusCode()); client.close(); }

In the above execute method calls the method mentioned above. So in that case I think I am instrumenting the right method with right signature.

There is already a url variable which can be reused?
A. I think answer here is no. Because getURI() returns a string and doesn't have required components like Absolute Path and target (aka host). here is the reference to the docs (http://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/client/methods/HttpUriRequest.html#getURI())

Now coming to the question that we should instead parse the URI string to get the desired target, relative path etc. I think I would refrain from doing these because of following reason:
String manipulation with regex is expensive operation, also it is susceptible to errors. Further URLs may not be that simple as that appears : for example = https:///www.xyz.com:8888/name=Dhaval?....... so on a long one. In that case coming up with a exact regex which will efficiently exactly the required host and target is challenging.

I think instrumentation here is pretty straight forward and I do not see any possibility of exceptions which could arrise. For a matter of fact I have tried this on SpringBoot, Spring and J2EE applications and things seemed fine. Let me know if you have further questions on this.

What about java.net.URL? It's supposed to handle absolute and relative URLs and provide you an ability to get any portion of URL back as a string, e.g. hostname, query string and etc..
Having one standard parsing instead of byte-code instrumentation sounds like a safer way.

@Dmitry-Matveev as I mentioned above flow goes as follows...
client.execute(get) -> public CloseableHttpResponse execute(HttpUriRequest request, HttpContext context). It is this class of ApacheHttpClient which holds the java.net.URI class. Inside the URI you can get almost anything that you want, as a string (hostname, query, path, raw url etc.). So ideally we only need to instrument the code to capture this URI object which sits inside the HttpUriRequest and that is what my code does. I did not purposefully touch the other URL string coming from org.apache.http.RequestLine to avoid introducing any possible bugs, but that refactoring is very well possible and I do not see any harm in triming that out.

Also we could use the same class HttpUriRequest to get the method (GET/ POST/PUT) instead of instrumenting RequestLine for consistency. Does this make sense? I can explain more if needed.

I meant that if you have a string of absolute or relative URL already (like @grlima suggested in his review), then you can avoid additional byte-code instrumentation and just create new URL object from that string to get host and path.

@Dmitry-Matveev I think we can keep single instrumentation here. I will remove the instrumentation for RequestLine method which got only String URL but instead keep the instrumentation for java.net.URL here which can provide us all the required information. I will make those refactoring.

grlima

Take a look at comment in visitor code. I'm not sure adding more bytecode is the right approach since we're already collecting URI there. I'd prefer just parsing that in the handler.

…nd get all details from java.net.URI

dhaval24 · 2018-03-06T07:29:55Z

@Dmitry-Matveev @grlima I have refactored the code to remove the instrumentation of RequestLine class to getURL and getMethod. Instead now I am only relying on instrumenting HttpUriRequest and java.net.URI to get all the details. This should address all the concerns as now there is only single instrumentation.

dhaval24 · 2018-03-07T16:57:07Z

@littleaj @grlima thanks for approving. I am merging this PR. If in some case we need to roll back we can do that.

dhaval24 added 2 commits February 21, 2018 13:34

Adding instrumentation to capture target and path in for apache http …

512f39f

…dependencies

Update changelog.md

0cf699f

dhaval24 self-assigned this Feb 21, 2018

dhaval24 requested review from littleaj and grlima February 21, 2018 21:47

dhaval24 added this to the 2.0.1 milestone Feb 21, 2018

dhaval24 mentioned this pull request Feb 21, 2018

Http DependencyTelemetry data structure is inconsistent with .NET SDKs #533

Closed

dhaval24 requested review from SergeyKanzhelev and Dmitry-Matveev February 21, 2018 21:50

dhaval24 added the Bug label Feb 21, 2018

lmolkova reviewed Feb 21, 2018

View reviewed changes

fixing incorrect dependency execution time because of wrong variable …

9b07a41

…passed

lmolkova reviewed Feb 21, 2018

View reviewed changes

Fixing the incorrect start time of HTTP dependency

1ac1f1e

lmolkova approved these changes Feb 22, 2018

View reviewed changes

dhaval24 added 2 commits February 21, 2018 18:37

Fixing dependency start time for general methodFinished

dbc22de

Fix dependency start time for all the dependencies being tracked in t…

b98d045

…he file

Updating changelog for recent changes in PR

24f6459

grlima reviewed Feb 27, 2018

View reviewed changes

grlima suggested changes Feb 27, 2018

View reviewed changes

littleaj removed this from the 2.0.1 milestone Mar 2, 2018

littleaj added this to the 2.0.2 milestone Mar 2, 2018

dhaval24 mentioned this pull request Mar 3, 2018

Native Java HTTP calls are not captured #427

Closed

Refactor instrumentation code to remove RequestLine instrumentation a…

36ed01c

…nd get all details from java.net.URI

littleaj approved these changes Mar 6, 2018

View reviewed changes

grlima approved these changes Mar 6, 2018

View reviewed changes

dhaval24 merged commit 9655e99 into master Mar 7, 2018

grlima deleted the FixHttpDependencyParameterCollection branch March 15, 2018 22:41

grlima mentioned this pull request Apr 5, 2018

org.apache.httpcomponents:httpclient dependency data incorrect (class name instead of host) #637

Closed

timja mentioned this pull request Apr 9, 2018

Upgrade appinsights version hmcts/java-logging#57

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Instrumentation to Collect Missing HTTP Parameters #567

Add Instrumentation to Collect Missing HTTP Parameters #567

dhaval24 commented Feb 21, 2018 •

edited

Loading

lmolkova Feb 21, 2018

dhaval24 Feb 21, 2018

lmolkova Feb 21, 2018

lmolkova Feb 21, 2018

dhaval24 Feb 21, 2018 •

edited

Loading

lmolkova Feb 21, 2018

dhaval24 Feb 21, 2018 •

edited

Loading

lmolkova Feb 21, 2018

lmolkova Feb 21, 2018

dhaval24 Feb 21, 2018

dhaval24 Feb 21, 2018

lmolkova Feb 22, 2018

dhaval24 Feb 22, 2018

dhaval24 Feb 22, 2018 •

edited

Loading

lmolkova commented Feb 21, 2018

dhaval24 commented Feb 21, 2018

lmolkova commented Feb 21, 2018

dhaval24 commented Feb 22, 2018

lmolkova commented Feb 22, 2018

dhaval24 commented Feb 22, 2018

dhaval24 commented Feb 24, 2018

grlima Feb 27, 2018

dhaval24 Feb 27, 2018

Dmitry-Matveev Mar 3, 2018

dhaval24 Mar 3, 2018

Dmitry-Matveev Mar 6, 2018

dhaval24 Mar 6, 2018

grlima left a comment

dhaval24 commented Mar 6, 2018

dhaval24 commented Mar 7, 2018

Add Instrumentation to Collect Missing HTTP Parameters #567

Add Instrumentation to Collect Missing HTTP Parameters #567

Conversation

dhaval24 commented Feb 21, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhaval24 Feb 21, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhaval24 Feb 21, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhaval24 Feb 22, 2018 • edited Loading

Choose a reason for hiding this comment

lmolkova commented Feb 21, 2018

dhaval24 commented Feb 21, 2018

lmolkova commented Feb 21, 2018

dhaval24 commented Feb 22, 2018

lmolkova commented Feb 22, 2018

dhaval24 commented Feb 22, 2018

dhaval24 commented Feb 24, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

grlima left a comment

Choose a reason for hiding this comment

dhaval24 commented Mar 6, 2018

dhaval24 commented Mar 7, 2018

dhaval24 commented Feb 21, 2018 •

edited

Loading

dhaval24 Feb 21, 2018 •

edited

Loading

dhaval24 Feb 21, 2018 •

edited

Loading

dhaval24 Feb 22, 2018 •

edited

Loading