fix: fix test of root span to match what is being set#2494
fix: fix test of root span to match what is being set#2494bbrowning merged 1 commit intollamastack:mainfrom
Conversation
1268a91 to
8a24da5
Compare
|
Even with this change, we'll still sometimes insert I believe this change looks reasonable for the purposes of properly testing if we're recording a root span or not, but we may also need to adjust the Trace class in llama_stack/apis/telemetry/telemetry.py to make Or, if it's actually required, we'll have to do the work to figure out what our root_span_id actually is for every trace request and never insert None there, right? |
My assumption (which may well be flawed), is that there is little value in a trace entry that does not reference a span. In what circumstances would a trace be inserted with no root_span_id, and how would that entry then be used? |
|
If we're part of a larger distributed trace started outside of Llama Stack by the calling client that root span id won't be our span_id but may be our parent_span_id (or our parent's parent, or parent's parent's parent, etc). I don't know if Llama Stack itself has visibility into the outermost root_span_id in those cases, although you could argue that for the purposes of this work it would probably be enough to traverse our parent lineage as far back as we can and consider that the root_span_id in most or all cases? |
|
And, maybe I'm reading too much into this column in sqlite being named |
|
Even with propagated trace context though, there doesn't seem to be any value in inserting a record into the traces table where the 'root_span_id' is null as it then has no link to any span information. |
llama_stack/providers/inline/telemetry/meta_reference/sqlite_span_processor.py
Outdated
Show resolved
Hide resolved
|
@bbrowning I've made a small further change here to correctly handle the case where a traceparent context is passed in by the client. I have introduced a separate marker for the 'local' root, so that inserting a span id for each trace is not dependent on the |
bbrowning
left a comment
There was a problem hiding this comment.
This seems like a reasonable approach, ensuring we log the span_id we know about even if for some reason it's not the ultimate root span because we're part of a larger context.
I left a non-blocking comment around maybe considering a code comment or constant just to help us remember the purpose of this.
Thanks!
Using the telemetry API to query the sqlite store of traces and spans requires that each trace record references the first span recorded locally in the 'spans' table. This fix ensures such a reference is always in place whether or not a traceparent is passed in. Signed-off-by: Gordon Sim <gsim@redhat.com>
|
I commented out the pytest skip for the only test in Here's how I ran that test locally, after commenting out the pytest skip marker on it: I don't know if this was the only reason that test was marked as skipped. If you're willing, it might be worth digging into if this makes that test stable now or if it was also failing for other reasons in CI. But, either way, great job on this one - merging! |
What does this PR do?
I get errors when trying to query spans. It appears to be a result of traces being inserted where there is no root_span_id which causes a pydantic validation error on trying to load the data for a query response (and in any case having no span referenced undermines the purpose of the trace). The root cause as far as I can see is an invalid test in the code that inserts the trace, where it is testing for the string "true" against an object set to the python value True.
Closes #2493
Test Plan
With this change I can query spans.