feat(weave): Refactor otel parsing to standardize fields #4143

zbirenbaum · 2025-04-16T03:31:52Z

Description

Fixes WB-24513
Fixes WB-24514

This adjusts the parsing logic to make it simple to generate and keep docs in line with the parser code. This adjusts the parser such that a set of standard fields are created for each of input, output, summary and attributes. Each field is assigned a list of keys which are checked until a valid entry is discovered or the possible key entries are exhausted.

Also includes cost calculation fix

PRs into master which should be reviewed and merged before reviewing this:

fix(weave): Fix trace export error when op_name is too long #4177

PRs into this branch which should be reviewed and merged before reviewing this:

feat(weave): use otel attribute keys as input and output subfields #4181

circle-job-mirror · 2025-04-16T03:36:40Z

Preview this PR with FeatureBee: https://beta.wandb.ai/?betaVersion=c61c2e8ae492c3eaa051af5426c61b96c1c8c70c

parambharat · 2025-04-17T07:29:13Z

weave/trace_server/opentelemetry/python_spans.py


-        # Options: set
        start_call = tsi.StartedCallSchemaForInsert(
            project_id=project_id,


@zacharyblasczyk : I came across this PR as I was implementing the pydantic_ai otel integration, one quick request I had was if it was possible to set the display_name: Optional[str] = None in StartedCallSchemaForInsert. We can map it to an attribute like weave.display_name from the attributes dict of the span. This can help with having nice display names for the calls in the UI.

I assume you meant @zbirenbaum.

@parambharat

That's possible but I think the better route would be to try and set display name based on the attribute and fall back to the field we currently use to determine the name if it isn't there. Otherwise we would end up with traces that have a perfectly fine name to use being unnamed if that attribute isn't set.

~~Would that satisfy your use case?~~

I reviewed it again and saw what you mean, got op_name and display_name confused. Display name isn't being set at all right now. I'll modify it to try and load the name from attributes, nice catch

…tants.py

ayulockin · 2025-04-21T13:05:20Z

Hey @zbirenbaum and @tssweeney when are we anticipating merging this PR and deploy the trace server with updates?

zbirenbaum · 2025-04-21T15:31:38Z

Hey @zbirenbaum and @tssweeney when are we anticipating merging this PR and deploy the trace server with updates?

It was ideally going to be merged friday but it got stuck in CI due to some change in dspy integration tests. It should be merged today

vanpelt · 2025-04-21T16:43:29Z

weave/trace_server/opentelemetry/constants.py

+    # System prompt/instructions
+    "system": [
+        "gen_ai.system",  # OpenTelemetry AI
+        "llm.system",  # OpenInference


Likely not a big deal, but in our Input / Output keys OpenInference is before OT AI

I'll switch it real quick for consistency but it shouldn't actually make a difference here, the same information is recorded in each key so even if they have both it will work properly.

vanpelt · 2025-04-21T16:47:32Z

weave/trace_server/opentelemetry/helpers.py

+            for k in value.__dataclass_fields__
+        }
+    else:
+        raise ValueError(f"Unsupported type for JSON serialization: {type(value)}")


Not sure how common any of these would be but I went ahead and asked ChatGPT for other common python native types we might encounter when attempting to serialize. I know in our wandb sdk things like Infinity or NaN would bite us from time to time.

import math from datetime import datetime, date, time, timedelta from uuid import UUID from enum import Enum from decimal import Decimal import base64 def to_json_serializable(value: Any) -> Any: """ Transform common data types into JSON-serializable values. """ if value is None: return None elif isinstance(value, (str, bool)): return value elif isinstance(value, int): return value elif isinstance(value, float): # Handle special floats: NaN, inf, -inf if math.isnan(value) or math.isinf(value): return str(value) return value elif isinstance(value, (list, tuple)): return [to_json_serializable(item) for item in value] elif isinstance(value, dict): return {str(k): to_json_serializable(v) for k, v in value.items()} elif isinstance(value, datetime): return value.isoformat() elif isinstance(value, date): # date without time return value.isoformat() elif isinstance(value, time): # time without date return value.isoformat() elif isinstance(value, timedelta): return value.total_seconds() elif isinstance(value, UUID): return str(value) elif isinstance(value, Enum): return value.value elif isinstance(value, Decimal): # Convert Decimal to float or str, depending on requirements. return float(value) elif isinstance(value, (set, frozenset)): return [to_json_serializable(item) for item in value] elif isinstance(value, complex): return {"real": value.real, "imag": value.imag} elif isinstance(value, (bytes, bytearray)): return base64.b64encode(value).decode('ascii') elif hasattr(value, "__dataclass_fields__"): return {k: to_json_serializable(getattr(value, k)) for k in value.__dataclass_fields__} else: raise ValueError(f"Unsupported type for JSON serialization: {type(value)}")

elif isinstance(value, (bytes, bytearray)): return base64.b64encode(value).decode('ascii')

Thanks for pointing those out, this should make it a bit more robust. Only note: is this the way we want to handle bytes values?

zbirenbaum · 2025-04-21T19:54:19Z

/cherry-pick server-release-0.68.0

zbirenbaum · 2025-04-21T19:57:36Z

/cherry-pick server-release-0.68.x

zbirenbaum · 2025-04-21T20:16:33Z

/cherry-pick server-release-0.68.x

zbirenbaum added 11 commits April 11, 2025 23:45

remove custom processing of openinference

e2d88fb

add convention attribute for frontend rendering

e0998ba

revert adding convention

2c96577

Load json strings and parse dict to array

2263491

revert trace server changes

158d927

fix tests to work with new parsing

a4d6895

working except attributes

ad11e73

iterative parsing working

caaa6da

use actual keys

89d178b

feat(weave): simplify and standardize otel parsing to specific fields

37f69d0

lint

abf476b

zbirenbaum requested a review from a team as a code owner April 16, 2025 03:31

remove prints

8e05459

zbirenbaum and others added 6 commits April 15, 2025 21:05

Merge branch 'master' into zach/otel-parsing-rewrite

728e231

remove incomple implementations

b62ff60

lint

0bdc352

fix cost missing in traces

6573028

chore(weave): Try load all strings starting with valid chars as json

0a113ea

move key mapping to seperate file for clarity

8b09ebc

parambharat reviewed Apr 17, 2025

View reviewed changes

ayulockin mentioned this pull request Apr 17, 2025

Add W&B Weave tracing NVIDIA/NeMo-Agent-Toolkit#135

Closed

2 tasks

zbirenbaum added 4 commits April 17, 2025 17:54

fix(weave): Fix trace export error when op_name is too long

33428d8

lint

7aecd23

merge zach/fix-long-op-name

8f61e94

lint

fa90b6f

zbirenbaum mentioned this pull request Apr 18, 2025

feat(weave): use otel attribute keys as input and output subfields #4181

Merged

zbirenbaum added 2 commits April 18, 2025 02:12

fix bug in shorten_name

92734cb

Merge branch 'zach/fix-long-op-name' into zach/otel-parsing-rewrite

e62646c

zbirenbaum mentioned this pull request Apr 18, 2025

feat(weave): load json strings and parse dict to array #4134

Closed

zbirenbaum and others added 10 commits April 18, 2025 15:33

add more detailed comment and examples

903bb61

merge zach/fix-long-op-name

7138285

fix failing testcase

ba479ee

lint

769b953

Merge branch 'zach/fix-long-op-name' into zach/otel-parsing-rewrite

09f0dc9

lint

a70a1f9

add additional protection for parse numeric handler

57641e0

reorganize utility functions to helpers.py and place mappings in cons…

79f032b

…tants.py

group handlers with key, small edge case fixes and formatting

67b9e8a

Merge branch 'master' into zach/otel-parsing-rewrite

cff296f

zbirenbaum added 2 commits April 21, 2025 08:31

Merge branch 'master' into zach/otel-parsing-rewrite

cae6c9c

Merge branch 'master' into zach/otel-parsing-rewrite

04aac9a

vanpelt reviewed Apr 21, 2025

View reviewed changes

zbirenbaum and others added 8 commits April 21, 2025 09:47

prefer otel keys

bf37597

additional serialization handling

6fab565

resolve merge conflicts

39be707

lint

4c1c6a1

lint and add tests for new to_json_serializable types

d624fbb

Merge branch 'master' into zach/otel-parsing-rewrite

562093e

remove numpy and lint

e4d7fb1

remove numpy test

e9cf402

zbirenbaum merged commit c5774b2 into master Apr 21, 2025
140 checks passed

zbirenbaum deleted the zach/otel-parsing-rewrite branch April 21, 2025 19:43

github-actions bot locked and limited conversation to collaborators Apr 21, 2025

feat(weave): Refactor otel parsing to standardize fields #4143

feat(weave): Refactor otel parsing to standardize fields #4143

Uh oh!

Conversation

zbirenbaum commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

circle-job-mirror bot commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

parambharat Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

zacharyblasczyk Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

zbirenbaum Apr 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ayulockin commented Apr 21, 2025

Uh oh!

zbirenbaum commented Apr 21, 2025

Uh oh!

vanpelt Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

zbirenbaum Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

vanpelt Apr 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zbirenbaum Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zbirenbaum commented Apr 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zbirenbaum commented Apr 21, 2025

Uh oh!

zbirenbaum commented Apr 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

zbirenbaum commented Apr 16, 2025 •

edited

Loading

circle-job-mirror bot commented Apr 16, 2025 •

edited

Loading

zbirenbaum Apr 17, 2025 •

edited

Loading

vanpelt Apr 21, 2025 •

edited

Loading

zbirenbaum commented Apr 21, 2025 •

edited

Loading