fix(langchain): spans dictionary memory leak by nirga · Pull Request #3216 · traceloop/openllmetry

nirga · 2025-08-02T09:56:16Z

I have added tests that cover my changes.
If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.
PR name follows conventional commits format: feat(instrumentation): ... or fix(instrumentation): ....
(If applicable) I have updated the documentation accordingly.

Important

Fixes memory leak in TraceloopCallbackHandler by deleting spans after use and updates tests for API changes.

Bug Fixes:
- Fix memory leak in TraceloopCallbackHandler by deleting spans after use in _end_span().
- Ensure duration metrics are recorded before ending spans in on_llm_end().
Tests:
- Mark test_batch_metadata_in_span_attributes and test_async_batch_metadata_in_span_attributes as skipped due to VCR issues in CI.
- Update test recordings to reflect newer API responses and client versions.

^{This description was created by}^{for e21167b. You can customize this summary. It will automatically update as commits are pushed.}

Summary by CodeRabbit

Bug Fixes
- Improved error handling to prevent potential issues when ending child spans during tracing operations.
- Ensured metrics are recorded before ending spans for more accurate telemetry data.
Tests
- Temporarily disabled two batch metadata tests due to issues with test recording in continuous integration.
- Removed related test data files for these tests.

coderabbitai · 2025-08-02T09:56:29Z

Caution

Review failed

The pull request is closed.

Walkthrough

The _end_span method in the LangChain OpenTelemetry instrumentation was updated to prevent key errors by checking for child span existence before access, and to explicitly delete span entries after use. The timing of span ending in on_llm_end was adjusted to ensure metrics are recorded first. Two test cassette YAML files were deleted, and two related tests were skipped due to VCR issues in CI.

Changes

Cohort / File(s)	Change Summary
Span Management Logic `packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py`	Improved child span cleanup in `_end_span` to avoid key errors and ensure memory release; adjusted span ending timing in `on_llm_end`.
Test Cassette Deletions `packages/opentelemetry-instrumentation-langchain/tests/cassettes/test_batch_metadata/test_async_batch_metadata_in_span_attributes.yaml`, `packages/opentelemetry-instrumentation-langchain/tests/cassettes/test_batch_metadata/test_batch_metadata_in_span_attributes.yaml`	Removed VCR cassette files recording HTTP interactions for batch metadata tests.
Test Skips `packages/opentelemetry-instrumentation-langchain/tests/test_batch_metadata.py`	Marked two batch metadata tests with pytest skip due to VCR malfunction in CI.

Sequence Diagram(s)

sequenceDiagram
    participant LangChain
    participant CallbackHandler
    participant OpenTelemetry

    LangChain->>CallbackHandler: on_llm_end()
    CallbackHandler->>CallbackHandler: Record duration metric
    CallbackHandler->>CallbackHandler: _end_span(run_id)
    CallbackHandler->>OpenTelemetry: End all child spans (if exist)
    CallbackHandler->>OpenTelemetry: End parent span
    CallbackHandler->>CallbackHandler: Delete span entry for run_id

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~7 minutes

Assessment against linked issues

Objective	Addressed	Explanation
Prevent memory leak in opentelemetry-instrumentation-langchain by ensuring proper cleanup of spans (#2790)	✅
Ensure no memory leak occurs when using traceloop-sdk with HTTP calls inside langgraph nodes (#2790)	✅

Assessment against linked issues: Out-of-scope changes

Code Change	Explanation
Deletion of VCR cassette files for batch metadata tests (`tests/cassettes/test_batch_metadata/*.yaml`)	Removing test cassettes is unrelated to memory leak fixes and not mentioned in the linked issue objectives.
Skipping batch metadata tests due to VCR issues in CI (`tests/test_batch_metadata.py`)	Marking tests as skipped is unrelated to the memory leak issue and not referenced in the linked issue.

Possibly related PRs

fix(langchain): populate metadata as span attributes in batch operations #3218: Modifies _create_span to populate span attributes from metadata in the same file, which is contextually related to span management.

Suggested reviewers

doronkopit5

Poem

In the warren where spans would grow,
A memory leak began to show.
With careful checks and cleanup neat,
The handler now admits defeat—
No more leaks, the code runs light,
Tests skipped for now, but soon set right.
🐇✨

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9dc249c and e21167b.

📒 Files selected for processing (3)

packages/opentelemetry-instrumentation-langchain/tests/cassettes/test_batch_metadata/test_async_batch_metadata_in_span_attributes.yaml (0 hunks)
packages/opentelemetry-instrumentation-langchain/tests/cassettes/test_batch_metadata/test_batch_metadata_in_span_attributes.yaml (0 hunks)
packages/opentelemetry-instrumentation-langchain/tests/test_batch_metadata.py (2 hunks)

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch langchain-memory-leak

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

ellipsis-dev

Important

Looks good to me! 👍

Reviewed everything up to 8528eb0 in 38 seconds. Click for details.

Reviewed 44 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 2 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:190

Draft comment:
Good cleanup: Deleting the span from self.spans here prevents the memory leak. Additionally, checking 'if child_id in self.spans' before ending child spans avoids potential KeyErrors if they've already been removed.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

2. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:543

Draft comment:
Reordering in on_llm_end is correct: recording the duration before calling _end_span ensures that self.spans[run_id] is still available for reading, preventing a KeyError after deletion.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_YsoyESAqboNuigIF

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

ellipsis-dev

Important

Looks good to me! 👍

Reviewed dba192a in 1 minute and 40 seconds. Click for details.

Reviewed 31 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 2 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:180

Draft comment:
Minor formatting cleanup: removed trailing whitespace in the child span check. Since this method is responsible for ending spans, please verify that child spans are also cleaned up (i.e. removed from the spans dict) to avoid lingering references and potential memory leaks.
Reason this comment was not posted:
Comment was on unchanged code.

2. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:550

Draft comment:
Removed an unnecessary blank line before calling _end_span in the LLM end handler. This is a pure formatting change intended to improve readability.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, describing a formatting change that doesn't affect functionality. It doesn't provide a suggestion or raise a concern about the code.

Workflow ID: wflow_BxWPst7SVZjVZK4Y

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 5c1ce20 in 39 seconds. Click for details.

Reviewed 397 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 2 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1.

packages/opentelemetry-instrumentation-langchain/tests/cassettes/test_async_batch_metadata/test_async_batch_metadata_in_span_attributes.yaml:19

Draft comment:
The updated traceparent (and associated header values, e.g. timestamps and CF-RAY) reflect the changes in span handling. Ensure that any dynamically generated fields are normalized or handled in tests to avoid brittleness in future runs.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

2.

packages/opentelemetry-instrumentation-langchain/tests/cassettes/test_batch_metadata/test_batch_metadata_in_span_attributes.yaml:3

Draft comment:
In this cassette, the request and response bodies (including binary encoded span attributes) and several header timings have been updated. These changes are consistent with improved error handling and recording duration metrics before ending spans. Confirm that these cassette updates fully capture the resolved memory leak behavior.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_GE4XfyAiVX86ZFoo

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 9dc249c in 2 minutes and 56 seconds. Click for details.

Reviewed 8163 lines of code in 18 files
Skipped 0 files when reviewing.
Skipped posting 8 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/opentelemetry-instrumentation-langchain/tests/test_lcel.py:1005

Draft comment:
In the helper function assert_message_in_logs, you convert log.log_record.body to a dict without checking its type. Consider adding a type check or comment so that future maintainers know what type of object to expect.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

2. packages/opentelemetry-instrumentation-langchain/tests/test_lcel.py:24

Draft comment:
The tests make good use of pipelines (using the | operator) to compose the chain. Consider adding a brief inline comment explaining that instrument_legacy, instrument_with_content, and instrument_with_no_content are fixtures that toggle the telemetry content to avoid confusion for future readers.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

3. packages/opentelemetry-instrumentation-langchain/tests/test_lcel.py:44

Draft comment:
When asserting the set of span names, consider using a more explicit approach (e.g., sorted lists) for better diagnostic output in case of mismatch. Consistency between tests that use set equality and those that rely on list equality (e.g. test_invoke) might be beneficial.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

4. packages/opentelemetry-instrumentation-langchain/tests/test_lcel.py:813

Draft comment:
In test_lcel_with_datetime, the test compares a datetime converted to an ISO string. This is clear, but be aware that any change in the serialization format may break the test. Consider adding a comment that this test validates the expected ISO 8601 format.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

5. packages/opentelemetry-instrumentation-langchain/tests/test_lcel.py:192

Draft comment:
The expected AI choice event in test_simple_lcel_with_events_with_content is hardcoded with tool call arguments. Ensure that these expected values remain in sync with the actual function conversion behavior in convert_pydantic_to_openai_function.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

6. packages/opentelemetry-instrumentation-langchain/tests/test_lcel.py:140

Draft comment:
For asynchronous tests using @pytest.mark.asyncio and await invocations, the structure and parent-child span assertions are clear. Consider adding brief comments to note the difference between synchronous and asynchronous pipelines to aid future maintainers.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

7.

packages/opentelemetry-instrumentation-langchain/tests/cassettes/test_batch_metadata/test_batch_metadata_in_span_attributes.yaml:182

Draft comment:
Typo: The 'Transfer-Encoding:' header appears to be missing its value or list item. Please add the appropriate value or remove the header if unintended.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

8.

packages/opentelemetry-instrumentation-langchain/tests/cassettes/test_lcel/test_stream_with_events_with_no_content.yaml:155

Draft comment:
There appears to be an odd concatenation in the streamed text around the tokens 'flaming', 'or', 'os', 'tacos'. When merged, it reads as "flaming or os tacos," which seems unintended. Could you verify if the intended word is "flamingos" (or another phrase)?
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The comment suggests there might be an issue with token concatenation, but looking at the context, this appears to be intentional. The text is describing different types of sock designs, and "flamingos or tacos" makes perfect sense as quirky sock pattern options. The tokens are being streamed correctly as part of the API response. Could there be an actual technical issue with how the tokens are being streamed that I'm missing? The comment author may have noticed something in the implementation that isn't visible in just this test file. No - this is a test cassette file that records actual API responses. The token streaming pattern shown here matches OpenAI's normal behavior, and the text content itself is coherent and intentional. The comment should be deleted. The apparent "odd concatenation" is actually the intended text content being streamed correctly, describing sock designs including "flamingos or tacos".

Workflow ID: wflow_IXbt41kovuf3lGSU

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev

Important

Looks good to me! 👍

Reviewed e21167b in 1 minute and 0 seconds. Click for details.

Reviewed 481 lines of code in 3 files
Skipped 0 files when reviewing.
Skipped posting 5 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/opentelemetry-instrumentation-langchain/tests/test_batch_metadata.py:7

Draft comment:
Skip marker: The test is deliberately skipped due to VCR issues in CI. Ensure that this is re-enabled once VCR is fixed to maintain proper test coverage.
Reason this comment was not posted:
Confidence changes required: 50% <= threshold 50% None

2. packages/opentelemetry-instrumentation-langchain/tests/test_batch_metadata.py:57

Draft comment:
Skip marker for async test: The async test is also skipped due to VCR issues in CI; make sure to re-enable it once the CI environment is fixed.
Reason this comment was not posted:
Confidence changes required: 50% <= threshold 50% None

3. packages/opentelemetry-instrumentation-langchain/tests/test_batch_metadata.py:41

Draft comment:
Consider asserting for a consistent key format for metadata attributes instead of using an OR condition. This dual check may mask naming issues in the span attributes.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

4. packages/opentelemetry-instrumentation-langchain/tests/test_batch_metadata.py:59

Draft comment:
Minor: The docstring in the async test refers to 'abatch calls'. Consider renaming it to 'async batch' for clarity.
Reason this comment was not posted:
Confidence changes required: 20% <= threshold 50% None

5. packages/opentelemetry-instrumentation-langchain/tests/test_batch_metadata.py:59

Draft comment:
Typo found: 'abatch' in the docstring might be a typo. Consider replacing 'abatch' with 'batch'.
Reason this comment was not posted:
Comment was on unchanged code.

Workflow ID: wflow_UvzbLX0IEsD74pRR

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

Co-authored-by: Claude <noreply@anthropic.com>

fix(langchain): spans dictionary memory leak

8528eb0

nirga mentioned this pull request Aug 2, 2025

fix(langchain): langchain end span #3007

Closed

4 tasks

ellipsis-dev Bot reviewed Aug 2, 2025

View reviewed changes

fix(langchain): remove trailing whitespace

dba192a

🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

ellipsis-dev Bot reviewed Aug 2, 2025

View reviewed changes

doronkopit5 approved these changes Aug 3, 2025

View reviewed changes

nirga added 3 commits August 3, 2025 10:55

Merge branch 'main' into langchain-memory-leak

969bfcd

Merge branch 'main' into langchain-memory-leak

0d82c50

Merge branch 'main' into langchain-memory-leak

a9d2f0e

ellipsis-dev Bot reviewed Aug 3, 2025

View reviewed changes

nirga force-pushed the langchain-memory-leak branch from 5c1ce20 to 9dc249c Compare August 3, 2025 20:34

ellipsis-dev Bot reviewed Aug 3, 2025

View reviewed changes

chore: disable batch metadata test as it's failing in CI

e21167b

nirga force-pushed the langchain-memory-leak branch from 9dc249c to e21167b Compare August 3, 2025 20:57

ellipsis-dev Bot reviewed Aug 3, 2025

View reviewed changes

nirga merged commit 87d8a50 into main Aug 3, 2025
8 of 9 checks passed

nirga deleted the langchain-memory-leak branch August 3, 2025 21:00

nina-kollman pushed a commit that referenced this pull request Aug 11, 2025

fix(langchain): spans dictionary memory leak (#3216)

8eed824

Co-authored-by: Claude <noreply@anthropic.com>

This was referenced Aug 12, 2025

fix(langchain): fix nesting of langgraph spans #3206

Merged

feat(instrumentation): updated GenAI attributes to use OTel's #3138

Merged

coderabbitai Bot mentioned this pull request Sep 23, 2025

fix(langchain): span attrs and metrics missing of langchain third party integration #3391

Open

4 tasks

amiasato-cloudwalk mentioned this pull request Oct 27, 2025

🐛 Bug Report: Memory Leak When Enabled opentelemetry-instrumentation-langchain #2790

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(langchain): spans dictionary memory leak#3216

fix(langchain): spans dictionary memory leak#3216
nirga merged 6 commits intomainfrom
langchain-memory-leak

nirga commented Aug 2, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Aug 2, 2025 •

edited

Loading

Review failed

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

ellipsis-dev Bot left a comment

Uh oh!

ellipsis-dev Bot left a comment

Uh oh!

ellipsis-dev Bot left a comment

Uh oh!

ellipsis-dev Bot left a comment

Uh oh!

ellipsis-dev Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nirga commented Aug 2, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Aug 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Assessment against linked issues

Assessment against linked issues: Out-of-scope changes

Possibly related PRs

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

ellipsis-dev Bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev Bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev Bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev Bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nirga commented Aug 2, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Aug 2, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)