Skip to content

fix(langchain): spans dictionary memory leak#3216

Merged
nirga merged 6 commits intomainfrom
langchain-memory-leak
Aug 3, 2025
Merged

fix(langchain): spans dictionary memory leak#3216
nirga merged 6 commits intomainfrom
langchain-memory-leak

Conversation

@nirga
Copy link
Copy Markdown
Member

@nirga nirga commented Aug 2, 2025

Fixes #2790

  • I have added tests that cover my changes.
  • If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.
  • PR name follows conventional commits format: feat(instrumentation): ... or fix(instrumentation): ....
  • (If applicable) I have updated the documentation accordingly.

Important

Fixes memory leak in TraceloopCallbackHandler by deleting spans after use and updates tests for API changes.

  • Bug Fixes:
    • Fix memory leak in TraceloopCallbackHandler by deleting spans after use in _end_span().
    • Ensure duration metrics are recorded before ending spans in on_llm_end().
  • Tests:
    • Mark test_batch_metadata_in_span_attributes and test_async_batch_metadata_in_span_attributes as skipped due to VCR issues in CI.
    • Update test recordings to reflect newer API responses and client versions.

This description was created by Ellipsis for e21167b. You can customize this summary. It will automatically update as commits are pushed.


Summary by CodeRabbit

  • Bug Fixes

    • Improved error handling to prevent potential issues when ending child spans during tracing operations.
    • Ensured metrics are recorded before ending spans for more accurate telemetry data.
  • Tests

    • Temporarily disabled two batch metadata tests due to issues with test recording in continuous integration.
    • Removed related test data files for these tests.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Aug 2, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

The _end_span method in the LangChain OpenTelemetry instrumentation was updated to prevent key errors by checking for child span existence before access, and to explicitly delete span entries after use. The timing of span ending in on_llm_end was adjusted to ensure metrics are recorded first. Two test cassette YAML files were deleted, and two related tests were skipped due to VCR issues in CI.

Changes

Cohort / File(s) Change Summary
Span Management Logic
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
Improved child span cleanup in _end_span to avoid key errors and ensure memory release; adjusted span ending timing in on_llm_end.
Test Cassette Deletions
packages/opentelemetry-instrumentation-langchain/tests/cassettes/test_batch_metadata/test_async_batch_metadata_in_span_attributes.yaml, packages/opentelemetry-instrumentation-langchain/tests/cassettes/test_batch_metadata/test_batch_metadata_in_span_attributes.yaml
Removed VCR cassette files recording HTTP interactions for batch metadata tests.
Test Skips
packages/opentelemetry-instrumentation-langchain/tests/test_batch_metadata.py
Marked two batch metadata tests with pytest skip due to VCR malfunction in CI.

Sequence Diagram(s)

sequenceDiagram
    participant LangChain
    participant CallbackHandler
    participant OpenTelemetry

    LangChain->>CallbackHandler: on_llm_end()
    CallbackHandler->>CallbackHandler: Record duration metric
    CallbackHandler->>CallbackHandler: _end_span(run_id)
    CallbackHandler->>OpenTelemetry: End all child spans (if exist)
    CallbackHandler->>OpenTelemetry: End parent span
    CallbackHandler->>CallbackHandler: Delete span entry for run_id
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~7 minutes

Assessment against linked issues

Objective Addressed Explanation
Prevent memory leak in opentelemetry-instrumentation-langchain by ensuring proper cleanup of spans (#2790)
Ensure no memory leak occurs when using traceloop-sdk with HTTP calls inside langgraph nodes (#2790)

Assessment against linked issues: Out-of-scope changes

Code Change Explanation
Deletion of VCR cassette files for batch metadata tests (tests/cassettes/test_batch_metadata/*.yaml) Removing test cassettes is unrelated to memory leak fixes and not mentioned in the linked issue objectives.
Skipping batch metadata tests due to VCR issues in CI (tests/test_batch_metadata.py) Marking tests as skipped is unrelated to the memory leak issue and not referenced in the linked issue.

Possibly related PRs

Suggested reviewers

  • doronkopit5

Poem

In the warren where spans would grow,
A memory leak began to show.
With careful checks and cleanup neat,
The handler now admits defeat—
No more leaks, the code runs light,
Tests skipped for now, but soon set right.
🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9dc249c and e21167b.

📒 Files selected for processing (3)
  • packages/opentelemetry-instrumentation-langchain/tests/cassettes/test_batch_metadata/test_async_batch_metadata_in_span_attributes.yaml (0 hunks)
  • packages/opentelemetry-instrumentation-langchain/tests/cassettes/test_batch_metadata/test_batch_metadata_in_span_attributes.yaml (0 hunks)
  • packages/opentelemetry-instrumentation-langchain/tests/test_batch_metadata.py (2 hunks)
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch langchain-memory-leak

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@nirga nirga mentioned this pull request Aug 2, 2025
4 tasks
Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to 8528eb0 in 38 seconds. Click for details.
  • Reviewed 44 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:190
  • Draft comment:
    Good cleanup: Deleting the span from self.spans here prevents the memory leak. Additionally, checking 'if child_id in self.spans' before ending child spans avoids potential KeyErrors if they've already been removed.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
2. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:543
  • Draft comment:
    Reordering in on_llm_end is correct: recording the duration before calling _end_span ensures that self.spans[run_id] is still available for reading, preventing a KeyError after deletion.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_YsoyESAqboNuigIF

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed dba192a in 1 minute and 40 seconds. Click for details.
  • Reviewed 31 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:180
  • Draft comment:
    Minor formatting cleanup: removed trailing whitespace in the child span check. Since this method is responsible for ending spans, please verify that child spans are also cleaned up (i.e. removed from the spans dict) to avoid lingering references and potential memory leaks.
  • Reason this comment was not posted:
    Comment was on unchanged code.
2. packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py:550
  • Draft comment:
    Removed an unnecessary blank line before calling _end_span in the LLM end handler. This is a pure formatting change intended to improve readability.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, describing a formatting change that doesn't affect functionality. It doesn't provide a suggestion or raise a concern about the code.

Workflow ID: wflow_BxWPst7SVZjVZK4Y

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 5c1ce20 in 39 seconds. Click for details.
  • Reviewed 397 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/tests/cassettes/test_async_batch_metadata/test_async_batch_metadata_in_span_attributes.yaml:19
  • Draft comment:
    The updated traceparent (and associated header values, e.g. timestamps and CF-RAY) reflect the changes in span handling. Ensure that any dynamically generated fields are normalized or handled in tests to avoid brittleness in future runs.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
2. packages/opentelemetry-instrumentation-langchain/tests/cassettes/test_batch_metadata/test_batch_metadata_in_span_attributes.yaml:3
  • Draft comment:
    In this cassette, the request and response bodies (including binary encoded span attributes) and several header timings have been updated. These changes are consistent with improved error handling and recording duration metrics before ending spans. Confirm that these cassette updates fully capture the resolved memory leak behavior.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_GE4XfyAiVX86ZFoo

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@nirga nirga force-pushed the langchain-memory-leak branch from 5c1ce20 to 9dc249c Compare August 3, 2025 20:34
Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 9dc249c in 2 minutes and 56 seconds. Click for details.
  • Reviewed 8163 lines of code in 18 files
  • Skipped 0 files when reviewing.
  • Skipped posting 8 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/tests/test_lcel.py:1005
  • Draft comment:
    In the helper function assert_message_in_logs, you convert log.log_record.body to a dict without checking its type. Consider adding a type check or comment so that future maintainers know what type of object to expect.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. packages/opentelemetry-instrumentation-langchain/tests/test_lcel.py:24
  • Draft comment:
    The tests make good use of pipelines (using the | operator) to compose the chain. Consider adding a brief inline comment explaining that instrument_legacy, instrument_with_content, and instrument_with_no_content are fixtures that toggle the telemetry content to avoid confusion for future readers.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
3. packages/opentelemetry-instrumentation-langchain/tests/test_lcel.py:44
  • Draft comment:
    When asserting the set of span names, consider using a more explicit approach (e.g., sorted lists) for better diagnostic output in case of mismatch. Consistency between tests that use set equality and those that rely on list equality (e.g. test_invoke) might be beneficial.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. packages/opentelemetry-instrumentation-langchain/tests/test_lcel.py:813
  • Draft comment:
    In test_lcel_with_datetime, the test compares a datetime converted to an ISO string. This is clear, but be aware that any change in the serialization format may break the test. Consider adding a comment that this test validates the expected ISO 8601 format.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
5. packages/opentelemetry-instrumentation-langchain/tests/test_lcel.py:192
  • Draft comment:
    The expected AI choice event in test_simple_lcel_with_events_with_content is hardcoded with tool call arguments. Ensure that these expected values remain in sync with the actual function conversion behavior in convert_pydantic_to_openai_function.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
6. packages/opentelemetry-instrumentation-langchain/tests/test_lcel.py:140
  • Draft comment:
    For asynchronous tests using @pytest.mark.asyncio and await invocations, the structure and parent-child span assertions are clear. Consider adding brief comments to note the difference between synchronous and asynchronous pipelines to aid future maintainers.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
7. packages/opentelemetry-instrumentation-langchain/tests/cassettes/test_batch_metadata/test_batch_metadata_in_span_attributes.yaml:182
  • Draft comment:
    Typo: The 'Transfer-Encoding:' header appears to be missing its value or list item. Please add the appropriate value or remove the header if unintended.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
8. packages/opentelemetry-instrumentation-langchain/tests/cassettes/test_lcel/test_stream_with_events_with_no_content.yaml:155
  • Draft comment:
    There appears to be an odd concatenation in the streamed text around the tokens 'flaming', 'or', 'os', 'tacos'. When merged, it reads as "flaming or os tacos," which seems unintended. Could you verify if the intended word is "flamingos" (or another phrase)?
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The comment suggests there might be an issue with token concatenation, but looking at the context, this appears to be intentional. The text is describing different types of sock designs, and "flamingos or tacos" makes perfect sense as quirky sock pattern options. The tokens are being streamed correctly as part of the API response. Could there be an actual technical issue with how the tokens are being streamed that I'm missing? The comment author may have noticed something in the implementation that isn't visible in just this test file. No - this is a test cassette file that records actual API responses. The token streaming pattern shown here matches OpenAI's normal behavior, and the text content itself is coherent and intentional. The comment should be deleted. The apparent "odd concatenation" is actually the intended text content being streamed correctly, describing sock designs including "flamingos or tacos".

Workflow ID: wflow_IXbt41kovuf3lGSU

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@nirga nirga force-pushed the langchain-memory-leak branch from 9dc249c to e21167b Compare August 3, 2025 20:57
Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed e21167b in 1 minute and 0 seconds. Click for details.
  • Reviewed 481 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 5 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-langchain/tests/test_batch_metadata.py:7
  • Draft comment:
    Skip marker: The test is deliberately skipped due to VCR issues in CI. Ensure that this is re-enabled once VCR is fixed to maintain proper test coverage.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50% None
2. packages/opentelemetry-instrumentation-langchain/tests/test_batch_metadata.py:57
  • Draft comment:
    Skip marker for async test: The async test is also skipped due to VCR issues in CI; make sure to re-enable it once the CI environment is fixed.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50% None
3. packages/opentelemetry-instrumentation-langchain/tests/test_batch_metadata.py:41
  • Draft comment:
    Consider asserting for a consistent key format for metadata attributes instead of using an OR condition. This dual check may mask naming issues in the span attributes.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
4. packages/opentelemetry-instrumentation-langchain/tests/test_batch_metadata.py:59
  • Draft comment:
    Minor: The docstring in the async test refers to 'abatch calls'. Consider renaming it to 'async batch' for clarity.
  • Reason this comment was not posted:
    Confidence changes required: 20% <= threshold 50% None
5. packages/opentelemetry-instrumentation-langchain/tests/test_batch_metadata.py:59
  • Draft comment:
    Typo found: 'abatch' in the docstring might be a typo. Consider replacing 'abatch' with 'batch'.
  • Reason this comment was not posted:
    Comment was on unchanged code.

Workflow ID: wflow_UvzbLX0IEsD74pRR

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@nirga nirga merged commit 87d8a50 into main Aug 3, 2025
8 of 9 checks passed
@nirga nirga deleted the langchain-memory-leak branch August 3, 2025 21:00
nina-kollman pushed a commit that referenced this pull request Aug 11, 2025
Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🐛 Bug Report: Memory Leak When Enabled opentelemetry-instrumentation-langchain

2 participants