Skip to content

Conversation

navin772
Copy link
Member

@navin772 navin772 commented Oct 15, 2025

User description

🔗 Related Issues

Addresses remaining methods for response handlers for python - #13993

💥 What does this PR do?

  1. Adds network response handlers:
    • add_response_handler
    • remove_response_handler
    • clear_response_handlers
  2. Adds continue_response method.
  3. Adds network data collector support
    • add_data_collector
    • remove_data_collector
    • get_data

🔧 Implementation Notes

Initially I started out implementing the data collector feature only, but during testing response handlers were required, thus this PR grew bigger.

💡 Additional Considerations

🔄 Types of changes

  • New feature (non-breaking change which adds functionality and tests!)

PR Type

Enhancement, Tests


Description

  • Adds network response handlers (add_response_handler, remove_response_handler, clear_response_handlers) for BiDi protocol

  • Implements continue_response method for modifying intercepted responses

  • Adds network data collector support (add_data_collector, remove_data_collector, get_data)

  • Introduces BytesValue, StringValue, and Base64Value classes for handling network data encoding

  • Adds comprehensive test coverage for response handlers, data collectors, and their integration


Diagram Walkthrough

flowchart LR
  A["Network class"] --> B["Response handlers"]
  A --> C["Data collectors"]
  B --> D["Response class"]
  D --> E["continue_response()"]
  C --> F["BytesValue encoding"]
  F --> G["StringValue/Base64Value"]
Loading

File Walkthrough

Relevant files
Enhancement
network.py
Add response handlers, data collectors, and encoding classes

py/selenium/webdriver/common/bidi/network.py

  • Adds BytesValue, StringValue, and Base64Value classes for network data
    encoding
  • Implements response handler methods (add_response_handler,
    remove_response_handler, clear_response_handlers)
  • Adds data collector methods (add_data_collector,
    remove_data_collector, get_data)
  • Introduces Response class with continue_response method for response
    interception
  • Refactors clear_request_handlers to properly clean up subscriptions
+398/-6 
Tests
bidi_network_tests.py
Add comprehensive tests for response handlers and data collectors

py/test/selenium/webdriver/common/bidi_network_tests.py

  • Adds tests for data collector lifecycle (add, remove)
  • Adds tests for response handler lifecycle and event capture
  • Adds integration tests for response handlers with data collection
  • Tests response interception with continue_response method
  • Tests URL pattern filtering for response handlers
+200/-1 

@selenium-ci selenium-ci added C-py Python Bindings B-devtools Includes everything BiDi or Chrome DevTools related labels Oct 15, 2025
Copy link
Contributor

qodo-merge-pro bot commented Oct 15, 2025

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
Potential DoS via memory

Description: The get_data method returns raw network body bytes without enforcing size limits or
content-type checks, risking excessive memory usage if a large response is fetched or
unexpectedly large data is stored.
network.py [574-589]

Referred Code
if disown and collector_id is None:
    raise ValueError("Cannot disown data without specifying a collector")

params = {
    "dataType": data_type,
    "request": request_id,
    "disown": disown,
}

if collector_id is not None:
    params["collector"] = collector_id

cmd = command_builder("network.getData", params)
result = self.conn.execute(cmd)
return BytesValue.from_dict(result["bytes"])
Sensitive data exposure

Description: Tests assert on collected response data that may contain HTML content; if logs or reports
persist raw data this could expose sensitive content, suggesting need to sanitize or limit
stored data in testing outputs.
bidi_network_tests.py [196-233]

Referred Code
    time.sleep(2)

    assert len(responses) > 0

    response = responses[0]
    assert response.request_id is not None
    assert "formPage.html" in response.url
    assert response.status_code is not None

    driver.network.remove_response_handler("response_completed", callback_id)


# Integrated Tests: Response Handlers + Data Collection
def test_data_collection_with_response_handler(driver, pages):
    captured_responses = []
    collected_data = []

    # Add a data collector
    collector_id = driver.network.add_data_collector(data_types=["response"], max_encoded_data_size=50000)

    def response_callback(response: Response):


 ... (clipped 17 lines)
Ticket Compliance
🟡
🎫 #13993
🟢 Implement response handlers: addResponseHandler, removeResponseHandler,
clearResponseHandlers.
Support continue response functionality to modify intercepted responses.
Consider returning IDs from add methods to ease removal.
Provide consistent API patterns (Python uses snake_case).
Expose high-level BiDi network methods from driver.network across bindings.
Maintain request/authentication handlers parity already implemented in other languages.
🟡
🎫 #1234
🟡
🎫 #5678
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
No custom compliance provided

Follow the guide to enable custom compliance check.

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

Copy link
Contributor

qodo-merge-pro bot commented Oct 15, 2025

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix incorrect dictionary key usage

Remove the redundant and incorrect code in _on_response that corrupts the
self.callbacks dictionary by using event_name instead of callback_id as the key.

py/selenium/webdriver/common/bidi/network.py [451-454]

-if event_name in self.callbacks:
-    self.callbacks[event_name].append(callback_id)
-else:
-    self.callbacks[event_name] = [callback_id]
+# This block should be removed.
  • Apply / Chat
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies a critical bug where self.callbacks is corrupted, which would break functionality in other methods like remove_response_handler.

High
Unsubscribe from events when clearing handlers

In clear_request_handlers, send a session.unsubscribe command for each event
after removing its local handlers to prevent resource leaks from the browser.

py/selenium/webdriver/common/bidi/network.py [305-312]

-for event_name in self.subscriptions:
+events_to_unsubscribe = list(self.subscriptions.keys())
+for event_name in events_to_unsubscribe:
     for callback_id in self.subscriptions[event_name].copy():
         net_event = NetworkEvent(event_name)
         self.conn.remove_callback(net_event, callback_id)
         if callback_id in self.callbacks:
             self._remove_intercept(self.callbacks[callback_id])
             del self.callbacks[callback_id]
-self.subscriptions = {}
+    self.conn.execute(command_builder("session.unsubscribe", {"events": [event_name]}))
+self.subscriptions.clear()
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly points out that the refactored clear_request_handlers method fails to unsubscribe from browser events, which is a regression that causes resource leaks.

Medium
Unsubscribe from events when removing handlers

In remove_response_handler, send a session.unsubscribe command when the last
callback for an event is removed to prevent the browser from sending unnecessary
data.

py/selenium/webdriver/common/bidi/network.py [388-391]

 if event_name in self.subscriptions:
     self.subscriptions[event_name].remove(callback_id)
     if len(self.subscriptions[event_name]) == 0:
         del self.subscriptions[event_name]
+        self.conn.execute(command_builder("session.unsubscribe", {"events": [event_name]}))
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies a resource leak where the client does not unsubscribe from a browser event after the last handler is removed, improving resource management.

Medium
High-level
Simplify the data retrieval API

The get_data method should be simplified to return raw bytes instead of the
complex BytesValue object. This change would create a more intuitive API by
hiding the protocol's internal encoding details from the user.

Examples:

py/selenium/webdriver/common/bidi/network.py [554-588]
    def get_data(self, data_type: str, request_id: str, collector_id: str = None, disown: bool = False):
        """Retrieve network data for a request.

        Parameters:
        ----------
            data_type (str): The type of data to retrieve (e.g., "response").
            request_id (str): The request id to get data for.
            collector_id (str, optional): The collector id to use.
                Default is None.
            disown (bool, optional): Whether to disown the data from the collector.

 ... (clipped 25 lines)
py/test/selenium/webdriver/common/bidi_network_tests.py [218-231]
        data = driver.network.get_data("response", response.request_id, collector_id)
        collected_data.append({"request_id": response.request_id, "url": response.url, "data": data})

    # Add response handler
    handler_id = driver.network.add_response_handler("response_completed", response_callback)

    url = pages.url("formPage.html")
    driver.browsing_context.navigate(context=driver.current_window_handle, url=url, wait=ReadinessState.COMPLETE)

    # Wait for responses

 ... (clipped 4 lines)

Solution Walkthrough:

Before:

# In network.py
class Network:
    def get_data(...) -> BytesValue:
        # ...
        result = self.conn.execute(...)
        return BytesValue.from_dict(result["bytes"])

# In user code (e.g., tests)
def response_callback(response: Response):
    # User receives a complex BytesValue object
    data_object = driver.network.get_data(...)
    
    # User must know how to extract the actual data
    body_string = data_object.value
    body_bytes = data_object.to_bytes()
    assert "<title>We Leave From Here</title>" in body_string

After:

# In network.py
class Network:
    def get_data(...) -> bytes:
        # ...
        result = self.conn.execute(...)
        # The conversion happens inside the method
        bytes_value = BytesValue.from_dict(result["bytes"])
        return bytes_value.to_bytes()

# In user code (e.g., tests)
def response_callback(response: Response):
    # User receives raw bytes directly
    body_bytes = driver.network.get_data(...)
    
    # API is simpler and more intuitive
    assert b"<title>We Leave From Here</title>" in body_bytes
Suggestion importance[1-10]: 8

__

Why: This is a strong API design suggestion that improves usability by abstracting away underlying protocol details, which is a key goal for a high-level library.

Medium
Learned
best practice
Add safe removal guards

Guard removals by checking membership and normalize error types. Use membership
checks before list.remove and raise ValueError for unknown events.

py/selenium/webdriver/common/bidi/network.py [366-391]

 def remove_response_handler(self, event, callback_id):
-    """Remove a response handler from the network.
-
-    Parameters:
-    ----------
-        event (str): The event to unsubscribe from.
-        callback_id (int): The callback id to remove.
-    """
     try:
         event_name = self.EVENTS[event]
     except KeyError:
-        raise Exception(f"Event {event} not found")
+        raise ValueError(f"Unknown event: {event}")
 
     net_event = NetworkEvent(event_name)
-
     self.conn.remove_callback(net_event, callback_id)
 
-    # Remove intercept if it was created for this handler
-    if callback_id in self.callbacks and self.callbacks[callback_id] is not None:
-        self._remove_intercept(self.callbacks[callback_id])
-        del self.callbacks[callback_id]
+    intercept_id = self.callbacks.pop(callback_id, None)
+    if intercept_id:
+        self._remove_intercept(intercept_id)
 
-    if event_name in self.subscriptions:
+    if event_name in self.subscriptions and callback_id in self.subscriptions[event_name]:
         self.subscriptions[event_name].remove(callback_id)
-        if len(self.subscriptions[event_name]) == 0:
+        if not self.subscriptions[event_name]:
             del self.subscriptions[event_name]
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why:
Relevant best practice - Validate inputs and states early with precise checks to prevent logic errors, including guarding removal operations and mapping lookups.

Low
Add try/finally for cleanup

Ensure handlers and collectors are cleaned up even if assertions fail by
wrapping registration in try/finally. This prevents leaks affecting subsequent
tests.

py/test/selenium/webdriver/common/bidi_network_tests.py [209-234]

 def test_data_collection_with_response_handler(driver, pages):
     captured_responses = []
     collected_data = []
 
-    # Add a data collector
     collector_id = driver.network.add_data_collector(data_types=["response"], max_encoded_data_size=50000)
 
     def response_callback(response: Response):
         captured_responses.append(response)
         data = driver.network.get_data("response", response.request_id, collector_id)
         collected_data.append({"request_id": response.request_id, "url": response.url, "data": data})
 
-    # Add response handler
     handler_id = driver.network.add_response_handler("response_completed", response_callback)
+    try:
+        url = pages.url("formPage.html")
+        driver.browsing_context.navigate(context=driver.current_window_handle, url=url, wait=ReadinessState.COMPLETE)
+        time.sleep(2)
 
-    url = pages.url("formPage.html")
-    driver.browsing_context.navigate(context=driver.current_window_handle, url=url, wait=ReadinessState.COMPLETE)
+        assert len(captured_responses) > 0, "No responses captured"
+        assert "<title>We Leave From Here</title>" in collected_data[0]["data"].value
+    finally:
+        driver.network.remove_response_handler("response_completed", handler_id)
+        driver.network.remove_data_collector(collector_id)
 
-    # Wait for responses
-    time.sleep(2)
-
-    assert len(captured_responses) > 0, "No responses captured"
-    assert "<title>We Leave From Here</title>" in collected_data[0]["data"].value
-
-    driver.network.remove_response_handler("response_completed", handler_id)
-    driver.network.remove_data_collector(collector_id)
-
  • Apply / Chat
Suggestion importance[1-10]: 5

__

Why:
Relevant best practice - Enforce deterministic and safe resource handling in tests by ensuring cleanup of handlers and collectors even on failure using try/finally.

Low
  • Update

@navin772 navin772 requested review from cgoldberg and shbenzer and removed request for cgoldberg October 16, 2025 10:47
@titusfortner
Copy link
Member

I think data collectors should be managed for the user with the handler registration, let's discuss more.

@nvborisenko
Copy link
Member

It is low-level api, we should treat the spec as is.

@titusfortner
Copy link
Member

In general we'll want a high level network class accessible directly from the driver, and a network module implementation class, in a bidi namespace, likely(?) private API. I think my issue here is that this class includes implementation details that need to be one level deeper. Everything calling out to sockets or converting things to spec syntax should be one level lower.

First we must agree on the high level API, then we can figure out how to split things.

Copy link
Contributor

@shbenzer shbenzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @navin772

request_id = request_data.get("request")

# Create a Response object with the response data and request ID
response = Response(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need/want any of the other fields from the Bidi spec for Response object:

network.ResponseData = {
url: text,
protocol: text,
status: js-uint,
statusText: text,
fromCache: bool,
headers: [network.Header],
mimeType: text,
bytesReceived: js-uint,
headersSize: js-uint / null,
bodySize: js-uint / null,
content: network.ResponseContent,
?authChallenges: [
network.AuthChallenge],
}

https://www.w3.org/TR/webdriver-bidi/#type-network-ResponseData

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want every field. Is there any reason we wouldn't?

@shbenzer
Copy link
Contributor

Looks like this is your test failure

bidi_network_tests.py::test_data_collection_with_response_handler[edge] FAILED [ 85%]
_______________ test_data_collection_with_response_handler[edge] _______________

driver = <selenium.webdriver.edge.webdriver.WebDriver (session="f4c7eb792fcbb27cbdd7711f32e6a65d")>
pages = <conftest.pages.<locals>.Pages object at 0x7fa8d1d30ca0>

    def test_data_collection_with_response_handler(driver, pages):
        captured_responses = []
        collected_data = []
    
        # Add a data collector
        collector_id = driver.network.add_data_collector(data_types=["response"], max_encoded_data_size=50000)
    
        def response_callback(response: Response):
            captured_responses.append(response)
            data = driver.network.get_data("response", response.request_id, collector_id)
            collected_data.append({"request_id": response.request_id, "url": response.url, "data": data})
    
        # Add response handler
        handler_id = driver.network.add_response_handler("response_completed", response_callback)
    
        url = pages.url("formPage.html")
        driver.browsing_context.navigate(context=driver.current_window_handle, url=url, wait=ReadinessState.COMPLETE)
    
        # Wait for responses
        time.sleep(2)
    
        assert len(captured_responses) > 0, "No responses captured"
>       assert "<title>We Leave From Here</title>" in collected_data[0]["data"].value
E       AssertionError: assert '<title>We Leave From Here</title>' in ''
E        +  where '' = <selenium.webdriver.common.bidi.network.BytesValue object at 0x7fa8d1d32290>.value

py/test/selenium/webdriver/common/bidi_network_tests.py:258: AssertionError

@cgoldberg
Copy link
Member

@navin772 this is great work. I'll review it more closely soon. I am planning on building functionality for capturing HAR files, and this will be a key piece to enable that.

http://www.softwareishard.com/blog/har-12-spec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

B-devtools Includes everything BiDi or Chrome DevTools related C-py Python Bindings Review effort 3/5

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants