Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[🐛 Bug]: Port collision on linux causing chromedriver to crash #10974

Closed
h-arlt opened this issue Aug 25, 2022 · 9 comments
Closed

[🐛 Bug]: Port collision on linux causing chromedriver to crash #10974

h-arlt opened this issue Aug 25, 2022 · 9 comments

Comments

@h-arlt
Copy link
Contributor

h-arlt commented Aug 25, 2022

What happened?

In Selenium v3, port probing choosed from a safe range of ephemeral ports that explicitly excluded the well-known range of ephemeral ports used by the OS for port allocation. See also the JavaDoc of PortProber.java in Selenium v3.141.49:

  /*
   * Returns a port that is within a probable free range. <p/> Based on the ports in
   * http://en.wikipedia.org/wiki/Ephemeral_ports, this method stays away from all well-known
   * ephemeral port ranges, since they can arbitrarily race with the operating system in
   * allocations. Due to the port-greedy nature of selenium this happens fairly frequently.
   * Staying within the known safe range increases the probability tests will run green quite
   * significantly.
   */ 

Since Selenium v4, moreover since commit 294d1c9, this behavior has changed as range of ports to choose from is now the well-known range of ephemeral ports used by the OS. IMO this is not correct as the well-known range of ephemeral ports should be used for short-living connections only; the connection to the browser driver server is far away from being short-living.
Anyway, this change increases the probability to choose a port that is already used, e.g. a connection established by some other browser process.

Moreover, the port is not detected as already being used since the port is bound to another interface than the loopback device that is used by PortProber (method checkPortIsFree uses localhost which resolves to IP 127.0.0.1 - the IP of the loopback device)

You might say that the probability is quite low, so what. Well, we are using Selenium/WebDriver in our self-hosted monitoring solution that executes real-browser tests regularly. One of our customers execute tests every minute and we're faced with chromedriver crashes at least once a day due to port collisions since we upgraded to Selenium v4.

How can we reproduce the issue?

This issue is hard to reproduce as port probing makes uses of randomization and port allocation by the OS does as well.

Relevant log output

This is the console log of our self-hosted monitoring solution that makes use of Selenium/WebDriver to run real-browser tests.


-- Print all active connections --
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 127.0.0.1:40555         0.0.0.0:*               LISTEN      1442657/extension_a 
tcp        0      0 0.0.0.0:10001           0.0.0.0:*               LISTEN      1391/java           
tcp        0      0 0.0.0.0:9080            0.0.0.0:*               LISTEN      1392/java           
tcp        0      0 0.0.0.0:43481           0.0.0.0:*               LISTEN      1442599/java        
tcp        0      0 10.24.1.7:9080          10.24.1.1:56772         TIME_WAIT   -                   
tcp        0      0 10.24.1.7:58000         35.234.95.141:443       TIME_WAIT   -                   
tcp        0      0 127.0.0.1:42771         127.0.0.1:33804         TIME_WAIT   -                   
tcp        0      0 10.24.1.7:34570         10.85.11.131:8443       TIME_WAIT   -                   
tcp        0      0 127.0.0.1:42771         127.0.0.1:33820         TIME_WAIT   -                   
tcp        0      0 127.0.0.1:59250         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:59248         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:10001         127.0.0.1:59250         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:10001         127.0.0.1:59846         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:10001         127.0.0.1:59246         ESTABLISHED 1391/java           
tcp        0      0 10.24.1.7:34750         10.85.11.131:8443       TIME_WAIT   -                   
tcp        0      0 127.0.0.1:35129         127.0.0.1:39996         TIME_WAIT   -                   
tcp        0      0 127.0.0.1:39988         127.0.0.1:35129         TIME_WAIT   -                   
tcp        0      0 127.0.0.1:10001         127.0.0.1:59230         ESTABLISHED 1391/java           
tcp        0      0 10.24.1.7:45736         10.85.5.95:8086         TIME_WAIT   -                   
tcp        0      0 10.24.1.7:9080          10.24.1.1:56914         TIME_WAIT   -                   
tcp       32      0 10.24.1.7:53974         35.188.184.241:8443     CLOSE_WAIT  1442702/.com.google 
tcp        0      0 127.0.0.1:10001         127.0.0.1:59266         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:59264         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:10001         127.0.0.1:59260         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:50564         127.0.0.1:40555         ESTABLISHED 1442646/chromedrive 
tcp        0      0 10.24.1.7:37800         142.250.185.237:443     TIME_WAIT   -                   
tcp        0      0 127.0.0.1:33222         127.0.0.1:43481         ESTABLISHED 1442702/.com.google 
tcp        0      0 127.0.0.1:59246         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:59230         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:59236         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:59262         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:10001         127.0.0.1:59740         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:10001         127.0.0.1:59232         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:10001         127.0.0.1:59252         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:59256         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:10001         127.0.0.1:59544         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:59266         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:10001         127.0.0.1:59628         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:50548         127.0.0.1:40555         ESTABLISHED 1442646/chromedrive 
tcp       32      0 10.24.1.7:53970         35.188.184.241:8443     CLOSE_WAIT  1442702/.com.google 
tcp        0      0 127.0.0.1:59254         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:59260         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 10.24.1.7:57822         35.234.95.141:443       TIME_WAIT   -                   
tcp        0      0 127.0.0.1:10001         127.0.0.1:59848         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:10001         127.0.0.1:59864         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:50556         127.0.0.1:40555         ESTABLISHED 1442646/chromedrive 
tcp        0      0 127.0.0.1:40555         127.0.0.1:50564         ESTABLISHED 1442657/extension_a 
tcp        0      0 127.0.0.1:54708         127.0.0.1:44609         TIME_WAIT   -                   
tcp        0      0 127.0.0.1:42771         127.0.0.1:33810         TIME_WAIT   -                   
tcp        0      0 10.24.1.7:40164         172.217.23.109:443      ESTABLISHED 1442702/.com.google 
tcp        0      0 127.0.0.1:35129         127.0.0.1:39992         TIME_WAIT   -                   
tcp        0      0 127.0.0.1:59740         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:50560         127.0.0.1:40555         ESTABLISHED 1442599/java        
tcp        0      0 127.0.0.1:41819         127.0.0.1:55932         TIME_WAIT   -                   
tcp        0      0 127.0.0.1:59738         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:50316         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:35129         127.0.0.1:40000         TIME_WAIT   -                   
tcp        0      0 127.0.0.1:10001         127.0.0.1:59738         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:50558         127.0.0.1:40555         ESTABLISHED 1442599/java        
tcp        0      0 127.0.0.1:45218         127.0.0.1:60376         ESTABLISHED 1442599/java        
tcp        0      0 127.0.0.1:59252         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 10.24.1.7:57260         172.217.18.10:443       ESTABLISHED 1442702/.com.google 
tcp        0      0 127.0.0.1:10001         127.0.0.1:59256         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:59864         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:10001         127.0.0.1:59254         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:59846         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:10001         127.0.0.1:59264         ESTABLISHED 1391/java           
tcp        0     32 10.24.1.7:41116         95.216.114.221:443      FIN_WAIT1   -                   
tcp        0      0 127.0.0.1:10001         127.0.0.1:59234         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:10001         127.0.0.1:59640         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:10001         127.0.0.1:59844         ESTABLISHED 1391/java           
tcp       32      0 10.24.1.7:53968         35.188.184.241:8443     CLOSE_WAIT  1442702/.com.google 
tcp        0      0 10.24.1.7:57334         172.217.18.10:443       TIME_WAIT   -                   
tcp        0      0 127.0.0.1:59844         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 10.24.1.7:34576         10.85.11.131:8443       TIME_WAIT   -                   
tcp        0      0 10.24.1.7:37582         142.250.185.237:443     TIME_WAIT   -                   
tcp        0      0 10.24.1.7:53976         35.188.184.241:8443     ESTABLISHED 1442702/.com.google 
tcp        0      0 127.0.0.1:10001         127.0.0.1:59258         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:10001         127.0.0.1:59744         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:35129         127.0.0.1:39986         TIME_WAIT   -                   
tcp        0      0 127.0.0.1:40555         127.0.0.1:50560         ESTABLISHED 1442657/extension_a 
tcp        0      0 127.0.0.1:35129         127.0.0.1:39994         TIME_WAIT   -                   
tcp        0      0 10.24.1.7:57158         172.217.18.10:443       TIME_WAIT   -                   
tcp        0      0 127.0.0.1:10001         127.0.0.1:59248         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:45212         127.0.0.1:60376         TIME_WAIT   -                   
tcp        0      0 10.24.1.7:9080          10.24.1.1:56866         TIME_WAIT   -                   
tcp        0      0 10.24.1.7:9080          10.24.1.1:56912         TIME_WAIT   -                   
tcp        0      0 127.0.0.1:40555         127.0.0.1:50548         ESTABLISHED 1442657/extension_a 
tcp        0      0 127.0.0.1:59258         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:40555         127.0.0.1:50558         ESTABLISHED 1442657/extension_a 
tcp        0      0 10.24.1.7:34672         10.85.11.131:8443       TIME_WAIT   -                   
tcp        0      0 127.0.0.1:42771         127.0.0.1:33814         TIME_WAIT   -                   
tcp        0      0 127.0.0.1:50552         127.0.0.1:40555         TIME_WAIT   -                   
tcp        0      0 127.0.0.1:41961         127.0.0.1:48762         TIME_WAIT   -                   
tcp        0      0 127.0.0.1:59744         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp       32      0 10.24.1.7:53972         35.188.184.241:8443     CLOSE_WAIT  1442702/.com.google 
tcp        0      0 127.0.0.1:10001         127.0.0.1:59244         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:43481         127.0.0.1:33222         ESTABLISHED 1442599/java        
tcp        0      0 10.24.1.7:34662         10.85.11.131:8443       TIME_WAIT   -                   
tcp        0      0 127.0.0.1:40555         127.0.0.1:50556         ESTABLISHED 1442657/extension_a 
tcp        0      0 127.0.0.1:10001         127.0.0.1:59236         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:42771         127.0.0.1:33812         TIME_WAIT   -                   
tcp        0      0 127.0.0.1:59244         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:59544         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:59628         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:59848         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:10001         127.0.0.1:59262         ESTABLISHED 1391/java           
tcp        0      0 10.24.1.7:9080          10.24.1.1:56846         TIME_WAIT   -                   
tcp        0      0 10.24.1.7:9080          10.24.1.1:56844         TIME_WAIT   -                   
tcp       32      0 10.24.1.7:53978         35.188.184.241:8443     CLOSE_WAIT  1442702/.com.google 
tcp        0      0 10.24.1.7:34748         10.85.11.131:8443       TIME_WAIT   -                   
tcp        0      0 127.0.0.1:59232         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 127.0.0.1:59234         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp        0      0 10.24.1.7:9080          10.24.1.1:56770         TIME_WAIT   -                   
tcp        0      0 10.24.1.7:9080          10.24.1.1:56792         TIME_WAIT   -                   
tcp        0      0 127.0.0.1:10001         127.0.0.1:50316         ESTABLISHED 1391/java           
tcp        0      0 127.0.0.1:59640         127.0.0.1:10001         ESTABLISHED 1392/java           
tcp6       0      0 :::60376                :::*                    LISTEN      1442646/chromedrive 
tcp6       0      0 127.0.0.1:39629         127.0.0.1:34368         TIME_WAIT   -                   
tcp6       0      0 127.0.0.1:39629         127.0.0.1:34276         TIME_WAIT   -                   
tcp6       0      0 127.0.0.1:44609         127.0.0.1:54710         TIME_WAIT   -                   
tcp6       0      0 127.0.0.1:60376         127.0.0.1:45218         ESTABLISHED 1442646/chromedrive 
tcp6       0      0 127.0.0.1:44609         127.0.0.1:54758         TIME_WAIT   -                   
udp        0      0 224.0.0.251:5353        0.0.0.0:*                           1442657/extension_a 

-- Compile and execute test --
...
[INFO] --- maven-surefire-plugin:2.22.2:test (default-test) @ xxx ---
[INFO] 
[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running tests.HomePageTest
[14:13:53,201] INFO  [main] - Trying to load property file 'file:///xxxx/5c7659d72512f90016f0bd5e/config/default.properties'.
[14:13:53,207] INFO  [main] - Loading properties from file: file:///xxxx/5c7659d72512f90016f0bd5e/config/default.properties
[14:13:53,209] INFO  [main] - Trying to load property file 'file:///xxxx/5c7659d72512f90016f0bd5e/config/project.properties'.
[14:13:53,209] INFO  [main] - Loading properties from file: file:///xxxx/5c7659d72512f90016f0bd5e/config/project.properties
[14:13:53,283] INFO  [HomePageTest-0] - ####### Test 'tests.HomePageTest' started
[14:13:54,852] INFO  [HomePageTest-0] - Started new Xvfb process using display 41
Starting ChromeDriver 103.0.5060.134 (8ec6fce403b3feb0869b0732eda8bd95011d333c-refs/branch-heads/5060@{#1262}) on port 34672
Remote connections are allowed by an allowlist (127.0.0.1).
Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
IPv6 port not available. Exiting...
[1661350435.525][SEVERE]: bind() failed: Address already in use (98)
[14:13:55,567] INFO  [HomePageTest-0] - ####### Test 'tests.HomePageTest' finished after 2283 ms
[14:13:55,581] INFO  [HomePageTest-0] - Cleaning up ...
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 3.676 s <<< FAILURE! - in tests.HomePageTest
[ERROR] testVisitingHomepage(tests.HomePageTest)  Time elapsed: 2.753 s  <<< ERROR!
org.openqa.selenium.SessionNotCreatedException: Could not start a new session. Possible causes are invalid address of the remote server or browser start-up failure.
Caused by: org.openqa.selenium.WebDriverException: 
Driver server process died prematurely.0
Build info: version: '4.1.1', revision: 'e8fcc2cecf'
System info: host: 'some-host-name', ip: '10.24.1.7', os.name: 'Linux', os.arch: 'amd64', os.version: '5.10.109+', java.version: '11.0.16'
Driver info: driver.version: ChromeDriver


### Operating System

Ubuntu 20.02 and compatible (e.g. Linux Mint)

### Selenium version

Java 4.1.1

### What are the browser(s) and version(s) where you see this issue?

Chrome, Chromium 103.0.5060.134

### What are the browser driver(s) and version(s) where you see this issue?

chromedriver v103.0.5060.134

### Are you using Selenium Grid?

_No response_
@github-actions
Copy link

@h-arlt, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

@asolntsev
Copy link
Contributor

@h-arlt Do you have a vision how to fix this issue?
Is it a good idea to just revert commit 294d1c9?

I see one strange thing about this commit:

  • according to the commit message, the goal was to fix problem with too narrow range of port (in particular, on old Windows systems),
  • but in reality this commit changed the behaviour to the opposite: before PortProber picked a port OUTSIDE of ephemeral ports range, now it picks a port INSIDE of ephemeral ports range. Which behaviour is correct?

@h-arlt
Copy link
Contributor Author

h-arlt commented Aug 26, 2022

@asolntsev I completely agree with your findings/notes about this commit; it mixes up several things w/o any further explanation why those changes were made.
Thus, I'd revert this commit partially: restore port prober to the behavior of Selenium 3 where the ephemeral port range was excluded on linux BUT keep the original intention of the commit (fix problem with too narrow range of ports on Win). Staying away of the well-known range of ephemeral ports on Linux was done for a very good reason and AFAIK there aren't any cons in doing so.

We did never experience any issues regarding port collision in Selenium 3 but are faced with such issues in Selenium 4 at least once a day.

@asolntsev
Copy link
Contributor

Hi @tflori !
Can you please comment the commit 294d1c9?
Was it an intentional change that before PortProber picked a port OUTSIDE of ephemeral ports range, and now it picks a port INSIDE of ephemeral ports range?

@titusfortner
Copy link
Member

@oomelianchuk ooc, is neodymium active? have you added it to our list? https://docs.google.com/forms/d/e/1FAIpQLSdr21sc1j8a4yqq-TQnc6ATC4r7she2CuSSfZylvC_YOX3JFA/viewform
Thanks!

@diemol
Copy link
Member

diemol commented Aug 30, 2022

@asolntsev @h-arlt would you like to send a PR for this? We can review that together with @shs96c.

h-arlt pushed a commit to h-arlt/selenium that referenced this issue Aug 31, 2022
…crash

- let port prober choose from a safe range of ephemeral ports that explicitly exclude the well-known range of ephemeral ports used by the OS for port allocation (as was done in Selenium v3)
- keep fallback to IANA port range in case range of ephemeral ports is too low (less than 5k)
h-arlt pushed a commit to h-arlt/selenium that referenced this issue Aug 31, 2022
Let port prober choose from a safe range of ephemeral ports that
explicitly exclude the well-known range of ephemeral ports used by the
OS for port allocation as was done in Selenium v3.

Keep fallback to IANA port range in case range of ephemeral ports
is too low (less than 5k).

Fixes SeleniumHQ#10974
@h-arlt
Copy link
Contributor Author

h-arlt commented Aug 31, 2022

@diemol Feel free to take a look and adapt the code as necessary. As stated in the PR, I wasn't able to build Selenium Java and run the existing tests as the build process ate all of my free disk space 🤷

@diemol
Copy link
Member

diemol commented Aug 31, 2022

@h-arlt you can also use GitPod and try out the build there
https://github.com/SeleniumHQ/selenium#contribute-with-gitpod

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Nov 13, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants