Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[new release] zmq (4 packages) (5.3.0) #25513

Merged
merged 2 commits into from
Apr 8, 2024

Conversation

andersfugmann
Copy link
Contributor

OCaml bindings for ZeroMQ 4.x

CHANGES:

@andersfugmann andersfugmann marked this pull request as draft March 16, 2024 19:03
@andersfugmann andersfugmann marked this pull request as ready for review March 16, 2024 19:10
@mseri
Copy link
Member

mseri commented Mar 19, 2024

I don't remember this issue on opensuse in the past, do you know what may cause it?

#=== ERROR while compiling zmq-async.5.3.0 ====================================#
# context              2.2.0~beta2~dev | linux/x86_64 | ocaml-base-compiler.4.14.1 | pinned(https://github.com/issuu/ocaml-zmq/releases/download/5.3.0/zmq-5.3.0.tbz)
# path                 ~/.opam/4.14/.opam-switch/build/zmq-async.5.3.0
# command              ~/.opam/opam-init/hooks/sandbox.sh build dune build -p zmq-async -j 255 @install @runtest
# exit-code            1
# env-file             ~/.opam/log/zmq-async-7-db8161.env
# output-file          ~/.opam/log/zmq-async-7-db8161.out
### output ###
# (cd _build/default/zmq-async/test && ./test.exe)
# .......
# Ran: 7 tests in: 1.08 seconds.
# OK
# File "zmq/test/dune", line 5, characters 0-76:
#  5 | (rule
#  6 |  (alias runtest)
#  7 |  (deps
#  8 |   (:test test.exe))
#  9 |  (action
# 10 |   (run %{test})))
# (cd _build/default/zmq/test && ./test.exe)
# ......F......
# ==============================================================================
# Failure: Zmq:0:zmq test:6:monitor
# 
# Wrong event received on m1
# expected: Monitor_stopped: ipc://monitor_socket but got: No event received
# ------------------------------------------------------------------------------
# Ran: 13 tests in: 2.13 seconds.
# FAILED: Cases: 13 Tried: 13 Errors: 0 Failures: 1 Skip:  0 Todo: 0 Timeouts: 0.
# Error: Context not closed before finalization
# Error: Socket not closed before finalization
# Error: Socket not closed before finalization
# Error: Socket not closed before finalization
# Warning: Invalid_argument: Unknown event type

"dune" {>= "2.7"}
"ocaml" {>= "4.04.1"}
"zmq" {= version}
"eio" {>= "0.10"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be >= 0.10 < 1.0? In case of breaking changes.

Copy link
Member

@mseri mseri Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try to avoid preventive upper bounds: if we see a breakage when eio 1.0 is released, we will add the upper bound manually.

In fact, this has already been built and tested by the CI with eio 1.0 without any problem

@mseri
Copy link
Member

mseri commented Mar 21, 2024

Other surprising failures on linux:


#=== ERROR while compiling zmq-async.5.1.5 ====================================#
# context              2.2.0~beta2~dev | linux/x86_64 | ocaml-base-compiler.4.14.1 | file:///home/opam/opam-repository
# path                 ~/.opam/4.14/.opam-switch/build/zmq-async.5.1.5
# command              ~/.opam/opam-init/hooks/sandbox.sh build dune build -p zmq-async -j 31 @install @runtest
# exit-code            1
# env-file             ~/.opam/log/zmq-async-7-1a91c0.env
# output-file          ~/.opam/log/zmq-async-7-1a91c0.out
### output ###
# File "zmq/test/dune", line 5, characters 0-76:
#  5 | (rule
#  6 |  (alias runtest)
#  7 |  (deps
#  8 |   (:test test.exe))
#  9 |  (action
# 10 |   (run %{test})))
# (cd _build/default/zmq/test && ./test.exe)
# ......F......
# ==============================================================================
# Failure: Zmq:0:zmq test:6:monitor
# 
# Wrong event received on m1
# expected: Accepted: ipc://monitor_socket but got: No event received
# ------------------------------------------------------------------------------
# Ran: 13 tests in: 2.13 seconds.
# FAILED: Cases: 13 Tried: 13 Errors: 0 Failures: 1 Skip:  0 Todo: 0 Timeouts: 0.
# Thread 1 killed on uncaught exception Zmq.ZMQ_exception(2, "Context was terminated")
# Raised at Zmq.zmq_raise in file "zmq/src/zmq.ml", line 772, characters 2-11
# Called from Zmq.Proxy.create in file "zmq/src/zmq.ml", line 558, characters 14-41
# Called from Zmq.Proxy.create in file "zmq/src/zmq.ml", line 558, characters 14-41
# Called from Dune__exe__Zmq_test.test_proxy.proxy in file "zmq/test/zmq_test.ml", line 151, characters 6-31
# Called from Thread.create.(fun) in file "thread.ml", line 49, characters 8-14
# Error: Context not closed before finalization
# Error: Socket not closed before finalization
# Error: Socket not closed before finalization
# Error: Socket not closed before finalization
# Error: Socket not closed before finalization
# (cd _build/default/zmq-async/test && ./test.exe)
# .......
# Ran: 7 tests in: 0.36 seconds.
# OK
#=== ERROR while compiling zmq-async.5.1.0 ====================================#
# context              2.2.0~beta2~dev | linux/x86_64 | ocaml-base-compiler.4.14.1 | file:///home/opam/opam-repository
# path                 ~/.opam/4.14/.opam-switch/build/zmq-async.5.1.0
# command              ~/.opam/opam-init/hooks/sandbox.sh build dune runtest -p zmq-async -j 31
# exit-code            1
# env-file             ~/.opam/log/zmq-async-7-755f86.env
# output-file          ~/.opam/log/zmq-async-7-755f86.out
### output ###
# File "zmq/test/dune", line 5, characters 0-72:
# 5 | (alias
# 6 |  (name runtest)
# 7 |  (deps (:test test.exe))
# 8 |  (action (run %{test})))
# (cd _build/default/zmq/test && ./test.exe)
# ......Thread 1 killed on uncaught exception Zmq.ZMQ_exception(2, "Context was terminated")
# Raised at Zmq.zmq_raise in file "zmq/src/zmq.ml", line 750, characters 2-11
# Called from Zmq.Proxy.create in file "zmq/src/zmq.ml", line 554, characters 14-41
# Called from Zmq.Proxy.create in file "zmq/src/zmq.ml", line 554, characters 14-41
# Called from Zmq_test.test_proxy.proxy in file "zmq/test/zmq_test.ml", line 147, characters 6-31
# Called from Thread.create.(fun) in file "thread.ml", line 49, characters 8-14
# double free or corruption (out)
# (cd _build/default/zmq-async/test && ./test.exe)
# .......
# Ran: 7 tests in: 0.38 seconds.
# OK

They seem similar to the opensuse one. I don't remember seeing these in the past, can you have a look?

@raphael-proust
Copy link
Contributor

These errors seem like scheduling errors. I've seen those happen in other projects: some test scenario is written to check the behaviour of some small program, but the test is often overly specialised to the dev machine and/or the project's CI machine, and then it fails when running on the opam-repository CI because the machines are specced very differently.

I'd suggest to try to run the tests on more exotic machines, on machines that are very busy with some high-CPU-usage process running concurrently, etc. to reproduce locally.

Note that often, the issue is just with the test (it's too specific in its expectations) rather than the code.

@mseri
Copy link
Member

mseri commented Apr 3, 2024

Ping @andersfugmann
Do you plan to look into those scheduling issues or you think they are safe to ignore?

@andersfugmann
Copy link
Contributor Author

Sorry for the delay.
The tests are somewhat fragile, and sometimes fail, but I've not been able to understand under which conditions.
I believe the test failures are safe to ignore, as the changes released does not contain core logic changes, but just adds the eio bindings.

@andersfugmann
Copy link
Contributor Author

I don't remember this issue on opensuse in the past, do you know what may cause it?

#=== ERROR while compiling zmq-async.5.3.0 ====================================#
# context              2.2.0~beta2~dev | linux/x86_64 | ocaml-base-compiler.4.14.1 | pinned(https://github.com/issuu/ocaml-zmq/releases/download/5.3.0/zmq-5.3.0.tbz)
# path                 ~/.opam/4.14/.opam-switch/build/zmq-async.5.3.0
# command              ~/.opam/opam-init/hooks/sandbox.sh build dune build -p zmq-async -j 255 @install @runtest
# exit-code            1
# env-file             ~/.opam/log/zmq-async-7-db8161.env
# output-file          ~/.opam/log/zmq-async-7-db8161.out
### output ###
# (cd _build/default/zmq-async/test && ./test.exe)
# .......
# Ran: 7 tests in: 1.08 seconds.
# OK
# File "zmq/test/dune", line 5, characters 0-76:
#  5 | (rule
#  6 |  (alias runtest)
#  7 |  (deps
#  8 |   (:test test.exe))
#  9 |  (action
# 10 |   (run %{test})))
# (cd _build/default/zmq/test && ./test.exe)
# ......F......
# ==============================================================================
# Failure: Zmq:0:zmq test:6:monitor
# 
# Wrong event received on m1
# expected: Monitor_stopped: ipc://monitor_socket but got: No event received
# ------------------------------------------------------------------------------
# Ran: 13 tests in: 2.13 seconds.
# FAILED: Cases: 13 Tried: 13 Errors: 0 Failures: 1 Skip:  0 Todo: 0 Timeouts: 0.
# Error: Context not closed before finalization
# Error: Socket not closed before finalization
# Error: Socket not closed before finalization
# Error: Socket not closed before finalization
# Warning: Invalid_argument: Unknown event type

I have not been able to track down why this happens. It seems that opensuse distributes a non-standard version of libzmq, as it delivered unknown (undocumented) events at times.

I think the opensuse test failures are safe to ignore for now.

@andersfugmann
Copy link
Contributor Author

Other surprising failures on linux:


#=== ERROR while compiling zmq-async.5.1.5 ====================================#
# context              2.2.0~beta2~dev | linux/x86_64 | ocaml-base-compiler.4.14.1 | file:///home/opam/opam-repository
# path                 ~/.opam/4.14/.opam-switch/build/zmq-async.5.1.5
# command              ~/.opam/opam-init/hooks/sandbox.sh build dune build -p zmq-async -j 31 @install @runtest
# exit-code            1
# env-file             ~/.opam/log/zmq-async-7-1a91c0.env
# output-file          ~/.opam/log/zmq-async-7-1a91c0.out
### output ###
# File "zmq/test/dune", line 5, characters 0-76:
#  5 | (rule
#  6 |  (alias runtest)
#  7 |  (deps
#  8 |   (:test test.exe))
#  9 |  (action
# 10 |   (run %{test})))
# (cd _build/default/zmq/test && ./test.exe)
# ......F......
# ==============================================================================
# Failure: Zmq:0:zmq test:6:monitor
# 
# Wrong event received on m1
# expected: Accepted: ipc://monitor_socket but got: No event received
# ------------------------------------------------------------------------------
# Ran: 13 tests in: 2.13 seconds.
# FAILED: Cases: 13 Tried: 13 Errors: 0 Failures: 1 Skip:  0 Todo: 0 Timeouts: 0.
# Thread 1 killed on uncaught exception Zmq.ZMQ_exception(2, "Context was terminated")
# Raised at Zmq.zmq_raise in file "zmq/src/zmq.ml", line 772, characters 2-11
# Called from Zmq.Proxy.create in file "zmq/src/zmq.ml", line 558, characters 14-41
# Called from Zmq.Proxy.create in file "zmq/src/zmq.ml", line 558, characters 14-41
# Called from Dune__exe__Zmq_test.test_proxy.proxy in file "zmq/test/zmq_test.ml", line 151, characters 6-31
# Called from Thread.create.(fun) in file "thread.ml", line 49, characters 8-14
# Error: Context not closed before finalization
# Error: Socket not closed before finalization
# Error: Socket not closed before finalization
# Error: Socket not closed before finalization
# Error: Socket not closed before finalization
# (cd _build/default/zmq-async/test && ./test.exe)
# .......
# Ran: 7 tests in: 0.36 seconds.
# OK
#=== ERROR while compiling zmq-async.5.1.0 ====================================#
# context              2.2.0~beta2~dev | linux/x86_64 | ocaml-base-compiler.4.14.1 | file:///home/opam/opam-repository
# path                 ~/.opam/4.14/.opam-switch/build/zmq-async.5.1.0
# command              ~/.opam/opam-init/hooks/sandbox.sh build dune runtest -p zmq-async -j 31
# exit-code            1
# env-file             ~/.opam/log/zmq-async-7-755f86.env
# output-file          ~/.opam/log/zmq-async-7-755f86.out
### output ###
# File "zmq/test/dune", line 5, characters 0-72:
# 5 | (alias
# 6 |  (name runtest)
# 7 |  (deps (:test test.exe))
# 8 |  (action (run %{test})))
# (cd _build/default/zmq/test && ./test.exe)
# ......Thread 1 killed on uncaught exception Zmq.ZMQ_exception(2, "Context was terminated")
# Raised at Zmq.zmq_raise in file "zmq/src/zmq.ml", line 750, characters 2-11
# Called from Zmq.Proxy.create in file "zmq/src/zmq.ml", line 554, characters 14-41
# Called from Zmq.Proxy.create in file "zmq/src/zmq.ml", line 554, characters 14-41
# Called from Zmq_test.test_proxy.proxy in file "zmq/test/zmq_test.ml", line 147, characters 6-31
# Called from Thread.create.(fun) in file "thread.ml", line 49, characters 8-14
# double free or corruption (out)
# (cd _build/default/zmq-async/test && ./test.exe)
# .......
# Ran: 7 tests in: 0.38 seconds.
# OK

They seem similar to the opensuse one. I don't remember seeing these in the past, can you have a look?

They do indeed. I think the problem is that events sent to the zmq socket monitor changes with each release of zmq. I'm close to just rewrite the whole monitor test to verify that we get some events at all to better tolerate different versions of zmq.

I think we can safely ignore the errors for now. The challenge is of course when testing and verifying updated to any dependencies.

Is there a way to detect is tests are run as part of opam CI (e.g. an environment variable) we can depend on to disable the specific monitor test? I could create a new release with a much simpler version of the monitor test to have less flaky tests.

@mseri
Copy link
Member

mseri commented Apr 8, 2024

Is there a way to detect is tests are run as part of opam CI (e.g. an environment variable) we can depend on to disable the specific monitor test? I could create a new release with a much simpler version of the monitor test to have less flaky tests.

The following is set in the environments:

OPAM_REPO_CI="true"

@mseri mseri merged commit 8ad29f4 into ocaml:master Apr 8, 2024
1 of 2 checks passed
@mseri
Copy link
Member

mseri commented Apr 8, 2024

Thanks!

@andersfugmann andersfugmann deleted the release-zmq-5.3.0 branch April 8, 2024 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants