Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Hotfix][Testing] Wait for RPCServer to be established #9150

Merged
merged 1 commit into from
Sep 30, 2021
Merged

[Hotfix][Testing] Wait for RPCServer to be established #9150

merged 1 commit into from
Sep 30, 2021

Conversation

junrushao
Copy link
Member

@junrushao junrushao commented Sep 29, 2021

In unittests, we establish a "faked" RPC tracker/runner locally, but we forgot to wait until the server process is set up, which causes flakiness on mainline.

https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/1815/pipeline

Thanks @vinx13 for reporting! CC @zxybazh

Copy link
Contributor

@shingjan shingjan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM. Would 0.5s be enough?

@junrushao
Copy link
Member Author

@shingjan Yes it's far more than enough as I experimented with @zxybazh weeks ago, but I did some refactoring when upstreaming the codebase, which accidentally dropped this line...

@zxybazh
Copy link
Member

zxybazh commented Sep 29, 2021

LGTM. Thanks for the fix.

@tqchen
Copy link
Member

tqchen commented Sep 29, 2021

Interesting, under popen impl, we should have wait until we get the related fields(where socket already binds) https://github.com/apache/tvm/blob/main/python/tvm/rpc/tracker.py#L450

Same thing for server, so i wonder why wait is still needed

@tqchen
Copy link
Member

tqchen commented Sep 29, 2021

OK, answering my own Q, this might be needed for server to connect to tracker.

@junrushao
Copy link
Member Author

@tqchen right, the server needs some time to talk to the tracker

@junrushao junrushao merged commit 677f2d4 into apache:main Sep 30, 2021
AndrewZhaoLuo added a commit to AndrewZhaoLuo/tvm that referenced this pull request Sep 30, 2021
* main: (80 commits)
  Introduce centralised name transformation functions (apache#9088)
  [OpenCL] Add vectorization to cuda conv2d_nhwc schedule (apache#8636)
  [6/6] Arm(R) Ethos(TM)-U NPU codegen integration with `tvmc` (apache#8854)
  [microTVM] Add wrapper for creating project using a MLF (apache#9090)
  Fix typo (apache#9156)
  [Hotfix][Testing] Wait for RPCServer to be established (apache#9150)
  Update find cublas so it search default path if needed. (apache#9149)
  [TIR][LowerMatchBuffer] Fix lowering strides when source region has higher dimension than the buffer (apache#9145)
  Fix flaky NMS test by making sure scores are unique (apache#9140)
  [Relay] Merge analysis/context_analysis.cc and transforms/device_annotation.cc (apache#9038)
  [LLVM] Make changes needed for opaque pointers (apache#9138)
  Arm(R) Ethos(TM)-U NPU codegen integration (apache#8849)
  [CI] Split Integration tests out of first phase of pipeline (apache#9128)
  [Meta Schedule][M3b] Runner (apache#9111)
  Fix Google Mock differences between Ubuntu 18.04 and 16.04 (apache#9141)
  [TIR] add loop partition hint pragma (apache#9121)
  fix things (apache#9146)
  [Meta Schedule][M3a] SearchStrategy (apache#9132)
  [Frontend][PyTorch] support for quantized conv_transpose2d op (apache#9133)
  [UnitTest] Parametrized test_conv2d_int8_intrinsics (apache#9143)
  ...
@areusch
Copy link
Contributor

areusch commented Oct 4, 2021

it would be great for these fixes if, in the future, they could come with a comment explaining why we're adding sleep :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants