tools: refactor protoxform to distinct transform/pretty-print stages.#10585
tools: refactor protoxform to distinct transform/pretty-print stages.#10585htuch merged 5 commits intoenvoyproxy:masterfrom
Conversation
This patch moves protoxform to a protoc plugin that is solely in the business of transforming FileDescriptorProtos to FileDescriptorProtos, taking into account package status and performing vN -> v(N+1) migrations. A distinct tool, protoprint.py, which has the guts of the former protoxform.py (basically no changes), is then invoked by proto_sync.py during source tree copy-back. The rationale for this is split is that proto_sync.py needs to be able to combine both the v2 next major version candidate and the hand edited v3 protos when generating the v3 shadow, once v3 becomes canonical. This is easiest to do when we operate on FileDescriptorProtos. Risk level: Low Testing: proto_format.sh (no changes). Signed-off-by: Harvey Tuch <htuch@google.com>
| label, '.next_major_version_candidate.envoy_internal.proto' | ||
| if shadow else '.next_major_version_candidate.proto'), dst_dir) | ||
| if shadow else '.next_major_version_candidate.proto')) | ||
| with mp.Pool() as p: |
There was a problem hiding this comment.
I had to resort to multiprocessing here to get reasonable performance. It isn't much slower than before on my workstation, but it might be slower on less powerful machines. Ultimately, when we move this all to Bazel native, this performance delta won't matter as we won't be rebuilding unchanged files, but that's post 1.14.0.
There was a problem hiding this comment.
I think this does affect performance on CI though?
There was a problem hiding this comment.
CI for format in this run was 12m 52s vs. 8m 59s for a recent PR involving proto changes (https://dev.azure.com/cncf/envoy/_build/results?buildId=36485&view=logs&j=9fd74226-df5b-5f29-39da-76fc341cc763).
It's a regression, but arguably a small fraction of total CI time. Not sure if there is a better way of tackling this without Bazelifying stuff, which is a major project that should happen after the v2 freeze. We need to be able to slice-and-dice the descriptors for the hand edited v3 and combine with v2 upgrade generated shadows somehow.. suggestions welcome :)
There was a problem hiding this comment.
Given issues we have had with non-deterministic behavior with multi-threading are you confident this is thread safe / deterministic?
Signed-off-by: Harvey Tuch <htuch@google.com>
| label, '.next_major_version_candidate.envoy_internal.proto' | ||
| if shadow else '.next_major_version_candidate.proto'), dst_dir) | ||
| if shadow else '.next_major_version_candidate.proto')) | ||
| with mp.Pool() as p: |
There was a problem hiding this comment.
Given issues we have had with non-deterministic behavior with multi-threading are you confident this is thread safe / deterministic?
Signed-off-by: Harvey Tuch <htuch@google.com>
|
@mattklein123 reasonably confident in the multiprocessing safety here, the reasoning is that each independent process is sourcing from an immutable location (the Bazel cache) and writing to an independent file in the temp directory. I moved the creation of the temp directory structure outside the multiprocessing to avoid potential conflict between |
@htuch Can you elaborate this a bit more? I can bazelify what protoprint does in this PR but not sure that works for the end goal. |
|
@lizan see what IMHO, we should ship this for freeze and work on Bazelifying after this. It's a lot easier to Bazelify if we have a PoC that shows how things should work. |
mattklein123
left a comment
There was a problem hiding this comment.
Agreed let's ship and iterate. Thank you!
This patch moves protoxform to a protoc plugin that is solely in the
business of transforming FileDescriptorProtos to FileDescriptorProtos,
taking into account package status and performing vN -> v(N+1)
migrations.
A distinct tool, protoprint.py, which has the guts of the former
protoxform.py (basically no changes), is then invoked by proto_sync.py
during source tree copy-back.
The rationale for this split is that proto_sync.py needs to be able
to combine both the v2 next major version candidate and the hand edited
v3 protos when generating the v3 shadow, once v3 becomes canonical. This
is easiest to do when we operate on FileDescriptorProtos.
Risk level: Low
Testing: proto_format.sh (no changes).
Signed-off-by: Harvey Tuch htuch@google.com