Skip to content
3 changes: 2 additions & 1 deletion DEVELOPER.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,10 @@ Below is a list of additional documentation to aid the development process:

- [Performance testing Envoy with `tcmalloc`/`pprof`](https://github.com/envoyproxy/envoy/blob/master/bazel/PPROF.md)

- [Porting Envoy and common porting issues](https://github.com/envoyproxy/envoy/blob/master/PORTING.md)

And some documents on components of Envoy architecture:

- [Envoy flow control](https://github.com/envoyproxy/envoy/blob/master/source/docs/flow_control.md)

- [Envoy's subset load balancer](https://github.com/envoyproxy/envoy/blob/master/source/docs/subset_load_balancer.md)

131 changes: 131 additions & 0 deletions PORTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
# Envoy Porting

* Linux (the original/baseline implementation)
* macOS port
* Windows Win32 API port

## Troubleshooting Common Porting Issues
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there some best practices or advice for developers on how now to break Windows? See for example our other thread on autoconf deps.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although it is largely secret-sauce, much of this goes back to K&R 1st edition. I've already added two specific links in the docs, one which covers the vast array of compilation differences between different compilers and the other which Microsoft themselves calls out the most commonly encountered quirks of their cl.exe compiler. We should add all such related and helpful links (rather than overburden what will become a long-enough document, as-is.)

Things like working around GNU autoconf-managed projects needs to be in an altogether different section than linkage compatibility, bazel compatibility etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And on the subject of 'examples' I do think that adding links to specific lines of commits where the specific code was fixed once upon a time so readers can see a before-and-after picture would be helpful to each of the particular problems. But getting this doc into everyone's hands as we kick off #11107 takes higher priority.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One worry I have is that the document is long, which you point out, and most folks won't have time to read. Ideally we have the takeaway bullet points early on or in some easy to scan for manner, e.g. call-out boxes, so that the relevant info can be ingested in a TL;DR world.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect it is a fraction of how long it may end up... I'm looking at this from github's document.md rendering for the first time, and we will absolutely format it for clarity and "skim-ability".


Between the use of gcc, clang, the optional use of libstdc++ vs different flavors of libc++,
and other platform specific compilers including SDK compilers for mobile, there are a number
of disagreements over what constitutes "correct" stdc++14 (the current baseline.) For a very
thorough reference, see;

* [Compiler stdc++ feature support chart](https://en.cppreference.com/w/cpp/compiler_support)

It is often useful to comparatively compile code fragments, and most developers don't have access
to a wide array of different compilers at hand. This tool helps porters evaluate how different
compilers (and even different revisions of the same compiler) compile the offending code
down to the assembly level;

* [Compiler Explorer](https://godbolt.org/)

Note that each port has its own development channel on (envoyproxy slack)[https://envoyproxy.slack.com/].
For example there are champions at `#envoy-windows-dev`, `#envoy-osx-dev` etc. who are happy to answer
quick questions and support contributors who are encountering architecture-specific CI failures. There are
also tags for issues and PR's such as `area/windows` which can help to raise issues to specific maintainers
of these ports.

### General Porting Issues with respect to Microsoft Visual C++ (cl.exe)

The LLVM clang compiler mirrors many of the gcc behaviors and quirks, resulting in two very
similar compilation results. This is not true of MSVC cl.exe, which has a number of unique quirks,
and deviations from non-standard clang/gcc quirks. Here are some links documenting these deviations
from ISO stdc++;

* [MSVC non-standard behaviors](https://docs.microsoft.com/en-us/cpp/cpp/nonstandard-behavior?view=vs-2019)

Microsoft documents their conformance fixes by Virtual Studio toolchain revision (the CI attempts
to keep up with the most recent release);

* [MSVC conformance improvements](https://docs.microsoft.com/en-us/cpp/overview/cpp-conformance-improvements?view=vs-2019)

Note that Envoy project has elected to distribute a monolithic envoy-static.exe including
all components compiled into the single executable. This includes the libcmt (C Runtime)
and cpp libraries; all code is compiled with /MT (or /MTd for debug runtime). Do not inject
the /nodefaultlib flag to the linker; this masks errors in the compilation phase and also
mixes dynamic and static runtime implementation macros in the resulting binary.

## Specific Coding Issues

Most porting issues can be summarized in a handful of assumptions to be avoided.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think would make sense to add some subheaders here and split into these broad categories:
e.g.

  1. C++ standards
  2. Word size
  3. C Runtime
  4. System calls

The scenario that I am thinking is that a contributor might want to skim-through the doc while having a specific issue and some subheaders might make it more easier to find the relevant paragraph or section to their problem

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I believe we are going to need further sub-headings. What the logical groupings are, I haven't worked out and welcome patches and suggestions.


The size of an 'int' or 'long' integer on Windows is 32 bits, while pointers are 64 bits
(we refer to this as a 64P architecture.) On Linux, OS/X and other *nix variants, the 'int',
'long' and pointer types are all 64 bits (64ILP architectures).
This can prove frustrating to contributors, as the Envoy project prefers explicit-size types
such as int64_t, uint32_t etc. but most coders default to using 'int' and similar in loops
and other local variables.

The Standard C++ library interfaces must conform to the actual size of the platform's 'int',
'long', 'ptrdiff_t' and similarly declared types. The use of 'auto' is often helpful but cannot
resolve this transition between C++ definitions and Envoy explicit-sized types.

The Windows MSVC compiler began but never completed a proper POSIX implementation, and this
has not been enhanced in over two decades. Many commonly used POSIX data types and functions
aren't available, or differ in name or implementation details. A prime example is the absence
of the POSIX, non-ISO/C++ type 'ssize_t' on Windows. The ISO/C++ 'ptrdiff_t' type is available,
and part of the language standard. Use 'ptrdiff_t' in place of 'ssize_t', and do not presume
that this value fits in a 32-bit `int`.

The Microsoft compiler performs K&R preprocessor line evaluation, so it is not possible to
embed `#if`/`#endif` and other preprocessor evaluation within the argument list of a macro
expression.

The Microsoft compiler has a limit of 2k characters in any single literal string expression
such as R"" multi-line expressions. However, this applies only to each individual token, so
string concatenation needs to be used to merge multiple string expressions into very large
string constants.

Alternative operator representations such as 'not', 'and', 'compl' etc. which are alternatives
to '!', '&&', '~' etc. which had been ISO standard workarounds for very limited character sets and
locale-specific keyboards are still supported by gcc and clang, but were never supported by the
Microsoft compiler and are now deprecated by the language spec.

Whenever platform-specific header files are needed, include the envoy/common/platform.h file to pick
up the most common inclusions. This file makes a number of adjustments to define common constructs
that may be missing on a particular environment. This file also adjusts for system file misbehavior.
(For example, the Windows.h header file #define's many common words that break even the simplest C++
source code; platform.h reverts some of this damage).

The portable implementations of FilesystemImpl, IoSocketImpl and low-level OsSysCallsImpl using
os_fd_t descriptors and the error constants defined in `platform.h` should be used in favor of raw
POSIX APIs when authoring tests.

## Dependency behavioral differences

The libraries that Envoy depends on will exhibit different behaviors when built for Windows
versus Linux versus BSD architectures. In some cases, Envoy has patched the library to either
correct misbehavior, or to make the library follow expected linux conventions more closely,
and every attempt is made to push such changes as optional features to upstream. Some notable
discrepancies include;

The googletest library has a number of Windows-specific quirks. At the time of this writing,
this library is incompatible with the LLVM clang-cl. It uses the POSIX re (regular expression)
system library by default. There is an option to use the PCRE library instead, which the
project has not adopted. On Windows, it uses a massively simplified expression parser which
is missing support for most common Re expressions. (The solution to this, substituting the
Google Re library, is proposed but has not been adopted by the googletest project yet).

The opentracing library is missing a Windows build schema (it relies entirely on automake),
so all tracing-related Envoy extensions are disabled on Windows.

## Bazel behavior differences

There are several circumstances that bazel will end up misbehaving in one environment or another.
For example, during RBE CI builds, the environment, paths, mounts etc. may have longer paths or
otherwise hit limits that aren't observed on the contributor's environment.

Similarly, when building on Windows the tools themselves have different path or command line length
restrictions than on the typical Linux/clang tool chain. These limits include;

* The individual path elements for -Inclusion (this restricts the total path name lengths available to the project).
* The length of the embedded command line stored in the debug info record of the COFF object file (48kb on Windows).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can developers action on this to prevent running into this situation?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We haven't worked out that solution yet. It's noted here in case Windows developers are observing strange debugging behavior, it isn't something we expect to see in CI. The way that Bazel assembles works with insane lists of inclusions suggests we won't be able to work around this. We can probably pull the comment if we don't see issues in CI or from developers, since there isn't anything that Envoy itself can realistically do.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self to inject bazel advice to Windows builders here, as mentioned in #12098


Note that the rules_foreign_cc macros for bazel helps to ensure compilation is consistent across
architectures. For the greatest portability, we rely on native bazel BUILD rules, or CMakeLists.txt
input to CMake which is handled somewhat painlessly by rules_foreign_cc.

Many tests rely on command line scripting or tool invocation. Bazel and Envoy rely heavily on bash scripts, executing
on Windows via msys2. Inherent discrepancies between the msys2 execution environment and a typical bash shell can
cause confusing errors. Be sure to use cmd.exe on Windows in any test scripts that intend to create symlinks.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a separate section for testing and add there:

  1. cmd for symlinks
  2. googletest regex
  3. googletest nice mocks on base classes

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sub-sections, yes as noted above.