-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FYI: A couple of weird bugs in flex-compatibility mode #137
Comments
Thank you for the feedback. Let's get this sorted out.
I have never seen this problem on the test platforms (MaxOS intel, RHEL, Android, Cygwin, RPi Jesse). Did you check the versioning, so that a version-matching libreflexmin.a is used? The libs are built as default C++ (non-gnu, non-11 etc). I am still waiting on my new MacBook Pro M1, so I cannot try this out to see if I can replicate the problem, assuming it's related somehow to that platform. I do recall a recent minor change was made to fix a related problem, so you may want to use the latest version of RE/flex.
Interesting observation. Perhaps run valgrind and/or other memory tools? I've tested with memory sanitizers some time ago. Perhaps I'll do that again, just in case. My 30+ years of C and C++ experience before writing RE/flex came in handy to structure the code such that dynamic allocation is not difficult to verify as correct. The buffering mechanism with realloc is at the lowest-level and thus the least transparent. Other than that, there is not sufficient info here to make any conclusions. |
By the way, to bypass the realloc calls, you may want to set the buffer size large for a large window on the input. This should be large enough to hold the longest matching pattern. It does not need to be as large as the input files. The buffer is 64K by default. To change the buffer, use -DBUFSZ=nnn with appropriate nnn (>4096) to compile all source code (BUFSZ is used in absmatcher.h). |
I was able to do a quick check on my older Intel Mac mini (macOS Monterey 12.4):
|
I have more questions than answers, since there isn't much here to offer any clues. Questions like these:
|
Here are a few additional data points. Using a macOS BigSur 11.6.3 VM, with Homebrew downloaded reflex v3.2.7 & the bdwgc library v8.0.6
Using a Linux Ubuntu 22.04 LTS VM, bdwgc library v8.0.6-1.1build1 from the Ubuntu repository, reflex v3.2.7 compiled locally with the g++11 compiler (not in the Ubuntu repository)
There is something odd with the BDWGC library code in macOS. |
Responding to: I have more questions than answers (above) ...
4.1. is the issue affected by the choice of libreflex or libreflexmin?
4.2. Or building from source without libraries (reflex/lib files)?
5.1. what does valgrind report? Or...
6.1. try clang with -fsanitize=memory (s.b. =address) and/or -fsanitize=leak
|
Thanks for the details.
Looks like some internal error in the sanitizer. See also google/sanitizers#1468 I don't really see any issues here, except when BDWGC is used.
Looks like it. I don't know what else can be done. |
FYI: my last debugging efforts (later this week):
I'll report back anything interesting. |
I'm delinquent in reporting back on my debugging status with respect to these 2 'weird bugs'. Note: these ‘bugs’ are not specific to flex-compatibility mode. The 'weird bug 1' remains a mystery: The macOS Apple & (Homebrew) LLVM C++ compilers just do not generate (mangled) name references in some cases that are compatible with those generated by macOS (Homebrew) GNU C++ compilers for the RE/flex library. I don't see the same problem with Linux LLVM and GNU C++ compilers and the RE/flex library installed from the Ubuntu repository. On the other hand, 'weird bug 2' is a self-inflicted wound: My original approach to sliding the BDWGC garbage collector underneath Ox and its test cases worked with Flex, since there is not a Flex-specific library to be linked into the executables. However, I didn't originally notice that the RE/flex library member 'matcher.o' contains external references to both C memory allocation functions and C++ new/delete operators, so in particular, object 'AbstractMatcher::buf_' is being free'd by the BDWGC 'GC_free' function but allocated by the C 'posix_memalign' function. I cannot explicitly state why the segfault only occurs with optimized code, but I'm not exactly surprised, as I've seen analogous segfault situations in the past. I've replace my pre-processor approach with a replacement library for the C allocation functions that are just shells around calls to the corresponding BDWGC functions. |
Can we close this? |
SureSent from Tom’s iDeviceOn Mar 25, 2023, at 3:50 PM, Dr. Robert van Engelen ***@***.***> wrote:
Can we close this?
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: ***@***.***>
|
I'm running portability tests for an upcoming release of Ox (https://sourceforge.net/projects/ox-attribute-grammar-compiler/), and I've run into a couple of extremely weird bugs adding reflex as an option for several scanners using the '--yy' option for flex-compatibility mode. I'm using release 3.2.7.
To slip the BDWGC library underneath everything, preprocessor macros replace calls to the functions 'free', 'malloc', 'realloc' and 'posix_memalign' in the reflex library with calls to the corresponding BDWGC functions 'GC_free', 'GC_malloc', 'GC_realloc' and 'GC_posix_memalign'.
In an attempt to debug 2 above, I linked with a version of the BDWGC library configured with assertions enabled.
Under lldb, for the segfault case (reflex libraries compiled with '-O2'), 'GC_posix_memalign' is never called, and I get an assertion failure in the BDWGC 'GC_free' function, saying that the object 'AbstractMatcher::buf_' being freed (absmatcher.h, line 335) wasn't allocated by a BDWGC allocation function. For the 'works just fine' case, 'GC_posix_memalign' is called.
If I can figure out how to modify the code in absmatcher.h to sidestep the GNU g++-11 optimizer bug, I'll let you know. A simple change to assigning the result of the call to 'posix_memalign' to a local 'volatile int' doesn't make a difference.
The text was updated successfully, but these errors were encountered: