-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stack overflow in the garbage collector #21
Comments
It's obvious why the stack overflow occurs. The Cheeney-style copying garbage collectors I've seen in Bezanson's projects have always suffered from the foolish bug that the relocate() function calls itself recursively when it relocates the car part of a lisp pair, and also when it relocates the final cdr part of the lisp pair at the end of a list. This has always been an unfortunate blemish on all the otherwise very elegant and intelligent code he so graciously has published over the years. static value_t relocate(value_t v) |
How many projects are we talking about here? |
Jeff wrote:
The main FemtoLisp project and all its so-called tiny variants, e.g., "lisp.c" and "lisp2.c" |
Ok, so one project. |
I would be happy to see a pull request fixing this. I do think there are trade-offs though --- we would need to use more heap space, and it might be slower, and car-deep structures are pretty rare in real code. Of course it could be a build-time option if necessary. |
Jeff wrote:
Your GC code aspires to be is a simple copying collector, which is a perfectly fine choice, in particular for the tiny variants. You have just misunderstood how such a simple copying GC works, so you have conflated copying and reference-updating. A correct implementation is at least as simple and elegant as what you have, and it won't even take an hour's work to rewrite your code. For convenience, here is a link: https://en.wikipedia.org/wiki/Cheney's_algorithm Please note that this is not a different algorithm. This is what your code always has aspired to implement: A simple copying garbage collector. Your existing code just suffers from a misunderstanding about how it's done. |
It seems to me the difference is breadth-first vs. depth-first order. Yes, I see breadth-first order might be better. |
Jeff wrote:
It's not just a question about being better, it's a question about being correct. When it's done the right way, there is no recursion. Put differently, there is no drain on stack space, so stack overflow is not an issue. The "might" in your sentence indicates that you haven't fully understood the crucial difference yet. When you've had the time to let it sink in, I'm sure you'll realize how easy and quickly you can fix it, both for the benefit of the users, but also for the algorithmic beauty, which is one of the rewards for the (us) programmers. Your projects are so good that they surely deserve a correctly implemented and elegant garbage collector! |
Indeed I assumed that Cheney's algorithm would be used as it is very much compatible with the "simple, interesting & fast" philosophy of Femtolisp. I was going to just implement it myself and make a PR but then I realized that
|
Nilern wrote:
Ah, the missing object headers is indeed a nuisance, but from the top of my head, I'd say that it's still doable in an easy manner. All it takes is, that you ensure that all objects on the heap are 8-byte aligned. That buys you the needed extra tag-value to signal that an object in the old heap contains a forward pointer after it has been relocated. nilern wrote:
FemtoLisp is a small and elegant Lisp. It's cannot be - and is not meant to be - particularly efficient, so this is an unimportant detail. |
Brian wrote:
I have only looked at the very small FemtoLisp versions (I cannot remember if there is a larger one, more geared toward production mode), and here it's not even necessary with an extra tag-value for a forward pointer. The only objects on the heap are cons cells. This means that an untagged pointer (as if it has the tag "TAG_NUM") can be used to signal that an object in the old heap contains a forward pointer. A pleasant side effect is, that it makes the handling of the forward pointers as efficient as it possibly can be. |
It's not particularly inefficient either. Just thought to mention that the breadth-first algorithm is not strictly superior.
Indeed there is a main production version and it has vectors and C interop values and whatnot.
Forwarding pointers are already implemented more or less like that. The problem is finding the start of the next object in the tospace queue. If there were just cons cells we could just bump a value_t pointer by two. When there are many types their sizes have to be stored on the heap in some uniform manner and the start of an object has to be distinguishable from alignment holes if some objects are more than word-aligned. Of course there is also the Big Bag of Pages arrangement (i.e. group the objects by type so headers are not needed) but I really don't think we want to go there. |
nilern wrote:
It's not a problem to find the the start of the next object, and you don't need to store object sizes for all objects types. It's easy and efficient to use a lookup table indexed by object type, reserving the return value "length = 0" to represent "variable length object type". All variable sized objects on the heap contain their length, or they contain enough information to compute the length on the fly, including alignment. Strictly speaking, the length information stored in variable sized objects don't need to have their lengths stored in a uniform manner. Uniformity is more a matter of convenience for the programmer. |
Yes, if we have the type at hand. But the Cheney scan pointer cannot be a tagged pointer, so we don't have the tag bits of the current object, so there is nothing to index with; the type is unknown. Also if there are uninitialized alignment holes we can't get the alignment of the next object because we don't know where it starts so we can't hop over the hole to its start to get the alignment...
"Uniform" was a poor choice of words. What I meant is that the heap contains enough "reflection" information for the GC to traverse it in order of addresses instead of following pointers. The specific term is "heap parsability". Actual uniformity does also have some performance advantage but I don't know how significant that is, especially in this context. |
nilern wrote:
That's not true. You do have the tag bits of the current object. An object copied into the new heap is complete, but its pointers to other objects will first be updated when the object is scanned. There is nothing difficult in implementing a Lisp heap and using Cheney's algorithm for the garbage collection. I'm sure that the necessary minor modifications can be accomplished in a day's work or so. Nilern wrote:
Alignment is no problem. There are no gaps. It's the responsibility of variable-sized objects always to allocate memory up to alignment, if necessary. If the implementation doesn't do that already, then it's your job to complete the task. If it makes things easier and more efficient, then add a field with the total length to the variable-sized objects. These objects has a size where is doesn't matter to add an extra field. nilern wrote:
Uniformity in itself doesn't matter for performance, but it may make things easier for the programmer, and if there is one efficient method that works well for all object types, then there is no reason to do it differently. But strictly speaking, since there are individual code lumps for scanning each object type anyway, then object type A is welcome to use one method for computing or storing its length, while object type B use a different strategy. |
Brian wrote:
Ah, You're right nivern. I woke up this morning finally understanding what you say. Your initial description of the problem sent me in the wrong direction by saying that the problem is the omission of object header fields. It's perfectly all right to omit an explicit object header field, also for a copying garbage collector like Cheney's algorithm. All it takes is, of course, that the heap still can be parsed, as you also point out. To ensure that the heap can be parsed, the program should store the object type in form of tag bits in the first field of the object. This is what I - very naturally - assumed the program was doing because that's necessary in order to make things work with a copying garbage collector. This is the background for all what I have written until now. There is room for the type tag in the low bits of the first field of each object, and in effect this works as a minimal object header, without adding to the memory footprint. So what you're pointing out is, that FemtoLisp has made the design mistake not to tag the objects, but to tag pointers to objects. That's a near-fatal mistake when the intention is to use a copying garbage collector. As I see it, there are two choices to turn FemtoLisp into a usable (i.e., non-crashing) Lisp implementation: Change the tagging, so it tags objects instead of the pointers to the objects, thereby making is a clean and straightforward design. This modification is probably not as complicated as it may sound. The "is-object-type?" functions and the "extract-pointer" functions are already isolated in macros. It's funny: This operation confirms the old saying, that there is nothing in programming that cannot be solved by adding an extra layer of indirection :-) While garbage-collecting, use a vector with one-byte-per-heap-lisp-cell to store the type of the relocated object starting at that cell, if any. The vector is nor needed anymore after the garbage collection. Maybe this is the easiest and most safe way to make the program work, without changing anything else in the program, but according to my taste, it's a less attractive alternative because of the lack of elegance. |
The tag bits of the first field are already used for the type of the field value. The object type does not need to be stored in the heap to use a copying collector. All that is needed is a way to convert a heap object into a "broken heart" which stores a forwarding pointer and can be identified as such by the GC. Femtolisp does as much and implements copying collection just fine (except for the stack overflow).
I would not call it a mistake. It saves space and makes type tests faster. But it does severely limit the choice of GC algorithms.
Nevertheless it is quite a fundamental change. These things are also subtle and tedious to debug.
That feels like a kludge.
It would also be possible to retain the current algorithm by removing the recursion and using a heap-allocated stack instead of the rather small C stack. At least for mark and sweep there is also the pointer reversal trick that stores the mark stack in the objects themselves (the 'Deutsch-Schorr-Waite algorithm'). |
nilern wrote:
Yes, that's how it is now, but just for the record, there was nothing wrong in what I said, which was that I suggested to do it differently, i.e., to store the type tags in the objects and not the pointers. nilern wrote:
That's right, and to me that sounds as the best choice, given the current architecture. nilern wrote:
I also considered that alternative briefly, but I would have to think more about it, before I know for sure, if it can be done without running into the problem with the missing types in the objects. Upon return to a pointer slot, it requires quite a lot of information to know how to proceed. |
nilern wrote:
Brian wrote:
A heap-allocated stack is not the right choice, but the good news are, that considering the stack as one of the few good possibilities provides the killer-argument against choosing anything but the natural solution. The memory footprint of the stack is theoretically exactly the same as if a header field is added to each object in the heap. This means that there is no argument against adding such a header field. That's a big relief because there is no need to feel bad about spending memory on the header field. The header field contains the obvious pieces of information for the copying garbage collector: A. The object size, including alignment, in the upper bits, thereby providing the size in a uniform and efficient manner, so the copying garbage collector knows where the next object begins. B. The object type in the lower bits, so the garbage collector knows what to do with the current object. Its'a win-win situation. It's the easiest way to implement the copying garbage collector, and it's efficient. The added memory footprint is of no concern. All the existing code runs unmodified, with its types stored in pointers. The only changes in the source code are the few added code lines - no changes - to the object constructor functions. They build the header fields for their respective object types. So the bottom line is, that there is no reason to look for alternatives to the natural solution. |
When running
julia --lisp
(so using Julia's version of Femtolisp) with this Scheme file:the result is a SIGSEGV due to a stack overflow (confirmed by Valgrind).
I suspect this is in the garbage collector, since replacing
car
withcdr
(andset-car!
withset-cdr!
takes much longer before segfaulting (possibly due to a use-after-free error, most likely due to integer overflow).The text was updated successfully, but these errors were encountered: