-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generated POSIX/C bindings + FreeBSD support #2442
Conversation
TODO:
|
Awesome!! I won't be able to look at this these days, because I'm finishing implementing #2390 and it's a huge change. I have a few comments and questions :-)
|
|
e526025
to
5c272bf
Compare
@asterite here is a bench. On the left is the
|
Green, eventually :D BTW: encoding specs ar failing on FreBSD because both GB2312 AND UCS-2LE aren't supported, either in encoding or decoding (or so it seems). musl-libc has less failures, since only encoding (or was it decoding?) to GB3212 isn't supported. |
The difference is more noticeable when compiling Crystal: posix branch on the left, master branch on the right, each one compiling its own branch. 40MB difference in memory usage isn't negligible. Otherwise compilation time is roughly on par.
|
Yeah, 40MB difference is a lot, thought maybe it can be reduced by changing the internal structure of an ASTNode (right now the base type occupies about 72 bytes, a fun definition about 144 bytes), and by reusing some nodes (for example a same type is mentioned over and over in a lib), but I'm not sure. |
I think some added constants and functions may be welcomed in the near feature (eg: sockets, fcntl, unistd) to expand th stdlib, some will stay useless (eg: limits?, obscure functions from stdio, string...), while some others can be more tricky: will we need the pthread types (attr, barrier) or process scheduling (sched, pthread sched params)? Or are those too far fetched that it's useless to include them now? |
With @waj we always had the idea of removing as much C bindings as possible from the standard library in order to make it more portable and independent. That's why I don't think it's a good idea to include a lot of C bindings, specially if they aren't going to be used right now. |
8c5d50d
to
675f6b4
Compare
I merged/squashed from master and reduced the generated bindings. There are still a few more constants and C functions than stricly required, but most are related to constants we use now, or some missing stdlib features (eg: For example I junked |
Simplified a bit more, by removing bindings for A Crystal binary for FreeBSD (that can compile this branch) is also available: |
@asterite I now don't notice a difference in memory usage with the master branch anymore! |
@ysbaddaden Great! One question: why is CRYSTAL_PATH modified in I think this PR really cleans up things and organizes code (C code is separate from Crystal code). However, I would still like to keep C bindings as a minimum and not add extra bindings that aren't needed right now. Ideally, the compiler should auto-generate these bindings as needed. What we had in the mind is that you'd write something like: lib LibC
@[Import]
fun write
end or something like that, and the compiler would complete the function with the correct argument types, even maybe declaring the necessary structs (although it would be better to have to explicitly import them too because renaming things from C to Crystal is hard). The compiler would use libclang for thing, so the compiler will always depend on it. In this way we don't have to maintain sources for each platform that we support, and we only import functions/structs that are really needed. Some of this is already done by crystal_lib, integrating it nicely in the compiler is the hard/tricky part. (crystal_lib allows importing many functions at one with a prefix, but I think that's an anti-feature because of name conflicts, it's better if we are explicit about which functions/structs are imported) The compiler could cache the generated bindings so it doesn't have to generate them all the time, unless the source file that declares them changes. The only problem with this is that to cross-compile to a new platform you won't be able to do it, because a compiler for such platform doesn't exist. Maybe in that case we could allow specifying a path to lookup C headers and when cross-compiling to a new platform you'd specify that. Do you think that all of this makes sense and is possible to implement? I really wouldn't mind depending on libclang if writing C bindings will become easier and less source code is needed. Also note that this will also be useful for any C library in any shard. I really think this is the ideal solution for this problem. |
I see your posix project generates these bindings from a bunch of YAML files. If I'm not mistaken, what we would need to do is to generate these bindings from lib declarations + some attributes in crystal's source code. With a first pass we could gather all of these and then pass them to posix (which would be inside the compiler's source code now... we can later think how to extract it into a separate project and make the compiler depend on it, if we want to). So the whole thing would consist of:
Makes sense? Possible? Any reason not to go in this direction? |
So, a syntax for how to specify what and how to import can be: # This attribute tells the compiler to generate the body of the given lib.
@[Import({
includes: "dlfcn",
libraries: {
darwin: "dl",
gnu: "dl",
},
constants: %w(RTLD_LAZY RTLD_NOW RTLD_GLOBAL RTLD_LOCAL),
structs: "Dl_info",
functions: %w(dlclose dlerror dlopen dlsym dladdr),
})]
lib LibC
end That's more or less what you have in your YAML, but translated as an attribute value. The compiler can be permissive about some things, like allowing a single string literal instead of an array of string literals when expecting an array, or internally doing I think this way (a single attribute) is better than And of course this can be used for other C libraries: @[Import({
includes: "gmp",
libraries: "gmp",
structs: %w(MPZ MPQ MPF),
functions: %w(__gmpz_init __gmpz_init2 __gmpz_add),
})]
lib LibGMP
end |
This PR is a first step towards a nicer LibC integration: an overall cleanup (separate C and CR) and a simpler way to target new platforms, namely BSD, Solaris, Cygwin and Android (NDK) or iOS. Maybe we don't need every target to be included in the source code. Maybe we could generate them for the target when distributing Crystal only? Always generating the bindings for the platform we're compiling on would be great. Especially it would avoid to have static bindings in the code and to update them for each target whenever we add a single fun or type, which is boring. It's still better than the existing, though, since we don't have to fiddle manually into C headers to find every constant, struct definition, ... which is time consuming and prone to stupid human errors. Always generating bindings may make cross compilation harder, thought, or maybe not if we use a cache folder? It means targeting a new platform may be harder, or compiling to Android (NDK) or iOS. We thus still need a way to have many targets in parallel, so I guess the CrystalPath changes would still be required. BTW: the |
Ideally all code should be written in Crystal except things that would be very hard to do like that. Specifically: syscalls. That's why I don't think it's a good idea to bring so many C functions and constants into LibC. For example there's no point in using The worry we have about cross-compiling to a new platform: how did you generate C bindings for freebsd without a compiler for it? Or did you first created a compiler and then ran |
Sorry, I should have explained that: I "cross-compiled" the FreeBSD bindings, as well as most of the bindings! I copy-pasted
And libclang took the provided headers for FreeBSD, not the system ones. Then I could cross-compile a bootstrap compiler that I linked on the FreeBSD install (linking took 10 minutes!) and then it could compile itself! Well, it took a bunch of cross compilation to achieve it, most of them because of libgc —I struggled to understand I needed the It only failed to build correct bindings for Look at the bindings now, I dropped most of I kept the naming of headers so we have a clear mapping of CR -> C. It would also allow to have POSIX become a |
@ysbaddaden Could you rebase one more time? I'll try to discuss this with @waj today and see if we merge. I think even without the smart bindings that we want to have this is still a very good change as it separates C bindings from other code, and of course it makes it work with FreeBSD. So it seems that for cross-compiling to a new platform we could still use the smart, on-the-fly, bindings by specifying CPATH and C_INCLUDE_DIR, right? :-) |
Yes, I cleaned a bunch more definitions from bindings, and they should now be down to just what's required plus one or two extras like I didn't update the PR yet. Please tell me what you and @waj think and then I'll rebase (or not, sob)! |
@ysbaddaden I just talked with @waj. We concluded that:
So, yes, please rebase :-) Then I'll test this locally, check if it there are no problems with Thank you! ❤️ |
TODO: revert after the next release of Crystal.
@asterite rebased & tidied up! |
@ysbaddaden Awesome! Travis is green, so I'm merging this. Should we keep the commits or should I squash them into a single one? |
I think I'll merge, the commits are separated really nicely :-) Thank you!! ❤️ |
@myfreeweb how much time did it take? For me it takes about 15~20 seconds. So yes, I guess that's fast :-) |
@asterite yeah, about 15 seconds. I'm used to small Rust projects taking much longer! |
@myfreeweb let's hope MIR improves things for Rust :-) |
I wrote a project to generate POSIX/C bindings for Crystal automatically given a set C headers: posix. It started using CrystalLib, but eventually evolved to just use the CrystalLib parser, to try to cleanup bindings as much as possible. For example to avoid many useless and ugly
alias X__SizeT = SizeT
; or being able to specify what types and structs to generate so we could have LibC dispatched into many files, not just one; etc.I eventually tried to use the generated headers with Crystal, cleaning up the core/stdlib source code along the way. Hopefully it only required a few tweaks for each platform. Even integrating a new UNIX target (FreebSD x86_64, e854c2d) was done in a few hours, most of them spent on
libgc
or waiting for the cross compiled compiler to be linked (~10 minutes in my VM) or dealing with port issues, not grepping include files for each constant or struct definition, which is very human-error prone.This pull request doesn't integrate the posix project, but rather adds the generated headers for each target. This takes some non negligible space, for mostly redundant code that could be generated cleanly for each OS. Yet, the posix project is a pile of hacks on top of CrystalLib, and I don't think it would fit properly here; it would also add clang as a permanent requirement. The con is that the target LibC folder must be injected into
CRYSTAL_PATH
. This is done automatically in Crystal::CrystalPath for known targets (see 8138ce7) but obviously the current compiler doesn't do that, so I modifiedbin/crystal
to do the job until the next release of Crystal (b13e5f1).Crystal only requires a subset of all the headers the posix project generates. I limited the generated headers here, but I think we could limit them more. For example the headers below are only there because POSIX says they should be included autmatically by headers Crystal uses. Maybe they'll come handy, or not? Tell me.
limits.h
locale.h
sched.h
math.h
—I found out Crystal uses LLVM mathstdint.h
Notes:
closes #1653 #1413