-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add C source information to Hs
and SHs
ASTs
#316
Comments
One idea is to change data Namespace =
NsOrdinary
| NsStructTag
| NsUnionTag
| NsEnumTag
newtype CName (ns :: Namespace) = CName { getCName :: Text }
... This type does not include member namespaces, which are separate per structure/union. Discussing with @edsko yesterday, he suggested that we may want to model member namespaces as well, perhaps using a GADT. C identifier namespace reference: |
it doesn't work in general. Counter example struct foo {
struct { int x; int y } bar;
int z;
} we will create More generally, what you propose is essentially delaying name mangling. |
Thanks for the example! I think that we should track how parts of our generated ASTs are created, so in this case we could include information for I do not mean to suggest that name mangling should be delayed. A Regarding documentation, we could generate documentation like the following:
|
which is essentially preserving exactly the information (to be) passed to the name mangling machinery. (And source location, but that is purely informative bit). |
Indeed. Perhaps another way to put it is that name mangling is not invertible. When generating tests, name Documentation like the above example could help users understand/confirm which Haskell maps to which C. When a user sees name |
I think that the easiest solution to go forward is to add a field with
That string is not |
Thank you very much for the suggestion. I will give that a try. |
This commit adds a type spelling field to the following ASTs: * `C` phase: `Struct`, `Enu`, `Typedef` * Type: `Text` * `Hs` phase: `Struct`, `Newtype` * Type: `Maybe Text` * `SHs` phase: `Record`, `Newtype` * Type: `Maybe Text` As suggested in #316, the string is not parsed. This commit simply uses `Text`, but we could implement a `newtype` wrapper if desired. The goal is to make it easier to generate tests for structures, enumerations, and `typedef`s of structures/enumerations. Note that this is also required for unions, but those are not implemented yet. This is the minimal change required to do this; the `Maybe` is needed because it does *not* track the type spelling for macros. (Cherry-picked from `source-info` for experimentation)
After discussing this with @TravisCardwell , the idea to annotate the Haskell tree with some kind of
is useful for both test generation and documentation generation; @phadej 's objection that the proposal as originally stated doesn't quite work ("which doesn't correspond to any type we could reference in C.") are especially important cases to consider, both for tests and for documentation generation; we agreed that Travis will submit a draft PR with an initial attempt at this so that we have something concrete to discuss and refine. |
While adding source locations to the C AST, I ran into an issue with forward declarations. When translating example I am not sure how to resolve this. When the lookup in |
@TravisCardwell EDIT: I guess you did update. |
@phadej Thank you very much for the assistance. I indeed rebased on the current By "the lookup in processTypeDecl unit ty sloc = do
dtraceIO "processTypeDecl" ty
s <- get
case OMap.lookup ty (typeDeclarations s) of
Nothing -> processTypeDecl' PathTop unit ty sloc
Just (TypeDecl t _) -> dtraceIO "processTypeDecl*" () >> return t
Just (TypeDeclAlias t) -> return t
Just (TypeDeclProcessing t') -> liftIO $ fail $ "Incomplete type declaration: " ++ show t' My reasoning is that when we process the "primary" declaration we have already processed it via the forward declaration and that
Instead of just |
Isn't "primary" source location the one returned by |
I didn't rename But it's the |
Using the cursor returned from With the code refactored, I ran into issues with From a user perspective, I would like source locations to be relative to the package directory, while sources outside of the package directory should be absolute. This would resolve the issue for the fixtures that we currently have; we would get paths like Thoughts? Would it be acceptable to transform (these) location paths so that they are relative to the package directory? EDIT: I implemented this in commit "Make typedef source locations relative" of the |
After a fair amount of experimentation, I realized that there is a simple solution that may be sufficient. C phase: The C AST must contain the C names and source locations. It generally represents the actual structure of the parsed C declarations, with some differences:
Any transformation that we need to distinguish in documentation or test generation must be recorded in the AST. Currently, this is simply done by using different data structures. Haskell phase: The Haskell AST should record why each declaration is created, using data NewtypeOrigin =
NewtypeOriginEnum C.Enu
| NewtypeOriginTypedef C.Typedef
| NewtypeOriginMacro C.Macro
deriving Show Some other parts of the AST track the origin in the same way, to support documentation and test generation. Specifically, we currently have a Simplified Haskell phase: The simplified Haskell AST records origin information in the same way as the Haskell AST. At this point, there are no differences, so the Haskell AST |
As mentioned in #316 (comment), we have to know the namespace of C names in order to generate C code. For example:
The idea mentioned in #316 (comment) requires many changes across the codebase, so for now I implemented a simple/minimal solution: adding a -- | Declaration name
data DeclName
= -- No name specified (anonymous)
DeclNameNone
| -- Structure/union tag specified
DeclNameTag CName
| -- Typedef name specified
DeclNameTypedef CName
deriving stock (Eq, Generic, Show)
deriving anyclass (PrettyVal) |
I ported the current test generation code to work with Haskell AST that includes origin information. IMHO, it is much better than the previous implementation that used the C type spelling to "join" generated Haskell AST with the C AST because there are far fewer error cases. The test generation is an initial implementation with very broad strokes. Many things need to be improved/implemented. For example, the current implementation only supports primitive types. It does not yet support nested structures, pointers, The test generation AST is very high-level, as (I think that) we will always generate pretty-printed code. Using TH is not an option with C source generation. |
Done in #338. |
We would like to track how various parts of our generated ASTs are created, referencing the C source.
One motivation for this is test generation (#22), which requires generating both C and Haskell code for testing. Generating the C test code from a C
Header
is not a good option because we would need to reimplement a lot of the logic for translating from C to Haskell. If the Haskell AST includes C source information, we could traverse the Haskell AST and determine exactly what test functions are required, perhaps referencing the CHeader
to get C details.For example, for a given
data
declaration (calledStruct
inHs
andRecord
inSHs
), it is useful to know the name of the corresponding C type, including the C namespace. When generating C code, the namespace determines how an identifier is written.foo
struct
tag namespace:struct foo
union
tag namespace:union foo
enum
namespace:enum foo
C source information can also be used to improve the generated Haddock documentation (#26). For example, we could output corresponding C names to help users understand/confirm which Haskell maps to which C.
We should include source locations, which may optionally be output in
LINE
pragmas (#74). We could even consider including source location information in generated Haddock documentation.Related to #23 (which is for the high-level API)
The text was updated successfully, but these errors were encountered: