Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not generate names for typedefed anonymous structs and enums #427

Closed
jethrogb opened this issue Jan 24, 2017 · 13 comments
Closed

Do not generate names for typedefed anonymous structs and enums #427

jethrogb opened this issue Jan 24, 2017 · 13 comments

Comments

@jethrogb
Copy link
Contributor

jethrogb commented Jan 24, 2017

A common pattern in a lot of C code is to do

typedef struct /* no name here */ {
    // ...
} my_struct;

bindgen 0.20 translates this using a made-up type name and a re-export. This makes the rustdoc documentation look awful (and potentially error messages as well). These bogus type names were not introduced in bindgen 0.19. I think there is no reason not to detect this common pattern and avoid them.

@emilio
Copy link
Contributor

emilio commented Jan 24, 2017

The thing is that detecting this common pattern is not that easy in the case of C++. Old bindgen did this looking at the AST of libclang, and got it wrong a lot of times, over all in presence of anonymous structs in classes (which I believe are also legal in C11).

I'm not really fond in supporting this. What we could do, is using a type alias instead of a re-export. The re-exports are done for convenience when using enums (#396), but we can definitely make it configurable.

@emilio
Copy link
Contributor

emilio commented Jan 24, 2017

BTW, thanks for filling all these issues, they're really helpful.

@jethrogb
Copy link
Contributor Author

I don't think type aliases would fix the documentation or the error messages.

pub struct bindgen_ty {
    pub a_field: u32
}
pub type Typed = bindgen_ty;
pub use bindgen_ty as Used;

fn main() {
    let t: Typed = Typed{a_field:2};
    t.not_a_field;
    let u: Used = Used{a_field:2};
    u.not_a_field;
}
error: no field `not_a_field` on type `bindgen_ty`
 --> <anon>:9:7
error: no field `not_a_field` on type `bindgen_ty`
  --> <anon>:11:7

Besides, type aliases are not the same as re-exports. Crucially, tuple-struct constructors are not available through a type alias.

@jethrogb
Copy link
Contributor Author

What's hard about this? Is such a definition split into two AST nodes with an anonymous struct definition and a related typedef?

@jethrogb
Copy link
Contributor Author

BTW, thanks for filling all these issues, they're really helpful.

Thanks for being so receptive of my issues! Otherwise I'm pretty happy with the new features such as string macros, 128-bit integer support, probably more...

@emilio
Copy link
Contributor

emilio commented Jan 26, 2017

What's hard about this? Is such a definition split into two AST nodes with an anonymous struct definition and a related typedef?

Yes, I remember it being specially tricky to handle correctly under some circumstances in C++, that's why I used this approach.

@emilio
Copy link
Contributor

emilio commented Jan 26, 2017

TL;DR: This is how the simplest case looks like:

(kind: StructDecl, spelling: , type: Record
)
(kind: TypedefDecl, spelling: Foo, type: Typedef
	(kind: StructDecl, spelling: , type: Record
	)
)

That is, first we find the anonymous struct, then we find a typedef declaration that happens to contain a node pointing to that same struct. I guess I could try to handle it somewhat correctly given the definition of the anonymous struct (I expect) points to the inner item, but it's tricky.

@emilio
Copy link
Contributor

emilio commented Jan 26, 2017

I hadn't look into this for a while, actually doesn't look that terrible if what I say above is true, we could try to special-case this. But need to see how more complex cases are handled.

@emilio emilio added E-less-easy and removed E-hard labels Feb 2, 2017
@kornelski
Copy link
Contributor

This is a huge problem for me. In C the typedef struct pattern is very very common. There are C libraries which don't even use named structs, and rely on the typedef pattern exclusively. This means that all generated code is hard to understand gibberish of _bindgen_ty_35.

@kornelski
Copy link
Contributor

kornelski commented Feb 15, 2017

I've tried to implement this, but there are too many things called Item and Kind for me to follow, and name() is sometimes from a trait and requires context and sometimes it doesn't :/

Given this header:

typedef struct {
} TYPEDEFED_NAME;

In bindgen items there is an item Item { id: ItemId(3) … kind: Type(Type { name: Some("struct TYPEDEFED_NAME"), so bindgen knows that there's a struct with a name from a typedef, but I don't know how to get to that field when generating canonical_name.

@emilio
Copy link
Contributor

emilio commented Feb 15, 2017

So the way to implement this is not in the code generation phase, but on the type parsing phase. Concretely, in the CXType_Record branch in src/ir/ty.rs, you'd do something check whether the current cursor spelling is empty.

If so, you'd have to walk to its next sibling (this would be hacky, but you'd have to re-visit the cursors I think), and see if the exact next cursor is a CXCursor_Typedef with a CXCursor_RecordDecl pointing to the same type. If so, you can just use the CXCursor_Typedef name instead of the record name (which is empty).

An alternative approach, which may work better, is looking at the type definition, and see if it's inside a typedef. And if so, do the same.

@emilio
Copy link
Contributor

emilio commented Feb 15, 2017

That is to say, you don't want to look at the Items, you need to do this work while constructing the item, that is, looking at the clang AST.

You can look at the clang ast with bindgen --emit-clang-ast foo.h.

@kornelski
Copy link
Contributor

kornelski commented Feb 15, 2017

In from_clang_ty, CXType_Record's spelling is the name of the typedef. It looks like CompInfo::from_ty isn't even trying to save it.

edit: oh!

let mut name = cursor.spelling();

changing to

let mut name = ty.spelling();

makes it save the struct name, although I see that ty.spelling() is quite freestyle and can contain "struct (anonymous at test.h:1:9)".

bors-servo pushed a commit that referenced this issue Feb 16, 2017
typedef struct {} name

Fixes #427

It looks like clang is doing the hard work of getting the right name from the typedef, but it falls back to arbitrary pretty-printed descriptions if it can't find a typedef. I couldn't find an API to check whether the name comes from a typedef, so I just filter out non-ident-like spellings.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants