Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eui64_t type needs a cleanup #14690

Closed
maribu opened this issue Aug 4, 2020 · 22 comments
Closed

eui64_t type needs a cleanup #14690

maribu opened this issue Aug 4, 2020 · 22 comments
Assignees
Labels
Area: network Area: Networking Discussion: needs consensus Marks a discussion that needs a (not necessarily moderated) consensus State: don't stale State: Tell state-bot to ignore this issue

Comments

@maribu
Copy link
Member

maribu commented Aug 4, 2020

Description

There was some backlash on consistently using the eui64_t type for EUI-64 identifiers, mostly due to the type being messy.

Steps to reproduce the issue

Check the definition of the type.

Expected results

Something reasonable.

Actual results

I replaced all network_uint<NUM>_t with an inline union network_uint<NUM> definition, so that the full depth of the definition is easier to see.

union {
    union network_uint64 {
        uint64_t u64;
        uint32_t u32[2];
        uint16_t u16[4];
        uint8_t u8[8];
        union be_uint16 {
             uint16_t u16;
             uint8_t u8[2];
        } b16[4];
        union be_uint32 {
             uint32_t u32;
             uint16_t u16[2];
             uint8_t u8[4];
             union be_uint16 {
                 uint16_t u16;
                 uint8_t u8[2];
             } b16[2];
        } b32[2];
    } uint64;
    union network_uint32 {
         uint32_t u32;
         uint16_t u16[2];
         uint8_t u8[4];
         union be_uint16 {
             uint16_t u16;
             uint8_t u8[2];
         } b16[2];
    } uint32[2];
    union network_uint16 {
         uint16_t u16;
         uint8_t u8[2];
    } uint16[4];
    uint8_t uint8[8];
};

Versions

Current master

@maribu
Copy link
Member Author

maribu commented Aug 4, 2020

Option 1 for a cleanup (vote with thumbs up / down):

typedef network_uint64_t eui64_t

@maribu
Copy link
Member Author

maribu commented Aug 4, 2020

Option 2 for a cleanup (vote with thumbs up / down):

typedef union {
    uint8_t u8[8]; /**< EUI-64 as byte array. */
    uint64_t u64;  /**< EUI-64 as number. Must be **big** **endian**!!!11eleven */
} eui64_t

@maribu
Copy link
Member Author

maribu commented Aug 4, 2020

Option 3 for a cleanup (vote with thumbs up / down):

typedef struct {
    uint8_t u8[8]; /**< EUI-64 as byte array.  Must be **big** **endian**!!!11eleven */
} eui64_t

Update: Don't using union, use struct here.

@maribu
Copy link
Member Author

maribu commented Aug 4, 2020

Two arguments against Option 3:

  • The EUI-64 is actually defined as 64 bit unsigned integer by the IEEE, and all references to e.g. the bit used to distinguish locally vs globally unique identifiers are in regard to this uint64_t representation.
  • The code gets less clean, as e.g. comparing for equality is now memcmp() instead of ==, and conversion between endianess (we have both hardware working in little and big endian) can no longer use byteorder_htonll() and friends for conversion.

@maribu maribu added Discussion: needs consensus Marks a discussion that needs a (not necessarily moderated) consensus Area: network Area: Networking labels Aug 4, 2020
@kaspar030
Copy link
Contributor

Two arguments against Option 3:

* The EUI-64 is actually defined as 64 bit unsigned integer by the IEEE, and all references to e.g. the bit used to distinguish locally vs globally unique identifiers are in regard to this `uint64_t` representation.

* The code gets less clean, as e.g. comparing for equality is now `memcmp()` instead of `==`, and conversion between endianess (we have both hardware working in little and big endian) can no longer use `byteorder_htonll()` and friends for conversion.

I think these can be mitigated by wrapping in functions, e.g., "eui64_equal(*a, *b)", "eui64_is_global(*a)", "eui64_from_u64()".

When is conversion between endianness necessary other than for displaying to the user?

Anyhow, I'd try to compare endianness bugs (which show up right away when not working) to alignment bugs (which show up in the field, when somehow the memory layout is different than in the test vectors). Alignment errors are a different class of bugs, which IMO should have priority to avoid or fix.

Also, I'm impressed that having a 20 line type definition is even considered to be preferable to a simple byte array. :)

@maribu
Copy link
Member Author

maribu commented Aug 4, 2020

When is conversion between endianness necessary other than for displaying to the user?

The network stack can set the L2 address and will always do so in big endian. The driver has to convert, if little endian is used by the hardware.

@maribu
Copy link
Member Author

maribu commented Aug 4, 2020

Anyhow, I'd try to compare endianness bugs (which show up right away when not working) to alignment bugs (which show up in the field, when somehow the memory layout is different than in the test vectors). Alignment errors are a different class of bugs, which IMO should have priority to avoid or fix.

This is generally true and I agree.

However, we still managed to miss an endianess bug in the at86rf2xx driver. Using a magic type will certainly not make this kind of bugs going away, even though a member name like be_u64 would help against confusion. But that alone is no guarantee that this bug does not pop up.

It would be nice to have a generic test for this, which would work on top of any IEEE 802.15.4 driver. Sadly, if the bug is applied consistently on both setting and getting the L2 address, this is not trivially to detect. I think we need a second device sniffing for frames send by this device. If the second device is using a known-good driver, this should catch any endianess issues.

@kfessel
Copy link
Contributor

kfessel commented Aug 5, 2020

The following is just a thought that came up and for some reason i wanted to share it (please season it to your taste).

Since this data has so many accessors and i thought C++ has a solution (private internals and some getter and setter), but this is C how can we get around.

How about creating a structure that has its internals not known to the user but has its content only accessed through static functions, which the compiler optimization will take care of. One could even put something like a scrambler define in (#define u8 abzgt) and undef at the end into the header that will let the compiler scream if someone accesses the internals of that struct.

@maribu
Copy link
Member Author

maribu commented Aug 5, 2020

One could even put something like a scrambler define in

My experience with trying to mimic features like private missing in C is that it hurts rightful access by making code more complex and harder to read, but cannot prevent abuse. These approaches are kind of similar to the security by obscurity technique, and IMO are flawed for the same reasons.

I would prefer proper documentation (so that offenders at least are aware of their wrongdoing) and proper review.

@kfessel
Copy link
Contributor

kfessel commented Aug 5, 2020

yes the accessors function would clutter the code but a the moment in most case u need to use a accessor anyway hton the only difference is its generality (the scrambler define is a second thought to make the non "legit" use scream at compiletime) and ofcause it may be switched of for debuging (like one would put the famous #define privat public).

@kaspar030
Copy link
Contributor

The network stack can set the L2 address and will always do so in big endian.

why is that? If the network stack uses eui64_t, it needs to get it somewhere (user input, from the network, freshly created). In all cases, eui64_from/to_<type>() would make sense, and could convert on the fly to our preferred byte order.

@maribu
Copy link
Member Author

maribu commented Aug 5, 2020

In all cases, eui64_from/to_<type>() would make sense, and could convert on the fly to our preferred byte order.

I agree on adding the helper functions. But please let's keep the driver in charge of doing any conversion, if needed. Otherwise the network code would look like this (pseudo-code):

upper_layer_set_l2_addr(dev_t *dev, eui64_t addr)
{
    if (dev->get_preferred_byte_order() == BIG_ENDIAN) {
        dev->set_l2_addr(addr);
    }
    else {
        dev->set_l2_addr(eui64_to_lsb(addr));
    }
}

And I really think that the network stack should not care about the internal representation of an L2 addr in the device it is running on.

@kaspar030
Copy link
Contributor

I agree on adding the helper functions. But please let's keep the driver in charge of doing any conversion, if needed.

+1

@kfessel
Copy link
Contributor

kfessel commented Aug 5, 2020

what is our preferred byte order? is it IEEE or MCU byteorder or Driver byteorder.
I think the one that defines the order has to store the value and provide the conversion functions.

@miri64
Copy link
Member

miri64 commented Aug 5, 2020

what is our preferred byte order? is it IEEE or MCU byteorder or Driver byteorder.
I think the one that defines the order has to store the value and provide the conversion functions.

At least for netopt it is defined to be in network byte order.

@jia200x
Copy link
Member

jia200x commented Aug 6, 2020

I agree on adding the helper functions. But please let's keep the driver in charge of doing any conversion, if needed. Otherwise the network code would look like this (pseudo-code):

I agree this snippet is not something we would like, but note that in most cases such an "upper_layer_set_l2_addr" might not be needed, since that dev interface is already an L2 interface.

In practice, this function would be the specific netif_set implementation for a given technology. They will differ between different MAC/network devices.

E.g setting LoRaWAN EUIs (little endian) works like this:

static int _set(gnrc_netif_t *netif, const gnrc_netapi_opt_t *opt)
{
    ...
        case NETOPT_ADDRESS_LONG:
            assert(opt->data_len == LORAMAC_DEVEUI_LEN);
            _memcpy_reversed(netif->lorawan.deveui, opt->data, LORAMAC_DEVEUI_LEN);
            break;
        case NETOPT_LORAWAN_APPEUI:
            assert(opt->data_len == LORAMAC_APPEUI_LEN);
            _memcpy_reversed(netif->lorawan.appeui, opt->data, LORAMAC_APPEUI_LEN);
            break;
    ...

GNRC LoRaWAN expects EUIs in litle endian, but NETOPTs are in network byte order. Still, it would be possible to bypass this memcpy_reversed if an application needs it.

@maribu
Copy link
Member Author

maribu commented Aug 10, 2020

I just noticed that network_uint64_t is marked with __attribute__((packed)). So it does not have alignment issues. However, things like calling ntohll() on it will be terribly expensive.

With this attribute, the type basically behaves exactly like Option 3 already; this is just well hidden under layers of complexity in the type definition.

@maribu
Copy link
Member Author

maribu commented Aug 10, 2020

IMO, the other byteoder types should just be cleaned up as well: #14737

@kaspar030
Copy link
Contributor

the type basically behaves exactly like Option 3 already; this is just well hidden under layers of complexity in the type definition.

Which, IMO, is the kind of complexity that we should avoid by using the simplest type possible. principle of least surprise. Don't say you were not surprised when you spotted the "packed". 😕

@maribu
Copy link
Member Author

maribu commented Aug 10, 2020

Maybe we should discuss #14737 first. As Option 1 will reuse be_uint64_t, it might makes sense to first discuss if it should be simplified.

@maribu
Copy link
Member Author

maribu commented Aug 14, 2020

I previously said that I do not care what the type of eui64_t is, as long as it is used consistently. But I personally reached the conclusion that all those fancy types cause more pain than benefit, especially when working with external code. Thus, I now also voted for the simple byte array.

@stale
Copy link

stale bot commented Mar 19, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.

@stale stale bot added the State: stale State: The issue / PR has no activity for >185 days label Mar 19, 2021
@stale stale bot closed this as completed Jun 3, 2021
@jeandudey jeandudey added State: don't stale State: Tell state-bot to ignore this issue and removed State: stale State: The issue / PR has no activity for >185 days labels Jun 3, 2021
@MrKevinWeiss MrKevinWeiss added this to the Release 2021.07 milestone Jul 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: network Area: Networking Discussion: needs consensus Marks a discussion that needs a (not necessarily moderated) consensus State: don't stale State: Tell state-bot to ignore this issue
Projects
None yet
Development

No branches or pull requests

8 participants