Skip to content

Commit 2b30f82

Browse files
vladimirolteandavem330
authored andcommitted
net: ethtool: add support for MAC Merge layer
The MAC merge sublayer (IEEE 802.3-2018 clause 99) is one of 2 specifications (the other being Frame Preemption; IEEE 802.1Q-2018 clause 6.7.2), which work together to minimize latency caused by frame interference at TX. The overall goal of TSN is for normal traffic and traffic with a bounded deadline to be able to cohabitate on the same L2 network and not bother each other too much. The standards achieve this (partly) by introducing the concept of preemptible traffic, i.e. Ethernet frames that have a custom value for the Start-of-Frame-Delimiter (SFD), and these frames can be fragmented and reassembled at L2 on a link-local basis. The non-preemptible frames are called express traffic, they are transmitted using a normal SFD, and they can preempt preemptible frames, therefore having lower latency, which can matter at lower (100 Mbps) link speeds, or at high MTUs (jumbo frames around 9K). Preemption is not recursive, i.e. a P frame cannot preempt another P frame. Preemption also does not depend upon priority, or otherwise said, an E frame with prio 0 will still preempt a P frame with prio 7. In terms of implementation, the standards talk about the presence of an express MAC (eMAC) which handles express traffic, and a preemptible MAC (pMAC) which handles preemptible traffic, and these MACs are multiplexed on the same MII by a MAC merge layer. To support frame preemption, the definition of the SFD was generalized to SMD (Start-of-mPacket-Delimiter), where an mPacket is essentially an Ethernet frame fragment, or a complete frame. Stations unaware of an SMD value different from the standard SFD will treat P frames as error frames. To prevent that from happening, a negotiation process is defined. On RX, packets are dispatched to the eMAC or pMAC after being filtered by their SMD. On TX, the eMAC/pMAC classification decision is taken by the 802.1Q spec, based on packet priority (each of the 8 user priority values may have an admin-status of preemptible or express). The MAC Merge layer and the Frame Preemption parameters have some degree of independence in terms of how software stacks are supposed to deal with them. The activation of the MM layer is supposed to be controlled by an LLDP daemon (after it has been communicated that the link partner also supports it), after which a (hardware-based or not) verification handshake takes place, before actually enabling the feature. So the process is intended to be relatively plug-and-play. Whereas FP settings are supposed to be coordinated across a network using something approximating NETCONF. The support contained here is exclusively for the 802.3 (MAC Merge) portions and not for the 802.1Q (Frame Preemption) parts. This API is sufficient for an LLDP daemon to do its job. The FP adminStatus variable from 802.1Q is outside the scope of an LLDP daemon. I have taken a few creative licenses and augmented the Linux kernel UAPI compared to the standard managed objects recommended by IEEE 802.3. These are: - ETHTOOL_A_MM_PMAC_ENABLED: According to Figure 99-6: Receive Processing state diagram, a MAC Merge layer is always supposed to be able to receive P frames. However, this implies keeping the pMAC powered on, which will consume needless power in applications where FP will never be used. If LLDP is used, the reception of an Additional Ethernet Capabilities TLV from the link partner is sufficient indication that the pMAC should be enabled. So my proposal is that in Linux, we keep the pMAC turned off by default and that user space turns it on when needed. - ETHTOOL_A_MM_VERIFY_ENABLED: The IEEE managed object is called aMACMergeVerifyDisableTx. I opted for consistency (positive logic) in the boolean netlink attributes offered, so this is also positive here. Other than the meaning being reversed, they correspond to the same thing. - ETHTOOL_A_MM_MAX_VERIFY_TIME: I found it most reasonable for a LLDP daemon to maximize the verifyTime variable (delay between SMD-V transmissions), to maximize its chances that the LP replies. IEEE says that the verifyTime can range between 1 and 128 ms, but the NXP ENETC stupidly keeps this variable in a 7 bit register, so the maximum supported value is 127 ms. I could have chosen to hardcode this in the LLDP daemon to a lower value, but why not let the kernel expose its supported range directly. - ETHTOOL_A_MM_TX_MIN_FRAG_SIZE: the standard managed object is called aMACMergeAddFragSize, and expresses the "additional" fragment size (on top of ETH_ZLEN), whereas this expresses the absolute value of the fragment size. - ETHTOOL_A_MM_RX_MIN_FRAG_SIZE: there doesn't appear to exist a managed object mandated by the standard, but user space clearly needs to know what is the minimum supported fragment size of our local receiver, since LLDP must advertise a value no lower than that. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
1 parent 40e0b09 commit 2b30f82

File tree

7 files changed

+451
-2
lines changed

7 files changed

+451
-2
lines changed

include/linux/ethtool.h

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -477,6 +477,98 @@ struct ethtool_module_power_mode_params {
477477
enum ethtool_module_power_mode mode;
478478
};
479479

480+
/**
481+
* struct ethtool_mm_state - 802.3 MAC merge layer state
482+
* @verify_time:
483+
* wait time between verification attempts in ms (according to clause
484+
* 30.14.1.6 aMACMergeVerifyTime)
485+
* @max_verify_time:
486+
* maximum accepted value for the @verify_time variable in set requests
487+
* @verify_status:
488+
* state of the verification state machine of the MM layer (according to
489+
* clause 30.14.1.2 aMACMergeStatusVerify)
490+
* @tx_enabled:
491+
* set if the MM layer is administratively enabled in the TX direction
492+
* (according to clause 30.14.1.3 aMACMergeEnableTx)
493+
* @tx_active:
494+
* set if the MM layer is enabled in the TX direction, which makes FP
495+
* possible (according to 30.14.1.5 aMACMergeStatusTx). This should be
496+
* true if MM is enabled, and the verification status is either verified,
497+
* or disabled.
498+
* @pmac_enabled:
499+
* set if the preemptible MAC is powered on and is able to receive
500+
* preemptible packets and respond to verification frames.
501+
* @verify_enabled:
502+
* set if the Verify function of the MM layer (which sends SMD-V
503+
* verification requests) is administratively enabled (regardless of
504+
* whether it is currently in the ETHTOOL_MM_VERIFY_STATUS_DISABLED state
505+
* or not), according to clause 30.14.1.4 aMACMergeVerifyDisableTx (but
506+
* using positive rather than negative logic). The device should always
507+
* respond to received SMD-V requests as long as @pmac_enabled is set.
508+
* @tx_min_frag_size:
509+
* the minimum size of non-final mPacket fragments that the link partner
510+
* supports receiving, expressed in octets. Compared to the definition
511+
* from clause 30.14.1.7 aMACMergeAddFragSize which is expressed in the
512+
* range 0 to 3 (requiring a translation to the size in octets according
513+
* to the formula 64 * (1 + addFragSize) - 4), a value in a continuous and
514+
* unbounded range can be specified here.
515+
* @rx_min_frag_size:
516+
* the minimum size of non-final mPacket fragments that this device
517+
* supports receiving, expressed in octets.
518+
*/
519+
struct ethtool_mm_state {
520+
u32 verify_time;
521+
u32 max_verify_time;
522+
enum ethtool_mm_verify_status verify_status;
523+
bool tx_enabled;
524+
bool tx_active;
525+
bool pmac_enabled;
526+
bool verify_enabled;
527+
u32 tx_min_frag_size;
528+
u32 rx_min_frag_size;
529+
};
530+
531+
/**
532+
* struct ethtool_mm_cfg - 802.3 MAC merge layer configuration
533+
* @verify_time: see struct ethtool_mm_state
534+
* @verify_enabled: see struct ethtool_mm_state
535+
* @tx_enabled: see struct ethtool_mm_state
536+
* @pmac_enabled: see struct ethtool_mm_state
537+
* @tx_min_frag_size: see struct ethtool_mm_state
538+
*/
539+
struct ethtool_mm_cfg {
540+
u32 verify_time;
541+
bool verify_enabled;
542+
bool tx_enabled;
543+
bool pmac_enabled;
544+
u32 tx_min_frag_size;
545+
};
546+
547+
/**
548+
* struct ethtool_mm_stats - 802.3 MAC merge layer statistics
549+
* @MACMergeFrameAssErrorCount:
550+
* received MAC frames with reassembly errors
551+
* @MACMergeFrameSmdErrorCount:
552+
* received MAC frames/fragments rejected due to unknown or incorrect SMD
553+
* @MACMergeFrameAssOkCount:
554+
* received MAC frames that were successfully reassembled and passed up
555+
* @MACMergeFragCountRx:
556+
* number of additional correct SMD-C mPackets received due to preemption
557+
* @MACMergeFragCountTx:
558+
* number of additional mPackets sent due to preemption
559+
* @MACMergeHoldCount:
560+
* number of times the MM layer entered the HOLD state, which blocks
561+
* transmission of preemptible traffic
562+
*/
563+
struct ethtool_mm_stats {
564+
u64 MACMergeFrameAssErrorCount;
565+
u64 MACMergeFrameSmdErrorCount;
566+
u64 MACMergeFrameAssOkCount;
567+
u64 MACMergeFragCountRx;
568+
u64 MACMergeFragCountTx;
569+
u64 MACMergeHoldCount;
570+
};
571+
480572
/**
481573
* struct ethtool_ops - optional netdev operations
482574
* @cap_link_lanes_supported: indicates if the driver supports lanes
@@ -649,6 +741,9 @@ struct ethtool_module_power_mode_params {
649741
* plugged-in.
650742
* @set_module_power_mode: Set the power mode policy for the plug-in module
651743
* used by the network device.
744+
* @get_mm: Query the 802.3 MAC Merge layer state.
745+
* @set_mm: Set the 802.3 MAC Merge layer parameters.
746+
* @get_mm_stats: Query the 802.3 MAC Merge layer statistics.
652747
*
653748
* All operations are optional (i.e. the function pointer may be set
654749
* to %NULL) and callers must take this into account. Callers must
@@ -787,6 +882,10 @@ struct ethtool_ops {
787882
int (*set_module_power_mode)(struct net_device *dev,
788883
const struct ethtool_module_power_mode_params *params,
789884
struct netlink_ext_ack *extack);
885+
int (*get_mm)(struct net_device *dev, struct ethtool_mm_state *state);
886+
int (*set_mm)(struct net_device *dev, struct ethtool_mm_cfg *cfg,
887+
struct netlink_ext_ack *extack);
888+
void (*get_mm_stats)(struct net_device *dev, struct ethtool_mm_stats *stats);
790889
};
791890

792891
int ethtool_check_ops(const struct ethtool_ops *ops);

include/uapi/linux/ethtool.h

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -779,6 +779,31 @@ enum ethtool_podl_pse_pw_d_status {
779779
ETHTOOL_PODL_PSE_PW_D_STATUS_ERROR,
780780
};
781781

782+
/**
783+
* enum ethtool_mm_verify_status - status of MAC Merge Verify function
784+
* @ETHTOOL_MM_VERIFY_STATUS_UNKNOWN:
785+
* verification status is unknown
786+
* @ETHTOOL_MM_VERIFY_STATUS_INITIAL:
787+
* the 802.3 Verify State diagram is in the state INIT_VERIFICATION
788+
* @ETHTOOL_MM_VERIFY_STATUS_VERIFYING:
789+
* the Verify State diagram is in the state VERIFICATION_IDLE,
790+
* SEND_VERIFY or WAIT_FOR_RESPONSE
791+
* @ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED:
792+
* indicates that the Verify State diagram is in the state VERIFIED
793+
* @ETHTOOL_MM_VERIFY_STATUS_FAILED:
794+
* the Verify State diagram is in the state VERIFY_FAIL
795+
* @ETHTOOL_MM_VERIFY_STATUS_DISABLED:
796+
* verification of preemption operation is disabled
797+
*/
798+
enum ethtool_mm_verify_status {
799+
ETHTOOL_MM_VERIFY_STATUS_UNKNOWN,
800+
ETHTOOL_MM_VERIFY_STATUS_INITIAL,
801+
ETHTOOL_MM_VERIFY_STATUS_VERIFYING,
802+
ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED,
803+
ETHTOOL_MM_VERIFY_STATUS_FAILED,
804+
ETHTOOL_MM_VERIFY_STATUS_DISABLED,
805+
};
806+
782807
/**
783808
* struct ethtool_gstrings - string set for data tagging
784809
* @cmd: Command number = %ETHTOOL_GSTRINGS

include/uapi/linux/ethtool_netlink.h

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,8 @@ enum {
5555
ETHTOOL_MSG_PLCA_GET_CFG,
5656
ETHTOOL_MSG_PLCA_SET_CFG,
5757
ETHTOOL_MSG_PLCA_GET_STATUS,
58+
ETHTOOL_MSG_MM_GET,
59+
ETHTOOL_MSG_MM_SET,
5860

5961
/* add new constants above here */
6062
__ETHTOOL_MSG_USER_CNT,
@@ -105,6 +107,8 @@ enum {
105107
ETHTOOL_MSG_PLCA_GET_CFG_REPLY,
106108
ETHTOOL_MSG_PLCA_GET_STATUS_REPLY,
107109
ETHTOOL_MSG_PLCA_NTF,
110+
ETHTOOL_MSG_MM_GET_REPLY,
111+
ETHTOOL_MSG_MM_NTF,
108112

109113
/* add new constants above here */
110114
__ETHTOOL_MSG_KERNEL_CNT,
@@ -922,6 +926,49 @@ enum {
922926
ETHTOOL_A_PLCA_MAX = (__ETHTOOL_A_PLCA_CNT - 1)
923927
};
924928

929+
/* MAC Merge (802.3) */
930+
931+
enum {
932+
ETHTOOL_A_MM_STAT_UNSPEC,
933+
ETHTOOL_A_MM_STAT_PAD,
934+
935+
/* aMACMergeFrameAssErrorCount */
936+
ETHTOOL_A_MM_STAT_REASSEMBLY_ERRORS, /* u64 */
937+
/* aMACMergeFrameSmdErrorCount */
938+
ETHTOOL_A_MM_STAT_SMD_ERRORS, /* u64 */
939+
/* aMACMergeFrameAssOkCount */
940+
ETHTOOL_A_MM_STAT_REASSEMBLY_OK, /* u64 */
941+
/* aMACMergeFragCountRx */
942+
ETHTOOL_A_MM_STAT_RX_FRAG_COUNT, /* u64 */
943+
/* aMACMergeFragCountTx */
944+
ETHTOOL_A_MM_STAT_TX_FRAG_COUNT, /* u64 */
945+
/* aMACMergeHoldCount */
946+
ETHTOOL_A_MM_STAT_HOLD_COUNT, /* u64 */
947+
948+
/* add new constants above here */
949+
__ETHTOOL_A_MM_STAT_CNT,
950+
ETHTOOL_A_MM_STAT_MAX = (__ETHTOOL_A_MM_STAT_CNT - 1)
951+
};
952+
953+
enum {
954+
ETHTOOL_A_MM_UNSPEC,
955+
ETHTOOL_A_MM_HEADER, /* nest - _A_HEADER_* */
956+
ETHTOOL_A_MM_PMAC_ENABLED, /* u8 */
957+
ETHTOOL_A_MM_TX_ENABLED, /* u8 */
958+
ETHTOOL_A_MM_TX_ACTIVE, /* u8 */
959+
ETHTOOL_A_MM_TX_MIN_FRAG_SIZE, /* u32 */
960+
ETHTOOL_A_MM_RX_MIN_FRAG_SIZE, /* u32 */
961+
ETHTOOL_A_MM_VERIFY_ENABLED, /* u8 */
962+
ETHTOOL_A_MM_VERIFY_STATUS, /* u8 */
963+
ETHTOOL_A_MM_VERIFY_TIME, /* u32 */
964+
ETHTOOL_A_MM_MAX_VERIFY_TIME, /* u32 */
965+
ETHTOOL_A_MM_STATS, /* nest - _A_MM_STAT_* */
966+
967+
/* add new constants above here */
968+
__ETHTOOL_A_MM_CNT,
969+
ETHTOOL_A_MM_MAX = (__ETHTOOL_A_MM_CNT - 1)
970+
};
971+
925972
/* generic netlink info */
926973
#define ETHTOOL_GENL_NAME "ethtool"
927974
#define ETHTOOL_GENL_VERSION 1

net/ethtool/Makefile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,5 +7,5 @@ obj-$(CONFIG_ETHTOOL_NETLINK) += ethtool_nl.o
77
ethtool_nl-y := netlink.o bitset.o strset.o linkinfo.o linkmodes.o rss.o \
88
linkstate.o debug.o wol.o features.o privflags.o rings.o \
99
channels.o coalesce.o pause.o eee.o tsinfo.o cabletest.o \
10-
tunnels.o fec.o eeprom.o stats.o phc_vclocks.o module.o \
11-
pse-pd.o plca.o
10+
tunnels.o fec.o eeprom.o stats.o phc_vclocks.o mm.o \
11+
module.o pse-pd.o plca.o mm.o

0 commit comments

Comments
 (0)