-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[autoneg] add support for remote speed advertisement and clarify the expected autoneg behaviors #924
Conversation
@Junchao-Mellanox, @keboliu please review this HLD and be the reviewer of the code once it is available. |
4. If autoneg is enabled and adv_speeds is specified, SAI must advertise only the speeds those are supported by the switch silicon. Hence the operational advertisement could be a subset of the speeds specified in adv_speeds. SAI should return errors if none of desirable advertised speeds is valid. | ||
5. If autoneg is enabled and adv_interface_types is not configured or empty, SAI must advertise it with all supported interface types. | ||
6. If autoneg is enabled and adv_interface_types is specified, SAI must advertise only the interface types those are valid to the attched transceiver and supported by the switch silicon. Hence the operational advertisement could be a subset of the interface types specified in adv_interface_types. SAI should return errors if none of the desirable advertised interface types is valid. | ||
7. If autoneg is enabled, the administrative port speed updates via SAI_PORT_ATTR_SPEED should disable the autoneg. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate this requirement? I suppose user can set autoneg=off and set port speed to enforce the admin speed currently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be revised as per the earlier community meeting. The new one is as follow:
7. If autoneg is enabled, the administrative port speed updates via SAI_PORT_ATTR_SPEED should not alter the autoneg config and operational states. The sonic-utilities should also block speed configuration when autoneg is enabled.
8. If autoneg is enabled, the interface type updates via SAI_PORT_ATTR_INTERFACE_TYPE should restart the autoneg. These are requests from pmon#xcvrd triggered by the transceiver insertion when the per-port interface type is specified in the platform-specific media_settings.json.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't agree with point 8. It should have consistency with speed and FEC. If autoneg is enabled and interface type is set it should be blocked at CLI since interface is derived from auto negotiation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed please don't add any checks on FEC when AN is enabled for backward compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, and it's done
|
||
| value | mode | Description | | ||
|:-----:|:----:|:---------------------------------------------------------:| | ||
| 2 | auto | Enable autoneg if applicable to the transceiver attached | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SONiC is using a bool attribute SAI_PORT_ATTR_AUTO_NEG_MODE to set autoneg mode. Do you plan to use a new SAI attribute? What is the SAI attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments, but the 'auto' mode will be removed as per the discussion in the earlier meeting
@@ -406,7 +447,7 @@ The port yang model needs to update according to DB schema change. The yang mode | |||
``` | |||
leaf autoneg { | |||
type string { | |||
pattern "0|1"; | |||
pattern "0|1|2"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be on|off|auto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy that, thanks
updatePortOperStatus(port, status) | ||
if status == SAI_PORT_OPER_STATUS_UP: | ||
updateDbPortOperSpeed(port, speed) | ||
updateDbPortOperRemoteAdvSpeeds(port) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we remove the remote adv speeds when oper status is down?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly
@@ -616,6 +685,10 @@ xcvrd just reads the configuration from media_settings.json and set value to APP | |||
|
|||
However, it is worthy mentioning that: if user have port attributes configured both in CONFIG_DB and media_settings.json, the value in media_settings.json will override the value in CONFIG_DB after rebooting, restarting pmon or re-insert cables. Base on that, if user choose to use media_settings.json, they probably should not use CLI or CONFIG_DB to avoid configuration loss after rebooting, restarting pmon or re-insert cables. | |||
|
|||
In the case of SFPs, the link training always takes place after speed negotiated, and given that port link training shall not be enabled on certain transceivers. For example, a chip-to-module transceiver. It's better to have pmon#xcvrd provide a hint to the swss#orchagent for the advanced auto negotiation and link training controls. Hence, **xcvr_capabilities** is introduced into APPL_DB for this. swss#orchagent should request syncd to enable port auto negotiation only when **autoneg=1** or (**autoneg=2** and **AN is specified in xcvr_capabilities**). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is xcvr_capabilities from, media_settings.json or EEPROM? Some vendors do not use media_settings.json. And a few questions here:
- What if media_settings.json is not available?
- Do you plan to have a CLI to configure this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- media_settings.json is optional, when it's not specified the xcvrd will automatically derive the capabilities from transceiver information, the details will be added into HLD
- No CLI config is necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have a knob to disable this flow (May be in media_settings.json). By default we shouldn't have xcvr capabilities published and have all the error logs, checks. It should be enabled per platform and I think the knob can be present in a platform specific file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed earlier in the meeting, all the xcvr capability checkers are now removed
@@ -616,6 +685,10 @@ xcvrd just reads the configuration from media_settings.json and set value to APP | |||
|
|||
However, it is worthy mentioning that: if user have port attributes configured both in CONFIG_DB and media_settings.json, the value in media_settings.json will override the value in CONFIG_DB after rebooting, restarting pmon or re-insert cables. Base on that, if user choose to use media_settings.json, they probably should not use CLI or CONFIG_DB to avoid configuration loss after rebooting, restarting pmon or re-insert cables. | |||
|
|||
In the case of SFPs, the link training always takes place after speed negotiated, and given that port link training shall not be enabled on certain transceivers. For example, a chip-to-module transceiver. It's better to have pmon#xcvrd provide a hint to the swss#orchagent for the advanced auto negotiation and link training controls. Hence, **xcvr_capabilities** is introduced into APPL_DB for this. swss#orchagent should request syncd to enable port auto negotiation only when **autoneg=1** or (**autoneg=2** and **AN is specified in xcvr_capabilities**). | |||
|
|||
In the case of SFPs, the valid speed list varies from transceivers to transceivers, and it's expected that user may specify a speed in the CONFIG_DB that's beyond the capabilities of the attached transceiver. Hence the **xcvr_speeds** is introduced into APPL_DB, and pmon#xcvrd should dynamically updates this field upon both transceiver insertion and dynamic port breakout operations. swss#orchagent should request syncd to update the port speed advertisement only when the speed is available in both **xcvr_speeds of APPL_DB** and **adv_speeds of CONFIG_DB**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a cli config interface advertised-speeds
, should it be extended to use "xcvr_speeds" to validate user input?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, will do
@@ -603,7 +672,7 @@ I choose this solution because: | |||
|
|||
Dynamic port breakout feature also introduces a hwsku.json file to describe the port capability. It defines the default dynamic breakout mode for now. As we won't automatically set auto negotiation attributes for a port after port breakout, it is not necessary to change the hwsku.json in this feature. | |||
|
|||
#### xcvrd Consideration | |||
#### PMON xcvrd Consideration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you provide an example of media_settings.json for the new attributes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These capabilities are directly derived from the transceiver info, and yes, we could provide the custom overrides in the media_settings.json, will update the HLD
@@ -616,6 +685,10 @@ xcvrd just reads the configuration from media_settings.json and set value to APP | |||
|
|||
However, it is worthy mentioning that: if user have port attributes configured both in CONFIG_DB and media_settings.json, the value in media_settings.json will override the value in CONFIG_DB after rebooting, restarting pmon or re-insert cables. Base on that, if user choose to use media_settings.json, they probably should not use CLI or CONFIG_DB to avoid configuration loss after rebooting, restarting pmon or re-insert cables. | |||
|
|||
In the case of SFPs, the link training always takes place after speed negotiated, and given that port link training shall not be enabled on certain transceivers. For example, a chip-to-module transceiver. It's better to have pmon#xcvrd provide a hint to the swss#orchagent for the advanced auto negotiation and link training controls. Hence, **xcvr_capabilities** is introduced into APPL_DB for this. swss#orchagent should request syncd to enable port auto negotiation only when **autoneg=1** or (**autoneg=2** and **AN is specified in xcvr_capabilities**). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you list down the criteria used to classify transceiver as autoneg capable vs not? If there is a list please capture it in the HLD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of now, only CR transceivers will be flagged with 'AN,LT' support, and the details will be added into HLD.
|
||
| value | mode | Description | | ||
|:-----:|:----:|:---------------------------------------------------------:| | ||
| 2 | auto | Enable autoneg if applicable to the transceiver attached | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove auto mode as discussed in the HLD review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -616,6 +685,10 @@ xcvrd just reads the configuration from media_settings.json and set value to APP | |||
|
|||
However, it is worthy mentioning that: if user have port attributes configured both in CONFIG_DB and media_settings.json, the value in media_settings.json will override the value in CONFIG_DB after rebooting, restarting pmon or re-insert cables. Base on that, if user choose to use media_settings.json, they probably should not use CLI or CONFIG_DB to avoid configuration loss after rebooting, restarting pmon or re-insert cables. | |||
|
|||
In the case of SFPs, the link training always takes place after speed negotiated, and given that port link training shall not be enabled on certain transceivers. For example, a chip-to-module transceiver. It's better to have pmon#xcvrd provide a hint to the swss#orchagent for the advanced auto negotiation and link training controls. Hence, **xcvr_capabilities** is introduced into APPL_DB for this. swss#orchagent should request syncd to enable port auto negotiation only when **autoneg=1** or (**autoneg=2** and **AN is specified in xcvr_capabilities**). | |||
|
|||
In the case of SFPs, the valid speed list varies from transceivers to transceivers, and it's expected that user may specify a speed in the CONFIG_DB that's beyond the capabilities of the attached transceiver. Hence the **xcvr_speeds** is introduced into APPL_DB, and pmon#xcvrd should dynamically updates this field upon both transceiver insertion and dynamic port breakout operations. swss#orchagent should request syncd to update the port speed advertisement only when the speed is available in both **xcvr_speeds of APPL_DB** and **adv_speeds of CONFIG_DB**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed if there is mismatch between xcvr_speeds and adv_speeds the ideal solution is to log error. Overriding the user configuration is not the right approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy that, and thanks
4. If autoneg is enabled and adv_speeds is specified, SAI must advertise only the speeds those are supported by the switch silicon. Hence the operational advertisement could be a subset of the speeds specified in adv_speeds. SAI should return errors if none of desirable advertised speeds is valid. | ||
5. If autoneg is enabled and adv_interface_types is not configured or empty, SAI must advertise it with all supported interface types. | ||
6. If autoneg is enabled and adv_interface_types is specified, SAI must advertise only the interface types those are valid to the attched transceiver and supported by the switch silicon. Hence the operational advertisement could be a subset of the interface types specified in adv_interface_types. SAI should return errors if none of the desirable advertised interface types is valid. | ||
7. If autoneg is enabled, the administrative port speed updates via SAI_PORT_ATTR_SPEED should disable the autoneg. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in the review remove the point 7. It should be replaced by a check in sonic-utilities to block speed configuration when autoneg is enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check should also be done for other attributes that are exchanged via autoneg. Some of them I can name are FEC, interface type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy that, the HLD will be updated as follow
7. If autoneg is enabled, the administrative port speed updates (e.g. SAI_PORT_ATTR_SPEED) should not alter the autoneg config and operational states. The sonic-utilities should also block speed configuration when autoneg is enabled.
8. If autoneg is enabled, the administrative port FEC updates (e.g. SAI_PORT_ATTR_FEC_MODE, SAI_PORT_ATTR_FEC_MODE_EXTENDED) should not alter the autoneg config and operational states. The sonic-utilities should also block FEC configuration when autoneg is enabled.
7. If autoneg is enabled, the administrative port speed updates via SAI_PORT_ATTR_SPEED should disable the autoneg. | ||
8. If autoneg is enabled, the interface type updates via SAI_PORT_ATTR_INTERFACE_TYPE should be ignored. These are requests from xcvrd triggered by the transceiver insertion if the per-port interface type is specified in the platform-specific media_settings.json. | ||
9. If autoneg is enabled on a SFP port, SAI should also activate the link-training to dynamically tune the TX FIR (i.e. Clause 73). | ||
10. If autoneg is disabled, the port speed, interface type and TX FIR should be restored to the original values before autoneg activation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the original values before autoneg activation? What if user has not configured these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if autoneg is disabled then if speed/fec/if_type were configured they should be set, if not configured – default values should be set explicitly. This is to ensure the hardware is not programmed with residual values of the autoneg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments, this will be updated as follow
11. If autoneg is disabled, the port speed, FEC, interface type and TX FIR should be restored with "speed", "fec", "interface_type" and ("preemphasis", "idriver", "ipredriver", "pre1", "pre2", "pre3", "main", "post1", "post2", "post3", "attn") values in the APPL_DB. If the corresponding configuration is not specified in the APPL_DB, vendor-specific SAI driver defaults should be applied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"vendor-specific SAI driver defaults should be applied" - This should not be the case. It should be SAI API defaults and not any vendor specific defaults to have a uniformity of the behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, the HLD is now updated, please check it out and see if works
- rmt_adv_speeds | ||
Valid value is the same as **adv_speeds**, except that it's reported by the remote peer. | ||
- xcvr_capabilities: | ||
Transceiver capabilities provided by pmon#xcvrd, it's a set of capabilities separated by comma, where `LT` stands for "link training" and `AN` for "auto negotiation". For example: "AN,LT". See detail description in section [PMON xcvrd Consideration](#pmon-xcvrd-consideration) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed during HLD review when enabling AN for a transceiver that doesn't support capability we should throw error message during configuration and pass it only if force option is used. Please update the HLD to reflect the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may also need to cover the scenario where pmon is not up or pmon comes up later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, and it's done
4. If autoneg is enabled and adv_speeds is specified, SAI must advertise only the speeds those are supported by the switch silicon. Hence the operational advertisement could be a subset of the speeds specified in adv_speeds. SAI should return errors if none of desirable advertised speeds is valid. | ||
5. If autoneg is enabled and adv_interface_types is not configured or empty, SAI must advertise it with all supported interface types. | ||
6. If autoneg is enabled and adv_interface_types is specified, SAI must advertise only the interface types those are valid to the attched transceiver and supported by the switch silicon. Hence the operational advertisement could be a subset of the interface types specified in adv_interface_types. SAI should return errors if none of the desirable advertised interface types is valid. | ||
7. If autoneg is enabled, the administrative port speed updates via SAI_PORT_ATTR_SPEED should disable the autoneg. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check should also be done for other attributes that are exchanged via autoneg. Some of them I can name are FEC, interface type.
7. If autoneg is enabled, the administrative port speed updates via SAI_PORT_ATTR_SPEED should disable the autoneg. | ||
8. If autoneg is enabled, the interface type updates via SAI_PORT_ATTR_INTERFACE_TYPE should be ignored. These are requests from xcvrd triggered by the transceiver insertion if the per-port interface type is specified in the platform-specific media_settings.json. | ||
9. If autoneg is enabled on a SFP port, SAI should also activate the link-training to dynamically tune the TX FIR (i.e. Clause 73). | ||
10. If autoneg is disabled, the port speed, interface type and TX FIR should be restored to the original values before autoneg activation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if autoneg is disabled then if speed/fec/if_type were configured they should be set, if not configured – default values should be set explicitly. This is to ensure the hardware is not programmed with residual values of the autoneg
- rmt_adv_speeds | ||
Valid value is the same as **adv_speeds**, except that it's reported by the remote peer. | ||
- xcvr_capabilities: | ||
Transceiver capabilities provided by pmon#xcvrd, it's a set of capabilities separated by comma, where `LT` stands for "link training" and `AN` for "auto negotiation". For example: "AN,LT". See detail description in section [PMON xcvrd Consideration](#pmon-xcvrd-consideration) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may also need to cover the scenario where pmon is not up or pmon comes up later.
Any plan to add test case to sonic-mgmt? There was a test case at https://github.com/Azure/sonic-mgmt/blob/master/tests/platform_tests/test_auto_negotiation.py |
4. If autoneg is enabled and adv_interface_types is not configured or empty, SAI must advertise it with all supported interface types. | ||
5. If autoneg is disabled and interface_type is not configured, SAI must use SAI_PORT_INTERFACE_TYPE_NONE. | ||
4. If autoneg is enabled and adv_interface_types is not configured or empty, SA | ||
I must advertise it with all supported interface types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think an unintended newline. Please correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
5. If autoneg is disabled and interface_type is not configured, SAI must use SAI_PORT_INTERFACE_TYPE_NONE. | ||
4. If autoneg is enabled and adv_interface_types is not configured or empty, SA | ||
I must advertise it with all supported interface types. | ||
5. If autoneg is disabled and interface_type is not configured, SAI must use SA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same. This unintended newline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
8. If autoneg is enabled on a SFP/QSFP/QSFPDD port, SAI should also activate the link-training to dynamically tune the TX FIR. | ||
9. If autoneg is transitioned from enabled to disabled, the port speed, FEC, interface type and TX FIR should be restored with the corresponding values in the APPL_DB. (e.g. "speed", "fec", "interface_type" and ("preemphasis", "idriver", "ipredriver", "pre1", "pre2", "pre3", "main", "post1", "post2", "post3", "attn") If the corresponding configuration is not available in the APPL_DB, the following SAI driver defaults should be applied to achieve the best backward compatibility. | ||
|
||
|Lane Count | Default Speed | Default FEC | Default Medium | Default Interface | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am again concerned with the defaults set here. SAI interface type if not configured should be set to none. FEC should also be set to None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, The defaults are SAI_PORT_FEC_MODE_NONE and SAI_PORT_INTERFACE_TYPE_NONE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, The defaults are SAI_PORT_FEC_MODE_NONE and SAI_PORT_INTERFACE_TYPE_NONE
| 10G | 1 | None | | ||
| 1G | 1 | None | | ||
|
||
11. If autoneg is enabled, the administrative port FEC mode updates (i.e. "fec=rs|fc|none" in CONFIG_DB) should be able to alter the FEC mode of the autoneg advertisement, if the configured FEC is not supported by the hardware, it should fallback as follows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the intended behavior. What currently we have in SONiC is if FEC is configured along with autoneg enabled, it overrides the auto negotiated value but doesn't change the advertisement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, this is removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, this is removed
Add support for remote speed advertisement, such that users could easily identify the connection issues when autoneg is enabled. HLD: sonic-net/SONiC#924 - What I did Add support for remote speed advertisement - How I did it Implementation is done according to the AutoNeg HLD Signed-off-by: Dante Su <[email protected]>
…s per transceiver attached - Why I did this? 1. Add support for the operational states of AutoNeg 2. Furtherly clarify the AutoNeg behaviors for the SAI implementation Signed-off-by: Dante Su <[email protected]>
…sement HLD: sonic-net/SONiC#924 - What I did Add support for remote speed advertisement Add support for capability checker - How I did it Implementation is done according to the AutoNeg HLD Signed-off-by: Dante Su <[email protected]>
…sement HLD: sonic-net/SONiC#924 - What I did Add support for remote speed advertisement Add support for capability checker - How I did it Implementation is done according to the AutoNeg HLD Signed-off-by: Dante Su <[email protected]>
@@ -169,14 +229,16 @@ Configuring advertised speeds takes effect only if auto negotiation is enabled. | |||
|
|||
``` | |||
Format: | |||
config interface advertised-speeds <interface_name> <speed_list> | |||
config interface advertised-speeds <interface_name> <speed_list> [-f] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please elaborate the need for -f?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's already removed, and it was for the xcvr capability checkers
|
||
; Defines information for port state | ||
key = PORT_TABLE:port_name ; state of the port | ||
; field = value | ||
... | ||
supported_speeds = STRING ; supported speed list | ||
speed = STRING ; operational speed | ||
speed = STRING ; operational speed | ||
rmt_adv_speeds = STRING ; advertised speed list of the remote |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The remote advertised speed might impact output of existing CLI. This usecase was not discussed in the meetings. Do we need this field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's okay to not to show remote advertised speed in the CLI, it's only a hint to help users identify the cause of a link failure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for your reference, while it's unexpected, the link could get up for some time when the local unit is with AN=ON and remote is in fixed speed. For example, a 100M link could be up when local is with AN=OFF and remote is with AN=ON (It's most likely the 100M is somehow the default speed upon AN failure).
Actually, we did hit this issue a few days ago, and having the remote advertisement displayed in CLI could help users identify this issue, if the link is up when AN=ON, and the remote advertisement is NONE/EMPTY, it's this case.
e.g. The 100M link is somehow coming up unexpectedly.
DUT A(100M, AN=OFF) + DUT B (AN=ON, Adv.Speed=1000,100,10)
Unfortunately, although the link is up unexpectedly, the duplex mode is wrong, and caused traffic issues
i.e.
DUT A is at Full-Duplex, while DUT B is at Half-Duplex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Junchao-Mellanox Can you please provide feedback on modifying existing CLI's impact?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for your reference, the CLI output is updated as follows, the column is increated from 90 to 108
Original
Interface Auto-Neg Mode Speed Adv Speeds Type Adv Types Oper Admin
----------- --------------- ------- ------------ ------ ----------- ------ -------
Ethernet0 enabled 25G 10G,50G CR4 CR4,CR2 down up
Ethernet32 disabled 40G all N/A all up up
Ethernet112 N/A 40G N/A N/A N/A up up
Ethernet116 N/A 40G N/A N/A N/A up up
Ethernet120 N/A 40G N/A N/A N/A up up
Ethernet124 N/A 40G N/A N/A N/A up up
Modified
Interface Auto-Neg Mode Speed Adv Speeds Rmt Adv Speeds Type Adv Types Oper Admin
----------- --------------- ------- ------------ ---------------- ------ ----------- ------ -------
Ethernet0 enabled 25G 10G,50G 40G CR4 CR4,CR2 down up
Ethernet32 disabled 40G all N/A N/A all up up
Ethernet112 N/A 40G N/A N/A N/A N/A up up
Ethernet116 N/A 40G N/A N/A N/A N/A up up
Ethernet120 N/A 40G N/A N/A N/A N/A up up
Ethernet124 N/A 40G N/A N/A N/A N/A up up
|
||
However, it is worthy mentioning that: if user have port attributes configured both in CONFIG_DB and media_settings.json, the value in media_settings.json will override the value in CONFIG_DB after rebooting, restarting pmon or re-insert cables. Base on that, if user choose to use media_settings.json, they probably should not use CLI or CONFIG_DB to avoid configuration loss after rebooting, restarting pmon or re-insert cables. | ||
If user have port attributes configured both in CONFIG_DB and media_settings.json, the value in media_settings.json will override the value in CONFIG_DB after rebooting, restarting pmon or re-insert cables. Base on that, if user choose to use media_settings.json, they probably should not use CLI or CONFIG_DB to avoid configuration loss after rebooting, restarting pmon or re-insert cables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is it ensured that media_settings.json takes preference over CONFIG_DB? In fact the user configuration should take more priority over any auto generated values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of the checkers/blockers are there in the current implementation, it's all about the timing when the corresponding fields of APPL_DB are updated, the OA is unable to check the source, and media_setting framework is always directly sending out these requests into APPL_DB, and an improvement to media_settings framework should be discussed in a separated and dedicated PR against Media-based-Port-settings.md.
Hence this statement right here is to help users understand
-
Interface type should be configured with either CONFIG_DB or media_settings, only one of the approaches should be activated at the same time. Otherwise, the conflict will take place and may lead to unwanted results based on the timing and operation sequences.
-
While the pre-emphasis and interface type defined in media_setting is for the TX FIR parameters to SFP/QSFP/QSFPDD ports, it's not applicable to the native coppers. And the IEEE standards expect link-training to be activated followed by negotiation, hence link-training is part of the autoneg, and because the TX FIR will be dynamically updated at runtime, the static parameters in media_settings.json will never be taken into process when autoneg is activated. However, the OA did need to make sure the requested static TX FIR will be restored when AutoNeg transitioned from ON to OFF.
-
In the case that Link-Training is conditionally disabled during the AutoNeg, as far as I know, it's not IEEE standard and is not supported by all the ASICs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please update HLD accordingly. I believe if the user has both configuration as well information defined in media_settings.json it wouldn't be possible to predict which would take precedence as there is no logic giving one priority over other. Hence the user should be aware of not to configure interface type if it is already defined in media_settings.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll emphasize this unwanted behavior in the HLD
@@ -511,6 +538,52 @@ else if autoneg == false: | |||
setInterfaceType(port, interface_type) | |||
``` | |||
|
|||
##### Getting Negotiated Results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not add coding snippets to HLD. HLD is to be used by non programmers as well. For explaining the flows please use flow diagrams
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do
Signed-off-by: Dante Su <[email protected]>
Signed-off-by: Dante Su <[email protected]>
8498931
to
8837dc2
Compare
…sement HLD: sonic-net/SONiC#924 - What I did Add support for remote speed advertisement Add support for capability checker - How I did it Implementation is done according to the AutoNeg HLD Signed-off-by: Dante Su <[email protected]>
Add support for remote speed advertisement, such that users could easily identify the connection issues when autoneg is enabled. HLD: sonic-net/SONiC#924 - What I did Add support for remote speed advertisement - How I did it Implementation is done according to the AutoNeg HLD Signed-off-by: Dante Su <[email protected]>
@zhangyanzhao @yxieca pls help merge |
@@ -511,6 +538,14 @@ else if autoneg == false: | |||
setInterfaceType(port, interface_type) | |||
``` | |||
|
|||
##### Getting Remote Advertisement | |||
|
|||
A new periodic timer task will be introduced into PortsOrch, it periodically loops through physical ports and update the per-port remote advertisement if autoneg is enabled and the link is down. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why link being DOWN is a requirement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once the link gets up, it's not necessary to keep polling the remote advertisement, as a link-down event is expected upon advertisement update and cable replacement.
|
||
1. When xcvrd start, it reads the media_setting.json and set pre-emphasis values to APPL_DB for each port | ||
2. When a new cable is inserted, xcvrd uses the value in media_setting.json to set pre-emphasis value to APPL_DB for this port. | ||
While it is possible to use CLI/CONFIG_DB for setting the port auto negotiation attributes, this feature is also available via [media_settings.json](https://github.com/Azure/SONiC/blob/master/doc/media-settings/Media-based-Port-settings.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean AN can be configured via media_settings.json?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it can not. This is again emphasise that interface type should be specified by either CLI/CONFIG_DB or media_settings.json, not both.
@@ -86,6 +92,11 @@ Currently, SAI already defines a few port attributes to support port auto negoti | |||
3. If autoneg is enabled and adv_speeds is not configured or empty, SAI must advertise it with all supported speeds. | |||
4. If autoneg is enabled and adv_interface_types is not configured or empty, SAI must advertise it with all supported interface types. | |||
5. If autoneg is disabled and interface_type is not configured, SAI must use SAI_PORT_INTERFACE_TYPE_NONE. | |||
6. If autoneg is enabled, the administrative port speed updates should not disable the autoneg. The configured speed should be cached in swss#orchagent and gets replayed when autoneg is transitioned from enabled to disabled. | |||
7. If autoneg is enabled, while the administrative interface type updates via CONFIG_DB should be blocked, the dynamic interface type updates from pmon#xcvrd via APPL_DB should be delivered to the SAI to update the autoneg advertisement. Please refer to [PMON xcvrd Consideration](#pmon-xcvrd-consideration) for details. | |||
8. If autoneg is enabled on a SFP/QSFP/QSFPDD port, SAI should also activate the link-training to dynamically tune the TX FIR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought if AN is enabled in the HW, LT would follow automatically, why SW intervention is required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of Broadcom switch ASIC, this is true, the LT is always enabled and can't be disabled if AN is enabled.
However, the LT and AN are different hardware component, hence it may or may not be applicable to other vendors.
e.g. If we have LT disabled in the autoneg, the speed, fec negotiation could be possible for the optics, although this is not a standard port mode in the IEEE.
Add support for remote speed advertisement, such that users could easily identify the connection issues when autoneg is enabled. HLD: sonic-net/SONiC#924 - What I did Add support for remote speed advertisement - How I did it Implementation is done according to the AutoNeg HLD Signed-off-by: Dante Su <[email protected]>
* [autoneg] add support for remote speed advertisement Add support for remote speed advertisement, such that users could easily identify the connection issues when autoneg is enabled. HLD: sonic-net/SONiC#924 - What I did Add support for remote speed advertisement - How I did it Implementation is done according to the AutoNeg HLD Signed-off-by: Dante Su <[email protected]> * fix test failures in dump_state_test.py Signed-off-by: Dante Su <[email protected]> * address review comments Signed-off-by: Dante Su <[email protected]> * drop PORT_ADV_SPEEDS from state_db_port_status_get() Signed-off-by: Dante Su <[email protected]> * address review comments Signed-off-by: Dante Su <[email protected]>
* [autoneg] add support for remote speed advertisement Add support for remote speed advertisement, such that users could easily identify the connection issues when autoneg is enabled. HLD: sonic-net/SONiC#924 - What I did Add support for remote speed advertisement - How I did it Implementation is done according to the AutoNeg HLD Signed-off-by: Dante Su <[email protected]> * fix test failures in dump_state_test.py Signed-off-by: Dante Su <[email protected]> * address review comments Signed-off-by: Dante Su <[email protected]> * drop PORT_ADV_SPEEDS from state_db_port_status_get() Signed-off-by: Dante Su <[email protected]> * address review comments Signed-off-by: Dante Su <[email protected]>
Why I did this?
Related PRs:
Signed-off-by: Dante Su [email protected]