Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[202012] [Dell S5248F] Platform modules missing for DellEMC-S5248f #10887

Closed
jeff-yin opened this issue May 19, 2022 · 10 comments
Closed

[202012] [Dell S5248F] Platform modules missing for DellEMC-S5248f #10887

jeff-yin opened this issue May 19, 2022 · 10 comments
Assignees
Labels
MSFT Triaged this issue has been triaged

Comments

@jeff-yin
Copy link
Collaborator

Opening this based on query from mailing list:
https://groups.google.com/g/sonicproject/c/Yv6R-_ubBzg/m/GOf6gfgODAAJ?utm_medium=email&utm_source=footer&pli=1

Description

pmon fails to start due traceback thrown in decode-syseeprom, which leads to platform module not loading correctly.

Steps to reproduce the issue:

Install the 202012 sonic-broadcom image to the Dell S5248F-ON.

Describe the results you received:

show platform syseeprom
Traceback (most recent call last):
  File "/usr/local/bin/decode-syseeprom", line 171, in <module>
    exit(main())
  File "/usr/local/bin/decode-syseeprom", line 58, in main
    run(t, opts, args, support_eeprom_db)
  File "/usr/local/bin/decode-syseeprom", line 87, in run
    err = target.read_eeprom_db()
  File "/usr/local/lib/python3.7/dist-packages/sonic_eeprom/eeprom_tlvinfo.py", line 283, in read_eeprom_db
    db_state = self._redis_hget('EEPROM_INFO|State', 'Initialized')
  File "/usr/local/lib/python3.7/dist-packages/sonic_eeprom/eeprom_tlvinfo.py", line 630, in _redis_hget
    value = self.redis_client.hget(key, field)
  File "/usr/local/lib/python3.7/dist-packages/sonic_eeprom/eeprom_tlvinfo.py", line 625, in redis_client
    if not self._redis_client:
AttributeError: 'board' object has no attribute '_redis_client'

PMON docker details:

docker exec -it pmon bash
root@sonic:/usr/local/bin# python3 /usr/local/bin/thermalctld
Traceback (most recent call last):
  File "/usr/local/bin/thermalctld", line 14, in <module>
    import sonic_platform
ImportError: No module named sonic_platform

Describe the results you expected:

pmon should start successfully and there should not be any traceback errors in show platform syseeprom

Output of show version:

SONiC Version:

SONiC Software Version: SONiC.202012.100954-acfee3be9
Distribution: Debian 10.12
Kernel: 4.19.0-12-2-amd64
Build commit: acfee3be9
Build date: Thu May 19 13:38:29 UTC 2022
Built by: AzDevOps@sonic-build-workers-001IJN

Platform: x86_64-dellemc_s5248f_c3538-r0
HwSKU: DellEMC-S5248f-P-25G
ASIC: broadcom
ASIC Count: 1
Serial Number: 
Uptime: 12:32:27 up 22 min,  1 user,  load average: 0.16, 0.18, 0.18

Docker images:
REPOSITORY                    TAG                       IMAGE ID            SIZE
docker-sonic-mgmt-framework   202012.100954-acfee3be9   3043b6de019e        687MB
docker-sonic-mgmt-framework   latest                    3043b6de019e        687MB
docker-sonic-telemetry        202012.100954-acfee3be9   d26dc75a6ef3        451MB
docker-sonic-telemetry        latest                    d26dc75a6ef3        451MB
docker-fpm-frr                202012.100954-acfee3be9   b9ae3f4cea2e        391MB
docker-fpm-frr                latest                    b9ae3f4cea2e        391MB
docker-sflow                  202012.100954-acfee3be9   27e297133b7a        374MB
docker-sflow                  latest                    27e297133b7a        374MB
docker-nat                    202012.100954-acfee3be9   60b4fad032f2        376MB
docker-nat                    latest                    60b4fad032f2        376MB
docker-teamd                  202012.100954-acfee3be9   dfd090573019        373MB
docker-teamd                  latest                    dfd090573019        373MB
docker-orchagent              202012.100954-acfee3be9   cd5c3153656d        390MB
docker-orchagent              latest                    cd5c3153656d        390MB
docker-platform-monitor       202012.100954-acfee3be9   1d9f6c11e627        544MB
docker-platform-monitor       latest                    1d9f6c11e627        544MB
docker-snmp                   202012.100954-acfee3be9   d0fd0c2b84b6        405MB
docker-snmp                   latest                    d0fd0c2b84b6        405MB
docker-syncd-brcm             202012.100954-acfee3be9   b9c85ebbe4a1        654MB
docker-syncd-brcm             latest                    b9c85ebbe4a1        654MB
docker-router-advertiser      202012.100954-acfee3be9   af94a56e0148        362MB
docker-router-advertiser      latest                    af94a56e0148        362MB
docker-lldp                   202012.100954-acfee3be9   951ec937bab9        402MB
docker-lldp                   latest                    951ec937bab9        402MB
docker-dhcp-relay             202012.100954-acfee3be9   7cedcd5b5c5f        375MB
docker-dhcp-relay             latest                    7cedcd5b5c5f        375MB
docker-database               202012.100954-acfee3be9   580fd16d5bb6        362MB
docker-database               latest                    580fd16d5bb6        362MB
docker-mux                    202012.100954-acfee3be9   8cdc6f4f43c3        414MB
docker-mux                    latest                    8cdc6f4f43c3        414MB

Output of show techsupport:

Not provided, but if the repro is otherwise not straightforward, let's reach out to Madhu Paluru @ Aviz.

Additional information you deem important (e.g. issue happens only occasionally):

More logs provided by Madhu:

May 19 12:09:43 sonic rc.local[433]: + dpkg -i /host/image-202012.100954-acfee3be9/platform/x86_64-dellemc_s5248f_c3538-r0/platform-modules-s5248f_1.1_amd64.deb
May 19 12:09:43 sonic rc.local[433]: + dpkg -i /host/image-202012.100954-acfee3be9/platform/x86_64-dellemc_s5248f_c3538-r0/platform-modules-s5248f_1.1_amd64.deb
May 19 12:09:43 sonic rc.local[555]: dpkg-deb: error: '/host/image-202012.100954-acfee3be9/platform/x86_64-dellemc_s5248f_c3538-r0/platform-modules-s5248f_1.1_amd64.deb' is not a Debian format archive
May 19 12:09:43 sonic rc.local[555]: dpkg-deb: error: '/host/image-202012.100954-acfee3be9/platform/x86_64-dellemc_s5248f_c3538-r0/platform-modules-s5248f_1.1_amd64.deb' is not a Debian format archive
May 19 12:09:43 sonic rc.local[550]: dpkg: error processing archive /host/image-202012.100954-acfee3be9/platform/x86_64-dellemc_s5248f_c3538-r0/platform-modules-s5248f_1.1_amd64.deb (--install):
May 19 12:09:43 sonic rc.local[550]: dpkg: error processing archive /host/image-202012.100954-acfee3be9/platform/x86_64-dellemc_s5248f_c3538-r0/platform-modules-s5248f_1.1_amd64.deb (--install):
@jeff-yin
Copy link
Collaborator Author

Please assign to @aravindmani-1 for initial triage

@madhupalu
Copy link

madhupalu commented May 19, 2022

@jeff-yin

Thanks for tracking the issue - #10887

  • Issue is seen in other images as well ex: 202211.
  • We have confirmed this is not the hardware issue.

May I know who will be working on this?
Note: - We see there is a fix made to reduce image size - https://github.com/Azure/sonic-buildimage/pull/10775/files and the issue got addressed in 202211 and there is a back port requested for 202012. Could someone from Dell or BRCM check whether this can fix this issue?

@KrupakarAnnam
Copy link

sonic_dump_sonic_20220522_234126.tar.gz
Hi Team, Attaching 'show techsupport' dump here...

@abdosi
Copy link
Contributor

abdosi commented May 24, 2022

@jeff-yin is the installation of new image done via sonic installer or onie ?

@jeff-yin
Copy link
Collaborator Author

@jeff-yin is the installation of new image done via sonic installer or onie ?

@madhupalu @KrupakarAnnam can you respond? I'm assuming it's via ONIE.

@abdosi
Copy link
Contributor

abdosi commented May 24, 2022

cc @anamehra

@KrupakarAnnam
Copy link

HI Jeff,
We did some more tests..

The issue is happening on the 202012 branch with ONIE installation only.

  • Sonic 202111 build is installed using ONIE and we don't see a Debian file loading issue.
  • Now, install the 202012 22nd May build using ONIE and the Debian file loading issue reproduced.
  • Tried manually installing Debian file but failed: dpkg -i /host/image-202012.100954-acfee3be9/platform/x86_64-dellemc_s5248f_c3538-r0/platform-modules-s5248f_1.1_amd64.deb
  • Now, install 202012 19th May build using ONIE and the Debian file loading issue is still seen
  • Now, install the Sonic 202111 build using ONIE and we don't see the Debian file loading issue.
  • Now, install 202012 19th May build using 'sonic install' - the Debian file loading issue is not seen

Reverting the PR solved the issue

Install the 202012 22nd May build using ONIE and the Debian file loading issue reproduced.
Now, install a build by reverting PR: https://github.com/Azure/sonic-buildimage/pull/10775/files [github.com]) using ONIE and the issue is not seen anymore.

@aravindmani-1
Copy link
Contributor

Thanks @KrupakarAnnam . We observed the same issue in DellEMC S5232f also. i believe that we will be hitting this issue in most of the platforms. @abdosi , @xumia Could you please share your insights on this issue?.

@abdosi
Copy link
Contributor

abdosi commented May 24, 2022

@xumia can we revert https://github.com/Azure/sonic-buildimage/pull/10775/files until you have complete tested fix ?

@zhangyanzhao zhangyanzhao added Triaged this issue has been triaged MSFT labels May 25, 2022
jonathantsai-qci referenced this issue May 27, 2022
Why I did it
The image size is too large, when there are multiple lazy packages and multiple platforms. It is not necessary to keep the lazy installation packages in multiple copies.
For cisco image, the image size will reduce from 3.5G to 1.7G.

How I did it
Use symbol links to only keep one package for each of the lazy package.
Make a new folder fsroot/platform/common
Copy the lazy packages into the folder.
When using a package in each of the platform, such as x86_64-grub, x86_64-8800_rp-r0, x86_64-8201_on-r0, etc, only make a symbol link to the package in the common folder.
@aravindmani-1
Copy link
Contributor

@KrupakarAnnam Please confirm whether this issue can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MSFT Triaged this issue has been triaged
Projects
None yet
Development

No branches or pull requests

7 participants