Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Domain list page breaks for users with many domains #96354

Open
leonardost opened this issue Nov 13, 2024 · 13 comments
Open

Domain list page breaks for users with many domains #96354

leonardost opened this issue Nov 13, 2024 · 13 comments
Assignees
Labels
[Experiment] AI labels added [Feature] Calypso & wp-admin Navigation All navigation in Calypso and wp-admin, and the unified transitions between the two. [Feature] Domain Management Tools for managing your site's domain(s). [Feature Group] Emails & Domains Features related to email integrations and domain management. [Feature] Site Performance Features related to the speed and performance of your site. Groundskeeping Issues handled through Dotcom Groundskeeping rotations [Platform] Simple [Pri] High Address as soon as possible after BLOCKER issues [Product] WordPress.com All features accessible on and related to WordPress.com. [Status] Escalated to Product Ambassadors Triaged To be used when issues have been triaged. [Type] Bug When a feature is broken and / or not performing as intended

Comments

@leonardost
Copy link
Contributor

leonardost commented Nov 13, 2024

Quick summary

Users that have many domains (over a couple hundreds I believe) can't manage their domains in Calypso because the domain list page (/domains/manage) doesn't load. Some users have thousands of domains due to the Google Domains Takeover initiative (pcYYhz-1ts-p2).

Steps to reproduce

  1. Open the domain list page (/domains/manage) for a user that has many domains

What you expected to happen

The domain list should be loaded correctly and I should be able to manage my domains.

What actually happened

The domain list never finishes loading, and eventually sometimes the page completely breaks.

Example screenshot from a user support session:

v5YA15N2i8Msd354KDa1OkWMBgxEfh8oZO36rCC7.jpg

Impact

Some (< 50%)

Available workarounds?

No and the platform is unusable

If the above answer is "Yes...", outline the workaround.

No response

Platform (Simple and/or Atomic)

Simple

Logs or notes

No response

@leonardost leonardost added [Feature Group] Emails & Domains Features related to email integrations and domain management. [Feature] Domain Management Tools for managing your site's domain(s). [Product] WordPress.com All features accessible on and related to WordPress.com. [Type] Bug When a feature is broken and / or not performing as intended Needs triage Ticket needs to be triaged labels Nov 13, 2024
@github-actions github-actions bot added [Status] Escalated to Product Ambassadors [Platform] Simple [Pri] BLOCKER Requires immediate attention. [Feature] Calypso & wp-admin Navigation All navigation in Calypso and wp-admin, and the unified transitions between the two. [Feature] Site Performance Features related to the speed and performance of your site. labels Nov 13, 2024
Copy link

OpenAI suggested the following labels for this issue:

  • [Feature Group] Emails & Domains: The issue is directly related to domain management and the challenges users face with their domain listings.
  • [Feature] Domain Management: The problem specifically pertains to managing a long list of domains, which is a core feature of domain management.
  • [Feature] Calypso & wp-admin Navigation: The issue occurs within the Calypso interface when navigating to the domain management page.
  • [Feature] Site Performance: The inability to load the domain list signifies a performance issue within the site.

@renancarvalho
Copy link
Contributor

Hello 👋 do we know a user with this issue? That would help the investigation since it is challenging to have hundreds/thousand of domains.

@Robertght
Copy link

Robertght commented Nov 18, 2024

While I don't have an account with that many domains, on my end, I didn't have this problem, but I noticed their whole account loaded slow. @renancarvalho I'm going to send the details via Slack.

LE: I learned someone else will look into it. Please reach out via Slack once you see this. Thanks!

@Robertght Robertght moved this from Needs Triage to In Triage in Automattic Prioritization: The One Board ™ Nov 18, 2024
@Robertght Robertght moved this from In Triage to Triaged in Automattic Prioritization: The One Board ™ Nov 18, 2024
@Robertght Robertght added Triaged To be used when issues have been triaged. and removed Needs triage Ticket needs to be triaged labels Nov 18, 2024
@dsas
Copy link
Contributor

dsas commented Nov 21, 2024

While I don't have an account with that many domains, on my end, I didn't have this problem, but I noticed their whole account loaded slow. @renancarvalho I'm going to send the details via Slack.

LE: I learned someone else will look into it. Please reach out via Slack once you see this. Thanks!

Details at p1732188303637839-slack-C07GZ2UA3TN

@dsas dsas self-assigned this Nov 21, 2024
@dsas
Copy link
Contributor

dsas commented Nov 21, 2024

The page makes one API call per wpcom site, for users with hundreds of domains, that is hundreds of API calls. They did eventually all get a response. I think they were all loaded into the page and I was able to scroll down to domains beginning with Z.

Eventually I got an "Oh snap. error code 5" chrome crash. This happens faster if dev tools are open, and slower if it's not - so long as you interact with the page by e.g. scrolling slowly enough through the list for it to start populating stuff.

Following along in the chrome task manager I can see cpu use nearly constantly above 200%, memory use ranges between 2-6 gb and at the point of crash is increasing. I'm guessing the crash is due to some kind of memory pressure.

To set expectations: it's probably unlikely to be easy to find and fix the problem.

@dsas
Copy link
Contributor

dsas commented Nov 21, 2024

It looks like this has been happening for over a year: p1695308169001609-slack-C04H4NY6STW

Looked into this some more with @zaguiini , some things we noticed:

1. The page loads all of the domains.

  • It makes one request per site (domain).
  • Eventually the requests start failing with net::ERR_INSUFFICIENT_RESOURCES which presumably means there wasn't enough memory free to decode it - not because of the size of the response (each is approx 2kb), but because of the number of them.
  • The request is used to populate the "owner" and "site" cells

Ideally it should only request information for domains in the viewport - not all of the information for all of the domains the user has. This appears to happen under @zaguiini's account, but it doesn't happen for the problematic user.

Over the next few weeks the table is being replaced by a dataview by @Automattic/nexus, so it might get resolved as a side effect. If we could figure out the problem first that would be better. pfuQfP-13x-p2

2. The purchases endpoint times out

/me/purchases returns a HTTP 504 for this user - gateway timeout. This will probably cause them problems on other calypso pages too.

/me/purchases doesn't currently have pagination fbhepr%2Skers%2Sjcpbz%2Schoyvp.ncv%2Serfg%2Sjcpbz%2Qwfba%2Qraqcbvagf%2Spynff.jcpbz%2Qfgber%2Qncv%2Qraqcbvagf.cuc%3Se%3Q4174p112%23656-og It does have some performance instrumentation with statsd.

It sounds like there have been repeated problems with this user's account causing fatals in payments code: p1694033164744309-slack-C096PD42U

3. The sites endpoint times out

/me/sites returns a HTTP 504 for this user - gateway timeout. This will probably cause them problems on other calypso pages too.

/me/sites doesn't currently have pagintion fbhepr%2Skers%2Sjcpbz%2Schoyvp.ncv%2Serfg%2Sjcpbz%2Qwfba%2Qraqcbvagf%2Spynff.jcpbz%2Qwfba%2Qncv%2Qzr%2Qfvgrf%2Qraqcbvag.cuc%3Se%3Q5o812359%236-og It does have some performance instrumentation with statsd.

It seems to me that we have at least three areas to improve to get the dashboard to load.

@dsas
Copy link
Contributor

dsas commented Nov 21, 2024

Opened child issues for purchases and sites.

@dsas dsas removed their assignment Nov 21, 2024
@vykes-mac
Copy link
Contributor

vykes-mac commented Nov 25, 2024

@dsas This issue feels like it should be flagged as an improvement project. A similar project (see pet6gk-5d-p2) was done recently to improve the performance of the plugins page. A project will be more focused instead of trying to resolve this across multiple GKs. from the mentioned project I realise some of our endpoints are not optimise to handle thousands of sites. anywhere near 2k and it takes almost a minute to return the results and it most cases the UI crashes.

@ouikhuan ouikhuan moved this from Triaged to Needs shaping in Automattic Prioritization: The One Board ™ Nov 29, 2024
@mmtr mmtr added the Groundskeeping Issues handled through Dotcom Groundskeeping rotations label Jan 21, 2025
@matticbot matticbot moved this from Needs shaping to Triaged in Automattic Prioritization: The One Board ™ Jan 21, 2025
@mmtr mmtr removed the Groundskeeping Issues handled through Dotcom Groundskeeping rotations label Jan 21, 2025
@mmtr mmtr moved this from Triaged to Needs shaping in Automattic Prioritization: The One Board ™ Jan 21, 2025
@merkushin
Copy link
Member

merkushin commented Jan 29, 2025

For adding more context, in this post, we have a discussion on proper implementation of the backend for domains: pdxWSz-23k-p2

To set expectations: probably, we won't work on it as the team's focus is changing.

@andres-blanco
Copy link
Contributor

I've followed up with Nomado to get more context on the work.

Slack ref: p1738586222942579-slack-C0BNMNMNG

@valterlorran
Copy link
Contributor

valterlorran commented Feb 9, 2025

Hey folks, I created a PR with an idea I had to help in this problem: 172801-ghe-Automattic/wpcom

My idea is to add a flag to the /rest/v1.1/all-domains endpoint, to disable the "light loading" mode so we can have access to the owner and the SSL status. We should still do it in the light loading mode, so we can get the information ASAP, but instead of making individual requests to get the owner and SSL status for each domain we would just make another request to /rest/v1.1/all-domains&load_extra_data=true, with the load_extra_data flag that we would interpret as turning off the light loading.

With the change I mentioned I believe we would be able to replace this code for a single request:

https://github.com/Automattic/wp-calypso/blob/trunk/packages/domains-table/src/domains-table/domains-table.tsx#L205-L211

@escapemanuele
Copy link
Contributor

Hey @valterlorran, I assigned the issue to you since you are working on a solution.

@escapemanuele escapemanuele moved this from Needs shaping to Needs Review in Automattic Prioritization: The One Board ™ Feb 13, 2025
@lsl lsl added [Pri] High Address as soon as possible after BLOCKER issues and removed [Pri] BLOCKER Requires immediate attention. labels Feb 27, 2025
@matticbot matticbot moved this from Needs Review to Triaged in Automattic Prioritization: The One Board ™ Feb 27, 2025
@lsl
Copy link
Contributor

lsl commented Feb 27, 2025

Reprioritized high, this sounds quite broken but doesn't meet the bar for "requires immediate action" given it was opened in November.

@lsl lsl added the Groundskeeping Issues handled through Dotcom Groundskeeping rotations label Feb 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Experiment] AI labels added [Feature] Calypso & wp-admin Navigation All navigation in Calypso and wp-admin, and the unified transitions between the two. [Feature] Domain Management Tools for managing your site's domain(s). [Feature Group] Emails & Domains Features related to email integrations and domain management. [Feature] Site Performance Features related to the speed and performance of your site. Groundskeeping Issues handled through Dotcom Groundskeeping rotations [Platform] Simple [Pri] High Address as soon as possible after BLOCKER issues [Product] WordPress.com All features accessible on and related to WordPress.com. [Status] Escalated to Product Ambassadors Triaged To be used when issues have been triaged. [Type] Bug When a feature is broken and / or not performing as intended
Projects
Development

No branches or pull requests