-
Notifications
You must be signed in to change notification settings - Fork 22.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Figure out where we use tags and how much we can clean them up #3424
Comments
No idea. I recently cleaned up an old pre-kuma syntax that allowed you to list all pages that had a particular tag - "because we don't do that anymore"; that's pretty much the only use of tags that I would want. |
We do index them in Elasticsearch. For no reason, because you can't search for them through the query API ( We certainly don't display them in any UI on the document page itself. So I think they're only used for some macros. Sidebar macros in particular. |
One thing I've noticed is that if you think the English tags are a mess; wait till you see the translated-content tags. So often they are different!
|
If the tags never appear in the UI, and we don't intend to show them in future, then there is no reason not to piggy back them off the en-US parent. In particular this is true because IMO it is desirable for sidebar navigation to be consistent across localisations. But I'd be considering dumping them for sidebar navigation (and MDN) altogether. My "experience" is that there are better ways to build navigation. They do make sense for something like wikipedia where navigation is by category not "guided" (which is where I see the value of sidebars) |
Something just struck me about this. Am I remembering correctly that I could previously click on a tag and see all the pages with that tag? If there's a way to look at stats of those URLs, someone should. Such stats may or may not be arguments for keeping tags, but they could be evidence of use cases that still need to be supported. For example, tags may have been the only way to see a complete list of all events. |
In other words; "hold on before we kill all tags" :) I'll let you content gurus hash it out but I can predict that it would be trivial to add a site-search, by tag. You can actually already search by prefixes. E.g. https://developer.mozilla.org/api/v1/search?q=foreach&slug_prefix=web/htMl |
By the way @jpmedley @wbamberg if you have ideas how to get rid of https://developer.mozilla.org/en-US/docs/Web/API/Index I'm all ears. That page is so huuuge that it's omitted from the site-search and our sitemaps. <a href="/en-US/search?slug_prefix=web/api">Search all Web API pages</a> ...or something. |
I believe that this page was only used by the doc status pages, which don't exist any more. But @chrisdavidmills might know better than me. |
I think Will is right here. |
You got the gist of it. I was going more for make sure we're clear on all the consequences before moving forward, and make sure we have new implementations for valid use cases such as (possibly) the events problem I alluded to above.
That doesn't look like any index I've ever seen. It looks like something you'd create to make up for a lack of site search. I'll call this misnamed and something that should probably go away. I'd add that the site has no index at all. If you wanted an index, that's a different discussion, but I'm not sure we do. Taking your question at face value, an index would be a list of links only organized under letters of the alphabet as headings. Links would ideally be subjects rather than to single pages. For example: S...
This kind of thing would be more useful in the expository content where locations of answers to specific questions are less predictable. It seems like there would also need to be separate indexes for the various sections. For example, it's unlikely that someone searching for information on packaging an extension would need to see entries related to PWAs. In the reference material, it's not just redundant with site search, it's, in a way, redundant with reference structure itself. If I've spent any time at all in the reference, I can predict where all of these topics will be found. The fact that the items in that page seem to all link to single pages reinforces my feeling that it was a site search substitute. |
The The only reason to not auto-generate the deprecation header is that it would be nice in cases where we have a deprecated API superseded by a clear replacement, if we could include the replacement as part of the macro. That would allow removal of "state" tags. |
Should this be in the BCD instead? That would make the information available to all BCD clients and open use cases we haven't thought of yet. |
Not sure if this helps, but looking at the original issue here from @wbamberg ...
Actually. I'm going to stop there. There are many more macros that depend on them.
So it's pretty clear it's still relevant and important. But all of this reminds me, I should write a translation-differences checker that makes sure the translated document's tags match the en-US one. |
It may be a longer project than I thought, but I'm still in favor of moving to BCD when we can. It seems like |
Yes, there are some tags that are used in macros. But there are many many tags that are not. That's why we would need to do the analysis of which tags are used, so we can clean them. But, many of the places we use tags in macros are to identify the type of page something is. From your partial list above:
What I would like to do here is have a "page-type" front matter key instead, as in mdn/yari#3350. This would be useful for all sorts of things, including (for example) figuring out whether a page out to contain a BCD table, or a specifications table, or a box listing event properties, or a link to a contructor page, ... As you said in that issue, Peter, technically that's possible right now, thanks to making front matter available to KS. Most of the other tags you've listed are to do with the lifecycle status of the feature:
Some of these are already represented in BCD, and some of them ("Obsolete") we are trying to get away from. |
Would it help to figure out which tags are in documents but never ever used in the |
Another thing that's occurred to me is that it's quite possible that |
Some of this discussion moved on to https://github.com/mdn/content/discussions/5162. Just a few points. In support of what @wbamberg said, the idea is to make separate out the metadata related to document/object state and type from the more arbitrary "tags". The information would still be there in some form.
That would great, but BCD owners are pretty protective of what goes in there (and rightly so).
I think so - part of the analysis.
This is why fetching from BCD is so handy. But yes, IMO all metadata in translated pages should be ignored, and probably should be stripped. Only the "master" version should be used, and it should be used everywhere. The only exception might be "tags after we have pulled out all the useful ones into dedicated keys". Note, we're not rendering the metadata. |
You don't know until you ask.
Can you provide an example? When I do reference pages I always do the BCD first, and I require that of the contractors I've been supervising the last year.
I'd need to see a use case before I'd be convinced that's necessary. |
That's off-topic but a great point!! sidebar: css but you could omit that stuff automatically if it says: page-type: css-selector because Yari could have a mapping of page-types => sidebar. |
Yes on using page-type to map to sidebars. https://github.com/mdn/content/discussions/5162 talks about that as a usage for page-type. On BCD: I think the proper scope of BCD is to describe the level of browser support for web platform features. I'm uncomfortable with it becoming a general repository for data about the web platform. Or at least, we could decide we wanted to do this but it would be a definite change of scope. Instead I'd like us to consider using front matter for data about web platform features at least in some cases. (I think perhaps this ship has pretty much sailed for at least some things, though, like spec_url and deprecated.) |
I'm mindful of the concern with scope. My concern with front matter overriding what's in BCD is that it implies that the BCD is incorrect. I wouldn't want incorrect information showing up in the third-party apps that consume BCD. |
Where are we with this?
If it's helpful, I can write a script that logs exactly only the tags used in any |
Yes, next step I would say is : make a list of all the tags currently used by macros (I guess that is the "list of valid tags") and remove all the rest. Then when we land page-type we can update macros to use that instead of tags, and remove all the tags that are proxies for page type. (That's if we can agree that we should remove unused tags. I know Joe was concerned that we might be losing important use cases by doing this, but tbh that state of our tags at the moment is such that I can see that data being useful at the moment.) I've not done any work on this yet because it's not as much as a priority for me as getting the content Markdown-ready (like removing inline styles and updating live samples).
Is that a reliable way to know which tags are used across all our macros? Is that the only way tags are exposed? (e.g. there isn't a |
For all
The number just represents how many times it's called. For example, A quick one that pops out to me is |
Another interesting one is: |
Actually, the |
I guess this explains the |
@wbamberg Re #3424 (comment) "Yes, next step I would say is " Yes, but in addition, "create a new key/keys for document state tags: Non-standard, Experimental, Deprecated". These are our three most used tags. They are what cause rendering of the little icons in sidebars and other macros. The info should be pulled out of BCD where possible, but we still need a way for the data to be inserted in the page if there is no BCD entry. They are important enough to separate, and they would not be part of the "page type" data (I don't think?) |
"Junk" was how pages on the wiki were marked for archival/deletion. |
I know that translated content is lower priority but MDN gets millions of people to the non-English documents so it's always worth keeping it in mind. Just wanted to highlight that mdn/yari#3955 is coming. Because at the moment, I think the tags are just a scary nuisance for the translators. They might unnecessarily worry about "Oh no, do I have to make sure that it always matches?!?" |
I propose we move this to the Discussions tracker. |
In talking to @escattone we were wondering where and for what we use tags in MDN. Our tags were migrated wholesale from Kuma and in general are a real mess.
I think that now we don't expose them directly to users any more (?) but they are used in some macros. It would be good to understand which macros use them, and which specific tag values these macros are looking for, and whether we could clean out the other values.
The text was updated successfully, but these errors were encountered: