Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Libraries APIs are fully documented using efficient workflows #44969

Open
6 of 16 tasks
Tracked by #44314
carlossanlop opened this issue Nov 19, 2020 · 43 comments
Open
6 of 16 tasks
Tracked by #44314

Libraries APIs are fully documented using efficient workflows #44969

carlossanlop opened this issue Nov 19, 2020 · 43 comments
Assignees
Labels
area-Meta Cost:XL Work that requires one engineer more than 4 weeks Epic Groups multiple user stories. Can be grouped under a theme. Team:Libraries
Milestone

Comments

@carlossanlop
Copy link
Member

carlossanlop commented Nov 19, 2020

Today, some of our source code repos like Runtime, WPF, WinForms and WCF, consider the dotnet-api-docs repo the source of truth for their documentation. This poses some challenges:

  • We only use triple slash comments in source for seeding the documentation.
  • We have to manually port these comments to dotnet-api-docs so they show up both in MS Docs and in IntelliSense.
  • Before the ported comments get merged in dotnet-api-docs, they need to go through language review, which may change the contents considerably.
  • Once they get merged, the triple slash comments in source become obsolete.
  • We depend on the Docs build system to generate IntelliSense for us with the language-reviewed contents.
  • We need to consume the generated IntelliSense in the source code repos via a nuget package to make it available in the published SDK.
  • Any documentation changes/fixes need to be done in dotnet-api-docs, which may cause even greater discrepancies with the original triple slash comments, unless the developer also submits a PR to fix the comments there.
  • This complex manual process and the dependency round trip made it difficult to ensure APIs introduced in 1.x and 2.x were fully documented in MS Docs and IntelliSense. We improved our process for 3.x and 5.0 and prevented documentation debt in those versions, but we still had to do the whole process manually.
  • The fact that dotnet-api-docs shows shared documentation for .NET Core and .NET Framework is one of the main reasons why this process has remained the way it currently is.
  • We rarely add code examples to new APIs. The few examples we have, live in dotnet-api-docs. Some of the existing ones use obsolete APIs, or APIs that only exist in .NET Framework, or show old coding conventions.

We would like to propose a series of changes in our documentation process that will simplify the developer's role and automate some of the steps. During .NET 6, we piloted this new documentation process with a subset of the .NET Libraries. That pilot produced the following outputs:

  • Overall feasibility and promise of the new process
  • An assessment of contributor satisfaction with the new process
  • An understanding of the challenges that would need to be overcome across the remaining libraries
  • A project plan for either completing the migration or canceling the pilot and reverting to the previous process, with a new User Story created and all involved work estimated

We will continue this plan in .NET 8.


Bring documentation from Docs to triple slash

Substitute all the triple slash comments in source code with the language-reviewed documentation that exists in dotnet-api-docs. We will do this on an assembly by assembly basis, and will enable the MSBuild property <GenerateDocumentationFile> to ensure new public APIs cause a build warning when they don't have documentation.

We will be using the dotnet/api-docs-sync PortToTripleSlash tool for this effort, for which I added the feature to port dotnet-api-docs to triple slash comments: https://github.com/dotnet/api-docs-sync/tree/main/src/PortToTripleSlash

Remarks

We won't backport remarks for the following reasons:

  • They are bulky. Really long remarks would have to be moved to external files and linked in the triple slash comments.
  • They aren't shown in VS intellisense.
  • Remarks usually contain links to code example external files. Files with code snippets will remain in the dotnet-api-docs repo, untouched. When (and if) we backport remarks containing links to those code snippets, the links will be relative to the dotnet-api-docs repo.
  • Remarks also contain embedded markdown code snippets. They would have to be moved to their own files and merged directly in dotnet-api-docs, to avoid having huge triple slash comments sections.

.NET Framework-only APIs

APIs that only exist in .NET Framework will continue having dotnet-api-docs as its source of truth.

APIs that are shared by both .NET Core and .NET Framework will have their source of truth in triple slash comments in .NET Core, making sure we preserve the differences in behavior between versions.

Tasks

Here we will list the assemblies that got their documentation backported.

To do - Add one item per assembly and link to PRs as they are created.


Merge blocking label and docs reviewers

We already have a bot task that automatically adds the new-api-needs-documentation label to PRs that are introducing new public APIs, but we want to make sure it also becomes a merge blocker, like the * NO MERGE * label does.

Once the PR has been reviewed by a maintainer, and they confirmed the new APIs have proper documentation, the label can be manually removed to unblock merging.

We also want the bot to automatically add the @dotnet/docs members as PR reviewers for language review.

Tasks
  • Make the new-api-needs-documentation label mandatory.
  • Automatically add language reviewers to PRs adding new APIs.
  • Update our readmes to describe the purpose of the label and what to expect from a PR review.

Automatic Docs build

Note: We can only begin this work if we finished backporting the documentation from all assemblies.

Currently, whenever new APIs are added to the source code repos, we send the updated ref assemblies to the Docs team so they feed them to the Docs build system, which causes the regeneration of the dotnet-api-docs xml files, showing the new APIs. After this point, we can then manually port the documentation from triple slash, using https://github.com/dotnet/api-docs-sync/tree/main/src/PortToTripleSlash

From now on, we want to automate the process by automatically merging the ref assembly drop (it's just a commit in an internal repo). This drop will also contain the build-generated IntelliSense xmls, which would now contain the documentation source of truth, removing the step of manual porting.

Tasks
  • Automatically ship our generated intellisense packages to customers, instead of the ones we normally would bring from the dotnet-api-docs internal feed.
  • Exclude assemblies with backported documentation from the ref assemblies drop. Instead, just include the intellisense xml.

Debt prevention and docs fixes

At the time of writing this document, we have 900+ issues open in the dotnet-api-docs repo. We would like to consider these as part of the regular work planning for our dev teams, and we want to make it easier to filter issues by area by automatically adding labels using a bot, and area owners should be notified (on a subscription basis, like in runtime).

Contributors will still be able to report documentation issues in dotnet-api-docs, but fixes will now be done directly in triple slash comments in source. PRs will be disabled in dotnet-api-docs except for maintainers.

Documentation for APIs that only exist .NET Framework will continue to be done directly in dotnet-api-docs (that will be its source of truth).

Tasks
  • Finish documenting APIs introduced in 1.x and 2.x, 3.x, 5.x, 6.x and 7.x (dotnet/runtime/projects/60)
  • Add bot task to automatically add area labels to dotnet-api-docs issues.
  • Update fabric bot that auto-mention people, to be onboarded for dotnet-api-docs, so that only subscribed users can get mentioned in comments of new docs issues.
  • Consider Docs for our sprint planning and triaging. cc @jeffhandley
  • Update readme with new guidance on debt prevention and docs fixes.

Low pri / Nice to have

The following are tasks that are out of scope for this effort, but we would like to consider in the near future:

  • - Add CI validation to the code snippets in the dotnet-api-docs repo.
  • - Redirect the MS Docs Edit button to the source code file instead of the dotnet-api-docs xml file. To achieve this, we would also have to include PDBs in the drop that contains the ref assemblies and xmls, and the Docs team would have to read them to determine the location of the source code for an API.
  • - Consider creating a new PR label needs doc update that would also become merge blocking, to ensure documentation gets updated when behavior is changed. Here is a good argument in favor of that.

Questions and suggestions are welcome.

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Nov 19, 2020
@carlossanlop carlossanlop removed the untriaged New issue has not been triaged by the area owner label Nov 19, 2020
@jkotas
Copy link
Member

jkotas commented Nov 19, 2020

Which source is the documentation going to live for APIs that have separate implementations per OS, per runtime or per architecture?

@safern
Copy link
Member

safern commented Nov 19, 2020

Which source is the documentation going to live for APIs that have separate implementations per OS, per runtime or per architecture?

@carlossanlop and I are still discussing this. I will write up a proposal design for that and discuss it with other people to make sure we have a good convention and the right features for people to specify the source of truth that makes it's way to the final ref pack. Once I have that I can loop you as well in the conversations, or just include it in this issue, but it feels like the description is quite large already.

@safern
Copy link
Member

safern commented Nov 19, 2020

We also need to come up with a plan for APIs that live in Private assemblies (System.Private.CoreLib, System.Private.Uri, etc) on how we're going to have their documentation land in the .xml for the ref assembly that expose them, for example, String APIs should be under System.Runtime.xml.

@carlossanlop
Copy link
Member Author

carlossanlop commented Nov 19, 2020

Does it mean contributors will have to open PRs to both the runtime repo and the docs repo

@SingleAccretion

  • If you are only adding or editing documentation, we expect contributors to create PRs in the source code repo (runtime/wpf/winforms/wcf). We will take care of automatically syncing that content with MS Docs.
  • Samples will still live in dotnet-api-docs. Adding a sample will require two PRs, one in dotnet-api-docs to add the sample file, and another in the source code repo to add the link in the API remarks (a link that will be relative to the dotnet-api-docs repo root).

Edit: Seems the original comment I'm replying to got deleted?

@carlossanlop
Copy link
Member Author

carlossanlop commented Nov 19, 2020

@carlossanlop and I are still discussing this.

One proposal we are considering (and it would be nice to hear some opinions on this) is to add documentation to one file, based on a pre-defined priority. The first file we would try to find is the AnyOS file. If not found, we try to find others, like Windows, then Win32, then Unix, etc. Again, we would have to define a priority.

Then, once documentation is added to one file, the rest of them should also have triple slash comments to prevent the build warning, but the text would just be a boilerplate message indicating that is not the main documentation file.

@jkotas
Copy link
Member

jkotas commented Nov 20, 2020

If not found, we try to find others, like Windows, then Win32, then Unix, etc. Again, we would have to define a priority.

It does not feel right to define priority ordering of OSes. Also, it would lead to odd situations like Unix specific behaviors having to be documented in .Windows.cs file, etc.

Maybe would should have a dummy "AnyOS" or similar file for these cases that would have a throwing implementation and contain the documentation?

@carlossanlop
Copy link
Member Author

it would lead to odd situations like Unix specific behaviors having to be documented in .Windows.cs

Sure, just keep in mind that's how we currently have our documentation in the dotnet-api-docs xmls: We do not have multiple <summary> sections for each OS/architecture/version. We put all our documentation in one place.

We need to keep in mind that we can only send to the Docs build system one build-generated intellisense xml file per ref assembly, from which all the documentation (for all OS/architectures/versions) will be copied and pasted into the ECMAxml files in dotnet-api-docs.

@safern also suggested we could indicate in the csproj an MSBuild property that would indicate which file is the source of truth.

But your idea @jkotas of always having an AnyOS file makes sense if we want to avoid confusion and we want to avoid the MSBuild property.

@safern
Copy link
Member

safern commented Nov 20, 2020

But your idea @jkotas of always having an AnyOS file makes sense if we want to avoid confusion and we want to avoid the MSBuild property.

Wouldn't this be the equivalent of having a super long file with the APIs? Something like a ref assembly?

@jkotas
Copy link
Member

jkotas commented Nov 20, 2020

I meant that we would follow the existing factoring - one file per type. For example, we would have Path.cs, Path.Unix.cs, Path.Windows.cs and Path.AnyOS.cs. Path.AnyOS.cs would have the docs.

@krwq
Copy link
Member

krwq commented Nov 20, 2020

How many of such OS/Arch specific APIs there are? Would it be possible to manually handle that? Also should APIs have generic description without implementation detail? If there are many of such perhaps we could combine the summaries into a single:

On Windows:
<Windows summary>

On Unix:
<Unix summary>
...

maybe that wouldn't be perfect but it would temporarily solve the problem, list of such APIs could be gathered into an issue and later those could be improved by hand.

(and not sure how would that work for arguments but still I'm curious about the numbers first, possibly this problem is very small)

@safern
Copy link
Member

safern commented Nov 20, 2020

I meant that we would follow the existing factoring - one file per type. For example, we would have Path.cs, Path.Unix.cs, Path.Windows.cs and Path.AnyOS.cs. Path.AnyOS.cs would have the docs.

I see. There are 2 things that make me wonder if that's the best approach:

  1. We have various projects (OOB packages) that cross compile with AnyOS and for the AnyOS configuration it generates the PNSE platform (the PNSE generator does filter per API if there are APIs included in the Compile item already, so that wouldn't be a problem), but that would mean, that for libraries that don't have an AnyOS configuration, we would need to add one (potentially impacting build times), we would also need to add one to their project references and also, we would need to exclude that asset from the product/package. So it would be a "doc" only build.

  2. People that create a OS specific file and want to add docs to APIs that are split in between those OS files, will need to remember to add an OS agnostic build configuration to their project, and an AnyOS file for those types, by having to manually include all APIs of that type there and add docs into that API.

I still need to do an analysis on how many OS specific files we have and the patterns we use on the csproj to think on a better solution, but so far it seems like the "most" reasonable approach yet.

cc: @ericstj any thougths?

@safern
Copy link
Member

safern commented Nov 20, 2020

Another idea that comes to mind, would be injecting a tfm-docs build that is done on every vertical to every project. And then for APIs that are OS Specific, people would have to add a .Docs.cs file that throw PNSE, and then we grab the .xml output from that vertical as the source of truth.

@carlossanlop
Copy link
Member Author

@krwq I haven't yet seen a <summary> that describes a different behavior between operating systems. This description split is usually seen in the exceptions an API throws, and even those are not that common.

Having a <Unix summary> wouldn't work because we still need a <summary> tag, otherwise the build warning will show up when the assembly gets the mandatory documentation MSBuild property enabled, because that required tag would be missing.

Bringing documentation from dotnet-api-docs into triple slash would mean bringing a single text description for each tag (<summary>, <returns>, etc.) that applies for all OS already, but if we were to follow your suggested approach, we would have to manually split each text for each OS, even though in the majority of the cases, the behavior doesn't really change. That is not the issue being discussed here - the issue is deciding in which file we want to paste those backported comments in triple slash.

@GSPP
Copy link

GSPP commented Nov 22, 2020

Could it be even easier for the general public to submit documentation improvements? Contributing to the GitHub repos is already quite a time investment. People need an account, and maybe they have never done anything like this. Nobody will casually do this.

I sometimes read the documentation on the Microsoft website and think "this really should be explained differently". If there was a button that I could use to submit a documentation fix right there, and that would make it into the documentation eventually, I think a lot of people would do that. This might even become a bit of an addiction to power users such as contributing to Stack Overflow is addictive.

@huoyaoyuan
Copy link
Member

Some complain:

For certain simple APIs (namely Math.Max, Math.Sin and so on), their documentation are totally meaningless, and can be much longer than the implementation itself. It's even more meaningless for non-English speakers.
Putting them into triple-slash doc in code will downgrade the code reading experience.

@carlossanlop
Copy link
Member Author

carlossanlop commented Nov 23, 2020

their documentation are totally meaningless, and can be much longer than the implementation itself. It's even more meaningless for non-English speakers.

@huoyaoyuan Whether the documentation is meaningless or not, is not the issue we are trying to solve. This issue is about making it easier for contributors to add documentation. People are more familiarized with triple slash comments in C# than with ECMAxmls from dotnet-api-docs.

Putting them into triple-slash doc in code will downgrade the code reading experience.

Why? You can always press some key combinations collapse the comments sections. For example, in Visual Studio, you can press Control+M+O. In VS Code you can press Control+K+0.

@huoyaoyuan
Copy link
Member

Why? You can always press some key combinations collapse the comments sections. For example, in Visual Studio, you can press Control+M+O. In VS Code you can press Control+K+0.

Thanks for pointing out this. But, sometimes I need to read the code in GitHub or source.dot.net.

@krwq
Copy link
Member

krwq commented Nov 24, 2020

@huoyaoyuan you should file a separate issue on that on https://github.com/dotnet/dotnet-api-docs or https://github.com/dotnet/docs (not sure which is the correct one but they will likely redirect you).

@iSazonov
Copy link
Contributor

iSazonov commented Mar 4, 2021

Thank you @iSazonov for the feedback.

This is a certainly complex issue - if we place xml-docs near the code than we'd have run 2 build pipelines, but if we don't we likely degrade the developer experience, and hence less likely get quality docs.

I already have an opened PR where MSFT reviewer ask we update XML docs. A problem is that English is not my native language and there was three(!) iteration with three(!) men before we get acceptable result. (I'm not even sure that every English man is capable of writing good documentation that fully complies with the repo standards.)

The only way to get good docs is to delegate updating the documentation to a professional writer and do it asynchronously.

@carlossanlop
Copy link
Member Author

Update:

We want to emphasize how important it is for us to ensure this effort does not become a ship blocker at any point. One of the main concerns was having a fully working "source of truth hybrid mode": having intellisense.xml files generated only for some assemblies, and use them as the source of truth, while other assemblies kept using dotnet-api-docs xmls as their source of truth.

Yesterday, we tested this and we are happy to confirm the hybrid mode works:

@safern created a ref assemblies drop with only two intellisense.xml files: System.IO.Compression.Brotli and System.Numerics.Vectors. @gewarren created a new test branch in the Docs build system and fed it with the assemblies drop. The output showed that all assemblies kept using their existing docs from dotnet-api-docs xmls, while only those two assemblies updated their docs using the contents from the passed intellisense.xml files.

cc @jeffhandley @ericstj @danmoseley

@safern
Copy link
Member

safern commented Mar 11, 2021

Another thing to keep in mind here for the hybrid mode for the release if we end up using that mode to avoid blocking a release if not all docs are ported, is to think about the intellisense xml files that we ship with the product are correct and up to date. For example:

We have System.Numerics.Vectors using triple slash comments that means the intellisense xml file is produced from its buid; but we have others like System.Runtime that their source of truth is the docs repo and we update the docs for an API in System.Runtime.xml, then we need to produce a new intellisense package with the System.Runtime.xml updated, consume that in dotnet/runtime and ship that updated System.Runtime.xml as part of the ref pack.

So we need to be careful in this scenario that System.Numerics.Vectors.xml in the ref pack is the one produced from its build but others like System.Runtime.xml in my example that haven't been ported, are shipping the xml file coming from the docs repo and have the latest content.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Apr 12, 2021
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jun 21, 2021
@jeffhandley jeffhandley modified the milestones: 6.0.0, 7.0.0 Jun 25, 2021
@jeffhandley jeffhandley changed the title Pilot a new process that extracts API documentation from source code Libraries APIs are fully documented using efficient workflows Jan 9, 2022
@jeffhandley jeffhandley added Epic Groups multiple user stories. Can be grouped under a theme. Priority:2 Work that is important, but not critical for the release and removed User Story A single user-facing feature. Can be grouped under an epic. Bottom Up Work Not part of a theme, epic, or user story Priority:1 Work that is critical for the release, but we could probably ship without labels Jan 9, 2022
@jeffhandley jeffhandley modified the milestones: 7.0.0, Future Jul 7, 2022
@jeffhandley jeffhandley removed the Priority:2 Work that is important, but not critical for the release label Jul 7, 2022
@jeffhandley
Copy link
Member

Update: I've moved this to Future and we are going to close all of the issues for backporting api docs to triple-slash comments for now. As we concluded early in the .NET 7.0 release cycle, we need to invest more into the DocsPortingTool to set this effort up for success. When we're able to revisit this, we will open new issues per area again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Meta Cost:XL Work that requires one engineer more than 4 weeks Epic Groups multiple user stories. Can be grouped under a theme. Team:Libraries
Projects
No open projects
Development

Successfully merging a pull request may close this issue.