Install same emitter only once when generating SDK #5690

pshao25 · 2023-03-13T05:42:02Z

Current behavior: we do npm install for exactly same packages in all the cadl projects. Same npm install would do n times.

After this change: Same emitter and its dependencies would install only once at the root folder. Reduce time cost from O(n) to O(1).

azure-sdk · 2023-03-13T05:46:14Z

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

eng/common/scripts/Cadl-Project-Generate.ps1

weshaggard · 2023-03-13T22:31:09Z

@mikeharder is this a case where we should consider npm ci over npm install?

pshao25 · 2023-03-14T00:24:26Z

@mikeharder is this a case where we should consider npm ci over npm install?

We shouldn't have any package-lock.json there. Could you put more details why we should consider npm ci?

mikeharder · 2023-03-14T00:40:10Z

@pshao25: In general, any scenario where we are installing a JavaScript-based tool, we should be installing from a lockfile to ensure we are pinning the versions of all transitive dependencies. Otherwise, our tool can be broken at any time by a dependency publishing a new minor or patch version with a breaking change.

pshao25 · 2023-03-14T02:32:47Z

@pshao25: In general, any scenario where we are installing a JavaScript-based tool, we should be installing from a lockfile to ensure we are pinning the versions of all transitive dependencies. Otherwise, our tool can be broken at any time by a dependency publishing a new minor or patch version with a breaking change.

In development scenarios, I totally agree.

But this script is for user scenarios. We published our emitter packages and this script could help customer install that package (download from public registry). In this case, we don't even pack the lockfile into that package. So IMO we should still use npm install.

mikeharder · 2023-03-14T06:01:46Z

@pshao25: In general, any scenario where we are installing a JavaScript-based tool, we should be installing from a lockfile to ensure we are pinning the versions of all transitive dependencies. Otherwise, our tool can be broken at any time by a dependency publishing a new minor or patch version with a breaking change.

In development scenarios, I totally agree.

But this script is for user scenarios. We published our emitter packages and this script could help customer install that package (download from public registry). In this case, we don't even pack the lockfile into that package. So IMO we should still use npm install.

I think we should take a deeper look at the customer scenario as well. It's important to understand the difference between JS libraries and JS tools.

Libraries are intended to be used in applications, where the application developer can (and should) use their own lockfile to pin all transitive dependencies. Libraries should float their dependencies to the widest range possible, for maximum interoperability with other libraries and to reduce the number of dependency versions that need to be installed.

Tools are intended to be run directly from the command line, and the user will not have their own lockfile. Tools should do as much as they can to ensure they are not broken by dependency updates, since tools want to guarantee that if users install a particular version of the tool, it will always work exactly the same regardless of what new dependency versions. This means tools either need to ship their own lockfile or bundle all their dependencies. The easiest way is to include an npm-shrinkwrap.json file in the package itself.

https://docs.npmjs.com/cli/v9/configuring-npm/npm-shrinkwrap-json

pshao25 · 2023-03-14T07:35:39Z

@mikeharder Ah I got your point now. I believe our customers use it as a JS tool. @m-nash Is there any possibility that our customers would use emitter as JS package?

There is a quote from the link that discourage us from using shrinkwrap:

It's strongly discouraged for library authors to publish this file, since that would prevent end users from having control over transitive dependency updates.

@m-nash there are 2 solutions:

We include an npm-shrinkwrap.json file in the package itself as Mike suggested.
We could add a emitter-package-lock.json right near emitter-package.json, so that when we install emitter from sdk repo, we could lock the dependency version.

I think other language repos don't have above files. They might need to do the same thing.

m-nash · 2023-03-14T17:19:35Z

I will defer to @mikeharder expertise here if you think our best approach is for all emitters to include a npm-shrinkwrap.json and @pshao25 you don't see any blocker reasons to not do that lets go with option 1.

mikeharder · 2023-03-14T17:45:48Z

It's strongly discouraged for library authors to publish this file, since that would prevent end users from having control over transitive dependency updates.

If you want to publish something as both a library and a tool, you should publish two packages. For example, the tool package could be named foo, and the library package something like foo-lib. The tool package would depend on the library package and include an npm-shrinkwrap.json, and the library package would not include npm-shrinkwrap.json.

mikeharder · 2023-03-14T18:03:24Z

@pshao25: I think this conversation has diverged from the original purpose of this PR, which I believe is to install the emitter only once. Do all packages in the repo use the same emitter version, and do we think this will be true going forward? Do we ever see a world where different packages use different versions of the emitter? Might this be necessary as the emitter evolves and we may not want to regenerate all packages at once?

If we are sure all packages will use the same emitter, should we just install it globally using npm install -g?

lmazuel · 2023-03-14T19:59:53Z

FWIW, I'm in totally in sync with @mikeharder in spirit here, our emitters are applications, not library. While I'm not JS expert, if option 1 is the right eng approach to pin dependencies of our emitters here then let's do that.

lmazuel · 2023-03-14T20:02:38Z

@pshao25 Pan Shao FTE: I think this conversation has diverged from the original purpose of this PR, which I believe is to install the emitter only once. Do all packages in the repo use the same emitter version, and do we think this will be true going forward? Do we ever see a world where different packages use different versions of the emitter? Might this be necessary as the emitter evolves and we may not want to regenerate all packages at once?

If we are sure all packages will use the same emitter, should we just install it globally using npm install -g?

I think it should be an unique version for the repo. It doesn't mean you're forced to regenerate each time there is a codegen release (it's interesting debate, but not here :)), but if you regenerate, you need to follow the current recommendation hardcoded in the repo. It's been the way SDK Automation has been doing it by the way before even it was named like that (when it was SwaggerToSdk in a previous life)

pshao25 · 2023-03-15T08:15:07Z

@mikeharder I created a new issue to track npm-shrinkwrap.json change. For the PR itself,

Do all packages in the repo use the same emitter version, and do we think this will be true going forward?

Yes.

Do we ever see a world where different packages use different versions of the emitter?

No.

Might this be necessary as the emitter evolves and we may not want to regenerate all packages at once?

No, AFAIK.

If we are sure all packages will use the same emitter, should we just install it globally using npm install -g?

My concern is if we install emitter globally, the dependecies (typespec compiler and others) would also be updated when we run this script in our dev machine. I don't think it is a good practice, because developers might want to install specific version of typespec compiler. I really don't want to contaminate the global place, and really don't want the place where I'm fetching the emitter to be contaminated (I mean it is possible we could install globally at any other place, right?)

mikeharder · 2023-03-15T17:50:56Z

@pshao25

My concern is if we install emitter globally, the dependecies (typespec compiler and others) would also be updated when we run this script in our dev machine.

Is this script intended for use both in pipelines and on dev machines? If so, I agree this adds a lot of complications. Installing things globally in pipelines is usually fine, as long as it works for the duration of the pipeline. But we do want to be more careful on dev machines.

I think it's fairly unusual for scripts under tools/eng/common to be used in both pipelines and dev machines.

@weshaggard: Are you aware of any other scripts under tools/eng/common used on dev machines?

weshaggard · 2023-03-15T23:05:09Z

@weshaggard: Are you aware of any other scripts under tools/eng/common used on dev machines?

Yes there are other scripts under eng/common (e.g. prepare-release.ps1) and if we need these scripts to be in all the repos I think it should be under eng/common for either local or pipelines.

weshaggard · 2023-03-15T23:06:05Z

For my earlier question about npm ci I agree while we don't have a lock file we cannot use that and I agree we should try to use shrinkwrap instead to help get these tools installed.

pshao25 · 2023-03-16T07:00:10Z

Is this script intended for use both in pipelines and on dev machines?

Yes.

@mikeharder @weshaggard @benbp @lmazuel Thank you all for the suggestion. I will track the npm-shrinkwrap.json in another issue and notify other languages.

If there is no further concern, I would have this merged after "cadl rename" finished.

m-nash · 2023-03-16T21:20:36Z

Definitely should not globally install it would mess with all other development / testing constantly.

eng/common/scripts/Cadl-Project-Generate.ps1

benbp · 2023-03-23T20:01:24Z

Following up here with a response to your email @pshao25 regarding parallel npm installs powershell locking. I think it would be better to investigate ways to solve this via npm and the project structure rather than adding more script logic on top of npm. For example, could workspaces or even manual npm link commands work? Are the child directory package.json files different and if not, could we instead only keep a single package.json in the relevant parent directory (this way npm install would bubble up)?

pshao25 · 2023-03-24T10:17:46Z

For example, could workspaces or even manual npm link commands work?

No. We will still face the same problem when two projects start to install packages in parellel. For each project (which are independent and don't know each other), it calls "TypeSpec-Project-Generate.ps1" to 1) install npm packages and 2) generate SDK by this package. The problem here is: the package is not installed in advance, it is installed when we call "TypeSpec-Project-Generate.ps1". Therefore, no matter we are mono repo or not, as long as we install the pacakge in the same place, we are facing the process race issue.

Are the child directory package.json files different

No, they are the same.

and if not, could we instead only keep a single package.json in the relevant parent directory (this way npm install would bubble up)?

Yes, that is exactly what we are doing in this PR. We try to install the package in the parent directory and hope it is only installed once. Then it comes to the problem we are facing "two processes might install together". And we want to hear the suggestion from you if it is OK to apply a file lock implemented in this PR.

eng/common/scripts/Helpers/Lock-Helpers.ps1

pshao25 · 2023-03-31T06:14:41Z

@m-nash @weshaggard @benbp I removed the lock and add an $env:SKIPINSTALL to control the installation. The extra steps we need to take is: each pipeline should add this variable and a step to install the package itself. Please let me know if this solution is OK to go.

eng/common/scripts/TypeSpec-Project-Generate.ps1

weshaggard · 2023-03-31T18:06:25Z

eng/common/scripts/TypeSpec-Project-Generate.ps1

+            $installedVersion = GetNpmPackageVersion $installedPath
+            Write-Host "Installed emitter version: $installedVersion"
+
+            $replacementVersion = GetEmitterDependencyVersion $replacementPackageJson


Why not pass in the emitterName here instead of having the function called again?

Which function do you think I'm calling twice? Here I want to get two versions of the emitter package. One is the already installed one under root folder, the other is the going-to-be installed one in emitter-package.json.

$emitterName = &$GetEmitterNameFn You are calling this on line 50 and then again inside of the GetEmitterDeependecyVersion function.

I don't see you setting emitterName anywhere. Also you will need to pass it as a parameter here.

It set globally in Line 122 too.

weshaggard · 2023-03-31T18:10:49Z

eng/common/scripts/TypeSpec-Project-Generate.ps1

@@ -36,17 +108,9 @@ function NpmInstallForProject([string]$workingDirectory) {
            Remove-Item -Path "package-lock.json" -Force


Don't we also want to remove the places we are deleting the files here? I don't think we want to delete those unless we actually do an npm install.

We need to do that in case user installs emitter under the project folder. If we don't do that, node will not load the emitter installed under root. It will load the emitter installed under the project folder.

If files like package.json are present under this folder do we think that folks have it committed to the repo? If so then deleting it will cause a change in the files. I think we might actually want to warn people instead of blindly deleting these.

This place $workingDirectory is in the gitignore. It is actually {root}\sdk\anomalydetector\Azure.AI.AnomalyDetector\TempCadlFiles\AnomalyDetector for example.

The scenario I'm thinking is: before this change, there will be package.json and npm_modules in the $workingDirectory. Then users apply this change, and they should load packages from root folder. So I want to remove the package.json and npm_modules existing before.

If this is under the temp working directory why would there ever be a package.json and node_modules files under that directory?

Think about these steps:

Before this change, people run TypeSpec-Project-Generate.ps1. Then in $workingDirectory there are package.json and npm_modules.

We get this change merged. package.json and npm_modules are still there in $workingDirectory.

People run TypeSpec-Project-Generate.ps1 again, which will call typespec compile from npm_modules in $workingDirectory instead of from npm_modules in root folder. And that is a critical problem.

If that is the problem we are worried about we should just have folks remove the entire temp working directory (i.e. git clean -xdf). I don't think we should try and special case this one upgrade scenario.

eng/common/scripts/TypeSpec-Project-Generate.ps1

azure-sdk · 2023-04-07T04:51:32Z

The following pipelines have been queued for testing:

You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

eng/common/scripts/TypeSpec-Project-Generate.ps1

azure-sdk · 2023-04-10T03:12:07Z

The following pipelines have been queued for testing:

You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

weshaggard · 2023-04-10T16:47:11Z

eng/common/scripts/TypeSpec-Project-Generate.ps1

+
+        $hasPackageFile = Test-Path "package.json"
+
+        $npmInstallCommand = "npm install --prefix $root $emitterName@$replacementVersion --no-package-lock --omit=dev"


If we start installing as a specific package like this we are making the assumption that there are no other packages in the emitter.package.json file. @raych1 @lmazuel is that assumption valid? If it is perhaps we should avoid any confusion by instead not having package.json file and simply track the package name and version instead.

I see the JS has an additional dependency currently, see https://github.com/Azure/azure-sdk-for-js/blob/main/eng/emitter-package.json. @MaryGao do you know if that extra dependency is needed or will it be included as a dependency of the emitter?

I have no preference between these two. I did this change because @lmazuel mentioned above emitter is a tool not a library.

We should make an explicit decision to make this change as it has larger implications on other things like other dependencies. I don't want to change to this unless we have thought it through and switched to that strategy everywhere.

@weshaggard , it's not aligned for all language emitters. I would suggest to use package.json for installation rather than install emitter individually. we could drop package.json to sdk folder to avoid conflicts with package.json in root folder.

I tend to agree we should stick with a package.json file and I believe we can put it under the tempTypeSpec directory it just needs to live under the directory structure of wherever we put the TypeSpec project.

@weshaggard , the purpose of @pshao25 is to avoid multiple times installation for emitter. Put package.json under tempTypeSpec directory doesn't work and it should be put under upper parent folder like sdk folder.

That is fair we do need to put it in some shared location.

@weshaggard the design was that we didn't have any additional dependencies in the emitter-package.json. This simulates exactly what a customer might do if they start with a language emitter. They will simply npm install our emitter and that should have ALL necessary dependencies defined in order to run. Since they are required to be defined in the emitters package.json there is no need to duplicate them in the emitter-package.json file.

I don't think this will accommodate the devDependency pinning and emitter-package-lock.json work we're doing. We would prefer to npm ci using emitter-package.json and emitter-package-lock.json rather than explicit version installation.

weshaggard · 2023-04-10T17:05:33Z

eng/common/scripts/TypeSpec-Project-Generate.ps1

+        if ($LASTEXITCODE) { exit $LASTEXITCODE }
+
+        if ($hasPackageFile) {
+            git restore package.json


If we stick with the npm install $emitterName@$replacementVersion they I assume you don't need to worry about the package.json any longer.

No, npm install will still modify/create package.json. This is how npm install works:

Look for a package.json from current folder up to outer folders until it finds one. Then it installs packages in that folder. In build agent, there are some folders up to repo root folder that contains package.json. So I add --prefix $root to make sure the installation will only happen in repo root folder.

When it installs packages, it modifies the package.json to add that package if there is a package.json, otherwise it creates a new package.json.

If that is the case then that is all the more reason for use to continue to use the npm install with the emitter.package.json file as we don't want package.json files to be created or modifled based on that install command.

m-nash · 2023-05-30T20:28:38Z

@m-nash @weshaggard @benbp I removed the lock and add an $env:SKIPINSTALL to control the installation. The extra steps we need to take is: each pipeline should add this variable and a step to install the package itself. Please let me know if this solution is OK to go.

Can you detail out the dev experience when running things locally?

It sounds like we would need to do the following.

copy the emitter-package.json to the root of the lang repo
run npm install on this package.json
Set this env variable SKIPINSTALL
reload the env variables for all tools / shells that might use that env variable but cache these
run our sync / generate commands

This seems overly cumbersome for a local dev experience. Or are we assuming that locally you will never get this optimization to take effect?

Instead of an env variable have we considered an optional command line argument?

weshaggard · 2023-05-30T20:50:05Z

@m-nash @weshaggard @benbp I removed the lock and add an $env:SKIPINSTALL to control the installation. The extra steps we need to take is: each pipeline should add this variable and a step to install the package itself. Please let me know if this solution is OK to go.

Can you detail out the dev experience when running things locally?

It sounds like we would need to do the following.

copy the emitter-package.json to the root of the lang repo

run npm install on this package.json

Set this env variable SKIPINSTALL

reload the env variables for all tools / shells that might use that env variable but cache these

run our sync / generate commands

This seems overly cumbersome for a local dev experience. Or are we assuming that locally you will never get this optimization to take effect?

Instead of an env variable have we considered an optional command line argument?

I don't think we are depending on any environment variables any longer. @pshao25 has switched to a model of simply looking to see if it is already installed and if so skip install otherwise install. I believe that should work just fine for the local dev scenarios as well.

…feature/oneEmitter

github-actions · 2023-11-10T05:03:36Z

Hi @pshao25. Thank you for your interest in helping to improve the Azure SDK experience and for your contribution. We've noticed that there hasn't been recent engagement on this pull request. If this is still an active work stream, please let us know by pushing some changes or leaving a comment. Otherwise, we'll close this out in 7 days.

github-actions · 2023-11-17T08:31:57Z

Hi @pshao25. Thank you for your contribution. Since there hasn't been recent engagement, we're going to close this out. Feel free to respond with a comment containing /reopen if you'd like to continue working on these changes. Please be sure to use the command to reopen or remove the no-recent-activity label; otherwise, this is likely to be closed again with the next cleanup pass.

Install same emitter only once when generating SDK

a898146

pshao25 requested review from benbp, konrad-jamrozik and weshaggard as code owners March 13, 2023 05:42

pshao25 requested a review from m-nash March 13, 2023 05:47

archerzz approved these changes Mar 13, 2023

View reviewed changes

benbp reviewed Mar 13, 2023

View reviewed changes

eng/common/scripts/Cadl-Project-Generate.ps1 Outdated Show resolved Hide resolved

weshaggard requested review from chunyu3 and mikeharder March 13, 2023 22:28

pshao25 mentioned this pull request Mar 15, 2023

Add npm-shrinkwrap.json to published emitter Azure/autorest.csharp#3203

Open

m-nash reviewed Mar 16, 2023

View reviewed changes

eng/common/scripts/Cadl-Project-Generate.ps1 Outdated Show resolved Hide resolved

Merge branch 'main' into feature/oneEmitter

7bbb41c

iscai-msft mentioned this pull request Mar 22, 2023

[typespec] package json with just emitter dependency Azure/azure-sdk-for-python#29521

Merged

weshaggard reviewed Mar 24, 2023

View reviewed changes

eng/common/scripts/Helpers/Lock-Helpers.ps1 Outdated Show resolved Hide resolved

UpdatE

eecb802

weshaggard reviewed Mar 31, 2023

View reviewed changes

eng/common/scripts/TypeSpec-Project-Generate.ps1 Outdated Show resolved Hide resolved

weshaggard reviewed Mar 31, 2023

View reviewed changes

eng/common/scripts/TypeSpec-Project-Generate.ps1 Show resolved Hide resolved

weshaggard reviewed Mar 31, 2023

View reviewed changes

eng/common/scripts/TypeSpec-Project-Generate.ps1 Outdated Show resolved Hide resolved

weshaggard reviewed Mar 31, 2023

View reviewed changes

eng/common/scripts/TypeSpec-Project-Generate.ps1 Outdated Show resolved Hide resolved

pshao25 mentioned this pull request Apr 4, 2023

TypeSpec-Project-Sync.ps1 isn't thread safe in running in parallel Azure/azure-sdk-for-net#35310

Closed

Azure deleted a comment from weshaggard Apr 4, 2023

pshao25 added 2 commits April 6, 2023 13:55

Update

9c056d1

Update

fe7d415

benbp reviewed Apr 7, 2023

View reviewed changes

eng/common/scripts/TypeSpec-Project-Generate.ps1 Show resolved Hide resolved

weshaggard reviewed Apr 7, 2023

View reviewed changes

eng/common/scripts/TypeSpec-Project-Generate.ps1 Show resolved Hide resolved

UpdatE

e6ccda1

weshaggard reviewed Apr 10, 2023

View reviewed changes

Merge branch 'main' of https://github.com/Azure/azure-sdk-tools into …

4a09259

…feature/oneEmitter

pshao25 mentioned this pull request Jul 10, 2023

Optimize package installation Azure/autorest.csharp#3575

Closed

github-actions bot added the no-recent-activity There has been no recent activity on this issue. label Nov 10, 2023

github-actions bot closed this Nov 17, 2023

		@@ -36,17 +108,9 @@ function NpmInstallForProject([string]$workingDirectory) {
		Remove-Item -Path "package-lock.json" -Force


		$hasPackageFile = Test-Path "package.json"

		$npmInstallCommand = "npm install --prefix $root $emitterName@$replacementVersion --no-package-lock --omit=dev"

Install same emitter only once when generating SDK #5690

Install same emitter only once when generating SDK #5690

Uh oh!

Conversation

pshao25 commented Mar 13, 2023

Uh oh!

azure-sdk commented Mar 13, 2023

Uh oh!

Uh oh!

weshaggard commented Mar 13, 2023

Uh oh!

pshao25 commented Mar 14, 2023

Uh oh!

mikeharder commented Mar 14, 2023

Uh oh!

pshao25 commented Mar 14, 2023

Uh oh!

mikeharder commented Mar 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pshao25 commented Mar 14, 2023

Uh oh!

m-nash commented Mar 14, 2023

Uh oh!

mikeharder commented Mar 14, 2023

Uh oh!

mikeharder commented Mar 14, 2023

Uh oh!

lmazuel commented Mar 14, 2023

Uh oh!

lmazuel commented Mar 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pshao25 commented Mar 15, 2023

Uh oh!

mikeharder commented Mar 15, 2023

Uh oh!

weshaggard commented Mar 15, 2023

Uh oh!

weshaggard commented Mar 15, 2023

Uh oh!

pshao25 commented Mar 16, 2023

Uh oh!

m-nash commented Mar 16, 2023

Uh oh!

Uh oh!

benbp commented Mar 23, 2023

Uh oh!

pshao25 commented Mar 24, 2023

Uh oh!

Uh oh!

pshao25 commented Mar 31, 2023

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pshao25 Apr 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mikeharder commented Mar 14, 2023 •

edited

Loading

lmazuel commented Mar 14, 2023 •

edited

Loading

pshao25 Apr 4, 2023 •

edited

Loading

raych1 Apr 14, 2023 •

edited

Loading

m-nash May 30, 2023 •

edited

Loading