-
Notifications
You must be signed in to change notification settings - Fork 228
Install same emitter only once when generating SDK #5690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The following pipelines have been queued for testing: |
|
@mikeharder is this a case where we should consider npm ci over npm install? |
We shouldn't have any package-lock.json there. Could you put more details why we should consider npm ci? |
|
@pshao25: In general, any scenario where we are installing a JavaScript-based tool, we should be installing from a lockfile to ensure we are pinning the versions of all transitive dependencies. Otherwise, our tool can be broken at any time by a dependency publishing a new minor or patch version with a breaking change. |
In development scenarios, I totally agree. But this script is for user scenarios. We published our emitter packages and this script could help customer install that package (download from public registry). In this case, we don't even pack the lockfile into that package. So IMO we should still use |
I think we should take a deeper look at the customer scenario as well. It's important to understand the difference between JS libraries and JS tools. Libraries are intended to be used in applications, where the application developer can (and should) use their own lockfile to pin all transitive dependencies. Libraries should float their dependencies to the widest range possible, for maximum interoperability with other libraries and to reduce the number of dependency versions that need to be installed. Tools are intended to be run directly from the command line, and the user will not have their own lockfile. Tools should do as much as they can to ensure they are not broken by dependency updates, since tools want to guarantee that if users install a particular version of the tool, it will always work exactly the same regardless of what new dependency versions. This means tools either need to ship their own lockfile or bundle all their dependencies. The easiest way is to include an https://docs.npmjs.com/cli/v9/configuring-npm/npm-shrinkwrap-json |
|
@mikeharder Ah I got your point now. I believe our customers use it as a JS tool. @m-nash Is there any possibility that our customers would use emitter as JS package? There is a quote from the link that discourage us from using shrinkwrap:
@m-nash there are 2 solutions:
I think other language repos don't have above files. They might need to do the same thing. |
|
I will defer to @mikeharder expertise here if you think our best approach is for all emitters to include a npm-shrinkwrap.json and @pshao25 you don't see any blocker reasons to not do that lets go with option 1. |
If you want to publish something as both a library and a tool, you should publish two packages. For example, the tool package could be named |
|
@pshao25: I think this conversation has diverged from the original purpose of this PR, which I believe is to install the emitter only once. Do all packages in the repo use the same emitter version, and do we think this will be true going forward? Do we ever see a world where different packages use different versions of the emitter? Might this be necessary as the emitter evolves and we may not want to regenerate all packages at once? If we are sure all packages will use the same emitter, should we just install it globally using |
|
FWIW, I'm in totally in sync with @mikeharder in spirit here, our emitters are applications, not library. While I'm not JS expert, if option 1 is the right eng approach to pin dependencies of our emitters here then let's do that. |
I think it should be an unique version for the repo. It doesn't mean you're forced to regenerate each time there is a codegen release (it's interesting debate, but not here :)), but if you regenerate, you need to follow the current recommendation hardcoded in the repo. It's been the way SDK Automation has been doing it by the way before even it was named like that (when it was SwaggerToSdk in a previous life) |
|
@mikeharder I created a new issue to track npm-shrinkwrap.json change. For the PR itself,
Yes.
No.
No, AFAIK.
My concern is if we install emitter globally, the dependecies (typespec compiler and others) would also be updated when we run this script in our dev machine. I don't think it is a good practice, because developers might want to install specific version of typespec compiler. I really don't want to contaminate the global place, and really don't want the place where I'm fetching the emitter to be contaminated (I mean it is possible we could install globally at any other place, right?) |
Is this script intended for use both in pipelines and on dev machines? If so, I agree this adds a lot of complications. Installing things globally in pipelines is usually fine, as long as it works for the duration of the pipeline. But we do want to be more careful on dev machines. I think it's fairly unusual for scripts under @weshaggard: Are you aware of any other scripts under |
Yes there are other scripts under eng/common (e.g. prepare-release.ps1) and if we need these scripts to be in all the repos I think it should be under eng/common for either local or pipelines. |
|
For my earlier question about npm ci I agree while we don't have a lock file we cannot use that and I agree we should try to use shrinkwrap instead to help get these tools installed. |
Yes. @mikeharder @weshaggard @benbp @lmazuel Thank you all for the suggestion. I will track the npm-shrinkwrap.json in another issue and notify other languages. If there is no further concern, I would have this merged after "cadl rename" finished. |
|
Definitely should not globally install it would mess with all other development / testing constantly. |
|
Following up here with a response to your email @pshao25 regarding parallel npm installs powershell locking. I think it would be better to investigate ways to solve this via npm and the project structure rather than adding more script logic on top of npm. For example, could workspaces or even manual |
No. We will still face the same problem when two projects start to install packages in parellel. For each project (which are independent and don't know each other), it calls "TypeSpec-Project-Generate.ps1" to 1) install npm packages and 2) generate SDK by this package. The problem here is: the package is not installed in advance, it is installed when we call "TypeSpec-Project-Generate.ps1". Therefore, no matter we are mono repo or not, as long as we install the pacakge in the same place, we are facing the process race issue.
No, they are the same.
Yes, that is exactly what we are doing in this PR. We try to install the package in the parent directory and hope it is only installed once. Then it comes to the problem we are facing "two processes might install together". And we want to hear the suggestion from you if it is OK to apply a file lock implemented in this PR. |
|
@m-nash @weshaggard @benbp I removed the lock and add an |
| $installedVersion = GetNpmPackageVersion $installedPath | ||
| Write-Host "Installed emitter version: $installedVersion" | ||
|
|
||
| $replacementVersion = GetEmitterDependencyVersion $replacementPackageJson |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not pass in the emitterName here instead of having the function called again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which function do you think I'm calling twice? Here I want to get two versions of the emitter package. One is the already installed one under root folder, the other is the going-to-be installed one in emitter-package.json.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$emitterName = &$GetEmitterNameFn You are calling this on line 50 and then again inside of the GetEmitterDeependecyVersion function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see you setting emitterName anywhere. Also you will need to pass it as a parameter here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It set globally in Line 122 too.
| @@ -36,17 +108,9 @@ function NpmInstallForProject([string]$workingDirectory) { | |||
| Remove-Item -Path "package-lock.json" -Force | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we also want to remove the places we are deleting the files here? I don't think we want to delete those unless we actually do an npm install.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to do that in case user installs emitter under the project folder. If we don't do that, node will not load the emitter installed under root. It will load the emitter installed under the project folder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If files like package.json are present under this folder do we think that folks have it committed to the repo? If so then deleting it will cause a change in the files. I think we might actually want to warn people instead of blindly deleting these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This place $workingDirectory is in the gitignore. It is actually {root}\sdk\anomalydetector\Azure.AI.AnomalyDetector\TempCadlFiles\AnomalyDetector for example.
The scenario I'm thinking is: before this change, there will be package.json and npm_modules in the $workingDirectory. Then users apply this change, and they should load packages from root folder. So I want to remove the package.json and npm_modules existing before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is under the temp working directory why would there ever be a package.json and node_modules files under that directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think about these steps:
- Before this change, people run
TypeSpec-Project-Generate.ps1. Then in$workingDirectorythere arepackage.jsonandnpm_modules. - We get this change merged.
package.jsonandnpm_modulesare still there in$workingDirectory. - People run
TypeSpec-Project-Generate.ps1again, which will calltypespec compilefromnpm_modulesin$workingDirectoryinstead of fromnpm_modulesin root folder. And that is a critical problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that is the problem we are worried about we should just have folks remove the entire temp working directory (i.e. git clean -xdf). I don't think we should try and special case this one upgrade scenario.
|
The following pipelines have been queued for testing: |
|
The following pipelines have been queued for testing: |
|
|
||
| $hasPackageFile = Test-Path "package.json" | ||
|
|
||
| $npmInstallCommand = "npm install --prefix $root $emitterName@$replacementVersion --no-package-lock --omit=dev" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we start installing as a specific package like this we are making the assumption that there are no other packages in the emitter.package.json file. @raych1 @lmazuel is that assumption valid? If it is perhaps we should avoid any confusion by instead not having package.json file and simply track the package name and version instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the JS has an additional dependency currently, see https://github.com/Azure/azure-sdk-for-js/blob/main/eng/emitter-package.json. @MaryGao do you know if that extra dependency is needed or will it be included as a dependency of the emitter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no preference between these two. I did this change because @lmazuel mentioned above emitter is a tool not a library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make an explicit decision to make this change as it has larger implications on other things like other dependencies. I don't want to change to this unless we have thought it through and switched to that strategy everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weshaggard , it's not aligned for all language emitters. I would suggest to use package.json for installation rather than install emitter individually. we could drop package.json to sdk folder to avoid conflicts with package.json in root folder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to agree we should stick with a package.json file and I believe we can put it under the tempTypeSpec directory it just needs to live under the directory structure of wherever we put the TypeSpec project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weshaggard , the purpose of @pshao25 is to avoid multiple times installation for emitter. Put package.json under tempTypeSpec directory doesn't work and it should be put under upper parent folder like sdk folder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is fair we do need to put it in some shared location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weshaggard the design was that we didn't have any additional dependencies in the emitter-package.json. This simulates exactly what a customer might do if they start with a language emitter. They will simply npm install our emitter and that should have ALL necessary dependencies defined in order to run. Since they are required to be defined in the emitters package.json there is no need to duplicate them in the emitter-package.json file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this will accommodate the devDependency pinning and emitter-package-lock.json work we're doing. We would prefer to npm ci using emitter-package.json and emitter-package-lock.json rather than explicit version installation.
| if ($LASTEXITCODE) { exit $LASTEXITCODE } | ||
|
|
||
| if ($hasPackageFile) { | ||
| git restore package.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we stick with the npm install $emitterName@$replacementVersion they I assume you don't need to worry about the package.json any longer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, npm install will still modify/create package.json. This is how npm install works:
- Look for a
package.jsonfrom current folder up to outer folders until it finds one. Then it installs packages in that folder. In build agent, there are some folders up to repo root folder that containspackage.json. So I add--prefix $rootto make sure the installation will only happen in repo root folder. - When it installs packages, it modifies the
package.jsonto add that package if there is apackage.json, otherwise it creates a newpackage.json.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that is the case then that is all the more reason for use to continue to use the npm install with the emitter.package.json file as we don't want package.json files to be created or modifled based on that install command.
Can you detail out the dev experience when running things locally? It sounds like we would need to do the following.
This seems overly cumbersome for a local dev experience. Or are we assuming that locally you will never get this optimization to take effect? Instead of an env variable have we considered an optional command line argument? |
I don't think we are depending on any environment variables any longer. @pshao25 has switched to a model of simply looking to see if it is already installed and if so skip install otherwise install. I believe that should work just fine for the local dev scenarios as well. |
|
Hi @pshao25. Thank you for your interest in helping to improve the Azure SDK experience and for your contribution. We've noticed that there hasn't been recent engagement on this pull request. If this is still an active work stream, please let us know by pushing some changes or leaving a comment. Otherwise, we'll close this out in 7 days. |
|
Hi @pshao25. Thank you for your contribution. Since there hasn't been recent engagement, we're going to close this out. Feel free to respond with a comment containing |
Current behavior: we do
npm installfor exactly same packages in all the cadl projects. Samenpm installwould dontimes.After this change: Same emitter and its dependencies would install only once at the root folder. Reduce time cost from
O(n)toO(1).