Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too many open files when restoring on Mac #2163

Closed
moozzyk opened this issue Feb 24, 2016 · 32 comments
Closed

Too many open files when restoring on Mac #2163

moozzyk opened this issue Feb 24, 2016 · 32 comments
Assignees
Labels
Functionality:Restore Platform:Xplat Resolution:External This issue appears to be External to nuget Resolution:NotRepro We are not able to reproduce this problem. Better steps would be helpful. Type:Bug
Milestone

Comments

@moozzyk
Copy link

moozzyk commented Feb 24, 2016

Looks like the fix for #2004 was not enough. I am still hitting the same issue on some repos.

I am using dotnet-osx-x64.1.0.0.001496.tar.gz which contains the fix for #2004

To repro:

Log from the CI:
https://s3.amazonaws.com/archive.travis-ci.org/jobs/111327930/log.txt

@moozzyk
Copy link
Author

moozzyk commented Mar 1, 2016

You can use opensnoop to track open files or lsof to take a snapshot of open files.

@zhili1208
Copy link
Contributor

@moozzyk just verified with latest bit, looks like it's fixed by @emgarten 's fix, feel free to reopen it if it still repro

@moozzyk
Copy link
Author

moozzyk commented Mar 23, 2016

I hope the new dotnet cli we will pick will have this fix so I will be able to verify this end to end.

@zhili1208
Copy link
Contributor

@moozzyk please try the new dotnet, I think it has this already

@agocke
Copy link

agocke commented Mar 27, 2016

I just hit this with 1.0.0.002060

@zhili1208
Copy link
Contributor

@agocke we just found one issue related to this, will fix it soon, Can you provide repro solution? then we can verify it with the fix.

@agocke
Copy link

agocke commented Mar 28, 2016

@zhili1208 It's the Roslyn solution -- I'm in the process of making a PR for this. If you get a build, lemme know and I can try it. Otherwise I'll set up a repro for you.

@agocke
Copy link

agocke commented Apr 14, 2016

I can confirm this is still happening:

Unhandled Exception: System.IO.IOException: Too many open files
   at Interop.ThrowExceptionForIoErrno(ErrorInfo errorInfo, String path, Boolean isDirectory, Func`2 errorRewriter)
   at Interop.CheckIo[TSafeHandle](TSafeHandle handle, String path, Boolean isDirectory, Func`2 errorRewriter)
   at System.ConsolePal.OpenStandardOutput()
   at System.Console.<>c.<get_Out>b__25_0()
   at System.Console.EnsureInitialized[T](T& field, Func`1 initializer)
   at System.Console.WriteLine(String value)
   at Microsoft.DotNet.Tools.Restore.RestoreCommand.<>c__DisplayClass1_0.<Run>b__2()
   at Microsoft.Dnx.Runtime.Common.CommandLine.CommandLineApplication.Execute(String[] args)
   at Microsoft.DotNet.Cli.Program.ProcessArgs(String[] args, ITelemetry telemetryClient)
   at Microsoft.DotNet.Cli.Program.Main(String[] args)
Restore failed
$ ./dotnet --version
1.0.0-rc2-002378

To repro

  1. clone Roslyn
  2. remove the ulimit setting in build/scripts/restore.sh
  3. run make restore
  4. Replace the binaries in Binaries/toolset/roslyn.mac.5/dotnet-cli with the latest mac dotnet build.
  5. Delete Binaries/toolset/restore.semaphore and ~/.nuget
  6. Re-run make restore

@zhili1208
Copy link
Contributor

@agocke yes, this is still happening, the issue comes from httpclient, dotnet core team will provide us an API to limit the open file number, but not now, so we are still waiting for them. If this blocks you, please use ulimit for now

@jaredpar
Copy link

Why is this issue still in the closed state? It's 100% reproducable in Roslyn and our build is completely broken unless we resort to ulimit hackiness to unblock us.

the issue comes from httpclient, dotnet core team will provide us an API to limit the open file number, but not now, so we are still waiting for them

That's fine but still seems like there is work to do on your end. Shouldn't this bug be tracking the work?

@zhili1208
Copy link
Contributor

@zhili1208 zhili1208 reopened this Apr 18, 2016
@joelverhagen
Copy link
Member

Did some digging, related: https://github.com/dotnet/corefx/issues/7932

Still investigating.

@agocke
Copy link

agocke commented Apr 22, 2016

Note that we've also started seeing this on Linux: dotnet/roslyn#10768

@joelverhagen
Copy link
Member

@agocke, we are going to pull in a fix for dotnet/corefx#7932 today, which addresses this issue. After that point, this problem should go away.

@agocke
Copy link

agocke commented Apr 22, 2016

@joelverhagen Thanks! Can you try the Roslyn repro I posted above or let me know when a build is available so I can try it myself?

@joelverhagen
Copy link
Member

The latest dotnet CLI has been fixed to address this problem. Basically you should look for a version of System.Net.Http to have a build number of 24022 or greater. This was bumped in CLI with dotnet/cli@e420515 and in NuGet with NuGet/NuGet.Client@b3fecd4.

I just tried restoring Roslyn latest with dotnet CLI latest on OS X and had no issue. My dotnet CLI version is 1.0.0-rc2-002487.

@zhili1208 zhili1208 self-assigned this Feb 15, 2017
@zhili1208
Copy link
Contributor

zhili1208 commented Feb 15, 2017

@starquake I can't repro this with dotnet 1.0.0-rc4-004823, Could you try latest dotnet cli and also share your nuget.config file from ~/.nuget/NuGet ? Sometimes this is related to the number of package source.

@starquake
Copy link

@zhili1208 I will share the nuget config when I'm at work tomorrow. Tried it on another system and it executed without any problems.

If you want to close this ticket because it's not reproducible that would be fine too.

@rrelyea rrelyea modified the milestones: Future-1, 4.0 RTM Feb 15, 2017
@rrelyea
Copy link
Contributor

rrelyea commented Feb 15, 2017

Moving out until we have a dependable repro to consider.

@starquake
Copy link

starquake commented Feb 16, 2017

Is there a dotnet cli that is newer than 1.0.0-rc4? Where can I get it?

This is in my nuget.config:

<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <packageSources>
    <add key="nuget.org" value="https://api.nuget.org/v3/index.json" protocolVersion="3" />
  </packageSources>
</configuration>%

@mrward
Copy link
Member

mrward commented Feb 16, 2017

@starquake - More recent rc4 dotnet cli builds are available from https://github.com/dotnet/cli

@starquake
Copy link

Installed 1.0.0-rc4-004834 and it still occurs.

I did some troubleshooting.

ulimit -n tells me 256 is the default
If I set it to 266 the error appears later.
If I set it to 276 the error appears even later.
If I set it 286 there is no error anymore.

So it looks like I'm just over the boundary.

When I look at my open files for the shell with lsof, it's not that much:

➜  ~ lsof -p 4496 | wc -l
      19

@joelverhagen
Copy link
Member

Could you run lsof to determine what kind of handles are most prevalent? We have seen issues in the past where HttpClient was not properly closing sockets. We also open a lot of file handles so perhaps there is an issue there.

@starquake
Copy link

@am11
Copy link

am11 commented Mar 9, 2017

Same situation with SDK v1.0.0 and v1.0.1 .. https://travis-ci.org/am11/MaxMind-DB-Reader-dotnet/jobs/209511300#L974-L996

Based on this comment: https://github.com/dotnet/cli/issues/809#issuecomment-194407913, would it be possible to limit the max connections (or MaxDegreeOfParallelism) to eight?

@Meligy
Copy link

Meligy commented Mar 13, 2017

I'm having this (as mentioned in https://github.com/dotnet/cli/issues/6014#issuecomment-286060904), even though when I run ulimit, I get unlimited. Is that expected? (as in, is unlimited actually a bad value?)

Thanks.

@am11
Copy link

am11 commented Mar 13, 2017

@Meligy, ulimit -n shows the actual value. For TravisCI, we are setting it like this.

@Meligy
Copy link

Meligy commented Mar 13, 2017

Aha. Makes total sense (for the record mine shows 256 for ulimit -n, but probably everybody here knows that already).

Thanks a lot.

@rrelyea
Copy link
Contributor

rrelyea commented Aug 16, 2017

@emgarten - can you make sure we get the right "ClosedAs" label set here? won't fix? by design?

@rrelyea rrelyea modified the milestones: 4.4, Future-1 Aug 16, 2017
@zhili1208
Copy link
Contributor

@rrelyea I remember there was a fix in template, they opened many nupkg files in memory but didn't close them. I think this issue is fixed by their fix.

@rrelyea
Copy link
Contributor

rrelyea commented Aug 16, 2017

From @mlorbetske:
that issue was fixed as of the 1.0.4 CLI. Either updating to a later CLI build or raising the value of ulimit should fix it.

will mark as external and notrepro - given that. if anybody still hits despite that, please chime in.

@rrelyea rrelyea added Resolution:External This issue appears to be External to nuget Resolution:NotRepro We are not able to reproduce this problem. Better steps would be helpful. and removed ClosedAs:WontFix labels Aug 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Functionality:Restore Platform:Xplat Resolution:External This issue appears to be External to nuget Resolution:NotRepro We are not able to reproduce this problem. Better steps would be helpful. Type:Bug
Projects
None yet
Development

No branches or pull requests