Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process.fromRawCommand combined with redirectOutput hangs without success. #2401

Closed
sergey-bulavskiy opened this issue Oct 3, 2019 · 19 comments
Labels
bug needs-repro This issue needs a way to reproduce in order to fix it

Comments

@sergey-bulavskiy
Copy link

sergey-bulavskiy commented Oct 3, 2019

Description

Goal, that i'm trying to achieve is to parse npm audit output and do some stuff with it. Unfortunately Npm.exec "audit" just fails with exception without output, so i've switched to CreateProcess.fromRawCommand approach.

Repro steps

I have this code:

CreateProcess.fromRawCommand npmPath ["audit"; "--no-color"]
    |> CreateProcess.withTimeout (TimeSpan.FromMinutes 1.0)
    |> CreateProcess.withWorkingDirectory npmParams.WorkingDirectory
    |> CreateProcess.redirectOutput
    |> CreateProcess.map parseOutput
    |> Proc.run

Expected behavior

On pc's of some of my colleagues it works fine and everything is finished without any problems.

Actual behavior

On my same windows pc it hangs because of redirectOutput (tried to remove it - after that code works fine.)
In case of redirectOutput presented, execution log looks like this:

C:\source\(Path to my project)> "C:\Program Files\nodejs\npm.CMD" audit --no-color (In: false, Out: true, Err: true)
At least one redirection task did not finish:
ReadErrorTask: RanToCompletion, ReadOutputTask: WaitingForActivation, RedirectStdInTask: RanToCompletion
At least one redirection task did not finish:
ReadErrorTask: RanToCompletion, ReadOutputTask: WaitingForActivation, RedirectStdInTask: RanToCompletion
At least one redirection task did not finish:
ReadErrorTask: RanToCompletion, ReadOutputTask: WaitingForActivation, RedirectStdInTask: RanToCompletion
At least one redirection task did not finish:
ReadErrorTask: RanToCompletion, ReadOutputTask: WaitingForActivation, RedirectStdInTask: RanToCompletion
At least one redirection task did not finish:
ReadErrorTask: RanToCompletion, ReadOutputTask: WaitingForActivation, RedirectStdInTask: RanToCompletion
At least one redirection task did not finish:
ReadErrorTask: RanToCompletion, ReadOutputTask: WaitingForActivation, RedirectStdInTask: RanToCompletion

Related information

  • Operating system: Windows 10
  • Npm --version - 6.11.3, (also tried on 6.10.3, same problem)
  • Ran on .NET Core 2.2, .NET Core 3.0, doesn't change anything.
  • Indications of severity
  • Version of FAKE: fake --version FAKE 5 - F# Make (5.16.1) (this line is written to standard error, see https://github.com/fsharp/FAKE/issues/2066) FakePath: C:\Users\someUser\.dotnet\tools\.store\fake-cli\5.16.1\fake-cli\5.16.1\tools\netcoreapp2.1\any\Fake.Runtime.dll Paket.Core: 5.216.0
@matthid
Copy link
Member

matthid commented Oct 3, 2019

Apparently something doesn't work as it should. Best case would be to have a standalone way to reproduce in order to debug this. As you already noticed that it works on another machine I wouldn't be surprised if it works on my machine as well.
So we can either:

  • Try to find a failing scenario which works on my machine (so I can debug it)
  • You can try to debug into it and see why the redirect output task doesn't finish

The relevant code path is this: https://github.com/fsharp/FAKE/blob/921f1973c9f1f1b3d7ce14ef4cc4282b4e0b83ef/src/app/Fake.Core.Process/RawProc.fs#L184

@sergey-bulavskiy
Copy link
Author

@matthid yeah, i'll return to this issue a little bit later, when i'll have time. Thanks for insights, i'll try to debug it.

@matthid matthid added the bug label Oct 4, 2019
matthid added a commit that referenced this issue Oct 4, 2019
@matthid
Copy link
Member

matthid commented Oct 4, 2019

I tried to reproduce but was unable to do so. I added some environment variables so in the next release you can set the environment variable FAKE_DEBUG_PROCESS_HANG (and FAKE_ATTACH_DEBUGGER if you want to debug it) to get a crash and a mini-dump. Maybe that helps further analyzing this problem.

@sergey-bulavskiy
Copy link
Author

@matthid Thank you, will do it

@matthid
Copy link
Member

matthid commented Oct 8, 2019

@serjebulavsky You can use 5.16.2-alpha.1304
Basically you only need to update Fake.Core.Process

@matthid
Copy link
Member

matthid commented Oct 13, 2019

Any updates on this?

@sergey-bulavskiy
Copy link
Author

sergey-bulavskiy commented Oct 14, 2019

Tried with Core.Process 5.16.2-alpha.1304 and FAKE_DEBUG_PROCESS_HANG = true
Dump:

FailFast:
At least one redirection task did not finish:
ReadErrorTask: RanToCompletion, ReadOutputTask: WaitingForActivation, RedirectStdInTask: RanToCompletion

   at System.Environment.FailFast(System.String, System.Exception)
   at [email protected](Microsoft.FSharp.Core.Unit)
   at Microsoft.FSharp.Control.AsyncPrimitives.CallThenInvoke[[System.__Canon, System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](Microsoft.FSharp.Control.AsyncActivation`1<System.__Canon>, System.__Canon, Microsoft.FSharp.Core.FSharpFunc`2<System.__Canon,Microsoft.FSharp.Control.FSharpAsync`1<System.__Canon>>)
   at [email protected](Microsoft.FSharp.Control.AsyncActivation`1<Microsoft.FSharp.Core.Unit>)
   at [email protected](Microsoft.FSharp.Control.AsyncActivation`1<Microsoft.FSharp.Core.Unit>)
   at [email protected](Microsoft.FSharp.Control.AsyncActivation`1<Microsoft.FSharp.Core.Unit>)
   at [email protected](Microsoft.FSharp.Control.AsyncActivation`1<Microsoft.FSharp.Core.Unit>)
   at [email protected](Microsoft.FSharp.Control.AsyncActivation`1<Microsoft.FSharp.Core.Unit>)
   at [email protected](Microsoft.FSharp.Control.AsyncActivation`1<Boolean>)
   at Microsoft.FSharp.Control.Trampoline.Execute(Microsoft.FSharp.Core.FSharpFunc`2<Microsoft.FSharp.Core.Unit,Microsoft.FSharp.Control.AsyncReturn>)
   at <StartupCode$FSharp-Core>.$Async.once@1106[[System.__Canon, System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](Microsoft.FSharp.Control.AsyncActivation`1<System.__Canon>, System.Threading.Thread, Microsoft.FSharp.Control.Latch, Microsoft.FSharp.Core.FSharpRef`1<Boolean>, Microsoft.FSharp.Core.FSharpRef`1<Microsoft.FSharp.Core.FSharpOption`1<Microsoft.FSharp.Core.FSharpFunc`2<Microsoft.FSharp.Core.Unit,Microsoft.FSharp.Control.AsyncReturn>>>, Microsoft.FSharp.Core.FSharpFunc`2<System.__Canon,Microsoft.FSharp.Control.AsyncReturn>, System.__Canon)
   at <StartupCode$FSharp-Core>.$Async+once@1106-1T[[System.__Canon, System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].Invoke(Microsoft.FSharp.Core.FSharpFunc`2<System.__Canon,Microsoft.FSharp.Control.AsyncReturn>, System.__Canon)
   at Fake.Core.ProcessHelpers.AsyncExtensions.continuation@74[[System.__Canon, System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](Microsoft.FSharp.Core.FSharpFunc`2<System.__Canon,Microsoft.FSharp.Core.Unit>, Microsoft.FSharp.Core.FSharpFunc`2<System.Exception,Microsoft.FSharp.Core.Unit>, System.Threading.Tasks.Task`1<System.__Canon>)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef)
   at System.Threading.ThreadPoolWorkQueue.Dispatch()

@sergey-bulavskiy
Copy link
Author

sergey-bulavskiy commented Oct 14, 2019

Sorry, i didn't specified additional output in the first message in the issue, full output looks like this, if it can help somehow:

At least one redirection task did not finish:
ReadErrorTask: RanToCompletion, ReadOutputTask: WaitingForActivation, RedirectStdInTask: RanToCompletion
At least one redirection task did not finish:
ReadErrorTask: RanToCompletion, ReadOutputTask: WaitingForActivation, RedirectStdInTask: RanToCompletion
At least one redirection task did not finish:
ReadErrorTask: RanToCompletion, ReadOutputTask: WaitingForActivation, RedirectStdInTask: RanToCompletion
At least one redirection task did not finish:
ReadErrorTask: RanToCompletion, ReadOutputTask: WaitingForActivation, RedirectStdInTask: RanToCompletion
At least one redirection task did not finish:
ReadErrorTask: RanToCompletion, ReadOutputTask: WaitingForActivation, RedirectStdInTask: RanToCompletion
At least one redirection task did not finish:
ReadErrorTask: RanToCompletion, ReadOutputTask: WaitingForActivation, RedirectStdInTask: RanToCompletion

Could not kill process C:\Program Files\nodejs\npm.CMD  audit --no-color after timeout: System.InvalidOperationException: No process is associated with this object.
   at System.Diagnostics.Process.EnsureState(State state)
   at System.Diagnostics.Process.get_HasExited()
   at [email protected](Unit unitVar) in d:\a\1\s\src\app\Fake.Core.Process\CreateProcess.fs:line 569
   at Microsoft.FSharp.Control.AsyncPrimitives.CallThenInvoke[T,TResult](AsyncActivation`1 ctxt, TResult result1, FSharpFunc`2 part2) in E:\A\_work\130\s\src\fsharp\FSharp.Core\async.fs:line 398
   at Microsoft.FSharp.Control.Trampoline.Execute(FSharpFunc`2 firstAction) in E:\A\_work\130\s\src\fsharp\FSharp.Core\async.fs:line 109

So basically kill on timeout doesn't work because it cannot find this process.

And i don't have a possibility to debug it currently, sorry.

@matthid
Copy link
Member

matthid commented Oct 14, 2019

Thanks, is this the full output with latest including the environment variables I suggested above?

@sergey-bulavskiy
Copy link
Author

@matthid Yes, this is all

At least one redirection task did not finish:
ReadErrorTask: RanToCompletion, ReadOutputTask: WaitingForActivation, RedirectStdInTask: RanToCompletion
At least one redirection task did not finish:
ReadErrorTask: RanToCompletion, ReadOutputTask: WaitingForActivation, RedirectStdInTask: RanToCompletion
At least one redirection task did not finish:
ReadErrorTask: RanToCompletion, ReadOutputTask: WaitingForActivation, RedirectStdInTask: RanToCompletion
At least one redirection task did not finish:
ReadErrorTask: RanToCompletion, ReadOutputTask: WaitingForActivation, RedirectStdInTask: RanToCompletion
At least one redirection task did not finish:
ReadErrorTask: RanToCompletion, ReadOutputTask: WaitingForActivation, RedirectStdInTask: RanToCompletion
FailFast:
At least one redirection task did not finish:
ReadErrorTask: RanToCompletion, ReadOutputTask: WaitingForActivation, RedirectStdInTask: RanToCompletion

   at System.Environment.FailFast(System.String, System.Exception)
   at [email protected](Microsoft.FSharp.Core.Unit)
   at Microsoft.FSharp.Control.AsyncPrimitives.CallThenInvoke[[System.__Canon, System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](Microsoft.FSharp.Control.AsyncActivation`1<System.__Canon>, System.__Canon, Microsoft.FSharp.Core.FSharpFunc`2<System.__Canon,Microsoft.FSharp.Control.FSharpAsync`1<System.__Canon>>)
   at [email protected](Microsoft.FSharp.Control.AsyncActivation`1<Microsoft.FSharp.Core.Unit>)
   at [email protected](Microsoft.FSharp.Control.AsyncActivation`1<Microsoft.FSharp.Core.Unit>)
   at [email protected](Microsoft.FSharp.Control.AsyncActivation`1<Microsoft.FSharp.Core.Unit>)
   at [email protected](Microsoft.FSharp.Control.AsyncActivation`1<Microsoft.FSharp.Core.Unit>)
   at [email protected](Microsoft.FSharp.Control.AsyncActivation`1<Microsoft.FSharp.Core.Unit>)
   at [email protected](Microsoft.FSharp.Control.AsyncActivation`1<Boolean>)
   at Microsoft.FSharp.Control.Trampoline.Execute(Microsoft.FSharp.Core.FSharpFunc`2<Microsoft.FSharp.Core.Unit,Microsoft.FSharp.Control.AsyncReturn>)
   at <StartupCode$FSharp-Core>.$Async.once@1106[[System.__Canon, System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](Microsoft.FSharp.Control.AsyncActivation`1<System.__Canon>, System.Threading.Thread, Microsoft.FSharp.Control.Latch, Microsoft.FSharp.Core.FSharpRef`1<Boolean>, Microsoft.FSharp.Core.FSharpRef`1<Microsoft.FSharp.Core.FSharpOption`1<Microsoft.FSharp.Core.FSharpFunc`2<Microsoft.FSharp.Core.Unit,Microsoft.FSharp.Control.AsyncReturn>>>, Microsoft.FSharp.Core.FSharpFunc`2<System.__Canon,Microsoft.FSharp.Control.AsyncReturn>, System.__Canon)
   at <StartupCode$FSharp-Core>.$Async+once@1106-1T[[System.__Canon, System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].Invoke(Microsoft.FSharp.Core.FSharpFunc`2<System.__Canon,Microsoft.FSharp.Control.AsyncReturn>, System.__Canon)
   at Fake.Core.ProcessHelpers.AsyncExtensions.continuation@74[[System.__Canon, System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](Microsoft.FSharp.Core.FSharpFunc`2<System.__Canon,Microsoft.FSharp.Core.Unit>, Microsoft.FSharp.Core.FSharpFunc`2<System.Exception,Microsoft.FSharp.Core.Unit>, System.Threading.Tasks.Task`1<System.__Canon>)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef)
   at System.Threading.ThreadPoolWorkQueue.Dispatch()

@matthid
Copy link
Member

matthid commented Oct 14, 2019

Ok, as you have no more time to debug, can you at least attach the debugger and somehow send me a full dump of the process (as I might be able to debug with it (if you can send me along the lockfile for the exact nuget versions you used)?
Basically, attach Visual Studio -> Create full dump -> (Zip it as it probably is big) -> Send it to me somehow, feel free to send via mail if this is something company specific (matthi.d at gmail dot com or root at matthid dot de).
We can try a minidump as well first but it will be another potential back and forth ;)

@sergey-bulavskiy
Copy link
Author

okay thanks, i will do it a little bit later :)

@sergey-bulavskiy
Copy link
Author

sergey-bulavskiy commented Oct 16, 2019

@matthid Hello, i've tried to reproduce it today to create a dump for you, but for some reason it stopped hanging. I haven't updated anything from yesterday (yesterdays evening it was reproducable), altough i have a small minidump from yesterday, i will send it to you.

@matthid
Copy link
Member

matthid commented Oct 16, 2019

@serjebulavsky Thanks for sending the dump, sadly I couldn't see any more information in the dump besides something obvious - that there is some other thread still hanging in "ReadFile" (to read the pipe of stdout of the process):

 	[Managed to Native Transition]	
>	System.Private.CoreLib.dll!System.IO.FileStream.ReadFileNative(Microsoft.Win32.SafeHandles.SafeFileHandle handle, System.Span<byte> bytes, System.Threading.NativeOverlapped* overlapped, out int errorCode) Line 1190	C#
 	System.Private.CoreLib.dll!System.IO.FileStream.ReadNative(System.Span<byte> buffer) Line 507	C#
 	System.Private.CoreLib.dll!System.IO.FileStream.ReadSpan(System.Span<byte> destination) Line 433	C#
 	System.Private.CoreLib.dll!System.IO.FileStream.Read(byte[] array, int offset, int count) Line 302	C#
 	System.Private.CoreLib.dll!System.IO.Stream.BeginReadInternal.AnonymousMethod__40_0(object <p0>) Line 307	C#
 	System.Private.CoreLib.dll!System.Threading.Tasks.Task<int>.InnerInvoke() Line 616	C#
 	System.Private.CoreLib.dll!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state) Line 167	C#
 	System.Private.CoreLib.dll!System.Threading.Tasks.Task.ExecuteWithThreadLocal(ref System.Threading.Tasks.Task currentTaskSlot) Line 2440	C#
 	System.Private.CoreLib.dll!System.Threading.ThreadPoolWorkQueue.Dispatch() Line 588	C#
 	[Native to Managed Transition]	

image

Not sure why it would hang there when the process is already closed. Not sure what to do when you yourself cannot even reproduce it anymore. Being able to reproduce it somehow is still preferred, it is not that unlikely that there is an edge case somewhere and I'm not seeing it. Starting processes in .net is surprisingly difficult to do correctly (which is the main reason for our own API).

@sergey-bulavskiy
Copy link
Author

sergey-bulavskiy commented Oct 17, 2019

@matthid Today it is reproducable again. I've tried to change number of vulnerable libraries in package.json (which leads to drastically decreased size of output), but no success, still hanging. Maybe you have other ideas or requests that i can try to do on my pc. Unfortunately right now (for a week at least) i just don't have enough time to deeply dive into it and debug it myself, sorry.

@matthid
Copy link
Member

matthid commented Oct 17, 2019

So your current workaround is to just "try again" I assume?
I'd still try to collect a full dump instead of the minidump (let me know if you need info hon how to do it), I hope it shows the running tasks in the parallel stacks window. Additionally, I can send you custom builds but that process will be painfully slow and include several back and forth.

@sergey-bulavskiy
Copy link
Author

@matthid no, i decided to work with built-in Npm helper audit command, without reading output, just the error code. For some reason npm helper doesn't provide output. It would be great if you will provide info on how to do it, thanks :)

@matthid
Copy link
Member

matthid commented Oct 20, 2019

I still have not been able to reproduce this in any scenario. To track this down further I need the following:

  • Make sure the process "hangs" (or the dialog to attach the debugger is open if you have set FAKE_ATTACH_DEBUGGER or FAKE_DEBUG_PROCESS_HANG)
  • install and open process explorer (or any other tool which allows to create full dumps)
  • make sure to find the correct process (the one with fake-cli.dll as first argument to dotnet or fake.exe depending on the installation):
    image
  • Create full dump: image
  • Send the dump (it might be quite large)

Let me know if you encounter difficulties.

Probably not useful and takes a long time

Additionally the following might help:

  • Attach visual Studio, either via the "Choose Just-In-Time Debugger" or for example via process explorer. Make sure to select "Manually choose the debugging engines.":
    image
    image

  • If not already hit "break all"

  • Open "Parallel Stacks" window, it probably looks like this:
    image

  • Install, open and setup DotPeek as symbol server (it is free)

  • Go to Tools->Options->Debugging->Symbols Add the symbol server url (for example http://localhost:33417 and press "Load all symbols" (see warning below):
    image

    This probably takes quite a while so be prepared to let it run for a while, you can see the progress in DotPeek (Windows -> Pdb Generation Status). Alternatively you can filter which symbols it tries to load before clicking "Load all symbols" (see below the button). Don't worry about yellow warnings in DotPeek as Visual Studio will try to load symbols for everything (not only .NET assemblies).

  • Go to Debugging->General and disable "Enable Just My Code" and enable "Enable .NET Framework source stepping"

  • Go to "Parallel Stacks" window -> Threads and copy all stacks, Go to "Parallel Stacks" window -> Tasks and copy all stacks and paste them here.

matthid added a commit to SteveGilham/FAKE that referenced this issue Oct 20, 2019
@matthid matthid added the needs-repro This issue needs a way to reproduce in order to fix it label Oct 20, 2019
@github-actions
Copy link
Contributor

There has not been any activity in this issue for the last 3 months so it will be closed in 14 days if there is no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug needs-repro This issue needs a way to reproduce in order to fix it
Projects
None yet
Development

No branches or pull requests

2 participants