Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make packages/io async #7

Merged
merged 4 commits into from
May 22, 2019
Merged

Make packages/io async #7

merged 4 commits into from
May 22, 2019

Conversation

jclem
Copy link
Contributor

@jclem jclem commented May 21, 2019

This pull request rewrites a bunch of logic in packages/io so that it's fully async for IO operations.

In the current iteration of packages/io, functions are returning promises, but are still internally calling the sync versions of fs modules.

@damccorm Let me know what you think. Another thing I wanted to point out was something w/async functions—when a function is declared to be async, it implicitly returns a promise—there's no need to create and then manually return one, rejecting and resolving, etc. Throwing inside of an async function will cause it to return a rejected promise (which will throw if that call is awaited).

jclem added 2 commits May 21, 2019 18:03
- `CopyOptions` on `cp` now defaults to empty
- Setting option values is easier now
@jclem jclem requested a review from damccorm May 21, 2019 22:49
@damccorm
Copy link
Contributor

In the current iteration of packages/io, functions are returning promises, but are still internally calling the sync versions of fs modules

Its unclear to me why this is a bad thing. It looks to me like you're basically replacing each fs.fooSync with await fs.promises.foo - doesn't this functionally do the same thing (aka block until the operation is complete)? When we await async functions don't they basically just become synchronous? Maybe I'm missing something here but I don't see the advantage

Everything else here makes sense to me

when a function is declared to be async, it implicitly returns a promise

Cool, thanks for the pointer

@jclem
Copy link
Contributor Author

jclem commented May 22, 2019

When we await async functions don't they basically just become synchronous?

The vertical flow of the code may make it look like that way, but they don't actually become synchronous. fs.statSync is a blocking IO operation, meaning that everything in the entire Node.js VM blocks while that operation is happening (since just wrapping a function in async doesn't mean it runs in a separate thread).

For example, in the following two snippets, the one using Sync calls would potentially run much slower than the one using async because each Sync call fully blocks the rest of the VM:

Generally slower (IO is not concurrent, one file read at a time):

aBunchOfFilePaths.map(fs.readFileSync)

Generally faster (IO is concurrent, many files read at a time):

await Promise.all(aBunchOfFilePaths.map(async path => {
  return await fs.promises.readFile(path)
}))

// Note that because async promises are implicit, that's the same as:
await Promise.all(aBunchOfFilePaths.map(fs.promises.readFile))

In our CI context, it may be, however, that there isn't going to be too much concurrent IO happening—this isn't a web server. That said, generally I think that Node.js users are used to asynchronous APIs for file system interaction. Further, providing truly async APIs reduces the cost of doing lots of file IO. I would also be concerned that putting an async API in front of fully synchronous code would have some adverse side effects—users may expect to be able to map quickly over our IO API and get concurrent IO benefits, but behind the scenes we'd actually just be doing 1 thing at a time.

@jclem jclem closed this May 22, 2019
@jclem jclem reopened this May 22, 2019
@jclem
Copy link
Contributor Author

jclem commented May 22, 2019

Whoops! Didn't mean to close.

@bryanmacfarlane
Copy link
Member

While it's true that it's blocking for that whole node VM, in practice it's not critical since folks are writing sequential script CI steps. Each step / action is run on the agent and each action step is run out of proc with a new instance of the node process.

However, technically, someone could invoke these methods concurrently and it would matter.

The use cases though was inspired by shelljs which aimed to write bash style scripts in javascript so the goal was synchronous and sequential.

As jclem said, it's not a web server here so it's not as critical. In a web server any sync code would be a huge non starter.

My 2 cents is if we're going after async/await style actions then it's consistent to have them all be awaitable and if they're awaitable, let's just make them async.

Make sense? Thoughts?

@bryanmacfarlane
Copy link
Member

I'm good to take these changes.

@damccorm
Copy link
Contributor

Ah cool, thanks for the explanation. I'm good with these changes then.

@damccorm damccorm merged commit 86228a8 into features/io May 22, 2019
@damccorm damccorm deleted the features/io-patch-jclem branch May 22, 2019 12:39
@damccorm damccorm mentioned this pull request May 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants