Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add builtin to output to file #3153

Closed
myaaaaaaaaa opened this issue Jul 20, 2024 · 5 comments
Closed

Add builtin to output to file #3153

myaaaaaaaaa opened this issue Jul 20, 2024 · 5 comments

Comments

@myaaaaaaaaa
Copy link
Contributor

myaaaaaaaaa commented Jul 20, 2024

I would like to propose a new builtin, say output_file("filename"; "contents"), which copies its input to its output while saving its arguments as a file. It's similar in principle to the debug builtin.

If --sandbox is specified, it will simply output the filename to stderr, essentially acting as a dry run.

Having this builtin would make it less awkward to split json files. See below for some workarounds that are currently required:

Proposed semantics:

# sample script
to_entries[] | output_file(.key; .value) | .key

# stdin
{
	"a.txt": "string\nstring",
	"b/c.txt": "invalid",
	"d.json": {
		"e": 10
	}
}

# stdout
"a.txt"
"b/c.txt"
"d.json"

# stderr
b/c.txt: No such file or directory


# a.txt
string
string

# d.json
{"e":10}
@wader
Copy link
Member

wader commented Jul 31, 2024

If you ok with using fq i've used a tar hack a few times to output multiple files. Something like this:

Copy tar code from https://github.com/wader/fq/wiki/snippets into tar.jq then

$ fq -n -L . 'include "tar"; to_tar({filename: "a", data: "aaa"}, {filename: "b", data: "bbb"})' | tar tv
-rw-r--r--  0 user   group       3 Jan  1  1970 a
-rw-r--r--  0 user   group       3 Jan  1  1970 b

Maybe you could rewrite the tar code to work with standard jq but then as jq does not support raw binary output you might be limited to just ASCII data in files etc.

@itchyny
Copy link
Contributor

itchyny commented Aug 1, 2024

Simple way of doing this is outputting a shell script from jq. That's how @sh is used for.

jq -r 'to_entries[] | @sh"echo \(.value|tostring) > \(.key)"' | sh

@myaaaaaaaaa
Copy link
Contributor Author

myaaaaaaaaa commented Aug 5, 2024

Simple way of doing this is outputting a shell script from jq. That's how @sh is used for.

In general, I also prefer outputting shell scripts over something like #3133 (although I didn't know about @sh for escaping - thanks for that!)

However, there's been several times where I've had to split the output into multiple shell scripts, and the resulting doubly-escaped script was a headache to review, which was what inspired this proposal.

Presumably, there's many other places where having this as a builtin would make for a nice quality-of-life improvement.

@wader
Copy link
Member

wader commented Aug 6, 2024

Agree that it would be nice with more I/O features. In my view the biggest issue is how to make it all fit nicely together, e.g #1843 includes file handles support that would make some of this possible to implement as builtins i think. Then also what would be good names and API? input/1 to read a file as JSON, string or how to specify? output/1 to write? tee/1 to write and pass thru? things like that.

Maybe a way forward could be to flash out how these API could look like and be used by a user and then maybe see what subset could be implement without major changes? that way we could minimize risk of adding something that turns out to be incompatible or awkward to combine with future fancier I/O, coeval, etc support.

@myaaaaaaaaa
Copy link
Contributor Author

myaaaaaaaaa commented Aug 6, 2024

Agree that it would be nice with more I/O features. In my view the biggest issue is how to make it all fit nicely together, e.g #1843 includes file handles support that would make some of this possible to implement as builtins i think.

For IO, I would advocate for having very few individually tailored high-level primitives, rather than many low-level building blocks like in that PR.

Due to the nature of jq being a functional language, interacting with the outside world is a much more advanced feature than usual1, and can end up being surprisingly asymmetrical (see below).

I'm even willing to be convinced that IO doesn't even belong in jq at all (hence this proposal being opened as an issue rather than a PR).


input/1 to read a file as JSON, string or how to specify?

I would actually advocate for something like an --input-var option instead, which reads all files into an $input variable containing a filename-to-contents map (essentially a more generalized form of --slurpfile and --rawfile)

Usage would be something like:

jq --input-var '$input | .["a.json"]' *.json

Maybe a way forward could be to flash out how these API could look like and be used by a user and then maybe see what subset could be implement without major changes? that way we could minimize risk of adding something that turns out to be incompatible or awkward to combine with future fancier I/O, coeval, etc support.

Another way to manage this risk could be to prefix experimental APIs (for example, this could be named _exp_output_file), and print warnings that the functionality is subject to change.

Footnotes

  1. For another example of typically standard functionality being treated as an advanced feature, note that jq officially considers variables an advanced feature

@myaaaaaaaaa myaaaaaaaaa closed this as not planned Won't fix, can't repro, duplicate, stale Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants