Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement RemoteStore.addToStoreFromDump #4009

Closed
wants to merge 6 commits into from

Conversation

roberth
Copy link
Member

@roberth roberth commented Sep 13, 2020

This implements RemoteStore.addToStoreFromDump, allowing nix-daemon to serve as a proxy for another nix-daemon. A possible use case for this is distrusting a container that runs as an otherwise trusted uid. The implementation was fairly straightforward.

@roberth roberth changed the title Proxy store Implement RemoteStore.addToStoreFromDump Sep 13, 2020
@edolstra
Copy link
Member

Not sure we should add this. addToStoreFromDump() is really superfluous since we already have an addToStore() that takes a Source. And the RemoteStore implementation of the latter implements it using FramedSink / FramedSource to ensure that the connection is left in a known state if an exception occurs.

@roberth
Copy link
Member Author

roberth commented Sep 14, 2020

Not sure we should add this. addToStoreFromDump() is really superfluous since we already have an addToStore() that takes a Source.

This is the only method in the Store interface that can take any nar source and return it as a content-addressable store path without knowing the hash in advance, which is good to keep around for performance.

This is more efficient than virtual void addToStore(const ValidPathInfo & info, Source & narSource, RepairFlag repair = NoRepair, CheckSigsFlag checkSigs = CheckSigs), because in order to use that method we'd have to

  • drain the nar stream to compute the hash for the ValidPathInfo
  • restart and drain it again while sending it to the store with addToStore

Whereas with addToStoreFromDump we don't need to precompute the hash and therefore only need to produce the nar once.

And the RemoteStore implementation of the latter implements it using FramedSink / FramedSource to ensure that the connection is left in a known state if an exception occurs.

That seems quite important. I'll check if there's more commonality that can be factored out.

Copy link
Member

@Ericson2314 Ericson2314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Baring potentially using FramedSink/FramedSource, looks good to me. Agreed we should not get rid of addToStoreFromDump because it's the only one that calculates the content address for you.

@Ericson2314
Copy link
Member

For similar reasons I made a addToStoreFromDump implementation for BinaryCacheStore in #3935.

@Ericson2314
Copy link
Member

Also, https://github.com/NixOS/nix/pull/3829/files too helps with chaining stores.

@roberth
Copy link
Member Author

roberth commented Sep 15, 2020

I've added the new woAddToStoreCANar, which is a framed and cleaned up version of wopAddToStore.

TODO: I've overlooked its last usage in StorePath RemoteStore::addToStore(const string & name, const Path & _srcPath, FileIngestionMethod method, HashType hashAlgo, PathFilter & filter, RepairFlag repair) which might be possible to implement in terms of fromdump.

@Ericson2314
Copy link
Member

If we are making a new version of wopAddToStore we might consider some other changes.

The other end is s bit cumbersome because the protocol always dumps a NAR, but the addToStoreFromDump expects the format to correspond to the file ingestion method. We could either plain dump the file for flat, or we could add a second FileIngestionMethod parameter to addToStoreFromDump to indicate what what the dump is, as opposed to what sort of content-address we should use for the remote-side-computed narinfo.

@roberth
Copy link
Member Author

roberth commented Sep 16, 2020

@Ericson2314 I've spent a long time reasoning the implications of using a plain dump and trying to implement it, but it doesn't seem to pay off. We'll introduce a lot of complexity in the form of compatibility logic around the daemon protocol and all it might ever save is a single Source wrapper that turns a NAR into a file (not too hard). That's an upper bound in the cost of leaving the new wopAddToStoreCANar as is. I'll save you the long rant, but it boils down to

  • NAR is a good common format that supports all store paths. This allows us to simplify the Store api further than has been done; a possibility we'd throw out by supporting multiple dump formats. (Specifically this is about virtual methods. The rest are essentially static utility functions despite their syntax)
  • I've had a thorough look at addToStoreSlow specifically and its problems are not caused by having a single dump format. It can be clarified by refactoring in various ways, including merging all FileIngestionMethod conditionals into one and using previously mentioned Source wrapper instead of its Sink counterpart.

I have to conclude that the refactoring isn't worth it. I suggest someone could do it if they really want to, and they can do it even after this PR. I don't think it's worth their time and effort though.

Copy link
Member Author

@roberth roberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ready for review now.
I've added some comments to clarify the changes.

Just in case, I've done additional testing by building a fresh package including a filterSource source and running nix-store --verify-path on the derivation closure and output closure, using a chained nix-daemon where the final daemon is Nix 2.4 before this change, the intermediate daemon is from this PR and the client is either from this PR or from before.

}
}
conn.withFramedSink([&](Sink & sink) {
copyNAR(source, sink);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored, behavior unchanged.

@@ -982,6 +931,83 @@ std::exception_ptr RemoteStore::Connection::processStderr(Sink * sink, Source *
return nullptr;
}


struct FramedSink : nix::BufferedSink
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved for reuse, unchanged.

};

void
ConnectionHandle::withFramedSink(std::function<void(Sink &sink)> fun) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted from void RemoteStore::addToStore(const ValidPathInfo & info, Source & source, ....., unchanged.

Comment on lines +65 to +71
StorePath addToStoreFromDump(Source & dump, const string & name,
FileIngestionMethod method = FileIngestionMethod::Recursive, HashType hashAlgo = htSHA256,
PathFilter & filter = defaultPathFilter, RepairFlag repair = NoRepair) override;
RepairFlag repair = NoRepair) override;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub may render this as a "change" but it's really a deletion and an addition.

Addition:

  • Providing a real implementation instead of throwing an exception realizes the goal of this PR, allowing daemons to be chained

Deletion:

  • addToStore(... Path ...) has a default implementation that delegates to addToStoreFromDump which is all we need

@@ -398,4 +400,51 @@ struct StreamToSourceAdapter : Source
};


/* Like SizedSource, but guarantees that the whole frame is consumed after
destruction. This ensures that the original stream is in a known state. */
struct FramedSource : Source
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved, unchanged.

}
logger->stopWork();

if (path /* always */) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test can be removed or replaced by assert(path).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the scope to a lambda to allow returning path without the optional contraption.

@@ -50,6 +50,7 @@ typedef enum {
wopAddToStoreNar = 39,
wopQueryMissing = 40,
wopQueryDerivationOutputMap = 41,
wopAddToStoreCANar = 42, // since Nix 3.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be called wopAddToStoreFromDump?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured it's a sibling of wopAddToStoreNar, so I've added CA to distinguish it, being only capable of producing that kind of store path. Dump sounds a bit vague to me, whereas Nar is specific.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah wopAddToStoreFromDump sounds like it isn't always expected nar-formatted objects, which would be confusing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually I wonder if we shouldn't just reuse wopAddToStore (by checking the daemon protocol) since this fully replaces it.

@Ericson2314
Copy link
Member

@roberth

@Ericson2314 I've spent a long time reasoning the implications of using a plain dump and trying to implement it, but it doesn't seem to pay off.

That's fine, we've had trouble with it too. But remember the second alternative I proposed is fixing up addToStoreFromDump itself.

Your current deamon code is wrong because it doesn't yet jump through the hoops of using RetrieveRegularNARSink in the Flat case.

Could you either change the interface of addToStoreFromDump or use RetrieveRegularNARSink in the daemon code?

@Ericson2314
Copy link
Member

If you go with the RetrieveRegularNARSink route, perhaps my never-finished #3802 could be of some use?

@roberth
Copy link
Member Author

roberth commented Sep 16, 2020

Your current deamon code is wrong because it doesn't yet jump through the hoops of using RetrieveRegularNARSink in the Flat case.

I don't think that's the case (I don't think I've made a mistake with my testing..) because the protocol now uses the exact same format as addToStoreFromDump, which is always a NAR.

That's fine, we've had trouble with it too. But remember the second alternative I proposed is fixing up addToStoreFromDump itself.

The only downside of having NAR as the format for Flat is to do with hashing, but this code doesn't need to do anything with hashing. The only reason I mentioned hashing is because I was also looking at the impact of changing the protocol not to use NAR for Flat. That impact seems rather small.

It's quite nice having to deal with only one dump format. Only LocalStore doesn't have that luxury, but that's solved with RetrieveRegularNARSink.

I'd like to keep a reasonable scope for this PR and not overhaul LocalStore::addToStore*.

@roberth
Copy link
Member Author

roberth commented Sep 16, 2020

Ok, rebased and formatted, so should be mergeable.

@Ericson2314 Thanks for the example. I think I'm going to leave it as is because the new daemon implementation is already constant memory with FramedSource.

@Ericson2314
Copy link
Member

format as addToStoreFromDump, which is always a NAR.

This is not true. addToStoreFromDump expects the "dump" source to match the file ingestion method, unless someone changed it in the past week or two.

@Ericson2314
Copy link
Member

Ericson2314 commented Sep 16, 2020

Yes, that is still the case. See LocalStore::addToStoreFromDump and how it uses the method parameter. This is why wopAddToStore need both with RetrieveRegularNARSink at all. And remeber that while nar -> flat is constant space, flat -> nar isn't because the nar needs to know the length up front which the flat won't provide.

Maybe let's just get the preperatory change pulling out FrameSource landed first? Because that is a great thing we can get out of the way, but I'm afraid what I just wrote will unfortunately mean a bunch more time needs to be spent on this.

Copy link
Member Author

@roberth roberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ericson2314 Thanks for pointing out a problem I wasn't going to find with testing.

I'll rename it to wopAddToStoreCA because it's not always a NAR.

I didn't assume depending on the server/client version was an option because it's not how compatibility was handled earlier, but it seems quite sensible to just replace wopAddToStore.

@edolstra I propose that we simplify the handling of these dumps by not parsing in "intermediate" methods unless strictly necessary. The only reason we had to do so is to find the end of the NAR, which can now be done with FramedSource, which separates the end of the NAR from the end of the stream that transports the frames. The legacy store protocol code and nix-store --import can still perform the NAR parsing that's required for their correctness, but it doesn't have to be part of the core.

EDIT: I'll update the PR later today.

Comment on lines +413 to +414
FramedSource narSource(from);
return store->addToStoreFromDump(narSource, baseName, method, hashAlgo);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't NAR-specific so it already works for either method.


{
FramedSink sink(conn, ex);
conn.withFramedSink([&](Sink & sink) {
copyNAR(source, sink);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated but refactored.

<< printHashType(hashAlgo);

conn.withFramedSink([&](Sink & sink) {
copyNAR(narSource, sink);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should fail on Flat but didn't in my testing because addToStoreFromDump invocations are rare and most if not all current use cases for adding Flat store paths build a ValidPathInfo first and use wopAddToStoreNar.

@roberth roberth mentioned this pull request Sep 17, 2020
@Ericson2314
Copy link
Member

@roberth Yeah it's very annoying it's not tested, but I think nix add-to-store --store ... can reproduce it. We should get a test case with that.

@roberth
Copy link
Member Author

roberth commented Sep 17, 2020

@Ericson2314 I've used builtins.path { recursive = false; } to test it. Also I've reimplemented things, so have a look at #4030.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants