Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disperser auth #984

Open
wants to merge 58 commits into
base: master
Choose a base branch
from
Open

Conversation

cody-littley
Copy link
Contributor

Why are these changes needed?

Authenticate StoreChunks() requests.

Checks

  • I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
  • I've checked the new test coverage and the coverage percentage didn't drop.
  • Testing Strategy
    • Unit tests
    • Integration tests
    • This PR is not tested :(

Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
@cody-littley cody-littley self-assigned this Dec 11, 2024
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
@cody-littley cody-littley marked this pull request as ready for review December 19, 2024 19:07
Copy link
Contributor

@ian-shim ian-shim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, few comments

api/clients/v2/request_signer.go Outdated Show resolved Hide resolved
api/clients/v2/node_client.go Outdated Show resolved Hide resolved
uint32 disperserID = 2;

// Signature using the disperser's ECDSA key over keccak hash of the batch. The purpose of this signature
// is to prevent hooligans from tricking DA nodes into storing data that they shouldn't be storing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤣

disperser/cmd/controller/config.go Show resolved Hide resolved
node/auth/authenticator.go Outdated Show resolved Hide resolved
core/eth/reader.go Show resolved Hide resolved
return nil
}

key, err := a.getDisperserKey(ctx, now, request.DisperserID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would result in an extra ETH call for any new DisperserID it sees.
Since we're already hardcoding the only allowed disperser, does it make sense if we consider a request invalid if DisperserID is unknown (not EigenLabsDisperserID) in isAuthenticationStillValid?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

	var defaultAddress gethcommon.Address
	if err != nil {
		return defaultAddress, fmt.Errorf("failed to get disperser address: %w", err)
	}
	if address == defaultAddress {
		return defaultAddress, fmt.Errorf("disperser with id %d not found", disperserID)
	}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this reply is for another comment in reader.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, you're correct @ian-shim, wires got crossed here. 🙃

The response I intended to type here was that the code does cache this value inside this cache: keyCache *lru.Cache[uint32, *keyWithTimeout].

nodeClientManager, err := controller.NewNodeClientManager(config.NodeClientCacheSize, logger)

var requestSigner clients.RequestSigner
if !config.DisperserStoreChunksSigningDisabled {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the store chunks signing is disabled, we should probably throw a warning on the start up logs.

Copy link
Contributor Author

@cody-littley cody-littley Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

	var requestSigner clients.DispersalRequestSigner
	if config.DisperserStoreChunksSigningDisabled {
		logger.Warn("StoreChunks() signing is disabled")
	} else {
		requestSigner, err = clients.NewDispersalRequestSigner(
			context.Background(),
			config.AwsClientConfig.Region,
			config.AwsClientConfig.EndpointURL,
			config.DisperserKMSKeyID)
		if err != nil {
			return fmt.Errorf("failed to create request signer: %v", err)
		}
	}


// authenticationTimeoutDuration is the duration for which an auth is valid.
// If this is zero, then auth saving is disabled, and each request will be authenticated independently.
authenticationTimeoutDuration time.Duration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep this value low for the beginning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've set this to be one minute by default. Is that sufficient, or should we lower it further?

Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
config *NodeClientConfig
initOnce sync.Once
conn *grpc.ClientConn
requestSigner DispersalRequestSigner
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A slightly different structuring of it: shall we pass in an interface, not a concrete implementation?
Like we have RequestSigners interface (and passed around), but with a KMS based implementation. This leaves space for potential non-KMS (i.e. non AWS specific) options (in a decentralized scenario it may be needed).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused about this comment. DispersalRequestSigner is already an interface. Is the interface ok in its current form?

// DispersalRequestSigner encapsulates the logic for signing GetChunks requests.
type DispersalRequestSigner interface {
	// SignStoreChunksRequest signs a StoreChunksRequest. Does not modify the request
	// (i.e. it does not insert the signature).
	SignStoreChunksRequest(ctx context.Context, request *grpc.StoreChunksRequest) ([]byte, error)
}

&bind.CallOpts{
Context: ctx,
},
disperserID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we settle on either ID or Key? Using both is unnecessary and adds a little bit confusion.
(if this is meant to be an int, ID seems fit)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with calling this an "ID". These are serial numbers. Eventually when we allow the community to register dispersers, we will allocate these in monotonic order via a smart contract.

Where do you see the terminology "key" being used in this context?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think I see what you are getting at. There is a method in the chain reader that references the disperser key:

getDisperserKey(
	ctx context.Context,
	now time.Time,
	disperserID uint32) (*gethcommon.Address, error)

The disperser key and the disperser ID are actually distinct things.

  • Disperser ID: a unique identifier for the disperser, a 32 bit serial number
  • Disperser (public) key: the public key used to verify StoreChunks() requests from the disperser, is an eth address

The reason why the disperser's public key is not a good identifier for the disperser is that we want the capability to re-key the disperser (e.g. if we lose the key, it gets compromised, or we just want to rotate it).

@@ -0,0 +1,62 @@
package clients
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can update the file name as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, forgot to rename the file. 😅 Fixed.

return nil
}

key, err := a.getDisperserKey(ctx, now, request.DisperserID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this reply is for another comment in reader.

// chainReader is used to read the chain state.
chainReader core.Reader

// keyCache is used to cache the public keys of dispersers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what public key has only 32 bits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The uint32 is the relay ID. These are serial numbers, not eth keys. Currently we have exactly one disperser, and its ID is hard coded to be uint32(0).

return fmt.Errorf("failed to verify request: %w", err)
}

a.saveAuthenticationResult(now, origin)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: cacheAuthenticationResult, "save" feels like it's persisting the data which it's not

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed

}
}

address, err := a.chainReader.GetDisperserAddress(ctx, disperserID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd cache this ID to address mapping to save RPCs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cached (see keyCache *lru.Cache[uint32, *keyWithTimeout]). By design, we re-fetch the same key again after a timeout just in case the key was changed. But not for each RPC.

@@ -238,6 +238,33 @@ var (
EnvVar: common.PrefixEnvVar(EnvVarPrefix, "CHUNK_DOWNLOAD_TIMEOUT"),
Value: 20 * time.Second,
}
DisableDispersalAuthenticationFlag = cli.BoolFlag{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why disabling is an option? shouldn't it be non optional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's enabled by default. Necessary to get things working in inabox. Inabox integration is kind of complex, so I'd like to merge this code with it disabled in the e2e test, and to enable it in a follow up PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file might be better fit in api/clients

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved all GRPC hashing code to a new package api/hashing. There is now no longer a dependency (outside of test code) between the client and this file.

@@ -3,6 +3,7 @@ package clients
import (
"context"
"fmt"
"github.com/Layr-Labs/eigenda/node/auth"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem to me this is a good dependency direction for api/ to depend on node/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this dependency is no longer present

Signed-off-by: Cody Littley <[email protected]>
Signed-off-by: Cody Littley <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants