Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i/prompting/requestprompts: add package to manage outstanding request prompts #13981

Merged

Conversation

olivercalder
Copy link
Member

@olivercalder olivercalder commented May 20, 2024

This is part of ongoing work to support AppArmor Prompting. When a request is received from the listener, a prompt is created with information which is provided to the user so that they may decide whether to allow or deny the requested action. This package manages the prompt database and individual request prompts.

This task is tracked internally by SNAPDENG-8329

@olivercalder olivercalder added the Needs Samuele review Needs a review from Samuele before it can land label May 20, 2024
return false
}
}
return true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does AvailablePermissions not matter here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AvailablePermissions are directly populated based on the interface of the prompt with which the constraints are associated. If the interface for two prompts is the same, the AvailablePermissions in the constraints will be the same, and two constraints should never be checked for equality unless they are associated with the same interface. So I don't really think it's necessary or beneficial to check their equality.

interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
//
// This should be called when snapd is shutting down, to notify prompt clients
// that the given prompts are no longer awaiting a reply.
func (pdb *PromptDB) CleanUp() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should add note to the comment clarifying that replies are not sent to the listener when the prompts are cleaned up -- the listener should be closed separately by the caller of CleanUp and it is in charge of sending denies for outstanding prompts.

Copy link
Collaborator

@pedronis pedronis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did a first pass

interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
notifyPrompt: notifyPrompt,
}
// Importantly, set maxIDPath before attempting to load max ID
pdb.maxIDPath = filepath.Join(dirs.GlobalRootDir, "/tmp/snapd-request-prompt-max-id")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the goal to start from zero after each reboot? something under /run/snapd might be more appropriate for that nowadays

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like /run/snapd/prompts/<stuff-here> looks appropriate. We don't explicitly set RuntimeDirectory in systemd, but rahter create one during startup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll put the max ID in /run/snapd/request-prompt-max-id. For rules (rather than prompts) we'll want to save some state more permanently across reboots (but still not in state.json), so I don't think it makes sense to have run/snapd/prompting/max-{prompt,rule}-id or something of the sort here.

interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
//
// Returns the IDs of any prompts which were fully satisfied by the given rule
// contents.
func (pdb *PromptDB) HandleNewRule(user uint32, snap string, iface string, constraints *prompting.Constraints, outcome prompting.OutcomeType) ([]string, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AddOrMerge and this could share a struct for some of the arguments here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User, snap, and interface are shared across almost all prompting-related things, I'll create a prompting.Metadata struct with these fields.

The remaining ones left out of the Metadata or Constraints structs would then be outcome, lifespan, duration. I think that's fine, we can revisit later if need be.

if !modified {
continue
}
pdb.notifyPrompt(user, id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this notify first and reply after while Reply does the reverse? if that wasn't the case maybe the two could also share some code ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there were two reasons for the differences:

  1. Here, we continue if some but not all of the permissions of the prompt are satisfied by the new rule, in which case we want to tell the clients to re-check the prompt details via the prompting API. We need to record the notice before the continue, so it can't be any closer to the sendReply call.
  2. In Reply, we're sending a user-given reply to the listener. In the off-chance that this results in an error (could be possible in the future if we pass more than a bool as the reply), we don't actually want to resolve the prompt, since the listener request itself has not yet been correctly resolved. So we return an error before removing the prompt of recording a notice. But we shouldn't encounter errors when replying based on a new rule, since we're theoretically in control of the reply format more. But really we should be fully vetting user replies before sending anything to the listener, so this is potentially a moot point.

Given these considerations, I think the best we could do is move the notifyPrompt call to just before the listener req reply loop in Reply, and assume there will be no errors. Or leave the errors return in place in the off-chance that it occurs -- there's no harm in re-recording a notice, telling the client to check again (in this case nothing will have changed, though maybe the client will send a better reply next time).

I'm a bit torn, what do you think about this?

Note to self: in the error case in (2), the listener request can't actually be replied to again, even though the first reply failed, so need to fix that in the listener. Possibilities are either sending an auto-deny on error or using a mutex instead of atomic.CompareAndSwap.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fundamentally, it doesn't matter when within the function the notice is recorded, since the lock is held any future API requests to get more info will block until we're done with this function anyway.

interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
// interface corresponding to the prompt.
type promptConstraints struct {
Path string `json:"path"`
Permissions []string `json:"permissions"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are those requested permissions? if so could the field be named eg RequestedPermissions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are the requested permissions. Iirc, these started out as "requested-permissions", but we moved to "permissions" instead for some reason. I'd be happy to go back to "requested-permissions" if @pedronis agrees.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could call the field RequestedPermission why keeping the agreed external JSON name just permissions, otoh maybe the fact that the overall thing is Constrains is clear enough?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the idea was to make PromptConstraints as similar as possible to Constraints, with the exception that we must use path instead of path-pattern. I think using the Permissions field throughout for both types is clearer, especially since the Permissions might get pruned over time as new rules satisfy some of the permissions, so Permissions is no longer matches the originally requested permissions. Also, AvailablePermissions is never used internally, it's just populated once as information for the client.

Furthermore, I'll soon be working on the ability to send back a bitmask of permissions corresponding to those provided in the user's reply, rather than just exactly what was asked for, and I think we'll need to record the original list of abstract permissions (not exported or included in the json) so that prompts which are partially satisfied by several new rules still send back a reply with all the requested permissions. So RequestedPermissions would be a bit ambiguous here whether it's referring to the originally requested permissions or the current remaining permissions.

interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
notifyPrompt: notifyPrompt,
}
// Importantly, set maxIDPath before attempting to load max ID
pdb.maxIDPath = filepath.Join(dirs.GlobalRootDir, "/tmp/snapd-request-prompt-max-id")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like /run/snapd/prompts/<stuff-here> looks appropriate. We don't explicitly set RuntimeDirectory in systemd, but rahter create one during startup.

//
// The caller must ensure that the given permissions are in the order in which
// they appear in the available permissions list for the given interface.
func (pdb *PromptDB) AddOrMerge(user uint32, snap string, iface string, path string, permissions []string, listenerReq *listener.Request) (*Prompt, bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry if it's a naiive question, but if one user (A) tries to access files of another user (B), is the user here equal to A or B? I suppose it's B, but is the process uid (A) then part of the listener request?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user is the "subject" making the request to access some object (a file, probably). So the user is A here. We don't care about filesystem permissions/ownership here, just apparmor permissions, which can vary depending on the owner of the file, but we let the apparmor profile make those distinctions, and if the request reaches snapd, we no longer care who owns the file.

// Search for an identical existing prompt, merge if found
for _, prompt := range userEntry.ByID {
if prompt.Snap == snap && prompt.Interface == iface && prompt.Constraints.equals(constraints) {
prompt.listenerReqs = append(prompt.listenerReqs, listenerReq)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we're keeping those in memory, it'd be nice to have some sort of an upper limit on how many requests can be kept

Copy link
Member Author

@olivercalder olivercalder Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. A few considerations:

  • What should the limit be (1000?), and should it be global or per-prompt?
    • Should the limit be the responsibility of the listener or the prompt DB?
  • Listener requests sit waiting in their own goroutine for a reply, so it's rather important that we don't keep too many open.
  • Do we want to auto-deny new listener requests rather than adding them to this list, or replace and deny old ones (I think I favour the latter here)?
  • Do we want to apply the same to prompts themselves, not just listener requests associated with a given prompt?
    • Do we want to preserve the oldest prompts and auto-deny new ones, or auto-deny old ones and preserve new ones (I think I favour the former here)?

What do you think of these @bboozzoo and @pedronis ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there some sort of limiting on the kernel side or are we at the mercy of the sandboxed process? AFAIU in a stable scenario there should be only a handful of prompts and requests, shouldn't there? Unless the sandboxed process is multi threaded or uses async IO it should be blocked inside a syscall awaiting a reply to a prompt, so I think I'd start with some modest limits and then see how it works in practice. Say, 10 requests and 100 prompts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for the record, perhaps we can address this in a followup PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no limit on the kernel side, a process could theoretically trigger as many syscalls as it can and each could result in an apparmor request.

If the current approach of starting a new goroutine to wait for a reply for each request is too heavy, we could instead actually store the received message in memory and use it later to construct and send a reply. This would also be of great benefit if we want to persist requests (and prompts) across snapd restarts. However, persisting these across snapd restarts comes with a load of complexities on both the kernel and snapd side, discussed a bit below. So if the overhead is not much different, a waiting goroutine seems much safer and simpler.


When the listener is created, it opens a file descriptor on the notify.socket and registers it to receive notifications in userspace with some filter (currently empty). Associated with that fd, the kernel creates a notification queue with the given filter. When a new event triggers a notification, the kernel goes through the queues and places the notification into the first queue whose filter matches the notification.

When a listener is closed, it closes the associated fd, and the kernel redistributes any un-read notifications from that queue to the first other queue which matches the filter for the given notification.

Once a message is read from the queue by its associated listener, the kernel will not redistribute it even if that listener closes its fd before it has replied. So it would require lots of work on the kernel side to distribute notifications to multiple queues (which introduces complexity and reply races) or reclaim un-replied-to notifications which have been read by a listener which has since closed its fd without replying (not sure if this is even possible).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the questions of limits to the number of outstanding prompts (and/or listeners) is a UX question as well, so I'll wait to address it until we've discussed with Desktop and Design people about this point. Persisting across restarts is another related topic we'll discuss.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thing given that we do a linear scan we should cap at something that is unreasonable to keep scanning this way. Lower thousands perhaps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll set the outstanding prompts per user limit to 1000, and not set a limit on number of listener requests per prompt. I think we want to deal with that in the listener itself, since optimizations there could make storing listener requests extremely lightweight. And users are unlikely to be able to answer >1000 prompts, but snapd can easily responds to a million listener requests automatically when their associated prompt is satisfied.

interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
Comment on lines 284 to 384
allow, err := outcome.IsAllow()
if err != nil {
return nil, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks a bit awkward now. Anyhow, the check could be done before taking any lock

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it seems like it would return an error if it's not "allow" (e.g. if it's "deny"). This used to be called AsBool, but we renamed it when reviewing #13849 (#13849 (comment)). I'm happy to rename it if you and @pedronis agree.

For now, I'll at least move it up to before the lock, nice catch!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've re-renamed to AsBool for now since I think IsAllow suggests it must be outcome == "allow" else error.

Copy link
Contributor

@MiguelPires MiguelPires left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm once the other reviews are addressed

interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
// Search for an identical existing prompt, merge if found
for _, prompt := range userEntry.ByID {
if prompt.Snap == snap && prompt.Interface == iface && prompt.Constraints.equals(constraints) {
prompt.listenerReqs = append(prompt.listenerReqs, listenerReq)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there some sort of limiting on the kernel side or are we at the mercy of the sandboxed process? AFAIU in a stable scenario there should be only a handful of prompts and requests, shouldn't there? Unless the sandboxed process is multi threaded or uses async IO it should be blocked inside a syscall awaiting a reply to a prompt, so I think I'd start with some modest limits and then see how it works in practice. Say, 10 requests and 100 prompts?


// loadMaxID reads the previous ID from the file at maxIDPath and sets maxID.
//
// If no file exists at maxIDPath, sets maxID to be 0. If another error occurs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wold there be reasons to start with a randomized value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's not a good reason.

  • We can't use notification ID (from the kernel) since this is an opaque value which we can't assume to be unique
  • There might be many prompts, so if we're using a monotonically-increasing ID we don't want to start at too high a number where we risk overflowing
  • Always starting at 0 makes for easier integration testing, so we can very programmatically do some actions which we believe should trigger some known number of notices with particular IDs in a particular order

Copy link
Contributor

@zyga zyga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick pass through non-test code.

Should we avoid the perils of state and use an RWLock up front grabbing shared read lock or exclusive write lock to improve performance of what is otherwise entirely sequential code now?

The scheme of allocating IDs is super slow and I wonder if we could make it faster by holding the ID in memory that is mapped. The kernel will guarantee it is written back to storage even if we die.

interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
defer f.Close()

idBuf := [16]byte{}
_, err = io.ReadFull(f, idBuf[:])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come we read full 16 bytes when the number is in text and not binary? Will this not err when we have less to read?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number is ASCII hex padded by 0s, so "0000000000000000" to "FFFFFFFFFFFFFFFF". But I prefer your suggestion of directly writing an int to a memory-mapped file.

interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
return nil, err
}
}
delete(userEntry.ByID, id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we ever reallocate and garbage collect the map so that it does not hold a lot of memory from the past?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can try to make this better. If we store sub-maps by snap and interface, we could provide more opportunities for setting maps to nil (allowing GC). But if loads of prompts are coming from the snap using the same interface, it doesn't help much.

That suggestion would help with speed of looking for matching prompts, though.

interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
@olivercalder
Copy link
Member Author

olivercalder commented Jun 18, 2024

I explored letting permissions be bitflags which are only converted to []string when marshalled, as this would eliminate problems of ordering and make checking permission equality, containment, subtraction, etc. much faster/simpler. But then marshalling the bitflags into permissions requires knowing the interface, and we'd need a different custom marshaller for PromptConstraints vs Constraints, and both of these still require knowing the interface, which is, again, not known by constraints.

Even in top-level structs containing constraints, the interface isn't always known: https://docs.google.com/document/d/1tBnefdukP69EUJOlH8bgD2hrvZCYoE8-1ZlqRRYlOqc/edit?pli=1#heading=h.ls8izcpzf5fb

From the spec, the request body for posting to /v2/interfaces/requests/prompts/{id}:

{
	"action": "allow" | "deny",
	"lifespan": "single" | "session" | "forever" | "timespan",
	"duration": <duration for which decision applies, if lifespan == "timespan">,
	"constraints": {
		"path-pattern": <path glob for which decision applies>,
		"permissions": [<perm>, <perm>, ...]
	}
}

So at the moment I do not see how this would be viable. It would be nice, though.

@olivercalder
Copy link
Member Author

This branch contains the exploration of using bitflags for permissions: https://github.com/olivercalder/snapd/tree/attempt-permissions-bitflags

@olivercalder olivercalder requested a review from zyga June 19, 2024 03:40
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
listenerReqs: []*listener.Request{listenerReq},
}
userEntry.ByID[id] = prompt
pdb.notifyPrompt(metadata.User, id, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only remaining issue I have now is that the notify code is called with the lock held and we can't foresee if the callee won't block for any reason. Since we already have the user ID and prompt ID, this call could be done after we release the db lock?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really prefer to record the notice while the requestprompts lock is held, since this guarantees the order in which the notices are issued. This lets us rely nicely on the fact that the notice when a prompt is resolved will always be the final notice for that prompt, and thus the notice data indicating the resolved reason will be preserved. We don't need to guarantee this, but it would be much nicer for clients.

Discussed more here: https://warthogs.atlassian.net/browse/SNAPDENG-15165

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure it should block landing this but we have the issues that emitting notices atm always require the state lock so we will hold this and then need to get the state lock to emit the notice. Maybe there's a way out once we undetstand more what should happens on snapd restarts. We already know that it doesn't make sense to keep prompt notices across reboots

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I'm not sure what we can do to improve this at the moment. Perhaps in an ideal world notices would be decoupled from the state lock entirely, so they can relay information while the lock is held. And so that recording notices doesn't lock up the state all the time.

Perhaps something can be optimized so that the notifyPrompt closure kicks off the notice recording, guaranteeing order, but doesn't wait for the state lock to be acquired and released.

In any case, I think we want to call notifyPrompt with the prompt DB lock held, and it's up to the closure implementation to guarantee that it doesn't block.

@olivercalder
Copy link
Member Author

olivercalder commented Jul 8, 2024

Rebased to fix conflicts with usage of new PathPattern type in constraints.

Copy link
Collaborator

@pedronis pedronis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code looks good, some considerations

// interface corresponding to the prompt.
type promptConstraints struct {
Path string `json:"path"`
Permissions []string `json:"permissions"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could call the field RequestedPermission why keeping the agreed external JSON name just permissions, otoh maybe the fact that the overall thing is Constrains is clear enough?

listenerReqs: []*listener.Request{listenerReq},
}
userEntry.ByID[id] = prompt
pdb.notifyPrompt(metadata.User, id, nil)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure it should block landing this but we have the issues that emitting notices atm always require the state lock so we will hold this and then need to get the state lock to emit the notice. Maybe there's a way out once we undetstand more what should happens on snapd restarts. We already know that it doesn't make sense to keep prompt notices across reboots

Copy link
Contributor

@zyga zyga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks very good.

My main points of feedback are:

  • Use ints rather than strings internally and only get to strings when we are closer to external APIs. This will speed things up, use less memory and puts smaller pressure on the garbage collector
  • Use a few types to make the semantics of a few integers more clear
  • Close should Unmap even though we may eventually GC, this is still a file-like resource on top of the extra memory that is probably unsafe to just wait for
  • It'd be nice if we could stop holding the big lock when we transition to notice APIs but I have no insight as to how to do that yet.

interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
interfaces/prompting/requestprompts/requestprompts.go Outdated Show resolved Hide resolved
// rule.
type Metadata struct {
// User is the UID of the subject (user) triggering the applicable requests.
User uint32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick on the type, let's decide if it's an int or an uint32 and add a type definition for it. The code will read cleaner and we will have fewer places to fix in case we got the guess wrong.

interfaces/prompting/requestprompts/requestprompts_test.go Outdated Show resolved Hide resolved
Previously, if a new prompt with the same ID as another prompt which has
since been resolved happened to be created before a particular prompt
client replied to it, that prompt client might attempt to reply to (what
it believed to be) the original prompt, resulting in unexpected behavior
or errors.

Since prompt IDs are monotonically-increasing integers, we can write the
maximum prompt ID to disk and re-read it whenever snapd starts up. This
ensures that prompt IDs are unique across snapd restarts.

We use a file in `/tmp` to store the maximum ID. Since prompt clients
are restarted when the host system reboots, there is no need to persist
prompt ID uniqueness across system reboots.

Signed-off-by: Oliver Calder <[email protected]>
The new CleanUp function records a notice for every outstanding prompt,
and then deletes all prompts by clearing the per-user prompt DB. This
should be called when snapd is shutting down or restarting.

Note: this does not send a reply to the associated listener requests.
That is left as the responsibility of the listener itself, and the
caller of the CleanUp function should also call `listener.Close()`.

Signed-off-by: Oliver Calder <[email protected]>
…ompt

When a new request is identical to an existing one and is merged into an
existing prompt, record a notice again for that prompt, despite that the
prompt contents itself have not changed.

The rationale for this is that if a client attempts to reply to a prompt
but sends a malformed response, snapd will respond with an error and the
prompt will not be resolved. If the client does not re-try to reply, the
application will be left hanging, waiting for a reply which will not
come.

If a user attempts to re-do the operation which triggered the prompt,
that request may be merged into an existing prompt, and the client will
not re-prompt the user or send a reply, leaving the application blocked
again, this time with no hope of a response, since the client has
received no notice for the prompt since the initial one, to which it
already sent a (malformed) reply.

Thus, by re-recording a notice when a request is merged into an existing
prompt, the user has a way of poking the client into taking action
again, and applications always cause a notice to be recorded when they
perform an action which should be handled by a prompt.

Of course, it is important that clients correctly handle error responses
from snapd and retry replying when appropriate. Re-recording a notice
when a request is merged into an existing prompt does not help in a
situation where an application is blocked on an operation which the
client thinks it has handled, and the application does not retry (which
is likely, given that the application itself is often blocked and
cannot/will not perform other actions until that syscall completes).

Signed-off-by: Oliver Calder <[email protected]>
… resolved

A prompt may only be resolved once, and the notice recorded when that
prompt is resolved is the final notice which will be associated with
that prompt ID. Therefore, it is safe to include notice data without
risk of signal loss.

Currently, prompt clients need to query the prompting API in order to
check whether an outstanding prompt was modified (due to some but not
all of its permissions being satisfied by a new rule) or fully resolved,
either due to a direct reply, a new rule satisfying all remaining
permissions, or the prompt DB being closed.

Since the notice recorded when a prompt is resolved is the final notice
for that prompt, we can instead include the "resolved" key in the
notice data to indicate to the client that the prompt has been resolved
without the need to query the prompting API. And the value for the
"resolved" key can be "replied", "satisfied", or "cancelled", giving
additional information to the prompt client about why the prompt was
resolved.

Signed-off-by: Oliver Calder <[email protected]>
By using a memory-mapped file to keep track of the max prompt ID, we can
avoid most syscalls when incrementing the value, and the max ID value
can still be incremented atomically.

If multiple prompt DBs happen to be running at the same time, any
prompts generated by either DB will have unique IDs across both DBs, and
the new max ID is immediately readable from the file on disk.

Signed-off-by: Oliver Calder <[email protected]>
Several improvements to the handling of the max ID file mmap:
- `Close()` calls `Munmap()` to clean up the mmap file description
- Methods on prompt DB error if `Close()` has already been called
  - As a result, methods may now return errors
- `maxIDFile.SyscallConn().Control()` so that the max ID file/fd is not
  GC'd and closed during the syscall
- Use `maxIDFileSize` whenever possible to reduce hard-coding

Signed-off-by: Oliver Calder <[email protected]>
Rather than using a map from ID to prompt, instead store all prompts for
the given user in a slice, where new prompts are appended to the end and
when a prompt is removed, the final prompt in the slice is moved to the
index of the removed slice, and the slice is truncated by one element.

To look up prompts by ID, the user prompt DB also stores a map from ID
to the index of the corresponding prompt in the prompt slice.

The benefit of this split storage system is that it is much easier to
extract the full list of prompts for a user, since they are already
stored as a slice. The one caveat is that we can't guarantee that the
slice won't be mutated internally after it has been returned, so we must
return a copy of the slice. The copy operation is still much faster than
iterating through all elements of a map and flattening it into a slice,
however.

Signed-off-by: Oliver Calder <[email protected]>
@olivercalder
Copy link
Member Author

Thanks for re-reviews! I've rebased to pull in test fixes from master.

@alfonsosanchezbeato alfonsosanchezbeato merged commit e586e70 into canonical:master Jul 24, 2024
47 of 53 checks passed
@olivercalder olivercalder added this to the 2.65 milestone Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Samuele review Needs a review from Samuele before it can land Squash-merge Please squash this PR when merging.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants