-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OXT-1521: Replace UID with Quark #1109
base: master
Are you sure you want to change the base?
Conversation
This should be considered for merge after #1101 |
Please create a JIRA ticket for this new component. |
This new implementation requires discussion on JIRA before a merge can occur.
|
Tested downstream and works as expected. +1 |
@jean-edouard @eric-ch @dozylynx @jandryuk thoughts? |
Thoughts about replacing an obsolete and un-maintained ocaml solution with a current and upstream C one? |
-LDFLAGS = -s | ||
+CPPFLAGS ?= -DVERSION=\"$(VERSION)\" -D_DEFAULT_SOURCE -D_XOPEN_SOURCE=700 -D_BSD_SOURCE | ||
+CFLAGS ?= -std=c99 -pedantic -Wall -Wextra -O | ||
+LDFLAGS ?= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weak assignment is no issue, but why change the defaults?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CFLAGS change was a leftover change from when I was debugging and can be reverted. The LDFLAGS change (I believe) was necessary to compile debug symbols for testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, -s
is not desirable. Switching to weak assignment lets Bitbake override the variables with the toolchain configuration though.
It is unfortunately done in many projects :).
- die("getgrnam '%s': %s", group, errno ? strerror(errno) : | ||
- "Entry not found"); | ||
- } | ||
- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you remove this? This is not documented in the patch and should be split in its own patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user/group validation and the uid operations in your below comment required quark to have some admin-like SELinux privileges. I wanted to avoid that, so I just removed the functionality entirely. We don't need these as far as I'm aware anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually you left the options definitions in getopt()
and the usage, so not entirely. It really belongs in a separate patch "Add POST and v4v support to quark." is not really related to this.
SELinux handles these sort of perms, plenty of other daemons drop root. It can be added to the policy file.
- } | ||
- if (getgid() == 0) { | ||
- die("Won't run as root group", argv0); | ||
- } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as before, also this is crippling options -u
and -g
which could be used should the default user/group hardcoded in there not be available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment above
"Date: %s\r\n" | ||
"Connection: close\r\n" | ||
"Last-Modified: %s\r\n" | ||
+ "Expires: Wed, 1 Jan 2030 00:00:00 GMT\r\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is that necessary? Is there some clever caching being done on the UI side?[1].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main reason this is here is because UID sent a similar line in its response header (see https://github.com/OpenXT/uid/blob/85bdfe033454efc8016362c466599b0658d91c79/file_handler.ml#L82). It's supposed to make use of HTTP caching, but I'm not even really sure that the client is taking advantage of that. i can remove it if desired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UID does it on images only, this is probably to not reload the entire UI and get things a little faster, the browser likely handles the caching.
While the images in the UI page won't change, could this trigger some fun on other resources?
Also quark
already has helpers to handle timestamps, this could be now + 10years, or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good catch I didn't realize it was just for images. I can do something similar using the helpers.
+ FD_SET(fd, &writefds); | ||
+ } | ||
+ | ||
+ select(fd + 1, &readfds, &writefds, NULL, &tv); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
select(2) can be interrupted by signals(7).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concur. Also, since you have a timeout specified and the function returns void, you don't know which occurred when calling this function, either read access, timeout, or interrupt. The return of the select() can help make this determination clearer, or allow you to put in a bit of a spinlock if you want to keep this function returning void.
+ /* wait until our fd is available for the next operation */ | ||
+ wait_fd_ready(fd, true, true); | ||
+ ret = resp_file(fd, RELPATH(realtarget), r, &st, mime, lower, upper); | ||
+ wait_fd_ready(fd, true, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not fix anything. If it does, we have bigger problems.
resp_file()
will only write tofd
.write(2)
should only block if there is no space left to send the message.
With that in mind:
- Waiting on
read
before writing would be equivalent to sleep(500ms). - Waiting on
write
before writing should not block. - Waiting on
read
after writing most likely sleep(500ms), but maybe the client will have time to reply something (not that anything is expected?) or close the connection.
So I am slightly curious: What was blocking exactly and are both wait_fd_ready
necessary to no longer observe the issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our first iteration of the issue, we had sleep(100)'s where the wait_fd_ready() functions are now. The delays were all necessary for proper execution, and to honest I still don't know why. In an effort to make the fix less hacky, I replaced them with select() calls, but based on your comment I'm guessing this was the wrong choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am asking to figure out if this reveals a problem in V4V or between the UI and Quark (or something else entirely...).
If this is something V4V does not handle, that problem will occur somewhere else eventually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not confident this is a v4v issue, but unfortunately my knowledge of socket operations is not very strong at all. Would someone else who's more knowledgeable be able to take a look? I can stick a TODO in there for now so as to not hold up this merge.
I would be willing to +1 if the packagegroup explicit replace change is instead made to be a check against DISTRO_FEATURES for quark at which point it is included with the default being to use uid. |
@dpsmith -1 to that. If it's optional and disabled by default, it might as well not be there. |
@crogers1 not sure if this is on your radar but this should be merged with OpenXT/toolstack-data#51 |
@jed, okay then -1 from me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not opposed to switching from UID to quark. Getting rid of a custom component is good. However this particular PR feels a little messy. It made quark work, but didn't go back and clean up the patch set.
The one patch, add-POST-and-v4v-support.patch, states 2 things, but really does 5.
- Add POST
This is okay and needed - Random wait_fd_ready calls
As @eric-ch said, these seem wrong. We control v4v, so if something is broken there then we should fix that. - Makefile changes
Again, needed, but they should be explained. - Strip out user/group changing
I think these should be dropped since we can change our SELinux policy to accommodate. - Hard-codes an Expires header
I don't know enough to evaluate this.
These are all independent changes, so I would structure them in separate patches.
We want to align with upstream sources to reduce our maintenance burden going forward. Therefore, we should only patch what we absolutely must. If we can re-work other parts of OpenXT, then we should. This is where I'd rather fix v4v and SELinux than patch quark.
+ /* keep the first newline */ | ||
+ p += 2; | ||
+ | ||
+ /* find second occurrence - if exists, POST request, else GET */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't the request state "POST" or "GET" explicitly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming this is reading the full HTTP message, POST or GET should be the absolute first characters seen, so it would be far more efficient to look in the absolute positions than a string match when handling GET requests. The first occurrence of double-CRLF is required, but the second search could probably be taken out completely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second search was less about figuring out the type of request and more about finding where to null terminate it and copying the correct number of bytes from the payload. That being said the second search can be done conditionally to avoid searching unnecessarily for a GET request.
+ if (!status) { | ||
+ if (r.method == M_POST) { | ||
+ /* handle POST */ | ||
+ syslog(LOG_INFO, "[quark] %s", r.payload); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A syslog message for each post seems excessive. Maybe drop it to LOG_DEBUG? Also, POST requests don't get a response?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, logging all requests at LOG_INFO seems excessive, especially if you are simply dumping the payload into the log message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell this is how UID handled POST messages as well. The only time POST is ever used is for debug messages (see https://github.com/OpenXT/uid/blob/85bdfe033454efc8016362c466599b0658d91c79/httpserver.ml#L205).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And see https://github.com/OpenXT/toolstack-data/blob/master/dist/script/lib/debug.js#L42 for where the POST calls originate from. AFAIK there are none besides these UI logging calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @tsirakisn . Yes, I forgot about the POST logging. I guess that means it should stay LOG_INFO.
I see uid sends an HTTP response to the POST messages. I think those should be maintained in quark. I don't know if it matters, but surf may be hanging onto resources for those unanswered requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 I'll mirror UID's response header in quark
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll also add a lengthier comment above the syslog call explaining my above comments
echo "Stopping Quark server" | ||
stop_quark | ||
;; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nitpick. Inconsistent whitespacing, some have a blank space between branches and some do not.
@jandryuk Just some background, the direction I gave my team was to avoid modifying v4v or the interposer if at all possible. Replacing UID with Quark affects a relatively small number of components, whereas a v4v or interposer change potentially affects every component that uses v4v, which is a considerable number, including critical ones such as rpc-proxy. With the massive changes moving to 64-bit, that seemed like the correct approach at the time. Now that x64 is merged if we want to reevaluate how we approach Quark, that's fine with me. @dpsmith @jean-edouard, Putting this in a layer as an optional component may be the best path forward, seeing as there's a pretty even split among contributors. I can get behind that. |
@crogers1 that would be fine and to be clear, my opposition is not in that I don't want to see uid replaced. It's just that uid works today with no major bugs, it has been exercised across many releases, and has been proven in production. While this commit introduces quark as a wholesale replacement that modifies its behavior from upstream thus this patched implemented has no track record of production usage to demonstrate stability, e.g. it is an experimental feature. As an experimental feature it should be an explicit action to be enabled for those that want to use and test. After a release or two with better testing, iterations of bugs, and demonstrated stability in production, then uid could be retired and quark could be made the default. |
2b9de62
to
e0ef433
Compare
Quark is a lightweight C-based webserver. This commit removes the old OCAML-based webserver, UID, and replaces it with Quark. We also add POST support to Quark to allow for the UI code to log debug messages to dom0 Signed-off-by: Chris Rogers <[email protected]>
The removal of UID and addition of Quark requires updates to the refpolicy. This commit adds a new quark module and removes several uid_t rules in other .te files. Signed-off-by: Nicholas Tsirakis <[email protected]> Signed-off-by: Chris Rogers <[email protected]>
Rebased this PR. Compiled on latest master. @eric-ch if we could get a custom build from the build server with this in it, that'd be awesome! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I lost track of reviewing this.
Previous feedback mentioned the single add-POST-and-argo-support.patch patch names 2 things, but actually does 4 things. I think that should be split into 4.
Has any effort been made to investigated the need for wait_fd_ready? Including as much detail of "server lockup" would be good.
Instead of patching out setuid/setgid, was running as non-root investigated? /dev/argo_stream would need a group assignment, but it would be nice. Though I have no idea what other changes may be needed to get it to actually work :)
type uid_tmp_t; | ||
files_tmp_file(uid_tmp_t) | ||
files_tmp_filetrans(uid_t, uid_tmp_t, file) | ||
xc_files_rw_v4v_chr(quark_t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want: xc_files_rw_argo_chr
@@ -62,7 +62,6 @@ kernel_read_vm_overcommit_sysctl(updatemgr_t) | |||
logging_send_syslog_msg(updatemgr_t) | |||
|
|||
dbd_dbus_chat(updatemgr_t) | |||
uid_dbus_chat(updatemgr_t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have quark_dbus_chat - don't you need to replace these?
UID is an Ocaml based webserver in dom0 that supports the backend operations of the UI and UIVM. In the current state of OpenXT, much of the code is unused. It simply serves the javascript files to the browser in the UI and listens for POST messages for debug/logging.
We can replace UID with a simpler webserver, Quark, that has several benefits. It's more lightweight, written in C, and is actively maintained. This PR seeks to replace UID and update the refpolicy to define quark-specific types.