Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add user-space message queue library to the OSAL (GSFC DCR 22160) #73

Open
skliper opened this issue Sep 30, 2019 · 18 comments
Open

Add user-space message queue library to the OSAL (GSFC DCR 22160) #73

skliper opened this issue Sep 30, 2019 · 18 comments

Comments

@skliper
Copy link
Contributor

skliper commented Sep 30, 2019

The GSFC ATLAS project developed an alternate queue library to use with POSIX to overcome a performance limitation with the Linux Posix message queues.

Incorporate this enhancement (or similar enhancement) into the OSAL for POSIX, RTEMS, and VxWorks.

@skliper skliper self-assigned this Sep 30, 2019
@skliper
Copy link
Contributor Author

skliper commented Sep 30, 2019

Imported from trac issue 50. Created by sstrege on 2015-05-14T18:58:30, last modified: 2019-08-14T14:11:46

@skliper
Copy link
Contributor Author

skliper commented Sep 30, 2019

Trac comment by sstrege on 2015-05-14 19:08:17:

Solution is dependent on Trac Ticket #28. In the current OSAL model one would have to write a separate implementation for each of VxWorks, RTEMS, and POSIX, yet this feature is fully self-contained and not OS-dependent so all 3 would be identical. Once #28 is merged in this can be done once in a shared area - much cleaner.

@skliper skliper removed their assignment Sep 30, 2019
@CDKnightNASA
Copy link
Contributor

Looks like mqueue does have "priority" but we aren't really using it...and we have no control over its logic. Writing our own queues would allow us full control.

@jphickey
Copy link
Contributor

jphickey commented Mar 3, 2020

Note that in addition to WSL this also helps with BSD variants and derivatives (Mac OS, FreeBSD, etc) which don't seem to offer POSIX mqueues.

This change would have a pretty high value in improving the cross platform applicability of OSAL by removing the dependency on posix queues.

@CDKnightNASA
Copy link
Contributor

Note that FPrime has a userspace queue for OSX support; it's written in C++ (of course) but I'm wondering if we could unify our codebase for queues.

https://github.com/nasa/fprime/blob/master/Os/MacOs/IPCQueueStub.cpp

@ivanperez-keera
Copy link
Member

ivanperez-keera commented Feb 2, 2023

If I understand this correctly, a reimplementation of message queues as proposed would also facilitate using cFS in docker containers without root on the host (which is a common restriction on NASA machines).

Right now, I can't run cFS inside docker because /proc/sys/fs/mqueue/msg_max is 10 in the container and I cannot increase it.

@skliper
Copy link
Contributor Author

skliper commented Feb 2, 2023

@ivanperez-keera - if you can set msg_max higher than your maximum requested queue depth on the host then you'll avoid the issue, or if you can live with the limit of 10 there's the OSAL_CONFIG_DEBUG_PERMISSIVE_MODE:

https://github.com/nasa/cFE/blob/6d96c6e856a654f7c96e66a87b003aa01ff96874/cmake/sample_defs/native_osconfig.cmake#L39

For development in dockers from a desktop/laptop host I typically just use permissive mode. For performance testing or similar where it really matters I either run it on a more representative system (or emulator of) or get an admin to increase the msg_max setting such that I can use deeper queues. Either way avoids the need for root on host.

User space queues would avoid the issue though, which would be nice.

@ivanperez-keera
Copy link
Member

get an admin to increase the msg_max

On the host, msg_max is 4096. Inside docker, it's reduced to 10. I have yet to figure out why. I also opened a question on stackoverflow months ago but received no replies: https://stackoverflow.com/questions/75329421/docker-fs-mqueue-msg-max-set-to-10-in-spite-of-hosts-being-4096.

@ivanperez-keera
Copy link
Member

@skliper That seems to be set to true by default when using native, which I am using. I'm very confused about why I'm still getting an error. Is there something else I need to do to set permissive mode during compilation?

When I grep for PERMISSIVE in my tree after building, my CMakeCache.txt files all indicate that OSAL_CONFIG_DEBUG_PERMISSIVE_MODE is false.

Here's my dockerfile: nasa/cFS#718 (comment)

@skliper
Copy link
Contributor Author

skliper commented Dec 4, 2023

get an admin to increase the msg_max

On the host, msg_max is 4096. Inside docker, it's reduced to 10. I have yet to figure out why. I also opened a question on stackoverflow months ago but received no replies: https://stackoverflow.com/questions/75329421/docker-fs-mqueue-msg-max-set-to-10-in-spite-of-hosts-being-4096.

@ivanperez-keera - I increase msg_max on my docker w/ a parameter to docker run --sysctl fs.mqueue.msg_max=10000. Try that and if it doesn't work could you post the error message?

@ivanperez-keera
Copy link
Member

Try that and if it doesn't work could you post the error message?

In some environments, I can't sudo. The message I get there is:

docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc
create failed: unable to start container process: error during container init: open
/proc/sys/fs/mqueue/msg_max: permission denied: unknown.

@ivanperez-keera
Copy link
Member

@skliper Nevertheless, you said " if you can live with the limit of 10 there's the OSAL_CONFIG_DEBUG_PERMISSIVE_MODE:".

However, when I tried that, I couldn't make it work. See: #73 (comment)

How do I set PERMISSIVE? It the dockerfile linked below, and without adjusting fs.mqueue.msg_max, what would I have to change to run the app without root privileges on the host, and staying under the limit of 10?

nasa/cFS#718 (comment)

@skliper
Copy link
Contributor Author

skliper commented Dec 4, 2023

Permissive is set here in the example setup when native:
https://github.com/nasa/cFE/blob/b72dd4e1f9f44c7dbb7a12895b5ac1635eb239b2/cmake/sample_defs/native_osconfig.cmake#L39
There's some file matching magic, but you could set it in the default config if you want it to apply to more than native.

If you can't set permissive mode, in theory you could override software bus queue depths to all be <= 10. I'm not sure they are configurable in every app though.

As a quick fix/test (if PERMISSIVE can't be fixed quickly) could just force the limit from the OSAL API implementation for queues. I haven't done the analysis to figure out what might require more than a depth of 10 for nominal operations, maybe sbr and/or the various forms of to (to_lab, etc) depending on how they are designed. See:

if (geteuid() != 0)
{
fp = fopen("/proc/sys/fs/mqueue/msg_max", "r");
if (fp)
{
if (fgets(buffer, sizeof(buffer), fp) != NULL)
{
OS_BSP_Global.MaxQueueDepth = OSAL_BLOCKCOUNT_C(strtoul(buffer, NULL, 10));
BSP_DEBUG("Maximum user msg queue depth = %u\n", (unsigned int)OS_BSP_Global.MaxQueueDepth);
}
fclose(fp);
}
}

#ifdef OSAL_CONFIG_DEBUG_PERMISSIVE_MODE
/*
* Use the BSP-provided limit
*/
POSIX_GlobalVars.TruncateQueueDepth = OS_BSP_Global.MaxQueueDepth;
#else
/*
* Initialize this to zero to indicate no limit
*/
POSIX_GlobalVars.TruncateQueueDepth = OSAL_BLOCKCOUNT_C(0);
#endif

@skliper
Copy link
Contributor Author

skliper commented Dec 4, 2023

Oh... I wonder if the method for getting MaxQueueDepth is broken for your setup. Might be worth a backup of 10 if the fopen/fget fails.

@ivanperez-keera
Copy link
Member

ivanperez-keera commented Dec 4, 2023

I wonder if the method for getting MaxQueueDepth is broken for your setup.

I don't know why that would be. See my docker image: I'm using the standard cFS.

Does that image work for you at all without root?

@skliper
Copy link
Contributor Author

skliper commented Dec 5, 2023

It's because in your docker geteuid() == 0 so cFS thinks it has privilages and skips the use of the msg_max limit. If you comment out the geteuid !=0 check it worked for me (it IS using PERMISSIVE w/ your setup).

@ivanperez-keera
Copy link
Member

That was it. Commenting the if line out (but leaving the following block in) makes cFS not crash. Thanks a bunch!

I wonder 1) if root can go beyond msg_max normally (otherwise, why is there such a check) and, 2) if so, then should there be a flag so that the condition is not solely based on the user id.

@ivanperez-keera
Copy link
Member

For completeness if someone has the same I problem I did, this is my current dockerfile:

FROM i386/debian:bullseye

# Apt should not ask questions during configuration
ENV DEBIAN_FRONTEND=noninteractive

# Update packages available
RUN apt-get update

# cFS dependencies
RUN apt-get install -y cmake build-essential gcc-multilib g++-multilib

# Generic dependencies needed
RUN apt-get install -y git

# Get copy of cFS
RUN git clone --recursive https://github.com/nasa/cFS
WORKDIR cFS
RUN git submodule init
RUN git submodule update

RUN cp cfe/cmake/Makefile.sample Makefile
RUN cp -r cfe/cmake/sample_defs sample_defs

# We have to either modify the following file to remove a check based on the
# user ID, or compile and install everything as a different user.
RUN sed -ie '66s/\<if\>/\/\/ if/g' osal/src/bsp/generic-linux/src/bsp_start.c

RUN make SIMULATION=native prep
RUN make
RUN make install
WORKDIR build/exe/cpu1/
CMD ./core-cpu1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants