Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding test to cfe_testcase app demonstrating non-detection of pipe destruction by CFE_SB_ReceiveBuffer #1777

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jbohren-hbr
Copy link
Contributor

Describe the contribution
A clear and concise description of what the contribution is.

  • This adds a test to the cfe_testcase app demonstrating a potential bug in CFE_SB_ReceiveBuffer()
  • When CFE_SB_ReceiveBuffer() is called with CFE_SB_PEND_FOREVER, and its pipe is destroyed, it does not return with an error code. (It is unclear at this time whether it blocks indefinitely or returns with CFE_SUCCESS).
  • If this test case asserts the correct behavior, it indicates a change may be necessary in CFE_SB_ReceiveBuffer()

Testing performed
Steps taken to test the contribution:

  1. Create template mission from sample_defs
  2. Add cfe_testcase to MISSION_GLOBAL_APPLIST in targets.cmake:
    list(APPEND MISSION_GLOBAL_APPLIST sample_app sample_lib cfe_testcase)
  3. Add cfe_testcase to cpu1_cfe_es_startup.scr:
    CFE_APP, cfe_testcase, CFE_TestMain, TESTCASE_APP, 50, 16384, 0x0, 0;
  4. Run ./core-cpu1 out of standard build directory

Expected behavior changes

  • The test case does the following:

    • Creates a pipe
    • Transmits one message, which should be received by that pipe
    • Creates a child task with a loop servicing that pipe
    • Waits one second
    • Destroys the pipe
    • Waits one second
    • Checks to see that the child task has exited
    • Checks to see if CFE_SB_ReceiveBuffer() has returned an error code
  • Output from the failing check:

EVS Port1 66/1/TESTCASE_APP 7: [BEGIN] 36 Test destroying a pipe while waiting.
EVS Port1 66/1/TESTCASE_APP 2: [ FAIL] 36.007 sb_task_test.c:127 - D->exited (0) == 1 (1)
EVS Port1 66/1/TESTCASE_APP 2: [ FAIL] 36.008 sb_task_test.c:128 - D->status (0) != CFE_SUCCESS (0)
EVS Port1 66/1/TESTCASE_APP 8: [  END] 36 Test destroying a pipe while waiting. TOTAL::8     PASS::6     FAIL::2     MIR::0     TSF::0     TTF::0     N$

This indicates that 1.0 seconds after deleting an SB pipe, a call to CFE_SB_ReceiveBuffer() has not exited with a failure condition.

System(s) tested on

  • Hardware: PC
  • OS: Ubuntu 18.04
  • Versions:
    • cFE Development Build: 6.8.0-rc1+dev290
    • OSAL Development Build: v5.1.0-rc1+dev586
    • PSP Development Build: v1.5.0-rc1+dev118

Additional context
We've been working with an internal fork of SBN running with cFE (version Development Build: 6.8.0-rc1+dev290) and encountered a race condition where a blocking call to CFE_SB_ReceiveBuffer() is not failing if its SB pipe is destroyed while waiting. This is modeled after the implementation in SBN [ref].

cc @CDKnightNASA

Third party code
No third-party code is included.

Contributor Info - All information REQUIRED for consideration of pull request
Full name and company/organization/center of all contributors ("Personal" if individual work)

  • Jonathan Bohren, Honeybee Robotics

@jbohren-hbr jbohren-hbr force-pushed the add-sb-receive-pipe-destruction-test branch from 2c6de53 to a4e23ca Compare August 10, 2021 16:17
@astrogeco
Copy link
Contributor

astrogeco commented Aug 10, 2021

Hi! Thanks for the contribution! Can you open an issue that describes the problem this PR solves?

Maybe reference any shortcomings in the implementations for #1651 and #1657

@jbohren-hbr
Copy link
Contributor Author

@astrogeco Thanks for taking a look. I opened an #1799 which related this test case to the extant one you referenced.

My main question is what is the expected behavior of a blocking call to CFE_SB_ReceiveBuffer() when the pipe provided to it is deleted in another task.

@skliper
Copy link
Contributor

skliper commented Aug 10, 2021

At this point that behavior is undefined :)

Haven't had a chance to look into it, but it may be challenging to provide a defined behavior based in the differing underlying implementations (message queues, etc). My suggestion from a user standpoint would be to avoid the undefined behavior by providing a "shutdown" message or similar which would then be consistent/more portable.

I haven't dug into the design... but you can also use the child task management APIs if the intent is to stop the child task or similar.

@jbohren-hbr
Copy link
Contributor Author

@skliper thanks for the note. I definitely appreciate the challenge in making this work reliably across platforms. My go-to for avoiding these sort of issues is to just never use indefinitely-blocking calls.

In our SBN fork we've just replaced this call with a 1000ms timeout.

Maybe SB could push a sentinel message into the pipe before destroying it, which would produce an appropriately descriptive error code on any blocking listeners. This might still require some additional bookkeeping, but I also haven't dug into what's already there.

@skliper
Copy link
Contributor

skliper commented Aug 11, 2021

My go-to for avoiding these sort of issues is to just never use indefinitely-blocking calls.

Good approach!

Maybe SB could push a sentinel message into the pipe before destroying it, which would produce an appropriately descriptive error code on any blocking listeners. This might still require some additional bookkeeping, but I also haven't dug into what's already there.

Good suggestion! Definitely worth consideration. I'm not sure how much analysis was done relative to multiple tasks sharing/using/managing a single pipe and ensuring consistent/guaranteed behavior in those scenarios.

@astrogeco
Copy link
Contributor

CCB:2021-08-18

  • Postpone to Draco

@astrogeco astrogeco added this to the cFS-Draco milestone Aug 18, 2021
@skliper skliper removed this from the Draco milestone Mar 18, 2022
@astrogeco astrogeco added the CCB:Ignore Pull Request can be ignored. Will be re-examined at by next CCB. label May 27, 2022
@dzbaker dzbaker self-assigned this Sep 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CCB:Ignore Pull Request can be ignored. Will be re-examined at by next CCB.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants