Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CFE_ES_CheckCounterIdSlotUsed doesn't handle error case (NULL dereference) - static analysis warning #1194

Closed
skliper opened this issue Mar 1, 2021 · 9 comments · Fixed by #1256 or #1258
Assignees
Milestone

Comments

@skliper
Copy link
Contributor

skliper commented Mar 1, 2021

Is your feature request related to a problem? Please describe.
CFE_ES_CheckCounterIdSlotUsed -> CFE_ES_LocateCounterRecordByID can return NULL, and CFE_ES_CounterRecordIsUsed dereferences

bool CFE_ES_CheckCounterIdSlotUsed(CFE_ResourceId_t CheckId)
{
return CFE_ES_CounterRecordIsUsed(CFE_ES_LocateCounterRecordByID(CFE_ES_COUNTERID_C(CheckId)));
}

CFE_ES_GenCounterRecord_t* CFE_ES_LocateCounterRecordByID(CFE_ES_CounterId_t CounterID)
{
CFE_ES_GenCounterRecord_t *CounterRecPtr;
uint32 Idx;
if (CFE_ES_CounterID_ToIndex(CounterID, &Idx) == CFE_SUCCESS)
{
CounterRecPtr = &CFE_ES_Global.CounterTable[Idx];
}
else
{
CounterRecPtr = NULL;
}
return CounterRecPtr;
}

static inline bool CFE_ES_CounterRecordIsUsed(const CFE_ES_GenCounterRecord_t *CounterRecPtr)
{
return CFE_RESOURCEID_TEST_DEFINED(CounterRecPtr->CounterId);
}

Describe the solution you'd like
Handle null

Describe alternatives you've considered
None

Additional context
Static analysis warning

Requester Info
Jacob Hageman - NASA/GSFC

@skliper skliper added the bug label Mar 1, 2021
@skliper skliper added this to the 7.0.0 milestone Mar 1, 2021
@skliper
Copy link
Contributor Author

skliper commented Mar 1, 2021

Same for CFE_ES_CheckAppIdSlotUsed, CFE_ES_CheckLibIdSlotUsed, CFE_ES_CheckMemPoolSlotUsed, CFE_SB_CheckPipeDescSlotUsed, CFE_ES_CheckCDSHandleSlotUsed

@jphickey
Copy link
Contributor

jphickey commented Mar 1, 2021

It's not really a bug though, it is guaranteed elsewhere that the code never invokes that with an invalid ID, so its impossible to get null in the use case for which that function was intended to be used.

@jphickey
Copy link
Contributor

jphickey commented Mar 1, 2021

To clarify this particular function is only supposed to be used with CFE_ResourceId_FindNext() which limits its range - so this is guaranteed to never be invoked with an out-of-range ID.

It is also supposed to be an internal helper function (i.e. not part of public API) and as such not possible that external entities other than ES itself can ever invoke this function. (which IIRC internal functions do not require NULL checks on inputs)

@skliper
Copy link
Contributor Author

skliper commented Mar 1, 2021

I see it as either it's guaranteed to return success, so you don't need to check the return (and set NULL as the else case) or it can be NULL and you should handle it. I don't follow the logic for having both.. especially when setting it to NULL would cause the following dereference to segfault. Why set up a case like that when it's all internal?

@skliper skliper added enhancement and removed bug labels Mar 1, 2021
@jphickey
Copy link
Contributor

jphickey commented Mar 2, 2021

It's not all internal. This is a case of an internal function (With controlled/known inputs) calling a common API so it doesn't need to repeat the logic.

The public API needs to check - per normal requirements of all APIs that have unknown inputs - but these failure cases will never be reached in the context of the helper function where the input is already known to be within range.

In this case the code is primarily run at start up so the cost of the redundant test is negligible - this isn't code the runs frequently. So if the objective here is to appease the static analysis tool that doesn't see the full context - then OK....

I'm definitely more concerned about redundant tests in code that executes during steady state/runtime than code that is primarily startup like this one.

@jphickey
Copy link
Contributor

jphickey commented Mar 2, 2021

To clarify that a bit -- the CFE_ES_LocateCounterRecordByID() is not itself public - but it is intended to range-check the ID on behalf of the caller if/when they are invoked from a public API. (e.g. CFE_ES_IncrementGenCounter, CFE_ES_DeleteGenCounter, etc). This is true of all the various "CFE_ES_Locate*ByID()" functions - the serve the role of validating the input on behalf of the caller so it doesn't need to be repeated in every public API function.

There's also a fair number of cases where these functions are also called from an internal context where the ID is known to be good or otherwise range checked already. It just isn't worth implementing a separate "internal" variant that skips the checks - that would cause other problems (duplicate logic, etc). So the existing/common API is used, but it just doesn't (re-?)check for failure when the input is known to be good already.

@skliper
Copy link
Contributor Author

skliper commented Mar 2, 2021

I find it interesting where we split on future-proofing. I get opposition when removing a default case or double checking a range where the case could never be exercised even with stubs when its done within the same unit, yet when there's 3 different units involved (the caller, CFE_ES_LocateCounterRecordByID, CFE_ES_CounterID_ToIndex) there's an argument for efficiency that leaves a possible future segmentation fault if API implementations or caller logic changes in any of these locations. I obviously prefer a bit more resiliency when utilizing APIs. I do recognize there's absolutely a trade to be made (efficiency/avoiding duplication/future-proofing, and even the "don't fix what isn't broken" perspective), just struggling with objective justification. As a general practice we protect from errors on many APIs that should never fail and at minimum avoid a segmentation fault, even though it's less efficient. Is this case really that unique? Maybe there could be a no-error checking internal implementation and the API could call that with the added protection (and these internal routines could use the more efficient internal implementation directly)?

@jphickey
Copy link
Contributor

jphickey commented Mar 5, 2021

That's what I was getting at earlier - the other option is to split up the underlying CFE_ES_CounterID_ToIndex() - currently this checks range of its input value AND converts to a local index if the range was valid. In this use-case we know the range is already valid (it was computed/generated based on the same limits) so it only really needs to convert to a local index.

To clarify - I have no strong objection to double-checking the value here. It is the lesser of two evils. Unnecessarily duplicating some logic and array lookup stuff to bypass checks would be a greater evil.

I guess I'm not really worried about a bug getting introduced (via changing one of the functions involved) that causes an out-of-range condition and corresponding segfault because that's a major bug and it would be really obvious the first time your run it that you broke it. "Hiding" the bug by not segfaulting here is probably just going to defer the segfault till a little later, where it's a little more decoupled from the real problem and harder to debug. But either way if someone breaks this code it'll probably segfault at some point - so I don't see this extra check as "saving the system from a segfault" - if someone breaks this code it WILL break the system, an "if" here won't save it.

@skliper skliper added the CCB:Ready Ready for discussion at the Configuration Control Board (CCB) label Mar 17, 2021
@skliper
Copy link
Contributor Author

skliper commented Mar 17, 2021

CCB - Could do a similar NULL debug check like OSAL with optional abort. Or just put in the if to check and report.

Suggest return true

@skliper skliper added CCB:2021-03-17 and removed CCB:Ready Ready for discussion at the Configuration Control Board (CCB) labels Mar 17, 2021
jphickey added a commit to jphickey/cFE that referenced this issue Mar 23, 2021
These are internal helper functions that determine if a table
slot corresponding to a given ID is in use or free/available.

This updates the function to handle NULL pointers even though
in context they are used the lookup should always work.
astrogeco added a commit that referenced this issue Mar 25, 2021
Fix #1194, check for NULL in SlotUsed helpers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants