-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it appropriate to specify an exception vs interrupt priority? #544
Comments
Let us just say that the Compliance TG would not be pleased with making
this implementation dependent.
High performance implementations treat bus error as asynchronous, because
they've already completed subsequent instructions by the time the bus error
is reported (even not-so-high performance implementations do).
I'm using bus error to mean a HW fault, but it possibly be access to
non-existent memory (or IO device register) which can't be detected before
the request is made externally (so legal permissions, legal address range).
The synchronous exceptions can be re-executed and possibly succeed; bus
errors can't (by my definition of bus error).
So, that's the easy fix: it hurts when you do that, don't do that - make
bus errors asynchronous interrupts. It should be a rare case, and it
should be fatal.
…On Tue, Jul 21, 2020 at 10:03 AM Greg Chadwick ***@***.***> wrote:
The current privileged spec specifies:
Synchronous exceptions are of lower priority than all interrupts
I feel this may be better off being left as implementation defined or the
meaning of priority in this instance should have further clarification.
Consider the following case:
1. Core fetches a load instruction and begins its execution
2. Core issues a load request on its external memory interface (i.e.
it's not being satisfied by a cache if one exists)
3. Before a response to the load request is received an interrupt is
raised
4. When the response to the load request does appear it signals a bus
error which results in a synchronous exception (either interrupt and bus
error response are seen on the same cycle or bus error response occurs
after interrupt)
The obvious interpretation of the line I quote from the spec would be that
in this scenario we should take the interrupt (whether we see the bus error
response on the same cycle or not). However what do we then do with our
pending load/pending exception on bus error?
If the response has yet to be received for the load then we need to keep
some state around so we're aware there is a pending load response along
with some information about the instruction and similarly if we saw the
error response the same cycle as the interrupt we need some state to
indicate a 'pending' exception with logic to handle the special pending
load -> pending exception transition when the response returns. For larger
cores this may be perfectly fine and indeed just 'fall out' of the existing
micro-architectural design of a super-scalar out of order core. At the
other end of the spectrum with tiny 2 or even 1 stage in-order pipelines
this adds extra area and complexity.
Then there's also the question of how a 'pending exception' should be
dealt with. One answer would be to immediately take the exception upon
returning from an interrupt with an mret that results in a jump back to the
code that saw the exception though I don't think the architecture strictly
specifies an interrupt handler must end with mret (though it is probably
inadvisible for software to use a different mechanism). You also have to
deal with nested cases where you have a pending exception and end up with
another interrupt and exception co-incident to get a second pending
exception. For further complexity you could end having the same PCs both
times. Clearly a big hardware stack of pending exceptions is not something
you want to architecturally mandate.
In some cases on taking the interrupt we could set the mepc to the
currently executing load and ignore any response received for it, so we
simply repeat the load on return from the interrupt. However whether this
is permissible is highly system/implementation dependent. For example
repeating the load may fine when loading from some 'ordinary' memory but
would not be fine when the load targets some device register where a read
triggers some action (such as popping from a FIFO) and clearly wouldn't be
fine for stores.
Another way to interpret the prioritization is it only applies between
instructions, i.e. you wait until an instruction execution has been
resolved (so for a load/store you know whether it will succeed or see a bus
error) before taking an interrupt handler. In this sense the exception
occurs 'before' the interrupt so you would take the exception first. Then
if you see an interrupt the same cycle you might send a load or store
request you prevent that request from going out in order that the interrupt
can occur before a potential exception.
Though this still suffers from being complex to apply to different
microarchitectures. In a big out of order core you are always executing
multiple instructions so there is no neat dividing point you can pay
attention to interrupts in. For a small core preventing a load or store
request the same cycle you see an interrupt may introduce a nasty timing
path (from interrupt in -> memory request out).
Ultimately I think the asynchronous vs synchronous nature of these two
things means you can't sensibly specify a priority between them without
adding extra baggage to the architecture to explain how things are
synchronized (and potentially require this 'pending exception' concept). It
is better off being left to implementors to decide (you could imagine a
core that wanted the lowest latency interrupt possible may be happy to
implement some kind of 'pending exception' idea or simply take a
non-recoverable general 'system error' exception to allow prioritizing the
interrupt given this case should be very rare and bus errors are generally
software issues rather than an expected event with a correct program).
Incidentally I came across this whilst looking at an issue on the lowRISC
Ibex core (some discussion here: lowRISC/ibex#1034
<lowRISC/ibex#1034>) which is an in-order 2 or 3
stage pipeline. There I plan to deal with this scenario by waiting for the
load or store response and taking the synchronous exception if a bus error
is seen (the interrupt then occurring in the synchronous exception handler
when interrupts are re-enabled).
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#544>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHPXVJWF4N5FPSXEDDHZ5MDR4XC4XANCNFSM4PDZUI3Q>
.
|
"synchronous exception" in the above refers to idempotent failures that occur instead of executing an instruction, e.g. page faults. The purpose of that sentence is that low-privilege code in a fault loop, executing an infinite sequence of ecalls or misaligned accesses or whatever, must not be able to starve high-privilege code of interrupts. |
Sure, I'd argue you can build a high-performance implementation that can take synchronous errors on bus errors on loads at least but my issue concerns the scenario where a bus error results in a synchronous exception. So if you're using an asynchronous exception for this then this issue doesn't apply. You could still end up with multiple pending asynchronous bus error exceptions as I described above but you could just say your asynchronous exception will only give the details (PC + Addr) of the first bus error seen.
I think this is a reasonable definition of bus error, but the specification does not define a bus error at all, simply a defined synchronous exception called 'Load access fault' with no requirements placed on what a 'Load access fault' must be. A reasonable interpretation here seems to be that a load or store that results in a bus error produces a synchronous exception with a 'Load access fault' or 'Store/AMO access fault' cause.
But this isn't mandated in the specification, there is no real definition of what a 'synchronous exception is' indeed there is space in the cause table for custom use. An implementer is free to define some synchronous exception that only occurs on a load or store and occurs based upon the system's response to the request for instance. That response could be a bus error or something else entirely. In this situation dealing with an interrupt that occurs after request but before response whilst strictly sticking to the specification is problematic as I've outlined. Perhaps the intent is no synchronous exception should have these properties (implying that bus errors must result in an asynchronous exception).
Well if the interrupt vs synchronous exception priority was implementation defined it'd be perfectly possible for the interrupt to take priority over the ecall/misaligned access exception and enter the handler. If the low-privilege code is intentionally hitting a bus error in a loop then you'd still enter the interrupt handler just a little later as soon as the exception handler re-enabled interrupts. |
I think you found a typo in the spec. "Access faults" in the "Machine level CSRs" chapter are the same thing as "access exceptions" in the "Physical memory protection" chapter; it is not simply a placeholder for user-defined errors.
General problem with the spec unfortunately.
Some kernels, e.g. seL4, never enable interrupts in kernel code and the next opportunity to take an interrupt is after returning to user. |
The purpose of a specification is to define enough for software to be able to work using only defined behavior. A kernel like seL4 needs timer and external hardware interrupts to take priority over software exceptions that can be generated by user code; to leave that implementation-defined renders the specification roughly as useful as if the existence of I think what you and I actually want here is to declare:
I don't think we really specify the timing when interrupts become pending, but this is what cores actually do, and if an interrupt is ready before the memory access is initiated you don't initiate a memory access. |
I'm not familiar with seL4 but if the exception prioritization was implementation defined as proposed then in the hostile/broken user-code case you suggest where there is a tight loop of faulting loads or similar then the interrupt handler will be executed immediately after the exception handler returns with an
Yes I think we want something like that though what 'fetched' means isn't straight-forward. I think wording around 'instruction boundaries' or 'beginning of instruction execution' may be better, e.g.
Some clarification on what should be done on a bus error would also be useful. I'd argue we shouldn't constrain that it must be an asynchronous/imprecise exception but if the 'access fault' cause is specifically for PMP failures another spec defined cause for a generic 'bus error' or 'system fault' or something would be useful. Otherwise everyone will choose their own cause value which doesn't seem like a good idea for something that is a generic issue that will be seen on many systems. You'd be unable to write a kernel that can see an exception was due to a bus error without having to know implementation specifics. |
Can this be closed, as it seems to not have traction? |
Bus errors - and the machine check architecture in general, should be
defined in some TG, and I suspect that is Functional Safety.
But the original title: priority of interrupts vs. exceptions, are already
architecturally defined, right?
So no exception trap should be taken if an interrupt is pending and enabled
- it should take the interrupt, return to the instruction that caused the
exception and re-trap.
If an exception happens inside an interrupt handler - then it may be
irrecoverable unless the interrupt handler has saved all the relevant state.
…On Tue, Aug 10, 2021 at 11:49 AM Joshua Scheid ***@***.***> wrote:
Can this be closed, as it seems to not have traction?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#544 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHPXVJT5BPFMI4ATUBSO5ETT4FYCPANCNFSM4PDZUI3Q>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email>
.
|
My confusion/concern here ultimately stemmed from trying to literally apply the architecture to micro-architectural events. In this case the processor is awaiting a response to a memory transaction and will take a synchronous exception if an error response is seen. Whilst awaiting this an interrupt is raised (as in the physical wire into the processor changes state). Taking that interrupt before you can deal with the potential synchronous exception is tricky/impossible as you end up with a half completed memory instruction that needs to trap if an error response comes back but you can't replay it so it's not as easy as pointing However it is easily resolved in a architecturally permissible manner. Just because an interrupt line has physically changed state doesn't mean that's architecturally visible yet. You simply state pending interrupts become visible as you begin instruction execution so ignoring a change on an interrupt line whilst in the middle of an instruction is permissible. Perhaps some wording saying it's implementation defined when a change in an interrupt line becomes architecturally visible would be useful? I do think there should be some mention of bus errors in the specification, even if just to state its implementation defined and some may choose an asynchronous interrupt, others may use a synchronous exception. |
The thread has bifurcated into a discussion of the desire for standardized handling of bus errors and a desire for clarification of the originally quoted remark that "Synchronous exceptions are of lower priority than all interrupts". The bus-error topic will be addressed by an RVI TG. To the original question: The arrival of most interrupts is asynchronous to the instruction stream, and consequentially, implementations are usually offered the luxury of deferring the taking of such interrupts until it's convenient for them to do so. Only at defined points are implementations required to constrain this behavior. Although we're still working on the language, the specific situation you described isn't one of those events, and so it would be valid to take the exception over the interrupt. What, then, was the point of the remark that "Synchronous exceptions are of lower priority than all interrupts"? That describes what happens when the instruction stream is synchronized to the list of pending interrupts. One such event is executing an MRET. If an interrupt is pending in the MRET's target privilege mode at the time of the MRET, then that interrupt must be taken before any synchronous exception that would've occurred as a result of execution in the target privilege mode. This isn't in conflict with your example. So, I think there's nothing actionable here, and in a roundabout way you'll actually get what you want. I'm going to close this issue. Feel free to open subsequent issues that are more targeted (but let's keep them to a single topic). |
The way I think about this is that interrupts are taken (logically) between
instructions.
An OOO machine has to pick a "between" point, which means it completes some
outstanding instructions, and flushes the pipe of anything beyond that
point.
Outstanding transactions (outside the core) either complete - or, if
speculative, can be flushed.
If that 'between' point is after an instruction completes with an
exception, the interrupt is taken, and upon return the exception trap is
taken.
That implies re-executing the excepting (and flushed) instructions.
That's something an excepting instruction without an interrupt would do
anyway, after fixing things up so it doesn't get another exception
(and that doesn't always work if there are more than one exception
causes)
A bus error is the only case I can think of that makes that difficult, and
that has to be addressed (sic) at some point.
…On Wed, Aug 11, 2021 at 1:14 AM Andrew Waterman ***@***.***> wrote:
Closed #544 <#544>.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#544 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHPXVJQ5Q22432QLD3Y4X43T4IWN5ANCNFSM4PDZUI3Q>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email>
.
|
The current privileged spec specifies:
I feel this may be better off being left as implementation defined or the meaning of priority in this instance should have further clarification.
Consider the following case:
The obvious interpretation of the line I quote from the spec would be that in this scenario we should take the interrupt (whether we see the bus error response on the same cycle or not). However what do we then do with our pending load/pending exception on bus error?
If the response has yet to be received for the load then we need to keep some state around so we're aware there is a pending load response along with some information about the instruction and similarly if we saw the error response the same cycle as the interrupt we need some state to indicate a 'pending' exception with logic to handle the special pending load -> pending exception transition when the response returns. For larger cores this may be perfectly fine and indeed just 'fall out' of the existing micro-architectural design of a super-scalar out of order core. At the other end of the spectrum with tiny 2 or even 1 stage in-order pipelines this adds extra area and complexity.
Then there's also the question of how a 'pending exception' should be dealt with. One answer would be to immediately take the exception upon returning from an interrupt with an mret that results in a jump back to the code that saw the exception though I don't think the architecture strictly specifies an interrupt handler must end with mret (though it is probably inadvisible for software to use a different mechanism). You also have to deal with nested cases where you have a pending exception and end up with another interrupt and exception co-incident to get a second pending exception. For further complexity you could end having the same PCs both times. Clearly a big hardware stack of pending exceptions is not something you want to architecturally mandate.
In some cases on taking the interrupt we could set the mepc to the currently executing load and ignore any response received for it, so we simply repeat the load on return from the interrupt. However whether this is permissible is highly system/implementation dependent. For example repeating the load may fine when loading from some 'ordinary' memory but would not be fine when the load targets some device register where a read triggers some action (such as popping from a FIFO) and clearly wouldn't be fine for stores.
Another way to interpret the prioritization is it only applies between instructions, i.e. you wait until an instruction execution has been resolved (so for a load/store you know whether it will succeed or see a bus error) before taking an interrupt handler. In this sense the exception occurs 'before' the interrupt so you would take the exception first. Then if you see an interrupt the same cycle you might send a load or store request you prevent that request from going out in order that the interrupt can occur before a potential exception.
Though this still suffers from being complex to apply to different microarchitectures. In a big out of order core you are always executing multiple instructions so there is no neat dividing point you can pay attention to interrupts in. For a small core preventing a load or store request the same cycle you see an interrupt may introduce a nasty timing path (from interrupt in -> memory request out).
Ultimately I think the asynchronous vs synchronous nature of these two things means you can't sensibly specify a priority between them without adding extra baggage to the architecture to explain how things are synchronized (and potentially require this 'pending exception' concept). It is better off being left to implementors to decide (you could imagine a core that wanted the lowest latency interrupt possible may be happy to implement some kind of 'pending exception' idea or simply take a non-recoverable general 'system error' exception to allow prioritizing the interrupt given this case should be very rare and bus errors are generally software issues rather than an expected event with a correct program).
Incidentally I came across this whilst looking at an issue on the lowRISC Ibex core (some discussion here: lowRISC/ibex#1034) which is an in-order 2 or 3 stage pipeline. There I plan to deal with this scenario by waiting for the load or store response and taking the synchronous exception if a bus error is seen (the interrupt then occurring in the synchronous exception handler when interrupts are re-enabled).
The text was updated successfully, but these errors were encountered: