"Progressive" or "dynamic" SARIF #661

davidmalcolm · 2024-09-24T15:11:25Z

SARIF defines an object model, the top level of which is the sarifLog object (§3.13), which contains the results of one or more analysis runs. The runs do not need to be produced by the same analysis tool.

A SARIF log file SHALL contain a serialization of the SARIF object model into the JSON format.

NOTE 1: In the future, other serializations might be defined.

The top-level value in the log file, representing the sarifLog object, SHALL conform to the JSON object grammar; that is, it SHALL consist of a comma-separated sequence of name/value pairs, enclosed in curly brackets, as specified by JSON [RFC8259].

i.e. that we have a log, which contains 0 or more run objects, which contains 0 or more results objects.

Is there an implicit assumption in the spec of a kind of "atomic" or "after-the-fact" model in which the log is generated/accessible after all runs have finished?

If so, it might be useful to support a more dynamic/progressive model, in which a consumer might see all of part of the SARIF before all of the results are in.

For example, in #572 and https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p3358r0.html#msvc Sy describes how MSVC can send diagnostics in the form of SARIF result objects back to an IDE, and how it's useful to the IDE to be able to start displaying results before the run has completed (in this case, a C++ compiler attempting to compile a C++ source file).

Similarly, for SARIF 3 we're looking to support dynamic analysis, where it seems useful to have a listener be able to listen for results, and dynamically update a UI based on them.

I'm filing this issue to split out this discussion from #572.

Some issues are:

what can change?
- Can you merely send notifications that append result objects onto a run? Might you need to append artifacts, rules, or other shared objects? (e.g. adding them "on-demand" to the SARIF tree?). Perhaps the endTimeUtc (§3.20.8) property could be set via a notification when a run is complete. (if so, would a more general JSON modification protocol be appropriate: e.g. "append array element at a JSON pointer". When would schema validity be guaranteed? Perhaps an atomic modification set/transaction, where schema validity is assumed before/after a transaction, but not necessarily guaranteed if not all the modifications in the set have been handled)
what protocols would be in use to send the change messages?
- e.g. JSON-RPC? framing via multiple HTTP-style headers (akin to LSP)
- https://jsonlines.org/
- https://en.wikipedia.org/wiki/JSON_streaming#Concatenated_JSON
- etc
how would SARIF producers and consumers work with such a model?
are there other tools (SARIF transformers??) that would work with such a model
etc

Or am I missing something here?

The text was updated successfully, but these errors were encountered:

davidmalcolm · 2024-09-24T15:23:15Z

Note to self:
DOM has a concept of mutation observers, and mutation records expressing changes to a tree. Not sure if that's overkill for this, but presumably would be very flexible.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Progressive" or "dynamic" SARIF #661

"Progressive" or "dynamic" SARIF #661

davidmalcolm commented Sep 24, 2024

davidmalcolm commented Sep 24, 2024

"Progressive" or "dynamic" SARIF #661

"Progressive" or "dynamic" SARIF #661

Comments

davidmalcolm commented Sep 24, 2024

davidmalcolm commented Sep 24, 2024