-
Notifications
You must be signed in to change notification settings - Fork 21
Design Rationale
See this page for general implementation advice on how STIX currently should be used. This wiki page is intended to track why things are the way that they are, not how to use them.
The requirements when choosing the first STIX language implementation were:
- Tooling available in most languages that make it easy to work with
- An accepted "schema" implementation, ideally also supported by tooling across many languages
- Ability to reference out to other specifications as necessary (e.g. OpenIOC, CIQ, etc)
The best fit for these requirements seemed to be XML. Moving forward, the expectation is that an independent model will be developed and one or more bindings to actual serialization formats published.
The use cases for CybOX patterning that led to the current design were:
- Ability to represent commonly used indicators of compromise (IPs, domains, etc.)
- Ability to expand beyond commonly used IOCs to advanced matching (full files, e-mails, etc.)
- Ability to represent wildcards and other non-equality conditions
Pros of current approach:
- Same structures represent both instances and patterns, meaning similar code can process them
- Patterns directly against the object structures make it easy to see where they apply
Cons of current approach:
- Unable to type instances, because patterns require formats that don't match the expected type (i.e. you can't force an IP address to dotted decimal if you want to wildcard it)
- Can be confusing
- Fact that an observable is a pattern is denoted on a field basis rather than observable basis, meaning there can be conflicts (one field has a condition, another doesn't)
- Many implementors neglect to add
@condition="Equals"
to their pattern observables that check for equality
CybOX Issue #352
Use cases for sightings are:
- TODO
Issue #306
The rationale for allowing both embedded and referenced relationships is:
- Referencing relationships is required because many-to-many relationships are necessary, which is not possible with embedded.
- Embedded relationships are nice because they allow people to use a simpler approach if they don't need the complexity of referencing relationships.
It has been suggested that either embedded or referenced relationships be removed: #201. Note that if relationships become top-level constructs (per #291) this decision point is irrelevant.
The rationale behind using xsi:type
(http://stixproject.github.io/documentation/concepts/xsi-type/) was that it allowed for de-coupling of the schemas such that:
- Component schemas (indicators, campaigns, etc.) were de-coupled from each other
- Extension schemas were de-coupled from core
- Controlled vocabs were de-coupled from where they are used
In retrospect, a negative of this is that it's difficult to understand and implement if you're working with the XML directly in code or even manually (i.e. not via bindings and APIs).
Issue #311
- TODO: Explain how and why malware, attack patterns, infrastructure, etc. are all types of TTPs
Schemas were versioned separately in order to denote how they evolved over time and with the intention that with each release some schemas would change and others would not. It would also allow for out-of-release releases of component schemas.
In retrospect, a negative of this is that it can be confusing. The @version
attribute on components in particular has led to confusion many times. We've also not seen many cases where a document consists of schemas from more than one STIX release (and in any case they would likely not be compatible with each other due to changes in STIX Common).
Issue #312
The primary use case that led to QName IDs was the ability to have a namespace for the IDs. This helps to prevent ID collisions (in particular if a tool is not using GUIDs) and indicates which producer created the construct.
It has been suggested that STIX move to URI-based IDs. This is more in line with the W3C specification on using QNames as IDs but the approach would need to ensure unique IDs and indicate which producer created the content (assuming that is still a requirement).
Issue #301