Skip to content

3.6 Structured Data Queries (XPath, JSONPath, YamlPath)

Gabe Stocco edited this page Sep 27, 2022 · 2 revisions

Background

Starting with Application Inspector 1.6 it is now possible to restrict your rules pattern to only run on selected elements of Yaml, XML and JSON documents.

Below are several examples from the documents and rules used in the test cases. Use the jsonpaths field to specify an OR'd list of JSONPaths to query with the patterns, ymlpaths to specify an OR'd list of YamlPaths to query with the patterns and the xpaths field to specify an OR'd list of XPaths to query with the specified patterns.

XML Rule Example

Sample XML Document

Imagine you are scanning XML documents like the one below (a pom.xml file) and you'd like to check the value of the java.version tag. This is difficult to do with regular regex because the properties may be in any order.

<?xml version="1.0" encoding="UTF-8"?>
<project>
  <modelVersion>4.0.0</modelVersion>

  <groupId>xxx</groupId>
  <artifactId>xxx</artifactId>
  <version>0.1.0-SNAPSHOT</version>
  <packaging>pom</packaging>

  <name>${project.groupId}:${project.artifactId}</name>
  <description />

  <properties>
    <java.version>17</java.version>
  </properties>

</project>

Sample XML Rule

Using an XPath based query you can easily write such a rule.

[{
    "name": "Source code: Java 17",
    "id": "CODEJAVA000000",
    "description": "Java 17 maven configuration",
    "applies_to_file_regex": [
      "pom.xml"
    ],
    "tags": [
      "Code.Java.17"
    ],
    "severity": "critical",
    "patterns": [
      {
        "pattern": "17",
        "xpaths" : ["/project/properties/java.version"],
        "type": "regex",
        "scopes": [
          "code"
        ],
        "modifiers": [
          "i"
        ],
        "confidence": "high"
      }
    ]
  }]

Sample XML Document with Namespace

You may be querying xml documents with namespace properties specified. In that case you will need to use the local-name xpath method or specify the namespace explicitly.

<?xml version=""1.0"" encoding=""UTF-8""?>
<project xmlns=""http://maven.apache.org/POM/4.0.0"" xmlns:xsi=""http://www.w3.org/2001/XMLSchema-instance"" xsi:schemaLocation=""http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"">
  <modelVersion>4.0.0</modelVersion>

  <groupId>xxx</groupId>
  <artifactId>xxx</artifactId>
  <version>0.1.0-SNAPSHOT</version>
  <packaging>pom</packaging>

  <name>${project.groupId}:${project.artifactId}</name>
  <description />

  <properties>
    <java.version>17</java.version>
  </properties>

</project>

Sample XML Rule for Document with Namespace

[{
    "name": "Source code: Java 17",
    "id": "CODEJAVA000000",
    "description": "Java 17 maven configuration",
    "applies_to_file_regex": [
      "pom.xml"
    ],
    "tags": [
      "Code.Java.17"
    ],
    "severity": "critical",
    "patterns": [
      {
        "pattern": "17",
        "xpaths" : ["/*[local-name(.)='project']/*[local-name(.)='properties']/*[local-name(.)='java.version']"],
        "type": "regex",
        "scopes": [
          "code"
        ],
        "modifiers": [
          "i"
        ],
        "confidence": "high"
      }
    ]
  }]

Json Rule Example:

Sample Json Document

The sample JSON document here is a theoretical listing from a bookshop.

{
    "books":
    [
        {
            "category": "fiction",
            "title" : "A Wild Sheep Chase",
            "author" : "Haruki Murakami",
            "price" : 22.72
        },
        {
            "category": "fiction",
            "title" : "The Night Watch",
            "author" : "Sergei Lukyanenko",
            "price" : 23.58
        },
        {
            "category": "fiction",
            "title" : "The Comedians",
            "author" : "Graham Greene",
            "price" : 21.99
        },
        {
            "category": "memoir",
            "title" : "The Night Watch",
            "author" : "David Atlee Phillips",
            "price" : 260.90
        },
        {
            "category": "memoir",
            "title" : "The Autobiography of Benjamin Franklin",
            "author" : "Benjamin Franklin",
            "price" : 123.45
        }
    ]
}

Sample JSON Rule

This sample rule finds instances where there is a book with a title containing Franklin

[
    {
        "id": "SA000005",
        "name": "Testing.Rules.JSON",
        "tags": [
            "Testing.Rules.JSON"
        ],
        "severity": "Critical",
        "description": "This rule finds books from the JSON titled with Franklin.",
        "patterns": [
            {
                "pattern": "Franklin",
                "type": "string",
                "confidence": "High",
                "scopes": [
                    "code"
                ],
                "jsonpaths" : ["$.books[*].title"]
            }
        ],
        "_comment": ""
    }
]

YML example

Sample Content

hash_name:
  a_key: 0
  b_key: 1
  c_key: 2
  d_key: 3
  e_key: 4

Sample Rule

[{
    "name": "YamlPathValidate",
    "id": "YmlPath",
    "description": "find documents where a_key as subkey of hash_name is 0",
    "severity": "critical",
    "patterns": [
      {
        "pattern": "0",
        "ymlpaths" : ["/hash_name/a_key"],
        "type": "string",
        "scopes": [
          "code"
        ],
        "modifiers": [
          "i"
        ],
        "confidence": "high"
      }
    ]
  }]

XML and Json Combined Example

You may have a situation where you have similar data conceptually stored in two different formats, at different paths. You can create a rule with a single pattern that applies to both.

XML and JSON Sample Documents

XML

<?xml version=""1.0"" encoding=""utf-8"" ?>   
  <bookstore>  
      <book genre=""autobiography"" publicationdate=""1981-03-22"" ISBN=""1-861003-11-0"">  
          <title>The Autobiography of Benjamin Franklin</title>  
          <author>  
              <first-name>Benjamin</first-name>  
              <last-name>Franklin</last-name>  
          </author>  
          <price>8.99</price>  
      </book>  
      <book genre=""novel"" publicationdate=""1967-11-17"" ISBN=""0-201-63361-2"">  
          <title>The Confidence Man</title>  
          <author>  
              <first-name>Herman</first-name>  
              <last-name>Melville</last-name>  
          </author>  
          <price>11.99</price>  
      </book>  
      <book genre=""philosophy"" publicationdate=""1991-02-15"" ISBN=""1-861001-57-6"">  
          <title>The Gorgias</title>  
          <author>  
              <name>Plato</name>  
          </author>  
          <price>9.99</price>  
      </book>  
  </bookstore>

JSON

{
    "books":
    [
        {
            "category": "fiction",
            "title" : "A Wild Sheep Chase",
            "author" : "Haruki Murakami",
            "price" : 22.72
        },
        {
            "category": "fiction",
            "title" : "The Night Watch",
            "author" : "Sergei Lukyanenko",
            "price" : 23.58
        },
        {
            "category": "fiction",
            "title" : "The Comedians",
            "author" : "Graham Greene",
            "price" : 21.99
        },
        {
            "category": "memoir",
            "title" : "The Night Watch",
            "author" : "David Atlee Phillips",
            "price" : 260.90
        },
        {
            "category": "memoir",
            "title" : "The Autobiography of Benjamin Franklin",
            "author" : "Benjamin Franklin",
            "price" : 123.45
        }
    ]
}

Combined XML and JSON Rule

This rule finds books with Franklin in the title located either in the jsonpath or xpath specified.

[
    {
        "id": "SA000005",
        "name": "Testing.Rules.JSONandXML",
        "tags": [
            "Testing.Rules.JSON.JSONandXML"
        ],
        "severity": "Critical",
        "description": "This rule finds books from the JSON or XML titled with Franklin.",
        "patterns": [
            {
                "pattern": "Franklin",
                "type": "string",
                "confidence": "High",
                "scopes": [
                    "code"
                ],
                "jsonpaths" : ["$.books[*].title"],
                "xpaths" : ["/bookstore/book/title"]
            }
        ],
        "_comment": ""
    }
]