Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

conflicting ID-types for attribute "id" #211

Open
sthibaul opened this issue Aug 31, 2016 · 15 comments
Open

conflicting ID-types for attribute "id" #211

sthibaul opened this issue Aug 31, 2016 · 15 comments

Comments

@sthibaul
Copy link

Hello,

As reported by Vincent Lefevre in Debian bug report http://bugs.debian.org/834555 :


jing yields an error on a valid XML file (neither xmllint, nor
Emacs nXML complain).

Consider the following files:

==> tdb.xml <==

.

==> tdb.rnc <==
default namespace = "http://localhost/"

include "/usr/share/xml/docbook/schema/rng/5.0/docbook.rnc" { start |= notAllowed }

root =
element root {
attribute xml:id { xsd:ID },
db.para
}

start = root

==> tdb.rng <==

<grammar ns="http://localhost/" xmlns="http://relaxng.org/ns/structure/1.0"
+datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
















Note: I generated tdb.rng with "trang tdb.rnc tdb.rng" and updated
the path to docbook.rng to reuse the schemas from the docbook5-xml
package.

I get the following error:

zira:~> jing tdb.rng tdb.xml
[warning] /usr/bin/jing: No java runtime was found
/usr/share/xml/docbook/schema/rng/5.0/docbook.rng:83:16: error: conflicting ID-types for attribute "id" from
+namespace "http://www.w3.org/XML/1998/namespace" of element "root" from namespace "http://localhost/"

while with xmllint from libxml2-utils:

zira:~> xmllint --noout --relaxng tdb.rng tdb.xml
tdb.xml validates

and when I open tdb.xml in Emacs, it is said:

-UUU:----F1 tdb.xml All L1 (nXML Valid) --------------
Using schema ~/tdb.rnc

@sthibaul
Copy link
Author

Of course github mangled the xml code... Here are the files attached

@sthibaul
Copy link
Author

@sthibaul
Copy link
Author

(I had to append .txt extensions for github to be happy...)

@georgebina
Copy link
Contributor

georgebina commented Sep 1, 2016

ID checking is defined as part of the Relax NG DTD compatibility specification and being something inherited from DTDs, their definition should be consistent for an element matched by a Relax NG pattern.
Usually the problem appears when the same content can be matched by a wildcard-like pattern (any element with any content, with any attribute etc.) because the same attribute will be considered with no ID type by the wildcard-like pattern and with ID type by the more concrete pattern that defines the element and the attribute.
If you look in docbook.rng at the indicated line you will see that there is a wildcard-like pattern there and that will match xml:id with no ID type while you defined that to be an ID in your schema, thus the error.
You can turn off ID checking in Jing - look for the available options - or you can change the schema to avoid this problem. One possibility is to change the any pattern to exclude xml:id and match that explicitly as an ID, something like below

    <define name="db._any.attribute">
      <choice>
        <attribute>
          <a:documentation>Any attribute including in any attribute in any
            namespace.</a:documentation>
          <anyName>
            <except>
              <name>xml:id</name>
            </except>
          </anyName>
        </attribute>
        <attribute name="xml:id">
          <data type="ID"/>
        </attribute>
      </choice>
    </define>

Regards,
George

@vinc17fr
Copy link

vinc17fr commented Sep 1, 2016

But db._any isn't used anywhere in the XML file. There may be an error in the DocBook 5 schema (for instance, if an XML file uses xml:id somewhere in MathML contents as a descendant of a DocBook element, the user wouldn't get what he may expect), but here xml:id doesn't appear as a descendant of a DocBook element, so that db._any isn't used and there shouldn't be any error.

@georgebina
Copy link
Contributor

The thing is that it is possible in an instance document to appear a "root" element in that area and in that case the processor will not know how to consider the ID type for the xml:id attribute - this is a static error, that analyses the schema, not a runtime error on a specific instance document. I mentioned the options above - one is to turn off ID checking.

Best Regards,
George

@vinc17fr
Copy link

vinc17fr commented Sep 1, 2016

If a "root" element appears in db._any, then the xml:id attribute would have type text in this context, because this is what the grammar says. Consider the following XML file:

<?xml version="1.0" encoding="utf-8"?>
<root xmlns="http://localhost/" xml:id="foo">
  <para xmlns="http://docbook.org/ns/docbook" linkend="foo">
    <inlineequation>
      <foo xmlns="http://www.w3.org/1998/Math/MathML">
        <root xmlns="http://localhost/" xml:id="bar"/>
      </foo>
    </inlineequation>
  </para>
</root>

Here, the xml:id="foo" would be of type ID because one has start = element root { attribute xml:id { xsd:ID }, db.para }. However the other "root" element is part of db._any, with db._any = element * - (db:* | html:*) { (db._any.attribute | text | db._any)* } and db._any.attribute = attribute * { text }, so that the type of xml:id="bar" would be text.

That said, the "xml:" namespace is special, as it is standard. https://www.w3.org/XML/1998/namespace says: "The xml:id specification defines a single attribute, xml:id, known to be of type ID independently of any DTD or schema." Note the "independently". So, because of this, xml:id="bar" is of type ID. This is how libxml2 behaves (I've checked, replacing linkend="foo" by linkend="bar"). That's probably why the DocBook 5 schema doesn't exclude xml:* in db._any.

And note that ID checking is useful, I don't want to turn it off. Currently, jing cannot work with any serious schema that mixes DocBook 5 and another namespace.

@georgebina
Copy link
Contributor

I agree that this is one of the major pain points with Relax NG, but I do not know the best way forward...
Ideally, wildard names like anyName or nsName should not contribute to identifying the ID type and ID type assignment should be done only using the information from elements/attributes specified with specific names. A similar issue appears for DITA 1.3 which uses Relax NG as normative schema and there we need to exclude some element names to get the schemas working.
Maybe the best solution will be an update to the DTD compatibility spec http://relaxng.org/compatibility-20011203.html#id to say that anyName and nsName name classes should not be considered when we check if two element/attribute to ID type mappings compete and then we can follow with updating Jing accordingly.
Maybe @jclark can share some insight on this.

Regards,
George

@vinc17fr
Copy link

vinc17fr commented Sep 1, 2016

If this can be useful in tests, here are two standalone examples.

The first example

<?xml version="1.0" encoding="utf-8"?>
<ex1>
  <foo xml:id="a">
    <bar ref="a b">
      <foo xml:id="b"/>
    </bar>
  </foo>
</ex1>

The corresponding schema:

start =
  element ex1 {
    element foo {
      attribute xml:id { xsd:ID }?,
      element bar {
        attribute ref { xsd:IDREFS }?,
        element foo {
          attribute xml:id { text }?
        }
      }
    }
  }
<?xml version="1.0" encoding="UTF-8"?>
<grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
  <start>
    <element name="ex1">
      <element name="foo">
        <optional>
          <attribute name="xml:id">
            <data type="ID"/>
          </attribute>
        </optional>
        <element name="bar">
          <optional>
            <attribute name="ref">
              <data type="IDREFS"/>
            </attribute>
          </optional>
          <element name="foo">
            <optional>
              <attribute name="xml:id"/>
            </optional>
          </element>
        </element>
      </element>
    </element>
  </start>
</grammar>

For this first example, according to the "xml:" namespace specifications, xml:id is always of type ID (what's inside attribute xml:id { } should be ignored). For this reason, I don't think there is a DTD compatibility issue concerning this example (this will be different in the second example, which I assume is less common).

The second example

<?xml version="1.0" encoding="utf-8"?>
<ex2>
  <foo myid="a">
    <bar ref="a">
      <foo myid="b"/>
    </bar>
  </foo>
</ex2>

The corresponding schema:

start =
  element ex2 {
    element foo {
      attribute myid { xsd:ID }?,
      element bar {
        attribute ref { xsd:IDREFS }?,
        element foo {
          attribute myid { text }?
        }
      }
    }
  }
<?xml version="1.0" encoding="UTF-8"?>
<grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
  <start>
    <element name="ex2">
      <element name="foo">
        <optional>
          <attribute name="myid">
            <data type="ID"/>
          </attribute>
        </optional>
        <element name="bar">
          <optional>
            <attribute name="ref">
              <data type="IDREFS"/>
            </attribute>
          </optional>
          <element name="foo">
            <optional>
              <attribute name="myid"/>
            </optional>
          </element>
        </element>
      </element>
    </element>
  </start>
</grammar>

Here, I've replaced the standard xml:id by myid. So, the first myid instance myid="a" is of type ID, but not the second myid instance myid="b" (the validation should fail if ref="a" is replaced by ref="b"). Again, libxml2 behaves that way.

@georgebina
Copy link
Contributor

Please note that this functionality is part of the DTD compatibility specification, and that means you cannot have two different declarations for the same attribute in the same element, because you cannot have that in a DTD.

@vinc17fr
Copy link

vinc17fr commented Sep 1, 2016

I suggest two possibilities:

  1. In a schema, for xml:id attributes, ignore the specified type and force it to ID as required by the XML specs. This should solve the issue with DocBook 5 and the first example (but not with the second example).
  2. Add an option to ignore the RELAX NG DTD Compatibility, while still checking ID/IDREF/IDREFS as specified in the validity constraints on the XML attribute types. This would solve issues in simple cases, but not in the first example, as one would get an error because the IDREF "b" does not have a matching ID; in practice, such errors would occur when a DocBook 5 document has an IDREF attribute referring an ID found in MathML or SVG (this is where db._any is involved) contained in the document.

@sideshowbarker
Copy link
Contributor

sideshowbarker commented Nov 1, 2018

Is this an outright bug that we ideally should fix in the sources? Or rather if it’s more of an enhancement request?

@sthibaul
Copy link
Author

sthibaul commented Nov 1, 2018

Well, it looks like a bug: the tool is saying the tdb.xml file is invalid while it is valid

@georgebina
Copy link
Contributor

The conflicting ID type error is not reported on the XML document, it is a problem reported on the schema and it is related to the DTD compatibility spec [1]. The DTD compatibility ID checking is controlled by an option [2], so you can disable that. This check does what the DTD compatibility spec says, so it is not a problem in Jing, if the spec is updated then Jing can follow the updated spec.

[1] https://www.oasis-open.org/committees/relax-ng/compatibility-20011203.html#id

if its attribute parent has any competing attribute elements, then each such competing attribute element has a data or value child specifying a datatype associated with the same ID-type. Two attribute elements
<attribute> nc1 p1 </attribute>
and
<attribute> nc2 p2 </attribute>
compete if and only if the containing definitions compete and there is a name n that belongs to both nc1 and nc2. Note that a definition competes with itself.

[2] http://www.thaiopensource.com/relaxng/jing.html

-i
Disables checking of ID/IDREF/IDREFS. By default, Jing enforces the constraints imposed by RELAX NG DTD Compatibility with respect to ID/IDREF/IDREFS.

@vinc17fr
Copy link

vinc17fr commented Nov 1, 2018

The -i option is not OK, since I still want ID checking. Compare with xmllint --relaxng, for instance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants