-
Notifications
You must be signed in to change notification settings - Fork 16
Lesson: make your terminology aware of xml namespaces
This Tutorial is known to work with om version 3.0.4.
Please update this wiki to reflect any other versions that have been tested.
- Define Terms in a Terminology that refer to XML elements with specific attribute values
Unlike the previous lessons, what if you want to create mods xml with the mods namespace and a root node of <mods>
instead of <fields>
<mods version="3.0" xmlns="http://www.loc.gov/mods/v3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<mods:name>
<mods:namePart type="given">Zoia</mods:namePart>
<mods:namePart type="family">Horn</mods:namePart>
<mods:role>
<mods:roleTerm type="text">Author</mods:roleTerm>
<mods:roleTerm type="code">AUT</mods:roleTerm>
</mods:role>
</mods:name>
</mods>
Disclaimer about MODS We are not actually creating valid MODS XML here. We're just using the mods namespace as an example.
Reopen fancy_book_metadata.rb
and modify the line that calls t.root
so it declares a path of "mods" instead of "fields" and declares an :xmlns that points to the uri of your namespace.
t.root(:path=>"mods", :xmlns=>"http://www.loc.gov/mods/v3")
You also need to update the .xml_template
method to match these changes
def self.xml_template
Nokogiri::XML.parse('<mods xmlns="http://www.loc.gov/mods/v3"/>')
end
Setting the :xmlns on the root of an OM Terminology makes it the default namespace for all of the Terms.
bundle console
Require the FancyBookMetadata class definition.
require "./fancy_book_metadata"
fancybook = FancyBookMetadata.new
Now rerun the same commands we ran in the last lesson and see what's different about the resulting XML.
fancybook.name.given_name = "Zoia"
=> "Zoia"
fancybook.name.family_name = "Horn"
=> "Horn"
fancybook.name.role.text = "author"
=> "author"
fancybook.name.role.code = "AUT"
=> "AUT"
fancybook.name(1).family_name = "Caesar"
=> "Caesar"
fancybook.name(1).given_name = "Julius"
=> "Julius"
fancybook.name(1).role.text = "Contributor"
=> "Contributor"
fancybook.name(1).role.code = "CON"
=> "CON"
puts fancybook.to_xml
<mods xmlns="http://www.loc.gov/mods/v3">
<name><namePart type="given">Zoia</namePart><namePart type="family">Horn</namePart><role><roleTerm type="text">author</roleTerm><roleTerm type="code">AUT</roleTerm></role></name>
<name><namePart type="family">Caesar</namePart><namePart type="given">Julius</namePart><role><roleTerm type="text">Contributor</roleTerm><roleTerm type="code">CON</roleTerm></role></name>
</mods>
=> nil
Now the mods namespace is declared as the xmlns on the root of the XML document. Though this looks like a small change, it has an important impact on how XPath queries are run against any XML documents you parse with this Terminology.
Put the following into a file called funny_sample.xml
<mods xmlns:mods="http://www.loc.gov/mods/v3">
<name>
<namePart type="given">Zoia</namePart>
<namePart type="family">Horn</namePart>
<role>
<roleTerm type="text">author</roleTerm>
<roleTerm type="code">AUT</roleTerm>
</role>
</name>
<name>
<namePart type="family">Caesar</namePart>
<namePart type="given">Julius</namePart>
<role>
<roleTerm type="text">Contributor</roleTerm>
<roleTerm type="code">CON</roleTerm>
</role>
</name>
</mods>
Notice that the document declares the mods namespace but none of the nodes use that namespace. This means that our terminology will not find them.
bundle console
require "./fancy_book_metadata"
file = File.new("funny_sample.xml")
funnysample = FancyBookMetadata.from_xml(file)
funnysample.name.count
=> 0
Now edit the funny_sample.xml file so that your elements are all in the mods namespace
<mods xmlns:mods="http://www.loc.gov/mods/v3">
<mods:name>
<mods:namePart type="given">Zoia</mods:namePart>
<mods:namePart type="family">Horn</mods:namePart>
<mods:role>
<mods:roleTerm type="text">author</mods:roleTerm>
<mods:roleTerm type="code">AUT</mods:roleTerm>
</mods:role>
</mods:name>
<mods:name>
<mods:namePart type="family">Caesar</mods:namePart>
<mods:namePart type="given">Julius</mods:namePart>
<mods:role>
<mods:roleTerm type="text">Contributor</mods:roleTerm>
<mods:roleTerm type="code">CON</mods:roleTerm>
</mods:role>
</mods:name>
</mods>
Now re-open the file, parse it again and re-run the query for names. Because the <name>
nodes are now in the mods namespace, they will be found by the XPath query.
bundle console
require "./fancy_book_metadata"
file = File.new("funny_sample.xml")
funnysample = FancyBookMetadata.from_xml(file)
funnysample.name.count
=> 2
How does this work? How does OM handle these namespaces? In short, when you have declared a namespace on your Terminology, OM injects that namespace into its XPath queries. Look at the XPath for the name
Term.
funnysample.name.xpath
=> "//oxns:name"
If you remember in an earlier lesson when you had a Terminology that didn't have :xmlns defined, calling funnysample.name.xpath
would have returned //name
, but now it returns //oxns:name
. This "oxns" is a placeholder that OM uses to signify "whichever namespace has been set as the default namespace on the Terminology". To see what namespaces are being used by a Terminology, use the namespaces
method on the OM Document's associated Terminology.
FancyBookMetadata.terminology.namespaces
=> {"xmlns"=>"http://www.loc.gov/mods/v3", "oxns"=>"http://www.loc.gov/mods/v3"}
funnysample.class.terminology.namespaces
=> {"xmlns"=>"http://www.loc.gov/mods/v3", "oxns"=>"http://www.loc.gov/mods/v3"}
Go on to Lesson: Parse an Existing XML File with a Terminology or return to the Tame your XML with OM page.