For more details about OAI-PMH see the documentation: http://www.openarchives.org/pmh/
The oaipmh component is used for polling OAI-PMH data providers. Camel will default poll the provider every 60th seconds.
Maven users will need to add the following dependency to their pom.xml for this component:
<dependency>
<groupId>es.upm.oeg.camel</groupId>
<artifactId>camel-oaipmh</artifactId>
<version>x.x.x</version>
</dependency>
Note: The component currently only supports polling (consuming) feeds.
Note: You must include this repository in your pom.xml:
<repositories>
<!-- GitHub Repository -->
<repository>
<id>camel-oaipmh-mvn-repo</id>
<url>https://raw.github.com/cbadenes/camel-oaipmh/mvn-repo/</url>
<snapshots>
<enabled>true</enabled>
<updatePolicy>always</updatePolicy>
</snapshots>
</repository>
</repositories>
oaipmh:oaipmhURI
Where oaipmhURI
is the URI to the OAI-PMH data provider to poll.
You can append query options to the URI in the following format, ?option=value&option=value&
...
Property | Default | Description |
---|---|---|
delay | 60000 | Delay in milliseconds between each poll |
initialDelay | 1000 | Milliseconds before polling starts |
userFixedDelay | false | Set to true to use fixed delay between pools, otherwise fixed rate is used. See ScheduledExecutorService in JDK for details. |
verb | ListRecords |
Future versions will handle ListIdentifiers , Identify , GetRecord , ListSets and ListMetadataFormats . |
metadataPrefix | oai_dc | Specifies the metadataPrefix of the format that should be included in the metadata part of the returned records. |
from | Specifies a lower bound for datestamp-based selective harvesting. UTC DateTime value. After first request, this value is updated to current time if no upper bound is defined | |
until | Specifies an upper bound for datestamp-based selective harvesting. UTC DateTime value. | |
set | Specifies membership as a criteria for set-based selective harvesting. |
Camel initializes the IN body on the Exchange with a response message in XML format. For ListXX
requests, Camel will return a message for each element of the list received.
The oaipmh component ships with an OAIPMH dataformat that can be used to convert between String
(XML) and OAIPMHType
model object (JaxB).
marshal
= fromOAIPMHType
to XMLString
unmarshal
= from XMLString
toOAIPMHType
More details about these xsd here.
A route using this would look something like this:
from("oaipmh://aprendeenlinea.udea.edu.co/revistas/index.php/ingenieria/oai?delay=60000").unmarshal().jaxb("es.upm.oeg.camel.oaipmh.model").to("mock:result");
The purpose of this feature is to make it possible to use Camel's lovely built-in expressions for manipulating OAI-PMH messages. As show below, an XPath expression can be used to filter the OAI-PMH message:
from("oaipmh://aprendeenlinea.udea.edu.co/revistas/index.php/ingenieria/oai?delay=60000").unmarshal().jaxb("es.upm.oeg.camel.oaipmh.model").filter().xpath("//item/request/set[contains(.,'physics')]").to("mock:result");
This work is funded by the EC-funded project DrInventor (www.drinventor.eu).