Skip to content

Conversation

@PsypherPunk
Copy link
Contributor

As per #219, this adds optional recursion and filename filtering to the WatchedCDXSource. The defaults are equivalent to:

<property name="source">
    <bean class="org.archive.wayback.resourceindex.WatchedCDXSource">
        <property name="recursive" value="false" />
        <property name="filters">
            <list>
                <value>^.+\.cdx$</value>
            </list>
        </property>
        <property name="path" value="/wayback/cdx-index/" />
    </bean>
</property>

I had to add a equals() method for the FlatFile class as it wasn't possible to easily delete a CDXIndex otherwise—they're now considered equivalent if both are of the FlatFile class and getPath() returns the same for both.

@ibnesayeed
Copy link
Contributor

I think it would be good to have a commented example config added in the CDXCollection.xml file. Also, the default path should be ${wayback.basedir}/cdx-index/ and if ${wayback.basedir} is not available where defaults are set then it should be hard-coded to /tmp/openwayback/cdx-index/ which is the default in default wayback.xml where placeholders are set.

@PsypherPunk
Copy link
Contributor Author

While the example in the Wiki should be changed to use ${wayback.basedir}/cdx-index/, I'm averse to adding more examples to the default config. files.

The current wayback.xml is an example of overly large config. file dominated by commented-out, unused sections; I'd much prefer that the documentation was accurate and, if our users are having genuine difficultly finding examples, easier to access.

@kris-sigur
Copy link
Member

Merged manually.

@kris-sigur kris-sigur closed this May 11, 2015
@PsypherPunk PsypherPunk deleted the watchedcdxsource-enhancements branch May 28, 2015 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants