Skip to content

Exclude URLs from a specific time range #212

@MohammedElsayyed

Description

@MohammedElsayyed

As stated in wayback.xml, we can use the following configuration to block URLs from the ResourceIndex by creating a plain text file "e.g. /tmp/exclude.txt" which contains URL prefixes:

<bean id="excluder-factory-static" class="org.archive.wayback.accesscontrol.staticmap.StaticMapExclusionFilterFactory">
    <property name="file" value="/tmp/exclude.txt" />
    <property name="checkInterval" value="600000" />
  </bean>

Can we change exclusion file format by including start and end date next to every URL if needed? OpenWayback (ResourceIndex) will check if there is a start and end date, then it will block snapshots which are in that range, else (no start and end date) it will behave normally by blocking it. A 3-column exclusion file format is as follows:

1st column is URL prefix which should be blocked. (required)
2nd and 3rd column are start and end date, respectively. (optional)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions