Skip to content

Omeka plugin to extract TOC from PDF files, and show it on public page.

Notifications You must be signed in to change notification settings

JBPressac/Plugin-PdfToc

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PDF TOC (plugin for Omeka)

Summary

Omeka plugin to extract TOC from PDF files, and show it with Bookreader viewer plugin or Universal Viewer plugin (in addition with the TOC for Universal Viewer plugin).

See demo of the in Bibliothèque numérique de l'université Rennes 2 (France).

You could also display toc on item show. See demo of the in 1886, digital library of university Bordeaux 3 (France).

Installation

  • This plugin needs pdftk command-line tool on your server (version 2.0.1 or higher required for PDFToc 1.0.2 and higher)
    sudo apt-get install pdftk
  • Upload the PDF TOC plugin folder into your plugins folder on the server
  • Rename the plugin folder as "PdfToc"
  • you can install the plugin via github
    cd omeka/plugins  
    git clone [email protected]:symac/Plugin-PdfToc.git "PdfToc"
  • Activate it from the admin → Settings → Plugins page
  • Click the Configure link to process or not existing PDF files.

Using the PDF TOC Plugin

  • Create an item
  • Add PDF file(s) to this item
  • Save the item
  • To locate the extracted table of content, select the item to which the PDF is attached. The TOC is stored in the text field of the PDF Table of Contents element set.

Optional plugins

Release notes

  • v1.0.3
    • The dump_data pdftk command converts non-ASCII caracters (accentuated letters, line feeds, etc.) with HTML entities so that the é caracter for instance is stored as é in the PDF Table of Contents. This is not a problem when the TOC is displayed with BookReader, which generates a HTML version of the TOC but this is problematical whith Universal Viewer which needs a JSON version of the TOC (and displays the HTML entities as so). This version of the plugin uses the dump_data_utf8 pdftk command to generate an UTF8 version of the table of content, comptaible for both viewers.
    • Deletion of the spaces before and after the TOC titles.
  • v1.0.2 : Compatible version with pdftk 1.0.2 and higher (which introduces a different structure of the data generated by the dump_data command).

Troubleshooting

See online PDF TOC issues.

License

This plugin is published under [GNU/GPL].

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

Contact

  • Syvain Machefert, Université Bordeaux 3 (see symac)

About

Omeka plugin to extract TOC from PDF files, and show it on public page.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • PHP 100.0%