You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
Björn Schießle 54fa53dd1a
use higher resolution and adjust to latest theme changes
4 years ago
.gitignore add gitignore 4 years ago
README.rst initial commit 4 years ago
pdf2html.py use higher resolution and adjust to latest theme changes 4 years ago
test.pdf initial commit 4 years ago

README.rst

Introduction

This is a fork of https://github.com/miohtama/pdf-to-html.git I adjusted the output format to work with my Hugo theme https://gitlab.com/BeS/hugo-sustain-ng

This is a Python script to convert a PDF to series of HTML <img> tags with alt texts. It makes the presentation suitable embedded for a blog post and reading on a mobile device and such.

Example Workflow:

  • Export presentation from Apple Keynote to PDF file. On Export dialog untick include date and add borders around slides.
  • Run the script against generated PDF file to convert it to a series of JPEG files and a HTML snippet with <img> tags
  • Optionally, the scripts adds a full URL prefix to <img src>, so you don't need to manually link images to your hosting service absolute URL
  • Copy-paste generated HTML to your blog post

Tested with Apple Keynote exported PDFs, but the approach should work for any PDF content.

See example blog post and presentation.

Installation

Dependencies (OSX):

sudo port install ghostscript

Please note that Ghostscript 9.06 crashed for me during the export. Please upgrade to 9.07.

Setting up virtualenv and insllating the code:

git clone xxx
cd pdf-presentation-to-html
curl -L -o virtualenv.py https://raw.github.com/pypa/virtualenv/master/virtualenv.py
python virtualenv.py venv
. venv/bin/activate
pip install pyPdf

Usage

Example:

. venv/bin/activate
python pdf2html.py test.pdf output

Advanced example:

. venv/bin/activate
python pdf2html.py test.pdf output

Even more advanced example with hardcoded URL:

GHOSTSCRIPT=/usr/local/bin/gs python pdf2html.py test.pdf output http://opensourcehacker.com/wp-content/uploads/wpd2013/

Then upload to the server for Wordpress to access:

rsync -av pycon2014 yourserver.example.com:/srv/yoursite/wordpress/wp-content/uploads

Author

Mikko Ohtamaa (blog, Facebook, Twitter, Google+)