Tip: using Pandoc to create truly standalone HTML files

If you’re using the excellent Pandoc to convert between different document formats, and you:

  • want your final output to be in HTML;
  • want the HTML to be styled with CSS;
  • and want the HTML document to be truly standalone;

then read on.

The most common approach with Pandoc is, I think, to write in Markdown, and then convert the output to RTF, PDF or HTML. There are all sorts of more advanced options too; but here we are only concerned with HTML.

The pandoc command has an option which allows you to style the resulting HTML with CSS. Example 3 in the User’s Guide shows how you do this, with the -c option. The example also uses the -s option, which means that we are creating a standalone HTML document, as distinct from a fragment that is to be embedded in another document. The full command is:

pandoc -s -S --toc -c pandoc.css -A footer.html README -o example3.html

If you inspect the generated HTML file after running this, you will see it contains a line like this:

    <link rel="stylesheet" href="pandoc.css" type="text/css" />

That links to the CSS stylesheet, keeping the formatting information separate from the content. Very good practice if you’re publishing a document on the web.

But what about that “standalone” idea that you expressed with the -s option? What that does is make sure that the HTML is a complete document, beginning with a DOCTYPE tag, an <html> tag, and so on. But if, for example, you have to email the document you just created, or upload it to your company’s document store, then things fall apart. When your reader opens it, they’ll see what you wrote, all right; but it won’t be styled the way you wanted it. Because that pandoc.css file with the styling is back on your machine, in the same directory as the original Markdown file.

What you really want is to use embedded CSS; you want the content of pandoc.css to be included along with the prose you wrote in your HTML file.

Luckily HTML supports that, and Pandoc provides a way to make it all happen: the -H option, or using its long form, --include-in-header=FILE

First you’ll have to make sure that your pandoc.css file1 starts and ends with HTML <style> tags, so it should look something like this:

<style type="text/css">
body {
    margin: auto;
    padding-right: 1em;
    padding-left: 1em;
    max-width: 44em; 
    border-left: 1px solid black;
    border-right: 1px solid black;
    color: black;
    font-family: Verdana, sans-serif;
    font-size: 100%;
    line-height: 140%;
    color: #333; 

Then run the pandoc command like this:

pandoc -s -S --toc -H pandoc.css -A footer.html README -o example3.html

and you’re done. A fully standalone HTML document.

  1. It doesn’t have to be called that, by the way.

0 comments / Add your comment below

  1. A better approach for generating self-contained HTML (at least when using recent releases) is to keep the CSS file as normal, without the tag), incorporate it with -s as in the example, and then add the –self-contained option of pandoc.

    That way, pandoc automatically put your CSS file inline in the HTML file, but it’ll also embed the scripts, images, videos… that your document may link to.

  2. Hi Vladimir, I’d suggest you visit the Pandoc site to check out that kind of thing. Also, it’s going to depend on what mail client you’re using to send from, and what it can understand or import. I use one for the Mac, for example, called MailMate, which allows you to use Markdown in composing your mail. Which is nice.

    1. But for the HTML document to be truly selv use ‘ –self-contained ‘ as this option will also embed images in the html document.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: