Pandoc Tips

Albert Krewinkel, Ilona Silverwood

quartoquarto-extensionquarto-templatepandocmarkdownrmarkdown

Quarto

Quarto is an open-source scientific and technical publishing system. Built on pandoc, it allows you to create reports, documents, websites, blogs, presentations, and articles in a variety of formats. Quarto supports several code editors, including JupyterLab, VS Code, and Neovim, as well as text editors. Quarto also has a powerful visual editor for pandoc markdown.

As a bonus, Quarto is thoroughly documented, and even has a Get Started guide with tutorial. If you want to connect with the Quarto community in the fediverse, use the #QuartoPub hashtag.

languageTeXLaTeXMarkdownOrg model10n

Setting the document language

It is good practice to set the document language, e.g. via lang metadata field on the command line

pandoc -M lang=de-DE

This allows for proper hyphenation and other improvements; e.g. headers in LaTeX PDFs will be localized, with translations for “Figure”, “Table of Contents” and similar labels. Or put it directly into the doc; e.g. in Markdown one can use a YAML block

---
lang: es
---

and in Org mode

#+language: fr
Luacustom reader

Fountain reader

Custom reader for the “Fountain” format, a plain text markup language designed for screenwriters: https://github.com/pandoc/pandoc-fountain

LaTeXgroffHTMLEPUB

Code modes

Whether you prefer dark mode or light mode for your code, pandoc has you covered: use --highlight-style to choose a consistent setting which will apply to all code blocks. To view all options, you can call a list of built-in style choices with --list-highlight-styles. These styling settings work with docx, pptx, LaTeX, groff, HTML, and EPUB.

Find more information on syntax highlighting, including information on customizing settings, here.

online

The pandoc website has an online converter at https://pandoc.org/try; it offers a wealth of options, including file upload for docx or epub conversions. It’s perfect for experiments, short demonstrations, and one-off conversions.

MarkdownCLI

The oldest Markdown syntax for code blocks, that is also the best supported across different md implementations, uses 4 space indentation, e.g.

Assign a variable

    a = 10

There is no md syntax to specify the language for these blocks, but setting the cli parameter --indented-code-class=python marks all such blocks in a document as python code, thereby enabling syntax highlighting.

Markdown

Separating lists

Blank lines alone do not separate Markdown lists, but an empty HTML comment does:

1. one
2. two

<!– –>

1. un
2. deux

Omitting the <!-- --> snippet in the example above results in a single list with four items.

docxEPUBimage

Extract all images embedded in a docx or epub file into a media folder with

pandoc --extract-media=media my-file.docx -o my-file.md
Docker

Docker has accepted pandoc as an open source project; rate limits for Docker images have been lifted, and images are marked with a “sponsored OSS” badge. https://hub.docker.com/u/pandoc

TeXLaTeXPDFMarkdowntable

LaTeX PDF tip: When desired, disable the top and bottom rules of tables with

\renewcommand{\toprule}[2]{}
\renewcommand{\bottomrule}[2]{}

The snippet can be placed in the header-includes or directly in Markdown.

For pandoc versions before 2.18 use this instead:

\let\toprule\relax
\let\bottomrule\relax
MarkdownOpen Access

On Jan 11, 2013, Aaron Swartz died. He was the co-creator of Markdown, internet activist, and strong proponent of Open Access. The documentary “The Internet’s Own Boy” tells about his life. https://archive.org/details/TheInternetsOwnBoyTheStoryOfAaronSwartz

Luafilter

If images become too deeply nested, use --extract-media=. and/or apply this Lua filter:

local pattern = '^media/(.*)'

for fp, mt, contents in pandoc.mediabag.items() do
  pandoc.mediabag.delete(fp)
  pandoc.mediabag.insert(fp:match(pattern), mt, contents)
end

function Image (img)
  img.src = img.src:match(pattern)
  return img
end
Quarto

Quarto

Quarto is an open-source scientific and technical publishing system. Built on pandoc, it allows you to create reports, documents, websites, blogs, presentations, and articles in a variety of formats. Quarto supports several code editors, including JupyterLab, VS Code, and Neovim, as well as text editors. Quarto also has a powerful visual editor for pandoc markdown.

As a bonus, Quarto is thoroughly documented, and even has a Get Started guide with tutorial. If you want to connect with the Quarto community in the fediverse, use the #QuartoPub hashtag.

BabelmarkMarkdownCommonMark

Babelmark

Babelmark is a great tool to see how different Markdown processors handle input. https://babelmark.github.io

templateCLI

Need a template engine?
🦀
Why not pandoc?

Template (a.txt):

Hello from ${place}!

Command:

pandoc --template=a.txt -V place='Decapod 10' /dev/null

Output:

Hello from Decapod 10!

The template engine is flexible and powerful; it was designed specifically for pandoc. Features include

  • interpolation,
  • partials (for modularization),
  • conditionals,
  • loops,
  • text layouting utils, and
  • simple text modifications.

https://pandoc.org/MANUAL#template-syntax

pagebreakHTMLEPUBAsciidocdocxgroffConTeXt

Page breaks

The pagebreak Lua filter adds support for manual page-break markers in the text with the raw #TeXLaTeX \newpage command. “Translated” by the filter, this marker works with HTML & EPUB, Asciidoc, Word docx, groff ms, and ConTeXt output.

The filter is available from https://github.com/pandoc-ext/pagebreak.

Quarto and RMarkdown ship with an older version of the pagebreak filter, so no need to install it when pandoc is called through either of those. #QuartoPub also supports the {{ pagebreak }} shortcode syntax.

https://quarto.org/docs/authoring/markdown-basics.html#page-breaks

Markdownformat extension

Markdown ATX-style headings, i.e. those marked by one or more hash signs, should always have a space between marker and text.

# Intro

Pandoc enforces this space by default, but can be “convinced” to accept the space-less variant

#​Intro

by disabling the space_in_atx_header extension:

pandoc --from=markdown-space_in_atx_header
CLIwarning

Pandoc warns when it cannot convert parts of the input. Use

pandoc --fail-if-warnings

to make sure that all of the input is converted as expected: no output will be produced in case of a warning.

MarkdownreStructuredTextline blocks

Line block syntax in pandoc’s Markdown is borrowed from reStructuredText: each line is prefixed with a pipe character and a space. It can be used for line-oriented text such as addresses or poems.

| One Ring to rule them all, One Ring to find them,
| One Ring to bring them all and in the darkness bind them.

The main difference to other methods of handling line-oriented contents is that line blocks convert leading spaces into non-breaking spaces, thereby preserving indentation.

https://pandoc.org/MANUAL.html#line-blocks

supportmailing list

“Pandoc-discuss” is pandoc’s official mailing list: https://groups.google.com/g/pandoc-discuss. It is the central point of contact for pandoc users and devs; use it for any questions about pandoc.

The web interface is misleading in that it makes it seem like a Google account is required to post. However, it’s enough to write a mail to to subscribe to the list with any mail address. support mailing list

MarkdownmathmacroTeX

TeX macros in Markdown math

Pandoc Markdown respects \newcommand definitions in math formulæ, independent of the chosen output format. The definition command can be included in Markdown text, e.g.:

\newcommand\rationals{\mathbb{Q}}

We can prove that $\pi\not\in\rationals$.

Output with pandoc -t plain:

We can prove that π ∉ ℚ.
Markdowncodewriter

Banish the braces

Do you need Markdown code blocks to be written without braces and dots for the language?

``` {.python}
print("hello")
```

Use the extension --to markdown-fenced_code_attributes to show the language without extra characters.

``` python
print("hello")
```
MarkdowncitationDOI

Citations in pandoc’s Markdown can be written with curly braces around the citation key. This allows to use complex keys, including URLs @{https​://en.wikipedia.org/wiki/pandoc} and DOI identifiers @doi:10.5281/zenodo.1038654.

reStructuredTextLuafilterrstunderline

Underline with reStructuredText

reStructuredText follows the logic that underlined text means that it’s a link, so it doesn’t have dedicated syntax for (non-link) underlined text either. Support can be added with this syntax

.. role:: underline
    :class: underline

:underline:`this`

in combination with a short Lua filter

function Span(s)
  return s.classes[1] == 'underline' and pandoc.Underline(s.content) or nil
end
versionHaskell

Versioning

Pandoc follows the “Package Versioning Policy”, a convention used in the Haskell ecosystem. The version number components are EPOCH.MAJOR.MINOR.PATCH; bumps in the major version happen when there are API changes in the Haskell library.

Markdownfilter

Blog post on fancy Markdown lists and how to write long, interrupted lists without having to count items.

https://tarleb.com/posts/list-continuation/

Spoiler: uses a Lua pandoc filter

Markdowncode block

Two Markdown tricks to “nest” code blocks, i.e., to use backtick-based code block syntax within a code block:

  1. More backticks for the outer block:

    ````` markdown
    ``` r
    print('Hello!')
    ```
    `````
  2. Tildes:

    ~~~ markdown
    ``` r
    print('Bonjour!')
    ```
    ~~~

There’s also a third method:

  1. Indent by four spaces

        ``` r
        print('Moin')
        ```

Note though that this last option differs from the other two in that it doesn’t enable syntax highlighting.

EPUBfonts

EPUB tip: sometimes pandoc doesn’t bundle all resources required for the desired rendering – fonts being a typical example. Include those files as images in placeholder metadata fields:

---
extra-files:
- '![](my-font.tff)'
---

This ensures that the files are picked up and packed into the generated EPUB archive.

reStructuredTextGrid tablesMarkdownRSTOrg mode

Starting with pandoc 2.19, reStructuredText-style grid tables can contain advanced table features, including multi-row headers and cells that can span across rows and columns. Grid tables work with Markdown, RST, and Org mode input.

BibLaTeXdocxbibtexciteproc

Render references from a BibLaTeX file as Word docx:

pandoc refs.bib --citeproc -o refs.docx
MarkdownreStructuredTextlists

The first ordered list item in pandoc’s Markdown and reStructuredText not only determines the list style, but also which number/letter it starts from. Useful e.g. to continue lists after an intermediate paragraph:

i) one
#) another

Interruption; not part of any list.

iii) continue
#)   keep counting
codeline numbersMarkdownOrg modeLaTeXConTeXtHTML

Line numbers in code blocks

Pandoc supports number code lines in a many output formats, incl. TeXLaTeX, ConTeXt, and HTML (but not docx)

Markdown:

``` lua {.number-lines}
print 'Hello!'
return true
```

Org mode:

#+begin_src lua +n
print 'Please answer!'
return 42
#+end_src

Rendered HTML:

Two lines of Lua code, each preceded by a line number. The code has been syntax highlighted.

MarkdownreStructuredText

Pandoc’s Markdown and reStructuredText allow for fancy lists using different numbering styles (numbers, letters, roman) and deliminators (parens, period, …):

a. alpha
#. bravo
#. charlie

I) primus
#) secundus
#) tertius

a. alpha
b. bravo
c. charlie

I)  primus
II) secundus
III) tertius
CLItoc

The --toc command line option adds a table of contents to the resulting document. The -s or --standalone flag must be set as well to make the TOC show up in plain text formats.

Markdownidentifieri18nattributes

Attributes

Pandoc’s Markdown supports attributes on headings, code, spans, etc. Attributes are enclosed by curly braces and follow this syntax:

{identifier-value .class-one .class-two some-key="a value"}

Attributes can define an internal cross-ref target and carry a variety of additional information about the element, e.g. language or styling.

peace: [שָׁלוֹם]{lang="he" dir="rtl"}
Markdownformat extension

Format extensions

The handling of formats can be tweaked by enabling or disabling format extensions. Running

pandoc --list-extensions=my_format

lists available extensions and their default status. Extensions are enabled by prefixing them with +, and disable with -:

pandoc --from my_format+enabled_extension-disabled_extension

Example: disable the smart handling of quotes, ellipses, etc in Markdown with

pandoc --from=markdown-smart
filterQuartoQuarto extension

New pandoc filter / quarto extension: run-lua, a filter to evaluate small Lua snippets embedded in the document. E.g.

Built with pandoc <?lua PANDOC_VERSION?>

or with Quarto

Generator: Quarto <?lua quarto.version?>

https://github.com/pandoc-ext/run-lua

Arch linux

Slim Arch install

Arch Linux users probably know that installing pandoc with pacman pulls in a large number of Haskell dependencies. This is due to Arch’s dynamic linking approach. The pandoc-bin package on AUR is an alternative to the default packages; it has a much smaller disk footprint.

Org mode

Org mode tip: use

#+MACRO: myvar Replacement text

to define macros. Use them with

{{{myvar}}}

Macros with arguments are possible, too:

#+MACRO: greet Hello, $1!
{{{greet(Mike)}}}

Pandoc supports this feature.

ScrivenerdocxZotero

Step-by-step guide published by @natalie, detailing a workflow that converts from Scrivener to Word docx with pandoc and Zotero. https://nataliekraneiss.com/wp-content/uploads/2022/11/Steps-from-Scrivener-to-Word-with-Live-Citations.pdf

reveal.jsslides

Reveal.js tip: Adding the r-stretch class to an image produces an image that’s stretched to fill the slide. The image should be the only element on that slide. E.g.

pandoc -t revealjs -s slides.md

with slide content

![Results](graph.png){.r-stretch}
CLIsectionsnumbering

Use the command line option --number-sections or -N to make pandoc generate output with numbered sections. Headings with class “unnumbered” are treated specially and won’t be numbered.

Common markattributesMarkdown

Common mark allows to add attributes to all elements when the attributes extension is enabled:

{.fruit}
- apple
- banana

Pandoc doesn’t support attributes on some elements (yet); this includes lists. The attributes are attached to a wrapping div in those cases, so the above is equivalent to this:

::: {.fruit}
- apple
- banana
:::

The extension is enabled by default in “commonmark_x”. Markdown

TeXLaTeXCSLMarkdown

Markdown as a simpler LaTeX

Markdown can be used as a simpler LaTeX, but with the raw power of TeX still available when targeting PDF. TeX commands can be embedded in Markdown; the options --natbib/--biblatex are available for when pandoc’s citation handling via CSL isn’t powerful enough for once.

exit code

Exit codes are useful, e.g. when embedding pandoc in wrapper-scripts. Pandoc exiting with code zero signals success, with other numbers indicating the type of error that occurred. For example, exit code 84 hints at a problem in the Lua subsystem. The full list is in the manual: https://pandoc.org/MANUAL.html#exit-codes

Markdownlinks

Markdown tip: put the title of a section in square brackets to create a link to that section.

# Julius Caesar

⋮

"Alea iacta est" is a phrase attributed to [Julius Caesar].
reStructuredTextMarkdownformat extension

reStructuredText allows to use a hash symbol instead of a number in ordered list markers:

#. this
#. that

Pandoc’s Markdown supports this syntax as well via the fancy_lists extension; it is enabled by default.

MarkdownEast AsianCJK

East Asian Line Breaks

Line breaks within paragraphs are usually treated as spaces in Markdown, but this can give bad results when writing in an East Asian language. Using pandoc -f markdown+east_asian_line_breaks solves this, it ensures that line breaks between East Asian wide characters get ignored.

CLIhighlightingHTML

The --no-highlight cli flag disables pandoc’s syntax highlighting. This comes in handy when external highlighting engines are used, e.g. client-side highlighting with HTML and https://highlightjs.org.

metadatavariableheader

Prevent automatic addition of header-block

Trick to prevent pandoc from adding title, date or author to the document text: unset the title variable by setting it to an empty string

pandoc -V title='' …

The values will still be included in the document’s metadata, but won’t become part of the text. The most common use-case for this is HTML output, where the default header block may be unwanted.

Markdown

Pandoc’s Markdown comes with special syntax for this, allowing headings to be marked as unnumbered via the attribute string {-}

# Unnumbered heading {-}
Lua filterQuartoAbstract

abstrat-section Lua filter

The abstract-section Lua filter allows to write an abstract as part of the main text instead of having to place it in the YAML metadata. The filter can also be used and installed as a #Quarto #extension.

Luafilter

Delightful introduction to Lua filters by James Adams. It shows how and why these document modifiers can be useful. https://jmablog.com/post/pandoc-filters/ Most of the article is about plain pandoc, despite the title.

DockerPDFi18n

Internationalized Docker images

The default #pandoc #Docker images ship with support for common languages that use scripts based on the Latin alphabet. Documents in other writing systems generally requires custom images. Example image that can be used with Ukrainian documents, adding a font with #Cyrillic glyphs:

FROM pandoc/latex
RUN tlmgr install babel-ukrainian
RUN apk --no-cache add font-linux-libertine && fc-cache -f

Build with

docker build -t pandoc-ukr -f Dockerfile .

#i18n #font #TeXLaTeX

An image created with the above method can then use with

docker run … pandoc-ukr --pdf-engine=lualatex …

Example document:

---
title: "Приклад українською"
mainfont: Linux Libertine
lang: uk
---

Цей текст не дуже цікавий.

#pandoc #i18n

reStructuredTextmetadata

The reStructuredText reader is unusual in that it switches behavior when pandoc is called with -s or --standalone: In that case, definitions at the start of the document get treated as metadata.

:title: My Document
:author: Jane Doe
:date: 2023-01-05

Without --standalone, that data is treated as part of the main document and parsed as a definition list.

versiondownloadnightly

Download a nightly

If you’re a pandoc user in need of a version that includes a recent bug fix, or wants to test a feature recently added to the development version, download a nightly.

Binaries of the current pandoc development version are created every 24 hours– to access them,

  1. go to GitHub actions at https://github.com/jgm/pandoc/actions?query=workflow%3ANightly
  2. click on the latest result
  3. download an artifact for Linux, macOS, or Windows.
figureMarkdown

Markdown figures

An image with a description is treated as a figure if it is the only element in a Markdown paragraph:

![Male mandrill](mandrill.jpg)

This can be turned-off for individual images by appending a (possibly empty) HTML comment

![Peppers](peppers.jpg)<!-- -->

or globally by disabling the implicit_figures extension:

pandoc --from=markdown-implicit_figures
commentMarkdown

Three methods to add a comment to pandoc Markdown texts:

  1. <!-- HTML comment; will show up in HTML output,
         unless suppressed with `--strip-comments` -->
  2. ---
    # YAML block with comment.
    # Always removed.
    ---
  3. ``` {=comment}
    Creative use of raw blocks.
    Included in md output, dropped everywhere else.
    ```
HTMLsemantic

Pandoc produces “flat” HTML output by default, placing all header elements on the same level in the document tree. The --section-divs option enables a hierarchical element structure, wrapping headings and the respective contents in

elements:

<section class="level1">
  <h1>Intro</h1>
  <p>Text</p>
  <section class="level2">
    <h2>Subsection</h2>
    <p>More</p>
  </section>
</section>

This adds semantic info and can help with styling.

custom reader

Syntax-highlighted code files

Custom Lua readers allow to support additional formats, and other interesting features. Here’s one converting code files to syntax-highlighted docs.

function to_cb (s)
  local _, lang = pandoc.path.split_extension(s.name)
  return pandoc.Div{
    pandoc.Header(1, s.name == '' and '<stdin>' or s.name),
    pandoc.CodeBlock(s.text, {class=lang}),
  }
end
function Reader (input)
  return pandoc.Pandoc(input:map(to_cb))
end

Usage: pandoc -f src.lua -o out.docx

https://pandoc.org/custom-readers

terminalLinuxMac

Viewing docs in the terminal

Files of supported formats can be read like Unix man pages in the terminal:

pandoc -s -t man INPUT_FILE | man -l -

Define a shell function for extra convenience:

mandoc () { pandoc -s -t man "$@" | man -l - }

Violà, a short and friendly command to read documents:

mandoc letter.docx

Works on Linux and Mac. pandoc

presentationreveal.jsbeamer

Slide themes

The theme variable allows to change the style of a presentation. Personal favorites are --variable=theme:serif for reveal.js and --variable=theme:metropolis for beamer.

See also: https://revealjs.com/themes/ https://hartwork.org/beamer-theme-matrix/

wikiwikilinksMarkdownCommonMark

Pandoc 3 introduced Markdown support for wikilinks:

[[pandoc|https:/‌/github.com]]

There is no consensus across tools on whether the link title should come before or after the pipe character, so pandoc supports both. Choose by enabling either of the new extensions

+wikilinks_title_after_pipe

or

+wikilinks_title_before_pipe

Works with CommonMark, too: for GitHub wiki input, use

pandoc --from=gfm+wikilinks_title_after_pipe …
linebreakMarkdownCommonMark

Two ways to get a hard linebreak within a paragraph with Markdown:

  1. End the line with two spaces (most portable).

    I want a linebreak after this.␠␠
    This is a new line.
  2. “Backslash escape” the linebreak (more readable).

    I want a linebreak after this.\
    This is a new line.

Both methods work with pandoc and any CommonMark processor.

Markdownformat extension

Lesser know pandoc Markdown extension: definition lists (also called “description lists”)

apple
: a delicious fruit

bad apple
: a metaphor
: graphical "Hello, World!"

Enabled by default in markdown and commonmark_x.

Luafilterl10n

The pandoc-quotes filter by @odin adjusts quotation marks to fit the document language. E.g. running the below

---
lang: es
---
"¡Hola!"

through pandoc -L pandoc-quotes.lua -t plain yields

«¡Hola!»

The filter is available at https://github.com/odkr/pandoc-quotes.lua

DockerDownloadrelease

Docker images

We maintain a number of Docker images for quick ’n easy access to pandoc, e.g. in CI systems. Types of pandoc/* Docker images:

  • minimal – very small, just the bare pandoc binary;

  • core – includes pandoc-crossref and helpers needed for SVG image conversion;

  • latex – like core, plus TeXLive with all packages required by the default template.

https://hub.docker.com/u/pandoc

The code used to generate the Docker images is available from https://github.com/pandoc/dockerfiles. The accompanying README contains more info as well as usage instructions.

MarkdownJournalPublishingPDFJATSCrossref

JOSS, the Journal of Open Source Software, is now in the fediverse. The journal is remarkable for multiple reasons, with one reason being that papers are authored in Markdown and processed by pandoc.

Of course, all of the journal’s systems are open source. E.g., the publishing pipeline is available here: https://github.com/openjournals/inara The pipeline currently produces PDF, JATS, and Crossref XML output.

For more info on the joss publishing pipeline, esp. the JATS generation, see this article https://www.ncbi.nlm.nih.gov/books/NBK579698/ and the accompanying presentation https://jats.nlm.nih.gov/jats-con/2022/presentations/jatscon22-krewinkel.html#/title-slide

typesettingPDFpdf enginetroffgroff

GNU troff is a powerful yet lesser-known typesetting system that can serve as an alternative to LaTeX. It is much faster and a good choice when performance is important. Use it by specifying --pdf-engine=pdfroff or --to=ms when converting to PDF. pandoc pdf engine troff groff https://www.gnu.org/software/groff/

ConTeXtPDFaccessibility

Tagged PDFs via ConTeXt

Pandoc 3.0 comes with a new ConTeXt extension for improved PDF tagging:

pandoc --to=context+tagging -V pdfa -o out.pdf

This will produce “tagged” output, i.e. the PDF file contains structured text that is easier to process by accessibility tools.

ConTeXt output is tagged with and without the new “tagging” extension, but what’s new and important is that the output gets optimized for tagging, resulting in much better results for paragraphs and emphasized text.

docxodtreference doc

Reference docs allow to adjust the style of docx and odt files. Generate a reference doc with

pandoc -o refdoc.docx --print-default-data-file reference.docx

Open the refdoc.docx document and modify the styles in there to your liking. Then pass the modified document to let pandoc use the new styles:

pandoc --reference-doc=refdoc.docx ...
Markdownbeamerreveal.jsPowerPoint

Speaker Notes

Create speaker notes for slides by adding a “notes” div:

# Abbey Road
Here comes the sun

::: notes
And I say it's all right.
:::

Works as expected with pptx. In reveal.js use the speaker view by pressing s. For beamer compile with -V classoption=notes to get slides an notes on one page, or -V classoption=notes=only to get just the notes.

Quarto

Accessing pandoc from Quarto

Pandoc is shipped as part of Quarto; users of the latter can run any raw pandoc command in the terminal with

quarto pandoc …

Example: find which pandoc version is shipped with quarto by running

quarto pandoc --version
securityfilterLua

🛡️ Security Advice 🛡️

Filters are small programs with access to the file system. They should be treated with due diligence. Run filters only if their authors and sources are trustworthy.

Djotcustom readercustom writer

John MacFarlane, creator of pandoc and coauthor of the CommonMark spec, designed “Djot”. Published in July 2022, Djot is a new markup language that is similar to Markdown, but more principled in many ways. Parse and produce djot files with pandoc via the custom reader and custom writer included in the repository. https://djot.net/

TeXConTeXttypesettingPDF

ConTeXt is an advanced typesetting system based on TeX; it’s the preferred system of many professional typesetters. Pandoc can convert to ConTeXt markup with --to=context and generate PDF files via ConTeXt with --pdf-engine=context. The system makes it possible to generate accessible, tagged PDFs, which is otherwise difficult to do:

pandoc --pdf-engine=context -V pdfa=3a ...

ConTeXt in the fediverse: @context@fosstodon.org.

MarkdownAbbreviations

No-break abbreviations

Markdown tip: pandoc prevents line breaks after certain abbreviations. For example, in order to treat Oct. 31 as a single word, pandoc replaces the normal space character with a no-break space.

To view the complete list of default abbreviations which use no-break spaces, use

pandoc --print-default-data-file=abbreviations

If you want to set custom abbreviations, create a file with

pandoc --abbreviations=...
reveal.jsslides

One more for reveal.js: set a slide’s background image with the background attribute. Example using pandoc’s Markdown:

# End {background=sunset.jpg}

Alternatively, set a color with

# 🚀 {background-color=black}
defaults

Default input files

Defaults files can fix the set of input files, as well as their order. Example: process the files foo​.md and bar​.md in this order by writing

input-files:

- foo​.md - bar​.md

to all.yaml. Then use it by calling pandoc with

pandoc -d all

#pandoc #defaultFiles

syntax highlightingcode

Syntax highlighting

Pandoc performs syntax highlighting for 148 different markup- or programming languages. To see the complete list, use pandoc --list-highlight-languages. Not every language is supported by default; to add a language yourself,

  1. check https://kate-editor.org/syntax/
  2. download the definition
  3. pass it to pandoc via the --syntax-definition option.
ZettlrMarkdownzettelkasten

Zettlr

Zettlr is an open source “Markdown editor for the 21th century” that comes with pandoc included. It works on all major platforms, plays well with reference managers, supports the “zettelkasten” system for note taking, and more.

https://zettlr.com

Patatterminalslidespresentation

Patat is a fun (and very nerdy) program that uses pandoc as a library. It allows to run presentations in an ANSI terminal and accepts all pandoc-supported formats as input. https://github.com/jaspervdj/patat/#patat

Patat screenshot
MarkdownOrg modereStructuredText

Ordered lists are numbered automatically in most lightweight markup languages, including Markdown, Org mode, and reStructuredText; the generated item numbering in the output is identical for both lists below:

1. one
2. two
3. tree

and

1. one
1. two
1. three
Markdownwrappingbackslash

pandoc Markdown tip: “backslash-escape” spaces to make them non-breaking. E.g.: J.\ R.\ R.\ Tolkien or 128\ cm. This prevents the surrounding words from ending up on different lines.

CommonMarkMarkdown

Pandoc supports many extensions for Common mark; pandoc version 2.10.1 introduced the “commonmark_x” format, which is the standardized CommonMark plus a default set of useful extensions. This includes support for YAML blocks, fancy lists, syntax for divs and spans, and many other extensions that are part of pandoc’s Markdown.

CommonMark_x is almost a drop-in replacement. Citations are the exception, they haven’t been implemented yet.

BibLaTeXCSLJSONZoteroCSL

One can document the access date of a referenced URL with the urldate key in BibLaTeX or the accessed field in CSLJSON. E.g. in BibLaTeX:

@online{pandoc,
  title = {Pandoc manual},
  url = {https://pandoc.org/MANUAL%7D,
  date = {2022},
  urldate = {2023-01-03}
}

Not all citation styles make use of this info, but “author-date” styles do. The Zotero CSL style database can be filtered for those; pick the one you want here: https://www.zotero.org/styles?format=author-date

imagepdfsvgMarkdown

Default image extension

If you generate figures in multiple formats and want to choose the format best suited for the target, using the --default-image-extension command line parameter will allow you to set your desired format. For example, if you export an image as both pdf and svg, then you can write this Markdown ![caption](my-figure) and run pandoc with pandoc --to=html --default-image-extension=svg or pandoc --to=latex --default-image-extension=pdf.

QuartoLuafilter

Pretty URLs

Just out: “pretty-urls”, a tiny Lua filter that “prettifies” bare URLs by removing the protocol prefix; i.e., it drops the https:// from the link text while leaving the actual link unchanged.

https://github.com/pandoc-ext/pretty-urls

Also usable as a Quarto extension.

MarkdownCommonMark

The names Markdown and CommonMark are often used interchangeably. The latter refers to the formal specification published in 2014, resolving the syntax ambiguities of the original Markdown spec. CommonMark and “pandoc’s Markdown” differ in subtle ways, which is why pandoc has an extra CommonMark parser:

pandoc --from=commonmark

https://commonmark.org/

underlineOrg modeDoku wikiTextileJiraMarkdown

Underline

Underline text in different lightweight markup languages:

  • Emacs Org mode: *this*

  • Doku wiki: **this**

  • Textile, Jira: +this+

There is no Markdown syntax for underlined text, but pandoc’s Markdown reader treats the content of spans with class “underline” or “ul” as underlined:

[important]{.underline}
[nota bene]{.ul}
HTMLCSSquotation

HTML <q> tags

Pandoc’s --html-q-tags parameter triggers the use of elements instead of quotation marks in HTML/#EPUB output.

pandoc <<< '"Hello", they said.'
⇒ <p>“Hello”, they said.</p>

pandoc --html-q-tags <<< '"Hello", they said.'
⇒ <p><q>Hello</q>, they said.</p>

This makes it easy to apply extra styling to quotes via CSS. See also: https://developer.mozilla.org/en-US/docs/Web/CSS/quotes

Luafilterlinks

Pandoc Lua filter enforcing all external links to be opened in a new tab:

function Link (link)
  if link.target:match '^https?%:' then
    link.attributes.target = '_blank'
    return link
  end
end
HTMLPDF

Video by Craig Parker on using pandoc to create both HTML and PDF output from a large corpus of documentation. In his words: “I tried to make it exciting, for the record, but things don’t really start happening until almost the end (5:30-ish).” https://youtu.be/_eFQsRNbCKE

TeXLaTeXl10nquoted

The “csquotes” LaTeX package provides advanced facilities for inline and display quotations. The package has excellent support for different languages, too. Set the csquotes variable to make pandoc use this package when rendering quotes. For best results, set the lang variable as well. E.g., the first output below was created with pandoc -o a.pdf and the second with pandoc -o a.pdf -V csquotes.

Default English quotation marks Proper use of Guillemets

gfmCommonMark

The parser for GitHub Flavored Markdown (gfm) is based on CommonMark as well.

pandoc --from=gfm ...

This is the format GitHub uses to render README files; it should be used when producing such files from different formats:

pandoc --to=gfm --output README.md ...

GitHub maintained a spec for their format, but the spec is not kept up to date, unfortunately.

https://github.github.com/gfm/

layoutLaTeXPDF

Page margins

Control page margins in PDFs via the geometry variable, e.g. in YAML

---
geometry: margin=2cm
---

or on the command line

pandoc -V geometry=left=3cm,right=4cm
highlightingKDEsyntax highlightingtheme

Pandoc ships with 8 syntax highlighting styles: pygments (the default), tango, espresso, zenburn, kate, monochrome, breezedark, and haddock. Set --highlight-style to one of these values to vary code coloring etc. Not enough? Pandoc can also use KDE highlighting themes. There’s a list at https://kate-editor.org/themes/, theme files are available from https://github.com/KDE/syntax-highlighting/tree/master/data/themes. Pass the downloaded file as highlight style. syntax highlighting theme

templateLaTeX

The Eisvogel template is a clean and beautiful pandoc LaTeX template. https://github.com/Wandmalfarbe/pandoc-latex-template

Markdowncode

Markdown tip: inline code can be delimited by an arbitrary number of backticks, which is helpful if the code itself contains one or more backticks.

``const url = `this/${id}`;``{.typescript}

Leading and trailing spaces in code are ignored:

just a single, verbatim backtick: `` ` ``
MarkdownCommonMark

Smart quotes in Markdown and Common mark work with multi-paragraph quotations, where only the last paragraph has a closing quotation mark. The style is used primarily in English prose.

"Enough work for today.

"Would you like some tea?"
CLIyaml

“Defaults files” are a convenient tool to store, group, and combine related command line options. E.g., instead of typing

pandoc --pdf-engine=xelatex -V csquotes

one could create a file pdf.yaml with

pdf-engine: xelatex
variables: {csquotes: true}

and use it with

pandoc -d pdf.yaml

https://pandoc.org/MANUAL#defaults-files

TeXLaTeXPDFLuafilter

Parsing embedded LaTeX

There are situations where our primary target is LaTeX (for PDF), but we still need decent output in different formats. The below Lua filter will parse all raw LaTeX snippets when converting to different formats, so even a carefully crafted LaTeX table embedded into Markdown will show up in HTML output. pandoc filter Lua lang https://github.com/tarleb/parse-latex

imageconversionDPI

Vector graphic rasterization

Dissatisfied with the resolution of images in the pandoc-generated output? Some formats require the conversion of vector graphics to raster images, a process that can be controlled with the --dpi option. The default for it is 96; set the parameter to a higher value to get images with higher resolutions.

pandoc --dpi=300 ...
presentationTeXLaTeXbeamerreveal.jspowerpoint

Presentation slides

Pandoc supports multiple slide formats, thereby enabling quick-and-easy creation of presentations. The most popular choices are probably --to=beamer for mathy and content-heavy slides, --to=revealjs for less-formal talks, and --to=powerpoint when the company requires it.

Markdown

Appending {target=_blank} to a pandoc Markdown link ensures that a new tab will be opened when the link is clicked.

[manual](https://pandoc.org/MANUAL){target="_blank"}

Add the below to the document’s header-includes to enable this behavior for all links:

<base target="_blank" />
LuafilterBibTeXBibLaTeX

Do you maintain one big BibLaTeX database? Get the subset of just those entries required for an article with

pandoc -L getbib.lua paper.md -t biblatex -o paper.bib

where getbib.lua contains

function Pandoc (doc)
  doc.meta.references = pandoc.utils.references(doc)
  doc.meta.bibliography = nil
  return doc
end
Beamerslides

“Block” divs around a heading and with class “example” or “alert” get turned into boxes on Beamer slides. The classes example & alert allow for alternative coloring. Shown here for --slide-level=2 and with theme “Frankfurt”. pandoc

::: {.block}
### A block
Try to block it out!
:::

::: {.block .example}
### Example
Demo
:::

::: {.block .alert}
### Heads up!
Watch out.
:::
Beamer block types
Luafilter

Pandoc is highly customizable. E.g., the binary includes a Lua interpreter that can be used to programmatically modify the internal document representation (AST) before the output is generated. See the Lua filters documentation for details and examples.

Organizing large documents

Large documents are easier to handle when split into smaller files. Pandoc will treat multiple input files as if they were one long file.

pandoc -o out.pdf intro.md methods.md

For extra convenience, define the order via filename prefixes, e.g. 01-intro‍.md, 02-methods‍.md, and use the shell’s globbing feature:

pandoc -o out.pdf *.md
LuafilterQuarto

Pandoc filter and Quarto extension: multibib, a tool to add multiple bibliographies to a document. Feedback, contributions, and feature requests are all welcome. https://github.com/pandoc-ext/multibib

HTMLsemantic

Pandoc produces “flat” HTML output by default, placing all header elements on the same level in the document tree. The --section-divs option enables a hierarchical element structure, wrapping headings and the respective contents in

elements:

<section class="level1">
  <h1>Intro</h1>
  <p>Text</p>
  <section class="level2">
    <h2>Subsection</h2>
    <p>More</p>
  </section>
</section>

This adds semantic info and can help with styling.

EmacsOrg mode

Headers to lists in Org mode

When converting Emacs Org mode documents, be aware that pandoc respects many org export options and uses the same defaults. E.g., a common source of confusion is that 4th level headers become paragraphs by default. It happens in Emacs too, but can be prevented by adding #+OPTIONS: H:9 to the top of the org document.

MarkdownCommonMarkgfmemojiformat extension

Markdown Emoji extension

The emoji extension gives simple access to a large set of emojis via the :smile: syntax.

:page_facing_up: :arrow_right: :mage: :arrow_right: :book: :sparkles:

📄 ➡️ 🧙 ➡️ 📖 ✨

The extension is enabled by default in the gfm and commonmark_x input formats.