Quarto
Quarto is an open-source scientific and technical publishing system. Built on pandoc, it allows you to create reports, documents, websites, blogs, presentations, and articles in a variety of formats. Quarto supports several code editors, including JupyterLab, VS Code, and Neovim, as well as text editors. Quarto also has a powerful visual editor for pandoc markdown.
As a bonus, Quarto is thoroughly documented, and even has a Get Started guide with
tutorial. If you want to connect with the Quarto community in the
fediverse, use the #QuartoPub
hashtag.
Setting the document language
It is good practice to set the document language, e.g. via
lang
metadata field on the command line
pandoc -M lang=de-DE
This allows for proper hyphenation and other improvements; e.g. headers in LaTeX PDFs will be localized, with translations for “Figure”, “Table of Contents” and similar labels. Or put it directly into the doc; e.g. in Markdown one can use a YAML block
---
lang: es
---
and in Org mode
#+language: fr
Fountain reader
Custom reader for the “Fountain” format, a plain text markup language designed for screenwriters: https://github.com/pandoc/pandoc-fountain
Code modes
Whether you prefer dark mode or light mode for your code, pandoc has
you covered: use --highlight-style
to choose a consistent
setting which will apply to all code blocks. To view all options, you
can call a list of built-in style choices with
--list-highlight-styles
. These styling settings work with
docx, pptx, LaTeX, groff, HTML, and EPUB.
Find more information on syntax highlighting, including information on customizing settings, here.
The pandoc website has an online converter at https://pandoc.org/try; it offers a wealth of options, including file upload for docx or epub conversions. It’s perfect for experiments, short demonstrations, and one-off conversions.
The oldest Markdown syntax for code blocks, that is also the best supported across different md implementations, uses 4 space indentation, e.g.
Assign a variable
a = 10
There is no md syntax to specify the language for these blocks, but
setting the cli parameter --indented-code-class=python
marks all such blocks in a document as python code, thereby enabling
syntax highlighting.
Separating lists
Blank lines alone do not separate Markdown lists, but an empty HTML comment does:
1. one
2. two
<!– –>
1. un
2. deux
Omitting the <!-- -->
snippet in the example above
results in a single list with four items.
Extract all images embedded in a docx or epub file into a
media
folder with
pandoc --extract-media=media my-file.docx -o my-file.md
Docker has accepted pandoc as an open source project; rate limits for Docker images have been lifted, and images are marked with a “sponsored OSS” badge. https://hub.docker.com/u/pandoc
LaTeX PDF tip: When desired, disable the top and bottom rules of tables with
\renewcommand{\toprule}[2]{}
\renewcommand{\bottomrule}[2]{}
The snippet can be placed in the header-includes or directly in Markdown.
For pandoc versions before 2.18 use this instead:
\let\toprule\relax
\let\bottomrule\relax
On Jan 11, 2013, Aaron Swartz died. He was the co-creator of Markdown, internet activist, and strong proponent of Open Access. The documentary “The Internet’s Own Boy” tells about his life. https://archive.org/details/TheInternetsOwnBoyTheStoryOfAaronSwartz
If images become too deeply nested, use
--extract-media=.
and/or apply this Lua filter:
local pattern = '^media/(.*)'
for fp, mt, contents in pandoc.mediabag.items() do
pandoc.mediabag.delete(fp)
pandoc.mediabag.insert(fp:match(pattern), mt, contents)
end
function Image (img)
img.src = img.src:match(pattern)
return img
end
Quarto
Quarto is an open-source scientific and technical publishing system. Built on pandoc, it allows you to create reports, documents, websites, blogs, presentations, and articles in a variety of formats. Quarto supports several code editors, including JupyterLab, VS Code, and Neovim, as well as text editors. Quarto also has a powerful visual editor for pandoc markdown.
As a bonus, Quarto is thoroughly documented, and even has a Get Started guide with
tutorial. If you want to connect with the Quarto community in the
fediverse, use the #QuartoPub
hashtag.
Babelmark
Babelmark is a great tool to see how different Markdown processors handle input. https://babelmark.github.io
Need a template engine?
🦀
Why not pandoc?
Template (a.txt):
Hello from ${place}!
Command:
pandoc --template=a.txt -V place='Decapod 10' /dev/null
Output:
Hello from Decapod 10!
The template engine is flexible and powerful; it was designed specifically for pandoc. Features include
- interpolation,
- partials (for modularization),
- conditionals,
- loops,
- text layouting utils, and
- simple text modifications.
https://pandoc.org/MANUAL#template-syntax
Page breaks
The pagebreak Lua filter adds support for manual page-break markers
in the text with the raw #TeXLaTeX \newpage
command.
“Translated” by the filter, this marker works with HTML & EPUB,
Asciidoc, Word docx, groff ms, and ConTeXt output.
The filter is available from https://github.com/pandoc-ext/pagebreak.
Quarto and RMarkdown ship with an older version of the pagebreak
filter, so no need to install it when pandoc is called through either of
those. #QuartoPub also supports the {{ pagebreak }}
shortcode syntax.
https://quarto.org/docs/authoring/markdown-basics.html#page-breaks
Markdown ATX-style headings, i.e. those marked by one or more hash signs, should always have a space between marker and text.
# Intro
Pandoc enforces this space by default, but can be “convinced” to accept the space-less variant
#Intro
by disabling the space_in_atx_header
extension:
pandoc --from=markdown-space_in_atx_header
Pandoc warns when it cannot convert parts of the input. Use
pandoc --fail-if-warnings
to make sure that all of the input is converted as expected: no output will be produced in case of a warning.
Line block syntax in pandoc’s Markdown is borrowed from reStructuredText: each line is prefixed with a pipe character and a space. It can be used for line-oriented text such as addresses or poems.
| One Ring to rule them all, One Ring to find them,
| One Ring to bring them all and in the darkness bind them.
The main difference to other methods of handling line-oriented contents is that line blocks convert leading spaces into non-breaking spaces, thereby preserving indentation.
https://pandoc.org/MANUAL.html#line-blocks
“Pandoc-discuss” is pandoc’s official mailing list: https://groups.google.com/g/pandoc-discuss. It is the central point of contact for pandoc users and devs; use it for any questions about pandoc.
The web interface is misleading in that it makes it seem like a Google account is required to post. However, it’s enough to write a mail to pandoc-discuss+subscribe@googlegroups.com to subscribe to the list with any mail address. support mailing list
TeX macros in Markdown math
Pandoc Markdown respects \newcommand
definitions in math
formulæ, independent of the chosen output format. The definition command
can be included in Markdown text, e.g.:
\newcommand\rationals{\mathbb{Q}}
We can prove that $\pi\not\in\rationals$.
Output with pandoc -t plain
:
We can prove that π ∉ ℚ.
Banish the braces
Do you need Markdown code blocks to be written without braces and dots for the language?
``` {.python}
print("hello")
```
Use the extension --to markdown-fenced_code_attributes
to show the language without extra characters.
``` python
print("hello")
```
Citations in pandoc’s Markdown can be written with curly braces
around the citation key. This allows to use complex keys, including URLs
@{https://en.wikipedia.org/wiki/pandoc}
and DOI identifiers
@doi:10.5281/zenodo.1038654
.
Underline with reStructuredText
reStructuredText follows the logic that underlined text means that it’s a link, so it doesn’t have dedicated syntax for (non-link) underlined text either. Support can be added with this syntax
.. role:: underline
:class: underline
:underline:`this`
in combination with a short Lua filter
function Span(s)
return s.classes[1] == 'underline' and pandoc.Underline(s.content) or nil
end
Versioning
Pandoc follows the “Package Versioning Policy”, a convention used in the Haskell ecosystem. The version number components are EPOCH.MAJOR.MINOR.PATCH; bumps in the major version happen when there are API changes in the Haskell library.
Blog post on fancy Markdown lists and how to write long, interrupted lists without having to count items.
https://tarleb.com/posts/list-continuation/
Spoiler: uses a Lua pandoc filter
Two Markdown tricks to “nest” code blocks, i.e., to use backtick-based code block syntax within a code block:
More backticks for the outer block:
````` markdown ``` r print('Hello!') ``` `````
Tildes:
~~~ markdown ``` r print('Bonjour!') ``` ~~~
There’s also a third method:
Indent by four spaces
``` r print('Moin') ```
Note though that this last option differs from the other two in that it doesn’t enable syntax highlighting.
EPUB tip: sometimes pandoc doesn’t bundle all resources required for the desired rendering – fonts being a typical example. Include those files as images in placeholder metadata fields:
---
extra-files:
- '![](my-font.tff)'
---
This ensures that the files are picked up and packed into the generated EPUB archive.
Starting with pandoc 2.19, reStructuredText-style grid tables can contain advanced table features, including multi-row headers and cells that can span across rows and columns. Grid tables work with Markdown, RST, and Org mode input.
Render references from a BibLaTeX file as Word docx:
pandoc refs.bib --citeproc -o refs.docx
The first ordered list item in pandoc’s Markdown and reStructuredText not only determines the list style, but also which number/letter it starts from. Useful e.g. to continue lists after an intermediate paragraph:
i) one
#) another
Interruption; not part of any list.
iii) continue
#) keep counting
Line numbers in code blocks
Pandoc supports number code lines in a many output formats, incl. TeXLaTeX, ConTeXt, and HTML (but not docx)
Markdown:
``` lua {.number-lines}
print 'Hello!'
return true
```
Org mode:
#+begin_src lua +n
print 'Please answer!'
return 42 #+end_src
Rendered HTML:
Two lines of Lua code, each preceded by a line number. The code has been syntax highlighted.
Pandoc’s Markdown and reStructuredText allow for fancy lists using different numbering styles (numbers, letters, roman) and deliminators (parens, period, …):
a. alpha
#. bravo
#. charlie
I) primus
#) secundus #) tertius
⇒
a. alpha
b. bravo
c. charlie
I) primus
II) secundus
III) tertius
Link bare URIs
The autolink_bare_uris
extension for Markdown allows to
omit the angle brackets that are otherwise required to mark URLs as
links. E.g., the input The manual’s URL is
https://pandoc.org/MANUAL.html. becomes identical to The manual’s URL is
https://pandoc.org/MANUAL.html. as can be seen by
running the former through
pandoc --from markdown+autolink_bare_uris --to markdown
(The example URL contains an extra char for technical reasons; do not copy-paste it.) pandoc format extension
The --toc
command line option adds a table of contents
to the resulting document. The -s
or
--standalone
flag must be set as well to make the TOC show
up in plain text formats.
Attributes
Pandoc’s Markdown supports attributes on headings, code, spans, etc. Attributes are enclosed by curly braces and follow this syntax:
{identifier-value .class-one .class-two some-key="a value"}
Attributes can define an internal cross-ref target and carry a variety of additional information about the element, e.g. language or styling.
peace: [שָׁלוֹם]{lang="he" dir="rtl"}
Format extensions
The handling of formats can be tweaked by enabling or disabling format extensions. Running
pandoc --list-extensions=my_format
lists available extensions and their default status. Extensions are
enabled by prefixing them with +
, and disable with
-
:
pandoc --from my_format+enabled_extension-disabled_extension
Example: disable the smart handling of quotes, ellipses, etc in Markdown with
pandoc --from=markdown-smart
New pandoc filter / quarto extension: run-lua, a filter to evaluate small Lua snippets embedded in the document. E.g.
Built with pandoc <?lua PANDOC_VERSION?>
or with Quarto
Generator: Quarto <?lua quarto.version?>
https://github.com/pandoc-ext/run-lua
Slim Arch install
Arch Linux users probably know that installing pandoc with pacman pulls in a large number of Haskell dependencies. This is due to Arch’s dynamic linking approach. The pandoc-bin package on AUR is an alternative to the default packages; it has a much smaller disk footprint.
Org mode tip: use
#+MACRO: myvar Replacement text
to define macros. Use them with
{{{myvar}}}
Macros with arguments are possible, too:
#+MACRO: greet Hello, $1!
{{{greet(Mike)}}}
Pandoc supports this feature.
Step-by-step guide published by @natalie, detailing a workflow that converts from Scrivener to Word docx with pandoc and Zotero. https://nataliekraneiss.com/wp-content/uploads/2022/11/Steps-from-Scrivener-to-Word-with-Live-Citations.pdf
Reveal.js tip: Adding the r-stretch class to an image produces an image that’s stretched to fill the slide. The image should be the only element on that slide. E.g.
pandoc -t revealjs -s slides.md
with slide content
![Results](graph.png){.r-stretch}
Use the command line option --number-sections
or
-N
to make pandoc generate output with numbered sections.
Headings with class “unnumbered” are treated specially and won’t be
numbered.
Common mark allows to add attributes to all elements when the
attributes
extension is enabled:
{.fruit}
- apple
- banana
Pandoc doesn’t support attributes on some elements (yet); this includes lists. The attributes are attached to a wrapping div in those cases, so the above is equivalent to this:
::: {.fruit}
- apple
- banana
:::
The extension is enabled by default in “commonmark_x”. Markdown
Markdown as a simpler LaTeX
Markdown can be used as a simpler LaTeX, but with the raw power of
TeX still available when targeting PDF. TeX commands can be embedded in
Markdown; the options --natbib
/--biblatex
are
available for when pandoc’s citation handling via CSL isn’t powerful
enough for once.
Exit codes are useful, e.g. when embedding pandoc in wrapper-scripts. Pandoc exiting with code zero signals success, with other numbers indicating the type of error that occurred. For example, exit code 84 hints at a problem in the Lua subsystem. The full list is in the manual: https://pandoc.org/MANUAL.html#exit-codes
Reference links
Markdown has two kinds of links:
[inline links](https://pandoc.org/MANUAL#inline-links)
and
[reference links][1]
[1]: https://pandoc.org/MANUAL#reference-links
The former is the default when producing Markdown, but the latter can
be generated by calling pandoc with the --reference-links
option. Control the placement of reference lists with
--reference-location
: either after each block, each
section, or at the end of the document.
Markdown tip: put the title of a section in square brackets to create a link to that section.
# Julius Caesar
⋮
"Alea iacta est" is a phrase attributed to [Julius Caesar].
reStructuredText allows to use a hash symbol instead of a number in ordered list markers:
#. this #. that
Pandoc’s Markdown supports this syntax as well via the
fancy_lists
extension; it is enabled by default.
East Asian Line Breaks
Line breaks within paragraphs are usually treated as spaces in
Markdown, but this can give bad results when writing in an East Asian
language. Using pandoc -f markdown+east_asian_line_breaks
solves this, it ensures that line breaks between East Asian wide
characters get ignored.
The --no-highlight
cli flag disables pandoc’s syntax
highlighting. This comes in handy when external highlighting engines are
used, e.g. client-side highlighting with HTML and
https://highlightjs.org.
Prevent automatic addition of header-block
Trick to prevent pandoc from adding title, date or author to the
document text: unset the title
variable by setting it to an
empty string
pandoc -V title='' …
The values will still be included in the document’s metadata, but won’t become part of the text. The most common use-case for this is HTML output, where the default header block may be unwanted.
Pandoc’s Markdown comes with special syntax for this, allowing headings to be marked as unnumbered via the attribute string {-}
# Unnumbered heading {-}
abstrat-section
Lua filter
The abstract-section Lua filter allows to write an abstract as part of the main text instead of having to place it in the YAML metadata. The filter can also be used and installed as a #Quarto #extension.
Delightful introduction to Lua filters by James Adams. It shows how and why these document modifiers can be useful. https://jmablog.com/post/pandoc-filters/ Most of the article is about plain pandoc, despite the title.
Internationalized Docker images
The default #pandoc #Docker images ship with support for common languages that use scripts based on the Latin alphabet. Documents in other writing systems generally requires custom images. Example image that can be used with Ukrainian documents, adding a font with #Cyrillic glyphs:
FROM pandoc/latex
RUN tlmgr install babel-ukrainian
RUN apk --no-cache add font-linux-libertine && fc-cache -f
Build with
docker build -t pandoc-ukr -f Dockerfile .
#i18n #font #TeXLaTeX
An image created with the above method can then use with
docker run … pandoc-ukr --pdf-engine=lualatex …
Example document:
---
title: "Приклад українською"
mainfont: Linux Libertine
lang: uk
---
Цей текст не дуже цікавий.
#pandoc #i18n
The reStructuredText reader is unusual in that it switches behavior
when pandoc is called with -s
or --standalone
:
In that case, definitions at the start of the document get treated as
metadata.
:title: My Document
:author: Jane Doe
:date: 2023-01-05
Without --standalone
, that data is treated as part of
the main document and parsed as a definition list.
Download a nightly
If you’re a pandoc user in need of a version that includes a recent bug fix, or wants to test a feature recently added to the development version, download a nightly.
Binaries of the current pandoc development version are created every 24 hours– to access them,
- go to GitHub actions at https://github.com/jgm/pandoc/actions?query=workflow%3ANightly
- click on the latest result
- download an artifact for Linux, macOS, or Windows.
Markdown figures
An image with a description is treated as a figure if it is the only element in a Markdown paragraph:
![Male mandrill](mandrill.jpg)
This can be turned-off for individual images by appending a (possibly empty) HTML comment
![Peppers](peppers.jpg)<!-- -->
or globally by disabling the implicit_figures
extension:
pandoc --from=markdown-implicit_figures
Three methods to add a comment to pandoc Markdown texts:
-
<!-- HTML comment; will show up in HTML output, unless suppressed with `--strip-comments` -->
-
--- # YAML block with comment. # Always removed. ---
-
``` {=comment} Creative use of raw blocks. Included in md output, dropped everywhere else. ```
Pandoc produces “flat” HTML output by default, placing all header
elements on the same level in the document tree. The
--section-divs
option enables a hierarchical element
structure, wrapping headings and the respective contents in
elements:
<section class="level1">
<h1>Intro</h1>
<p>Text</p>
<section class="level2">
<h2>Subsection</h2>
<p>More</p>
</section>
</section>
This adds semantic info and can help with styling.
Syntax-highlighted code files
Custom Lua readers allow to support additional formats, and other interesting features. Here’s one converting code files to syntax-highlighted docs.
function to_cb (s)
local _, lang = pandoc.path.split_extension(s.name)
return pandoc.Div{
pandoc.Header(1, s.name == '' and '<stdin>' or s.name),
pandoc.CodeBlock(s.text, {class=lang}),
}
end
function Reader (input)
return pandoc.Pandoc(input:map(to_cb))
end
Usage: pandoc -f src.lua -o out.docx
https://pandoc.org/custom-readers
Viewing docs in the terminal
Files of supported formats can be read like Unix man pages in the terminal:
pandoc -s -t man INPUT_FILE | man -l -
Define a shell function for extra convenience:
mandoc () { pandoc -s -t man "$@" | man -l - }
Violà, a short and friendly command to read documents:
mandoc letter.docx
Works on Linux and Mac. pandoc
Slide themes
The theme
variable allows to change the style of a
presentation. Personal favorites are --variable=theme:serif
for reveal.js and
--variable=theme:metropolis
for beamer.
See also: https://revealjs.com/themes/ https://hartwork.org/beamer-theme-matrix/
Pandoc 3 introduced Markdown support for wikilinks:
[[pandoc|https://github.com]]
There is no consensus across tools on whether the link title should come before or after the pipe character, so pandoc supports both. Choose by enabling either of the new extensions
+wikilinks_title_after_pipe
or
+wikilinks_title_before_pipe
Works with CommonMark, too: for GitHub wiki input, use
pandoc --from=gfm+wikilinks_title_after_pipe …
Two ways to get a hard linebreak within a paragraph with Markdown:
End the line with two spaces (most portable).
I want a linebreak after this.␠␠ This is a new line.
“Backslash escape” the linebreak (more readable).
I want a linebreak after this.\ This is a new line.
Both methods work with pandoc and any CommonMark processor.
Lesser know pandoc Markdown extension: definition lists (also called “description lists”)
apple
: a delicious fruit
bad apple
: a metaphor
: graphical "Hello, World!"
Enabled by default in markdown
and
commonmark_x
.
The pandoc-quotes filter by @odin adjusts quotation marks to fit the document language. E.g. running the below
---
lang: es
---
"¡Hola!"
through pandoc -L pandoc-quotes.lua -t plain
yields
«¡Hola!»
The filter is available at https://github.com/odkr/pandoc-quotes.lua
Docker images
We maintain a number of Docker images for quick ’n easy access to pandoc, e.g. in CI systems. Types of pandoc/* Docker images:
minimal – very small, just the bare pandoc binary;
core – includes pandoc-crossref and helpers needed for SVG image conversion;
latex – like core, plus TeXLive with all packages required by the default template.
https://hub.docker.com/u/pandoc
The code used to generate the Docker images is available from https://github.com/pandoc/dockerfiles. The accompanying README contains more info as well as usage instructions.
JOSS, the Journal of Open Source Software, is now in the fediverse. The journal is remarkable for multiple reasons, with one reason being that papers are authored in Markdown and processed by pandoc.
Of course, all of the journal’s systems are open source. E.g., the publishing pipeline is available here: https://github.com/openjournals/inara The pipeline currently produces PDF, JATS, and Crossref XML output.
For more info on the joss publishing pipeline, esp. the JATS generation, see this article https://www.ncbi.nlm.nih.gov/books/NBK579698/ and the accompanying presentation https://jats.nlm.nih.gov/jats-con/2022/presentations/jatscon22-krewinkel.html#/title-slide
GNU troff is a powerful yet lesser-known typesetting system that can
serve as an alternative to LaTeX. It is much faster and a good choice
when performance is important. Use it by specifying
--pdf-engine=pdfroff
or --to=ms
when
converting to PDF. pandoc pdf engine troff groff
https://www.gnu.org/software/groff/
Tagged PDFs via ConTeXt
Pandoc 3.0 comes with a new ConTeXt extension for improved PDF tagging:
pandoc --to=context+tagging -V pdfa -o out.pdf
This will produce “tagged” output, i.e. the PDF file contains structured text that is easier to process by accessibility tools.
ConTeXt output is tagged with and without the new “tagging” extension, but what’s new and important is that the output gets optimized for tagging, resulting in much better results for paragraphs and emphasized text.
Reference docs allow to adjust the style of docx and odt files. Generate a reference doc with
pandoc -o refdoc.docx --print-default-data-file reference.docx
Open the refdoc.docx document and modify the styles in there to your liking. Then pass the modified document to let pandoc use the new styles:
pandoc --reference-doc=refdoc.docx ...
Speaker Notes
Create speaker notes for slides by adding a “notes” div:
# Abbey Road
Here comes the sun
::: notes
And I say it's all right. :::
Works as expected with pptx. In reveal.js use the speaker view by
pressing s
. For beamer compile with
-V classoption=notes
to get slides an notes on one page, or
-V classoption=notes=only
to get just the notes.
Accessing pandoc from Quarto
Pandoc is shipped as part of Quarto; users of the latter can run any raw pandoc command in the terminal with
quarto pandoc …
Example: find which pandoc version is shipped with quarto by running
quarto pandoc --version
🛡️ Security Advice 🛡️
Filters are small programs with access to the file system. They should be treated with due diligence. Run filters only if their authors and sources are trustworthy.
John MacFarlane, creator of pandoc and coauthor of the CommonMark spec, designed “Djot”. Published in July 2022, Djot is a new markup language that is similar to Markdown, but more principled in many ways. Parse and produce djot files with pandoc via the custom reader and custom writer included in the repository. https://djot.net/
ConTeXt is an advanced typesetting system based on TeX; it’s the
preferred system of many professional typesetters. Pandoc can convert to
ConTeXt markup with --to=context
and generate PDF files via
ConTeXt with --pdf-engine=context
. The system makes it
possible to generate accessible, tagged PDFs, which is otherwise
difficult to do:
pandoc --pdf-engine=context -V pdfa=3a ...
ConTeXt in the fediverse: @context@fosstodon.org.
No-break abbreviations
Markdown tip: pandoc prevents line breaks after certain
abbreviations. For example, in order to treat Oct. 31
as a
single word, pandoc replaces the normal space character with a no-break
space.
To view the complete list of default abbreviations which use no-break spaces, use
pandoc --print-default-data-file=abbreviations
If you want to set custom abbreviations, create a file with
pandoc --abbreviations=...
One more for reveal.js: set a slide’s background image with the
background
attribute. Example using pandoc’s Markdown:
# End {background=sunset.jpg}
Alternatively, set a color with
# 🚀 {background-color=black}
Default input files
Defaults files can fix the set of input files, as well as their order. Example: process the files foo.md and bar.md in this order by writing
input-files:
- foo.md - bar.md
to all.yaml. Then use it by calling pandoc with
pandoc -d all
#pandoc #defaultFiles
Syntax highlighting
Pandoc performs syntax highlighting for 148 different markup- or
programming languages. To see the complete list, use
pandoc --list-highlight-languages
. Not every language is
supported by default; to add a language yourself,
- check https://kate-editor.org/syntax/
- download the definition
- pass it to pandoc via the
--syntax-definition
option.
Zettlr
Zettlr is an open source “Markdown editor for the 21th century” that comes with pandoc included. It works on all major platforms, plays well with reference managers, supports the “zettelkasten” system for note taking, and more.
https://zettlr.com
Patat is a fun (and very nerdy) program that uses pandoc as a library. It allows to run presentations in an ANSI terminal and accepts all pandoc-supported formats as input. https://github.com/jaspervdj/patat/#patat
Ordered lists are numbered automatically in most lightweight markup languages, including Markdown, Org mode, and reStructuredText; the generated item numbering in the output is identical for both lists below:
1. one
2. two
3. tree
and
1. one
1. two
1. three
pandoc Markdown tip: “backslash-escape” spaces to make them
non-breaking. E.g.: J.\ R.\ R.\ Tolkien
or
128\ cm
. This prevents the surrounding words from ending up
on different lines.
Pandoc supports many extensions for Common mark; pandoc version 2.10.1 introduced the “commonmark_x” format, which is the standardized CommonMark plus a default set of useful extensions. This includes support for YAML blocks, fancy lists, syntax for divs and spans, and many other extensions that are part of pandoc’s Markdown.
CommonMark_x is almost a drop-in replacement. Citations are the exception, they haven’t been implemented yet.
One can document the access date of a referenced URL with the
urldate
key in BibLaTeX or the accessed
field
in CSLJSON. E.g. in BibLaTeX:
@online{pandoc,
title = {Pandoc manual},
url = {https://pandoc.org/MANUAL%7D,
date = {2022},
urldate = {2023-01-03}
}
Not all citation styles make use of this info, but “author-date” styles do. The Zotero CSL style database can be filtered for those; pick the one you want here: https://www.zotero.org/styles?format=author-date
Default image extension
If you generate figures in multiple formats and want to choose the
format best suited for the target, using the
--default-image-extension
command line parameter will allow
you to set your desired format. For example, if you export an image as
both pdf and svg, then you can write this Markdown ![caption](my-figure)
and run pandoc with pandoc --to=html --default-image-extension=svg
or pandoc --to=latex --default-image-extension=pdf
.
Pretty URLs
Just out: “pretty-urls”, a tiny Lua filter that “prettifies” bare
URLs by removing the protocol prefix; i.e., it drops the
https://
from the link text while leaving the actual link
unchanged.
https://github.com/pandoc-ext/pretty-urls
Also usable as a Quarto extension.
The names Markdown and CommonMark are often used interchangeably. The latter refers to the formal specification published in 2014, resolving the syntax ambiguities of the original Markdown spec. CommonMark and “pandoc’s Markdown” differ in subtle ways, which is why pandoc has an extra CommonMark parser:
pandoc --from=commonmark
https://commonmark.org/
Underline
Underline text in different lightweight markup languages:
Emacs Org mode:
*this*
Doku wiki:
**this**
Textile, Jira:
+this+
There is no Markdown syntax for underlined text, but pandoc’s Markdown reader treats the content of spans with class “underline” or “ul” as underlined:
[important]{.underline}
[nota bene]{.ul}
Pandoc Lua filter enforcing all external links to be opened in a new tab:
function Link (link)
if link.target:match '^https?%:' then
link.attributes.target = '_blank'
return link
end
end
Video by Craig Parker on using pandoc to create both HTML and PDF output from a large corpus of documentation. In his words: “I tried to make it exciting, for the record, but things don’t really start happening until almost the end (5:30-ish).” https://youtu.be/_eFQsRNbCKE
The “csquotes” LaTeX package provides advanced facilities for inline
and display quotations. The package has excellent support for different
languages, too. Set the csquotes
variable to make pandoc
use this package when rendering quotes. For best results, set the
lang
variable as well. E.g., the first output below was
created with pandoc -o a.pdf
and the second with
pandoc -o a.pdf -V csquotes
.
The parser for GitHub Flavored Markdown (gfm) is based on CommonMark as well.
pandoc --from=gfm ...
This is the format GitHub uses to render README files; it should be used when producing such files from different formats:
pandoc --to=gfm --output README.md ...
GitHub maintained a spec for their format, but the spec is not kept up to date, unfortunately.
https://github.github.com/gfm/
Page margins
Control page margins in PDFs via the geometry
variable,
e.g. in YAML
---
geometry: margin=2cm
---
or on the command line
pandoc -V geometry=left=3cm,right=4cm
Pandoc ships with 8 syntax highlighting styles: pygments (the
default), tango, espresso, zenburn, kate, monochrome, breezedark, and
haddock. Set --highlight-style
to one of these values to
vary code coloring etc. Not enough? Pandoc can also use KDE highlighting
themes. There’s a list at https://kate-editor.org/themes/, theme files
are available from
https://github.com/KDE/syntax-highlighting/tree/master/data/themes. Pass
the downloaded file as highlight style. syntax highlighting theme
The Eisvogel template is a clean and beautiful pandoc LaTeX template. https://github.com/Wandmalfarbe/pandoc-latex-template
Markdown tip: inline code can be delimited by an arbitrary number of backticks, which is helpful if the code itself contains one or more backticks.
``const url = `this/${id}`;``{.typescript}
Leading and trailing spaces in code are ignored:
just a single, verbatim backtick: `` ` ``
Smart quotes in Markdown and Common mark work with multi-paragraph quotations, where only the last paragraph has a closing quotation mark. The style is used primarily in English prose.
"Enough work for today.
"Would you like some tea?"
“Defaults files” are a convenient tool to store, group, and combine related command line options. E.g., instead of typing
pandoc --pdf-engine=xelatex -V csquotes
one could create a file pdf.yaml
with
pdf-engine: xelatex
variables: {csquotes: true}
and use it with
pandoc -d pdf.yaml
https://pandoc.org/MANUAL#defaults-files
Parsing embedded LaTeX
There are situations where our primary target is LaTeX (for PDF), but we still need decent output in different formats. The below Lua filter will parse all raw LaTeX snippets when converting to different formats, so even a carefully crafted LaTeX table embedded into Markdown will show up in HTML output. pandoc filter Lua lang https://github.com/tarleb/parse-latex
Vector graphic rasterization
Dissatisfied with the resolution of images in the pandoc-generated
output? Some formats require the conversion of vector graphics to raster
images, a process that can be controlled with the --dpi
option. The default for it is 96; set the parameter to a higher value to
get images with higher resolutions.
pandoc --dpi=300 ...
Presentation slides
Pandoc supports multiple slide formats, thereby enabling
quick-and-easy creation of presentations. The most popular choices are
probably --to=beamer
for mathy and content-heavy slides,
--to=revealjs
for less-formal talks, and
--to=powerpoint
when the company requires it.
Appending {target=_blank}
to a pandoc Markdown link
ensures that a new tab will be opened when the link is clicked.
[manual](https://pandoc.org/MANUAL){target="_blank"}
Add the below to the document’s header-includes to enable this behavior for all links:
<base target="_blank" />
Do you maintain one big BibLaTeX database? Get the subset of just those entries required for an article with
pandoc -L getbib.lua paper.md -t biblatex -o paper.bib
where getbib.lua
contains
function Pandoc (doc)
doc.meta.references = pandoc.utils.references(doc)
doc.meta.bibliography = nil
return doc
end
“Block” divs around a heading and with class “example” or “alert” get
turned into boxes on Beamer slides. The classes example & alert
allow for alternative coloring. Shown here for
--slide-level=2
and with theme “Frankfurt”. pandoc
::: {.block}
### A block
Try to block it out!
:::
::: {.block .example}
### Example
Demo
:::
::: {.block .alert}
### Heads up!
Watch out.
:::
Pandoc is highly customizable. E.g., the binary includes a Lua interpreter that can be used to programmatically modify the internal document representation (AST) before the output is generated. See the Lua filters documentation for details and examples.
Organizing large documents
Large documents are easier to handle when split into smaller files. Pandoc will treat multiple input files as if they were one long file.
pandoc -o out.pdf intro.md methods.md
For extra convenience, define the order via filename prefixes,
e.g. 01-intro.md
, 02-methods.md
, and use the
shell’s globbing feature:
pandoc -o out.pdf *.md
Pandoc filter and Quarto extension: multibib, a tool to add multiple bibliographies to a document. Feedback, contributions, and feature requests are all welcome. https://github.com/pandoc-ext/multibib
Pandoc produces “flat” HTML output by default, placing all header
elements on the same level in the document tree. The
--section-divs
option enables a hierarchical element
structure, wrapping headings and the respective contents in
elements:
<section class="level1">
<h1>Intro</h1>
<p>Text</p>
<section class="level2">
<h2>Subsection</h2>
<p>More</p>
</section>
</section>
This adds semantic info and can help with styling.
Headers to lists in Org mode
When converting Emacs Org mode documents, be aware that pandoc
respects many org export options and uses the same defaults. E.g., a
common source of confusion is that 4th level headers become paragraphs
by default. It happens in Emacs too, but can be prevented by adding
#+OPTIONS: H:9
to the top of the org document.
Markdown Emoji extension
The emoji
extension gives simple access to a large set
of emojis via the :smile:
syntax.
:page_facing_up: :arrow_right: :mage: :arrow_right: :book: :sparkles:
⇒
📄 ➡️ 🧙 ➡️ 📖 ✨
The extension is enabled by default in the gfm and commonmark_x input formats.