pandoc

Universal markup converter

Subscribe to updates I use pandoc


Statistics on pandoc

Number of watchers on Github 11041
Number of open issues 452
Average time to close an issue 2 days
Main language Haskell
Average time to merge a PR 2 days
Open pull requests 106+
Closed pull requests 51+
Last commit 8 months ago
Repo Created over 8 years ago
Repo Last Updated 8 months ago
Size 29.7 MB
Homepage http://johnmacfar...
Organization / Authorjgm
Latest Release2.1.2
Contributors115
Page Updated
Do you use pandoc? Leave a review!
View open issues (452)
View pandoc activity
View on github
Latest Open Source Launches
Trendy new open source projects in your inbox! View examples

Subscribe to our mailing list

Evaluating pandoc for your project? Score Explanation
Commits Score (?)
Issues & PR Score (?)

Pandoc

github
release hackage
release homebrew stackage LTS
package travis build
status appveyor build
status license pandoc-discuss on google
groups

The universal markup converter

Pandoc is a [Haskell](https://www.haskell.org) library for converting from one markup format to another, and a command-line tool that uses this library. Pandoc can read [Markdown](http://daringfireball.net/projects/markdown/), [CommonMark](http://commonmark.org), [PHP Markdown Extra](https://michelf.ca/projects/php-markdown/extra/), [GitHub-Flavored Markdown](https://help.github.com/articles/github-flavored-markdown/), [MultiMarkdown](http://fletcherpenney.net/multimarkdown/), and (subsets of) [Textile](http://redcloth.org/textile), [reStructuredText](http://docutils.sourceforge.net/docs/ref/rst/introduction.html), [HTML](http://www.w3.org/html/), [LaTeX](http://latex-project.org), [MediaWiki markup](https://www.mediawiki.org/wiki/Help:Formatting), [TWiki markup](http://twiki.org/cgi-bin/view/TWiki/TextFormattingRules), [TikiWiki markup](https://doc.tiki.org/Wiki-Syntax-Text#The_Markup_Language_Wiki-Syntax), [Creole 1.0](http://www.wikicreole.org/wiki/Creole1.0), [Haddock markup](https://www.haskell.org/haddock/doc/html/ch03s08.html), [OPML](http://dev.opml.org/spec2.html), [Emacs Org mode](http://orgmode.org), [DocBook](http://docbook.org), [JATS](https://jats.nlm.nih.gov), [Muse](https://amusewiki.org/library/manual), [txt2tags](http://txt2tags.org), [Vimwiki](https://vimwiki.github.io), [EPUB](http://idpf.org/epub), [ODT](http://en.wikipedia.org/wiki/OpenDocument), and [Word docx](https://en.wikipedia.org/wiki/Office_Open_XML). Pandoc can write plain text, [Markdown](http://daringfireball.net/projects/markdown/), [CommonMark](http://commonmark.org), [PHP Markdown Extra](https://michelf.ca/projects/php-markdown/extra/), [GitHub-Flavored Markdown](https://help.github.com/articles/github-flavored-markdown/), [MultiMarkdown](http://fletcherpenney.net/multimarkdown/), [reStructuredText](http://docutils.sourceforge.net/docs/ref/rst/introduction.html), [XHTML](http://www.w3.org/TR/xhtml1/), [HTML5](http://www.w3.org/TR/html5/), [LaTeX](http://latex-project.org) (including [`beamer`](https://ctan.org/pkg/beamer) slide shows), [ConTeXt](http://www.contextgarden.net/), [RTF](http://en.wikipedia.org/wiki/Rich_Text_Format), [OPML](http://dev.opml.org/spec2.html), [DocBook](http://docbook.org), [JATS](https://jats.nlm.nih.gov), [OpenDocument](http://opendocument.xml.org), [ODT](http://en.wikipedia.org/wiki/OpenDocument), [Word docx](https://en.wikipedia.org/wiki/Office_Open_XML), [GNU Texinfo](http://www.gnu.org/software/texinfo/), [MediaWiki markup](https://www.mediawiki.org/wiki/Help:Formatting), [DokuWiki markup](https://www.dokuwiki.org/dokuwiki), [ZimWiki markup](http://zim-wiki.org/manual/Help/Wiki_Syntax.html), [Haddock markup](https://www.haskell.org/haddock/doc/html/ch03s08.html), [EPUB](http://idpf.org/epub) (v2 or v3), [FictionBook2](http://www.fictionbook.org/index.php/Eng:XML_Schema_Fictionbook_2.1), [Textile](http://redcloth.org/textile), [groff man](http://man7.org/linux/man-pages/man7/groff_man.7.html), [groff ms](http://man7.org/linux/man-pages/man7/groff_ms.7.html), [Emacs Org mode](http://orgmode.org), [AsciiDoc](http://www.methods.co.nz/asciidoc/), [InDesign ICML](http://wwwimages.adobe.com/www.adobe.com/content/dam/acom/en/devnet/indesign/sdk/cs6/idml/idml-cookbook.pdf), [TEI Simple](https://github.com/TEIC/TEI-Simple), [Muse](https://amusewiki.org/library/manual), [PowerPoint](https://en.wikipedia.org/wiki/Microsoft_PowerPoint) slide shows and [Slidy](http://www.w3.org/Talks/Tools/Slidy/), [Slideous](http://goessner.net/articles/slideous/), [DZSlides](http://paulrouget.com/dzslides/), [reveal.js](http://lab.hakim.se/reveal-js/) or [S5](http://meyerweb.com/eric/tools/s5/) HTML slide shows. It can also produce [PDF](https://www.adobe.com/pdf/) output on systems where LaTeX, ConTeXt, `pdfroff`, `wkhtmltopdf`, `prince`, or `weasyprint` is installed. Pandocs enhanced version of Markdown includes syntax for tables, definition lists, metadata blocks, `Div` blocks, footnotes and citations, embedded LaTeX (including math), Markdown inside HTML block elements, and much more. These enhancements, described further under Pandocs Markdown, can be disabled using the `markdown_strict` format. Pandoc has a modular design: it consists of a set of readers, which parse text in a given format and produce a native representation of the document (like an *abstract syntax tree* or AST), and a set of writers, which convert this native representation into a target format. Thus, adding an input or output format requires only adding a reader or writer. Users can also run custom [pandoc filters](http://pandoc.org/filters.html) to modify the intermediate AST. Because pandocs intermediate representation of a document is less expressive than many of the formats it converts between, one should not expect perfect conversions between every format and every other. Pandoc attempts to preserve the structural elements of a document, but not formatting details such as margin size. And some document elements, such as complex tables, may not fit into pandocs simple document model. While conversions from pandocs Markdown to all formats aspire to be perfect, conversions from formats more expressive than pandocs Markdown can be expected to be lossy.

Installing

Heres how to install pandoc.

Documentation

Pandocs website contains a full Users Guide. It is also available here as pandoc-flavored Markdown. The website also contains some examples of the use of pandoc and a limited online demo.

Contributing

Pull requests, bug reports, and feature requests are welcome. Please make sure to read the contributor guidelines before opening a new issue.

License

2006-2018 John MacFarlane (jgm@berkeley.edu). Released under the GPL, version 2 or greater. This software carries no warranty of any kind. (See COPYRIGHT for full copyright and warranty notices.)

pandoc open issues Ask a question     (View All Issues)
  • about 2 years [odt] no support for multiple text modifiers
  • about 2 years Error in $: mempty
  • about 2 years Option to convert to PDF/A
  • about 2 years Use data-dir as a default location to search for pandoc-citeproc files
  • about 2 years Windows: Invalid UTF-8 stream when using Pandoc as a filter
  • about 2 years Fenced code blocks in footnotes
  • about 2 years underscore not recognized as emphasis
  • about 2 years Footnote numbering is wrong in inserted Titleblock content.
  • about 2 years top-level-division not replacing chapters with sections in reports
  • about 2 years LaTeX reader should optionally capture comments as raw TeX, like Markdown does with raw HTML comments
  • about 2 years Beamer overlays (\only, \uncover, etc)
  • about 2 years Visible <div> tags in writePlain
  • about 2 years Better handling of implicit figures
  • about 2 years Pipe Tables written by pandoc cannot be written as pipe tables again
  • about 2 years Reduce memory usage
  • about 2 years Feature Request: MathJax-node in Pandoc
  • about 2 years Change back arrow in epub footnotes
  • about 2 years Docx reader: parse bidirectional (left-to-right and right-to-left) text
  • about 2 years Syntax to force md parsing
  • about 2 years Command-line options -c and -H override metadata css and header-includes
  • about 2 years `header-includes` cannot be used as template fragments
  • about 2 years paths using a tilde prefix not discovered with --self-contained
  • about 2 years Weird interaction between definition lists and tables
  • about 2 years Embedded html
  • about 2 years Code block in notes div causes compilation error with beamer
  • about 2 years Multiple metadata blocks
  • about 2 years Commented out \begin and \end commands error
  • about 2 years Conversion of tex -> html does not preserve figure references
  • about 2 years docx writer: Option to customize (or remove) title/author block
  • about 2 years ReStructured Text tables messed up converting to tex/docx/odt.
pandoc open pull requests (View All Pulls)
  • Update LaTeX writer to align images
  • Implement a Mallard Reader.
  • LaTeX writer: figure label
  • Zim writer: initial implementation
  • Add support for alt text as short title in latex
  • Ikiwiki wikilinks
  • Option to preserve empty paragraphs in Docx Reader
  • Consistent underline for Readers
  • Adding SVG support for PDF & DOCX
  • Expanded description of description metadata field
  • Note on non-working YAML export to docx for author
  • Work in progress: Asciidoc reader
  • MediaWiki reader: Fix quotation mark parsing
  • Drawer capability for the Org Writer
  • Fixes #661: --smart must also convert << to Β« and >> to Β»
  • Markdown writer: Add --bullet-list-marker argument option.
  • Adding fenced div blocks to markdown reader
  • Add --prefer-fenced-code-blocks option
  • Add "--revealjs-title-content" to flags
  • export more Html functions and use StateT monad transformer instead of S...
  • Add --parts command line option to LaTeX writer.
  • Overhaul line blocks.
  • Implement figure numbering for HTML output
  • pandoc.hs: Add --include-early argument option
  • Add support for pagebreaks for LaTeX input and HTML output.
  • Option for settings listings code block environment name
  • Added bibliography support to Pandoc.PDF
  • Command line option to specify that DocBook should use recursive <section> elements
  • Add image caption support for ODT
  • Exposes parsers in the API (issue #418)
  • Open Document writer: set first level of blockquotes to not use indent
  • Context line blocks
  • LaTeX Writer: fix polyglossia to babel env mapping
  • LaTeX writer: Add missing languages.
  • Fixes #1762
  • Org reader: support org-ref style citations
  • Cleanup and compact org-related changelog entries
  • Documentation updates
  • Textile reader: Added raw link parsing.
  • Don't equate "Latin letters" with ASCII letters
  • Linebreak handling control
  • LaTeX reader: drop duplicate `*` in bibtexKeyChars
  • synchronize spacing of footnotes in help output
  • Request for comments: LineBlock element
  • Put note on structured vars in separate paragraph
  • Add \begin{html} environment to LaTeX reader
  • Remove Compat.Monoid
  • LaTeX: Do not set [htbp] figure placement options.
  • Translate NARROW NO-BREAK SPACE into LaTeX.
  • Support `yaml_metadata_block` extension in more formats
  • Use the markdown version of COPYING from GNU
  • Build README.md from the MANUAL.txt and README.md.in
  • [ODT Parser] Include list's starting value
  • Add command line option allowing to set type of top-level divisions
  • Markdown Reader: add attributes for autolink
  • [tex] Positioning of figures
  • [odt] Infer table's caption from the style's name
  • [tex] No invalid inlines in sections' options
  • Using all features supported by `.github/`
  • Small caps in Bracketed Spans
  • travis: use language generic
  • Add pagebreaks to Pandoc
  • Rearrange and extend badges in README
  • Text/Pandoc/Writers/PsuedoPod - implementation to merge, with tests
  • Put content of \ref commands into span…
  • Add basic \textcolor support to LaTeX reader
  • organize README
  • [WIP] Add Muse reader
  • Improve SVG image size code
  • Add Support for `glossaries` and `acronym` package
  • Support LaTeX \figurename, \tablename and \lstlistingname
  • [WIP] Add colspan/rowspan support
  • EPUB: improve SVG image output, closes #2766
  • Org: Fix reading of citations before punctuation
  • Added support for textcquotes in LaTeX Reader
  • Implemented #168
  • Docx writer: make more deterministic to facilitate testing
  • MANUAL.txt: self-contained implies standalone
  • RFC: Verbose docx
  • ConTeXt writer: attempt to generate PDF/A
  • Add image class to HTML figure. Closes #3928.
  • Add beamer option to show notes on second screen
  • WIP: colspan/rowspan
  • Issue #1800 Add XWiki Support
  • Stop using overlapping instances
  • Get ready for `Semigroup` as `Monoid` superclass.
  • Endnotes
  • LaTeX reader: ignore \noindent and flush(left|right) environments.
  • Enable TOC display for DOCX in YAML metadata
  • WIP: add header colours
  • [WIP] Parse siunitx num command
  • Shared.hierarchicalize: fix the "header-in-div" bug (#997)
  • Provide `bidi` package's option using `\PassOptionsToPackage`.
  • Remove extraneous, significant whitespace in JATS writer output
  • strip spaces from inline contents when writing RST, closes #4327
  • 4320
  • Add instructions for background images reveal.js
  • Add support to parse unit string of \SI command
  • [latex] forced linebreaks for fenced code using
  • Add option to specify table env. in LaTeX output
  • LaTeX writer: Update \lstinline delimiters.
  • Add support for latex mintinline
  • Add -V beameroption variable
  • RST writer: refactor separating inline transformation logic from writ…
  • 4434
  • Remove rawInlineOr because `\ref` and `\label` should be parsed in any case even with raw_tex extension
pandoc questions on Stackoverflow (View All Questions)
  • Generating a docx file using Pandoc: images missing! Due to multiple requests?
  • Converting webpage to pdf with pandoc
  • Using Pandoc with Swift
  • Pandoc-mode ignoring settings file?
  • correctly sizing PNG images in markdown with pandoc for html/pdf/docx
  • Rmarkdown - Pandoc version issues while rendering R markdown in Linux environment
  • Why Pandoc does not retrieve the image file?
  • rmarkdown: pandoc: pdflatex not found
  • Pandoc: HTML-to-Markdown--can I replace elements using templates or scripts?
  • From Markdown to PDF: how to change the font-size with Pandoc?
  • Relative system path to miktex and pandoc - Shiny Application packaged as Windows desktop app
  • Use a specific class depending on bullet with Markdown/Pandoc
  • How to use latex macros with pandoc?
  • How should I bundle pandoc with my OpenShift application?
  • Pandoc: Is there a way to include an appendix of links in a PDF from markdown?
  • Is it possible to write conditional statements in a pandoc template based on variable values?
  • pandoc document conversion failed with error 127
  • How to specify the font used for word doc exported using pandoc?
  • pandoc-generated docx misses italic variables in equations
  • markdown -> pandoc -> PDF | chokes on rendering tables
  • Error: pandoc document conversion failed with error 127
  • Can't pass arguments to pandoc through rmarkdown
  • Pandoc markdown: default colour for inline verbatim
  • Using pandoc as a library to make a PDF
  • SublimeText3 + pandown + pandoc: includes_paths not working
  • Atom and Pandoc
  • How do I make a reference to a figure in markdown using pandoc?
  • Blank page at start of pdf created via pandoc
  • Convert .odt to .docx in pandoc
  • Pandoc: get array size
pandoc list of languages used
pandoc latest release notes
2.1.2 pandoc 2.1.2
  • Markdown reader:

    • Fix parsing bug with nested fenced divs (#4281). Previously we allowed nonindent spaces before the opening and closing :::, but this interfered with list parsing, so now we require the fences to be flush with the margin of the containing block.
  • Commonmark reader:

    • raw_html is now on by default. It can be disabled explicitly using -f commonmark-raw_html.
  • Org reader (Albert Krewinkel):

    • Move citation tests to separate module.
    • Allow changing emphasis syntax (#4378). The characters allowed before and after emphasis can be configured via #+pandoc-emphasis-pre and #+pandoc-emphasis-post, respectively. This allows to change which strings are recognized as emphasized text on a per-document or even per-paragraph basis. Example:

      #+pandoc-emphasis-pre: "-\t ('\"{"
      #+pandoc-emphasis-post: "-\t\n .,:!?;'\")}["
      
  • LaTeX reader:

    • Fixed comments inside citations (#4374).
    • Fix regression in package options including underscore (#4424).
    • Make --trace work.
    • Fixed parsing of tabular* environment (#4279).
  • RST reader:

    • Fix regression in parsing of headers with trailing space (#4280).
  • Muse reader (Alexander Krotov):

    • Enable <literal> tags even if amuse extension is enabled. Amusewiki disables tags for security reasons. If user wants similar behavior in pandoc, RawBlocks and RawInlines can be removed or replaced with filters.
    • Remove space prefix from <literal> tag contents.
    • Do not consume whitespace while looking for closing end tag.
    • Convert alphabetical list markers to decimal in round-trip test. Alphabetical lists are an addition of Text::Amuse. They are not present in Emacs Muse and can be ambiguous when list starts with i., c. etc.
    • Allow <quote> and other tags to be indented.
    • Allow single colon in definition list term.
    • Fix parsing of verse in lists.
    • Improved parsing efficiency. Avoid parseFromString. Lists are parsed in linear instead of exponential time now.
    • Replace ParserState with MuseState.
    • Prioritize lists with roman numerals over alphabetical lists. This is to make sure i. starts a roman numbered list, instead of a list with letter i (followed by j, k, ).
    • Fix directive parsing.
    • Parse definition lists with multiple descriptions.
    • Parse next list item before parsing more item contents.
    • Fixed a bug: headers did not terminate lists.
    • Move indentation parsing from definitionListItem to definitionList.
    • Paragraph indentation does not indicate nested quote. Muse allows indentation to indicate quotation or alignment, but only on the top level, not within a or list.
    • Require that block tags are on separate lines. Text::Amuse already explicitly requires it anyway.
    • Fix matching of closing inline tags.
    • Various internal changes.
    • Fix parsing of nested definition lists.
    • Require only one space for nested definition list indentation.
    • Do not remove trailing whitespace from <code>.
    • Fix parsing of trailing whitespace. Newline after whitespace now results in softbreak instead of space.
  • Docx reader (Jesse Rosenthal, except where noted):

    • Handle nested sdt tags (#4415).
    • Dont look up dependant run styles if +styles is enabled.
    • Move pandoc inline styling inside custom-style span.
    • Read custom styles (#1843). This will read all paragraph and character classes as divs and spans, respectively. Dependent styles will still be resolved, but will be wrapped with appropriate style tags. It is controlled by the +styles extension (-f docx+styles). This can be used in conjunction with the custom-style feature in the docx writer for a pandoc-docx editing workflow. Users can convert from an input docx, reading the custom-styles, and then use that same input docx file as a reference-doc for producing an output docx file. Styles will be maintained across the conversion, even if pandoc doesnt understand them.
    • Small change to Fields hyperlink parser. Previously, unquoted string required a space at the end of the line (and consumed it). Now we either take a space (and dont consume it), or end of input.
    • Pick table width from the longest row or header (Francesco Occhipinti, #4360).
  • Muse writer (Alexander Krotov):

    • Change verse markup: > instead of <verse> tag.
    • Remove empty strings during inline normalization.
    • Dont indent nested definition lists.
    • Use unicode quotes for quoted text.
    • Write image width specified in percent in Text::Amuse mode.
    • Dont wrap displayMath into <verse>.
    • Escape nonbreaking space (~~).
    • Join code with different attributes during normalization.
    • Indent lists inside Div.
    • Support definitions with multiple descriptions.
  • Powerpoint writer (Jesse Rosenthal):

    • Use table styles This will use the default table style in the reference-doc file. As a result they will be easier when using in a template, and match the color scheme.
    • Remove empty slides. Because of the way that slides were split, these could be accidentally produced by comments after images. When animations are added, there will be a way to add an empty slide with either incremental lists or pauses.
    • Implement syntax highlighting. Note that background colors cant be implemented in PowerPoint, so highlighting styles that require these will be incomplete.
    • New test framework for pptx. We now compare the output of the Powerpoint writer with files that we know to (a) not be corrupt, and (b) to show the desired output behavior (details below).
    • Add notesMaster to presentation.xml if necessary.
    • Ignore links and (end)notes in speaker notes.
    • Output speaker notes.
    • Read speaker note templates conditionally. If there are speaker notes in the presentation, we read in the notesMasters templates from the reference pptx file.
    • Fix deletion track changes (#4303, Jesse Rosenthal).
  • Markdown writer: properly escape @ to avoid capture as citation (#4366).

  • LaTeX writer:

    • Put hypertarget inside figure environment (#4388). This works around a problem with the endfloat package and makes pandocs output compatible with it.
    • Fix image height with percentage (#4389). This previously caused the image to be resized to a percentage of textwidth, rather than textheight.
  • ConTeXt writer (Henri Menke):

    • New section syntax and support --section-divs (#2609). \section[my-header]{My Header} -> \section[title={My Header},reference={my-header}]. The ConTeXt writer now supports the --section-divs option to write sections in the fenced style, with \startsection and \stopsection.
    • xtables: correct wrong usage of caption (Henri Menke).
  • Docx writer:

    • Fix image resizing with multiple images (#3930, Andrew Pritchard).
    • Use new golden framework (Jesse Rosenthal).
    • Make more deterministic to facilitate testing (Jesse Rosenthal).
      • getUniqueId now calls to the state to get an incremented digit, instead of calling to P.uniqueHash.
      • we always start the PRNG in mkNumbering/mkAbstractNum with the same seed (1848), so our randoms should be the same each time.
    • Fix ids in comment writing (Jesse Rosenthal). Comments from --track-changes=all were producing corrupt docx, because the writer was trying to get id from the (ID,_,_) field of the attributes, and ignoring the id entry in the key-value pairs. We now check both.
  • Ms writer: Added papersize variable.

  • TEI writer:

    • Use height instead of depth for images (#4331).
    • Ensure that id prefix is always used.
    • Dont emit role attribute; that was a leftover from the Docbook writer.
    • Use xml:id, not id attribute (#4371).
  • AsciiDoc writer:

    • Do not output implicit heading IDs (#4363, Alexander Krotov). Convert to asciidoc-auto_identifiers for old behaviour.
  • RST writer:

    • Remove blockToRST' moving its logic into fixBlocks (Francesco Occhipinti).
    • Insert comment between lists and quotes (#4248, Francesco Occchipinti).
  • RST template: remove definition of math role as raw. This used to be needed prior to v 0.8 of docutils, but now math support is built-in.

  • Slides: Use divs to set incremental/non-incremental (#4381, Jesse Rosenthal). The old method (list inside blockquote) still works, but we are encouraging the use of divs with class incremental or nonincremental.

  • Text.Pandoc.ImageSize:

    • Make image size detection for PDFs more robust (#4322).
    • Determine image size for PDFs (#4322).
    • EMF Image size support (#4375, Andrew Pritchard).
  • Text.Pandoc.Extensions:

    • Add Ext_styles (Jesse Rosenthal, API change). This will be used in the docx reader (defaulting to off) to read pargraph and character styles not understood by pandoc (as divs and spans, respectively).
    • Made Ext_raw_html default for commonmark format.
  • Text.Pandoc.Parsing:

    • Export manyUntil (Alexander Krotov, API change).
    • Export improved sepBy1 (Alexander Krotov).
    • Export list marker parsers: upperRoman, lowerRoman, decimal, lowerAlpha, upperAlpha (Alexander Krotov, API change).
  • Tests/Lua: fix tests on windows (Albert Krewinkel).

  • Lua: register script name in global variable (#4393). The name of the Lua script which is executed is made available in the global Lua variable PANDOC_SCRIPT_FILE, both for Lua filters and custom writers.

  • Tests: Abstract powerpoint tests out to OOXML tests (Jesse Rosenthal). There is very little pptx-specific in these tests, so we abstract out the basic testing function so it can be used for docx as well. This should allow us to catch some errors in the docx writer that slipped by the roundtrip testing.

  • Lua filters: store constructors in registry (Albert Krewinkel). Lua functions used to construct AST element values are stored in the Lua registry for quicker access. Getting a value from the registry is much faster than getting a global value (partly to idiosyncrasies of hslua); this change results in a considerable performance boost.

  • Documentation:

    • doc/org.md Add draft of Org-mode documentation (Albert Krewinkel).
    • doc/lua-filters.md: document global vars set for filters (Albert Krewinkel).
    • INSTALL.md: mention Stack version. (#4343, Adam Brandizzi).
    • MANUAL: add documentation on custom styles (Jesse Rosenthal).
    • MANUAL.txt: Document incremental and nonincremental divs (Jesse Rosenthal). Blockquoted lists are still described, but fenced divs are presented in preference.
    • MANUAL.txt: document header and footer variables (newmana).
    • MANUAL.txt: self-contained implies standalone (#4304, Daniel Lublin).
    • CONTRIBUTING.md: label was renamed. (#4310, Alexander Brandizzi).
  • Require tagsoup 0.14.3 (#4282), fixing HTML tokenization bug.

  • Use latest texmath.

  • Use latest pandoc-citeproc.

  • Allow exceptions 0.9.

  • Require aeson-pretty 0.8.5 (#4394).

  • Bump blaze-markup, blaze-html lower bounds to 0.8, 0.9 (#4334).

  • Update tagsoup to 0.14.6 (Alexander Krotov, #4282).

  • Removed ghc-prof-options. As of cabal 1.24, sensible defaults are used.

  • Update default.nix to current nixpkgs-unstable for hslua-0.9.5 (#4348, jarlg).

2.1.1 pandoc 2.1.1
  • Markdown reader:

    • Dont coalesce adjacent raw LaTeX blocks if they are separated by a blank line. See lierdakil/pandoc-crossref#160.
    • Improved inlinesInBalancedBrackets (#4272, jgm/pandoc-citeproc#315). The change both improves performance and fixes a regression whereby normal citations inside inline notes and figure captions were not parsed correctly.
  • RST reader:

    • Better handling for headers with an anchor (#4240). Instead of creating a Div containing the header, we put the id directly on the header. This way header promotion will work properly.
    • Add aligned environment when needed in math (#4254). rst2latex.py uses an align* environment for math in .. math:: blocks, so this math may contain line breaks. If it does, we put the math in an aligned environment to simulate rst2latex.pys behavior.
  • HTML reader:

    • Fix col width parsing for percentages < 10% (#4262, n3fariox).
  • LaTeX reader:

    • Advance source position at end of stream.
    • Pass through macro defs in rawLaTeXBlock even if the latex_macros extension is set (#4246). This reverts to earlier behavior and is probably safer on the whole, since some macros only modify things in included packages, which pandocs macro expansion cant modify.
    • Fixed pos calculation in tokenizing escaped space.
    • Allow macro definitions inside macros (#4253). Previously we went into an infinite loop with

      \newcommand{\noop}[1]{#1}
      \noop{\newcommand{\foo}[1]{#1}}
      \foo{hi}
      
    • Fix inconsistent column widths (#4238). This fixes a bug whereby column widths for the body were different from widths for the header in some tables.

  • Docx reader (Jesse Rosenthal):

    • Parse hyperlinks in instrText tags (#3389, #4266). This was a form of hyperlink found in older versions of word. The changes introduced for this, though, create a framework for parsing further fields in MS Word (see the spec, ECMA-376-1:2016, 17.16.5, for more on these fields). We introduce a new module, Text.Pandoc.Readers.Docx.Fields which contains a simple parsec parser. At the moment, only simple hyperlink fields are accepted, but that can be extended in the future.
  • Muse reader (Alexander Krotov):

    • Parse ~~ as non-breaking space in Text::Amuse mode.
    • Refactor list parsing.
  • Powerpoint writer (Jesse Rosenthal):

    • Change reference to notesSlide to endNotesSlide.
    • Move image sizing into picProps.
    • Improve table placement.
    • Make our own _rels/.rels file.
    • Import reference-doc images properly.
    • Move Presentation.hs out of PandocMonad.
    • Refactor into separate modules. T.P.W.Powerpoint.Presentation defines the Presentation datatype and goes Pandoc->Presentation; T.P.W.Pandoc.Output goes Presentation->Archive. Text.Pandoc.Writers.Powerpoint a thin wrapper around the two modules.
    • Avoid overlapping blocks in column output.
    • Position images correctly in two-column layout.
    • Make content shape retrieval environment-aware.
    • Improve image handling. We now determine image and caption placement by getting the dimensions of the content box in a given layout. This allows for images to be correctly sized and positioned in a different template. Note that images without captions and headers are no longer full-screened. We cant do this dependably in different layouts, because we dont know where the header is (it could be to the side of the content, for example).
    • Read presentation size from reference file. Our presentation size is now dependent on the reference/template file we use.
    • Handle (sub)headers above slidelevel correctly. Above the slidelevel, subheaders will be printed in bold and given a bit of extra space before them. Note that at the moment, no distinction is made between levels of headers above the slide header, though that can be changed.
    • Check for required files. Since we now import from reference/dist file by glob, we need to make sure that were getting the files we need to make a non-corrupt Powerpoint. This performs that check.
    • Improve templating using --reference-doc. Templating should work much more reliably now.
    • Include Notes slide in TOC.
    • Set notes slide header to slide-level.
    • Add table of contents. This is triggered by the --toc flag. Note that in a long slide deck this risks overrunning the text box. The user can address this by setting --toc-depth=1.
    • Set notes slide number correctly.
    • Clean up adding metadata slide. We want to count the slide numbers correctly if its in there.
    • Add anchor links. For anchor-type links ([foo](#bar)) we produce an anchor link. In powerpoint these are links to slides, so we keep track of a map relating anchors to the slides they occur on.
    • Make the slide number available to the blocks. For anchors, block-processing functions need to know what slide number theyre in. We make the envCurSlideId available to blocks.
    • Move curSlideId to environment.
    • Allow setting toc-title in metadata.
    • Link notes to endnotes slide.
  • Markdown writer:

    • Fix cell width calculation (#4265). Previously we could get ever-lengthening cell widths when a table was run repeatedly through pandoc -f markdown -t markdown.
  • LaTeX writer:

    • Escape & in lstinline (Robert Schtz).
  • ConTeXt writer:

    • Use xtables instead of Tables (#4223, Henri Menke). Default to xtables for context output. Natural Tables are used if the new ntb extension is set.
  • HTML writer:

    • Fixed footnote backlinks with --id-prefix (#4235).
  • Text.Pandoc.Extensions: Added Ext_ntb constructor (API change, Henri Menke).

  • Text.Pandoc.ImageSize: add derived Eq instance to Dimension (Jesse Rosenthal, API change).

  • Lua filters (Albert Krewinkel):

    • Make PANDOC_READER_OPTIONS available. The options which were used to read the document are made available to Lua filters via the PANDOC_READER_OPTIONS global.
    • Add lua module pandoc.utils.run_json_filter, which runs a JSON filter on a Pandoc document.
    • Refactor filter-handling code into Text.Pandoc.Filter.JSON, Text.Pandoc.Filter.Lua, and Text.Pandoc.Filter.Path.
    • Improve error messages. Provide more context about the task which caused an error.
  • data/pandoc.lua (Albert Krewinkel):

    • Accept singleton inline as a list. Every constructor which accepts a list of inlines now also accepts a single inline element for convenience.
    • Accept single block as singleton list. Every constructor which accepts a list of blocks now also accepts a single block element for convenience. Furthermore, strings are accepted as shorthand for {pandoc.Str "text"} in constructors.
    • Add attr, listAttributes accessors. Elements with attributes got an additional attr accessor. Attributes were accessible only via the identifier, classes, and attributes, which was in conflict with the documentation, which indirectly states that such elements have the an attr property.
    • Drop _VERSION. Having a _VERSION became superfluous, as this module is closely tied to the pandoc version, which is available via PANDOC_VERSION.
    • Fix access to Attr components. Accessing an Attr value (e.g., Attr().classes) was broken; the more common case of accessing it via an Inline or Block element was unaffected by this.
  • Move metaValueToInlines to from Docx writer to Text.Pandoc.Writers.Shared, so it can be used by other writers (Jesse Rosenthal).

  • MANUAL.txt:

    • Clarify otherlangs in LaTeX (#4072).
    • Clarify latex_macros extension.
    • Recommend use of raw_attribute extension in header includes (#4253).
  • Allow latest QuickCheck, tasty, criterion.

  • Remove custom prelude and ghc 7.8 support.

  • Reduce compiler noise (exact paths for compiled modules).

2.1 pandoc 2.1
  • Allow filters and lua filters to be interspersed (#4196). Previously we ran all lua filters before JSON filters. Now we run filters in the order they are presented on the command line, whether lua or JSON. There are two incompatible API changes: The type of applyFilters has changed, and applyLuaFilters has been removed. Filter is also now exported.

  • Use latest skylighting and omit the missingIncludes check, fixing a major performance regression in earlier releases of the 2.x series (#4226). Behavior change: If you use a custom syntax definition that refers to a syntax you havent loaded, pandoc will now complain when it is highlighting the text, rather than doing a check at the start. This change dramatically speeds up invocations of pandoc on short inputs.

  • Text.Pandoc.Class: make FileTree opaque (dont export FileTree constructor). This forces users to interact with it using insertInFileTree and getFileInfo, which normalize file names.

  • Markdown reader:

    • Rewrite inlinesInBalancedBrackets. The rewrite is much more direct, avoiding parseFromString. And it performs significantly better; unfortunately, parsing time still increases exponentially (see #1735).
    • Avoid parsing raw tex unless \ + letter seen. This seems to help with the performance problem, #4216.
  • LaTeX reader: Simplified a check for raw tex command.

  • Muse reader (Alexander Krotov):

    • Enable round trip test (#4107).
    • Automatically translate #cover into #cover-image. Amusewiki uses #cover directive to specify cover image.
  • Docx reader (Jesse Rosenthal):

    • Allow for insertion/deletion of paragraphs (#3927). If the paragraph has a deleted or inserted paragraph break (depending on the track-changes setting) we hold onto it until the next paragraph. This takes care of accept and reject. For this we introduce a new state which holds the ils from the previous para if necessary. For --track-changes=all, we add an empty span with class paragraph-insertion/paragraph-deletion at the end of the paragraph prior to the break to be inserted or deleted.
    • Remove unused anchors (#3679). Docx produces a lot of anchors with nothing pointing to themwe now remove these to produce cleaner output. Note that this has to occur at the end of the process because it has to follow link/anchor rewriting.
    • Read multiple children of w:sdtContents.
    • Combine adjacent anchors. There isnt any reason to have numerous anchors in the same place, since we cant maintain docxs non-nesting overlapping. So we reduce to a single anchor.
    • Improved tests.
  • Muse writer (Alexander Krotov): dont escape URIs from AST

  • Docx writer:

    • Removed redundant subtitle in title (Sebastian Talmon).
    • firstRow table definition compatibility for Word 2016 (Sebastian Talmon). Word 2016 seems to use a default value of 1 for table headers, if there is no firstRow definition (although a default value of 0 is documented), so all tables get the first Row formatted as header. Setting the parameter to 0 if the table has no header row fixes this for Word 2016
    • Fix custom styles with spaces in the name (#3290).
  • Powerpoint writer (Jesse Rosenthal):

    • Ignore Notes div for parity with other slide outputs.
    • Set default slidelevel correctly. We had previously defaulted to slideLevel 2. Now we use the correct behavior of defaulting to the highest level header followed by content. We change an expected test result to match this behavior.
    • Split blocks correctly for linked images.
    • Combine adjacent runs.
    • Make inline code inherit code size. Previously (a) the code size wasnt set when we force size, and (b) the properties was set from the default, instead of inheriting.
    • Simplify replaceNamedChildren function.
    • Allow linked images. The following markdown: [![Image Title](image.jpg)](http://www.example.com) will now produce a linked image in the resulting PowerPoint file.
    • Fix error with empty table cell. We require an empty <a:p> tag, even if the cell contains no paragraphsotherwise PowerPoint complains of corruption.
    • Implement two-column slides. This uses the columns/column div format described in the pandoc manual. At the moment, only two columns (half the screen each) are allowed. Custom widths are not supported.
    • Added more tests.
  • OpenDocument/ODT writers: improved rendering of formulas (#4170, oltolm).

  • Lua filters (Albert Krewinkel):

    • data/pandoc.lua: drop pandoc-api-version from Pandoc objects
    • The current pandoc-types version is made available to Lua programs in the global PANDOC_API_VERSION. It contains the version as a list of numbers.
    • The pandoc version available as a global PANDOC_VERSION (a list of numbers).
    • data/pandoc.lua: make Attr an AstElement.
    • data/pandoc.lua: make all types subtypes of AstElement. Pandoc, Meta, and Citation were just plain functions and did not set a metatable on the returned value, which made it difficult to amend objects of these types with new behavior. They are now subtypes of AstElement, meaning that all their objects can gain new features when a method is added to the behavior object (e.g., pandoc.Pandoc.behavior).
    • data/pandoc.lua: split type and behavior tables. Clearly distinguish between a type and the behavioral properties of an instance of that type. The behavior of a type (and all its subtypes) can now be amended by adding methods to that types behavior object, without exposing the type objects internals. E.g.:

      pandoc.Inline.behavior.frob = function () print'42' end
      local str = pandoc.Str'hello'
      str.frob() -- outputs '42'
      
    • data/pandoc.lua: fix Element inheritance. Extending all elements of a given type (e.g., all inline elements) was difficult, as the table used to lookup unknown methods would be reset every time a new element of that type was created, preventing recursive property lookup. This is was changed in that all methods and attributes of supertypes are now available to their subtypes.

    • data/pandoc.lua: fix attribute names of Citation (#4222). The fields were named like the Haskell fields, not like the documented, shorter version. The names are changed to match the documentation and Citations are given a shared metatable to enable simple extensibility.

    • data/pandoc.lua: drop function pandoc.global_filter.

    • Bump hslua version to 0.9.5. This version fixes a bug that made it difficult to handle failures while getting lists or a Map from Lua. A bug in pandoc, which made it necessary to always pass a tag when using MetaList or MetaBlock, is fixed as a result. Using the pandoc modules constructor functions for these values is now optional (if still recommended).

    • Stop exporting pushPandocModule (API change). The introduction of runPandocLua renders direct use of this function obsolete.

    • Update generation of module docs for lua filters.

    • Lua.Module.Utils: make stringify work on MetaValues (John MacFarlane). Im sure this was intended in the first place, but currently only Meta is supported.

  • Improve benchmarks.

    • Set the default extensions properly.
    • Improve benchmark argument parsing. You can now say make bench BENCHARGS="markdown latex reader" and both the markdown and latex readers will be benchmarked.
  • MANUAL.txt simplify and add more structure (Mauro Bieg).

  • Generate README.md from template and MANUAL.txt. make README.md will generate the README.md after changes to MANUAL.txt have been made.

  • Update copyright notices to include 2018 (Albert Krewinkel).

Other projects in Haskell