How to convert R Markdown to PDF

knitrpandocr-markdown

I've previously asked about the commands for converting R Markdown to HTML.

What is a good way to convert R Markdown files to PDF documents?

A good solution would preserve as much as possible of the content (e.g., images, equations, html tables, etc.). The solution needs to be able to be run from the command-line. A good solution would also be cross-platform, and ideally minimise dependencies to make it easier to share makefiles and so forth.

Specifically, there are a lot of options:

  • Whether to convert RMD to MD to HTML to PDF; or RMD to MD to PDF; or RMD to PDF
  • If using the markdown package in R, which options to specify
  • Whether to use pandoc, a package built into R, or something else

Here's an example rmd file that presumably provides a reasonable test of any proposed solution. It was used as the basis for this blog post.

Best Answer

Updated Answer (10 Feb 2013)

rmarkdown package: There is now an rmarkdown package available on github that interfaces with Pandoc. It includes a render function. The documentation makes it pretty clear how to convert rmarkdown to pdf among a range of other formats. This includes including output formats in the rmarkdown file or running supplying an output format to the rend function. E.g.,

render("input.Rmd", "pdf_document")

Command-line: When I run render from the command-line (e.g., using a makefile), I sometimes have issues with pandoc not being found. Presumably, it is not on the search path. The following answer explains how to add pandoc to the R environment.

So for example, on my computer running OSX, where I have a copy of pandoc through RStudio, I can use the following:

Rscript -e "Sys.setenv(RSTUDIO_PANDOC='/Applications/RStudio.app/Contents/MacOS/pandoc');library(rmarkdown);  library(utils); render('input.Rmd', 'pdf_document')"

Old Answer (circa 2012)

So, a number of people have suggested that Pandoc is the way to go. See notes below about the importance of having an up-to-date version of Pandoc.

Using Pandoc

I used the following command to convert R Markdown to HTML (i.e., a variant of this makefile), where RMDFILE is the name of the R Markdown file without the .rmd component (it also assumes that the extension is .rmd and not .Rmd).

RMDFILE=example-r-markdown  
Rscript -e "require(knitr); require(markdown); knit('$RMDFILE.rmd', '$RMDFILE.md'); markdownToHTML('$RMDFILE.md', '$RMDFILE.html', options=c('use_xhml'))"

and then this command to convert to pdf

Pandoc -s example-r-markdown.html -o example-r-markdown.pdf


A few notes about this:

  • I removed the reference in the example file which exports plots to imgur to host images.
  • I removed a reference to an image that was hosted on imgur. Figures appear to need to be local.
  • The options in the markdownToHTML function meant that image references are to files and not to data stored in the HTML file (i.e., I removed 'base64_images' from the option list).
  • The resulting output looked like this. It has clearly made a very LaTeX style document in contrast to what I get if I print the HTML file to pdf from a browser.

Getting up-to-date version of Pandoc

As mentioned by @daroczig, it's important to have an up-to-date version of Pandoc in order to output pdfs. On Ubuntu as of 15th June 2012, I was stuck with version 1.8.1 of Pandoc in the package manager, but it seems from the change log that for pdf support you need at least version 1.9+ of Pandoc.

Thus, I installed caball-install. And then ran:

cabal update
cabal install pandoc

Pandoc was installed in ~/.cabal/bin/pandoc Thus, when I ran pandoc it was still seeing the old version. See here for adding to the path.

Related Topic