123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144 |
- ---
- title: "Explore tidyverse with liftr"
- author: "Nan Xiao <<https://nanx.me>>"
- date: "`r Sys.Date()`"
- bibliography: liftr-tidyverse.bib
- output:
- rmarkdown::html_vignette:
- toc: true
- number_sections: true
- css: liftr.css
- vignette: >
- %\VignetteEngine{knitr::rmarkdown}
- %\VignetteIndexEntry{Explore tidyverse with liftr}
- ---
- # Introduction
- Creating Docker images from scratch can be time and labor consuming.
- Fortunately, many pre-built and regularly updated Docker images
- for the R community are ready for use, especially when creating
- your own containerized R Markdown documents with liftr.
- Such sources of pre-built Docker images include the
- [rocker project](https://github.com/rocker-org/rocker) and
- [Bioconductor Docker containers](https://bioconductor.org/help/docker/).
- In this article, we will use the [tidyverse image](https://hub.docker.com/r/rocker/tidyverse/)
- provided by rocker. This image includes the essential tidyverse packages
- and devtools environment loved by many data scientists [@wickham2014tidy].
- We will demonstrate how to containerize and render your tidyverse-heavy
- R Markdown document using Docker in only a few minutes.
- # Install Docker
- If Docker has not been installed on your system, please use `install_docker()`
- and follow the guidelines to install it. After that, `check_docker_install()`
- and `check_docker_running()` would help you make sure that Docker has been
- installed and running properly.
- # Example document
- Let's create a new folder first and copy the example R Markdown document
- to this folder:
- ```{r, eval = FALSE}
- path = paste0("~/liftr-tidyverse/")
- dir.create(path)
- file.copy(system.file("examples/liftr-tidyverse.Rmd", package = "liftr"), path)
- input = paste0(path, "liftr-tidyverse.Rmd")
- ```
- If we open the R Markdown file, we will see the header section
- includes a `liftr` section, which defines the Docker system
- environment required to render this document. For our case,
- it is very straightforward and simple indeed:
- ```yaml
- ---
- title: "Explore tidyverse with liftr"
- author: "Nan Xiao <<[email protected]>>"
- date: "`r Sys.Date()`"
- output:
- rmarkdown::pdf_document:
- toc: true
- number_sections: true
- liftr:
- from: "rocker/tidyverse:latest"
- maintainer: "Nan Xiao"
- email: "[email protected]"
- pandoc: false
- texlive: true
- cran:
- - nycflights13
- ---
- ```
- Most of the fields are self-explanatory:
- - Here we simply specified the latest `rocker/tidyverse` image as our
- base image, which would save us a lot of time creating a custom
- base image with all the tidyverse dependencies.
- - The custom `pandoc` installation was not included because the
- tidyverse image already includes `pandoc`.
- - We included TeXLive here since we intend to render a PDF file in the end.
- - The CRAN data package `nycflights13` will be installed.
- # Containerize the document
- Let's containerize this document by generating a `Dockerfile` for it,
- using `liftr::lift`:
- ```{r, eval = FALSE}
- lift(input)
- ```
- A file named `Dockerfile` will be generated under the same directory
- of the input RMD file. It contains the necessary commands for building
- the Docker container for rendering the document.
- # Render the document
- We can use `render_docker()` to start the Docker container,
- and render the document inside it:
- ```{r, eval = FALSE}
- render_docker(input)
- ```
- Let's view the rendered document:
- ```{r, eval = FALSE}
- browseURL(paste0(path, "liftr-tidyverse.pdf"))
- ```
- In the last section of the rendered PDF, we will see that the session
- information are probably different with your current system's information.
- Yes, that is because the document is completed generated by
- a newly built, isolated Linux system environment, using Docker.
- In this way, the R Markdown document gains a higher, system level
- reproducibility, thus easily replicable by other users who might not
- have the identical system and R package environment to yours.
- This is a good thing for team collaboration and large-scale document
- orchestration. The best part is, all you need to share is still the
- document itself, only with a few extra metadata fields.
- # Housekeeping
- The Docker images stored in your system could take a few gigabytes
- and get larger gradually as you build more images. Let's remove
- the generated Docker image to save some disk space:
- ```{r, eval = FALSE}
- prune_image(paste0(path, "liftr-tidyverse.docker.yml"))
- ```
- If we do this, the Docker container will be rebuilt next time
- when you use `render_docker()`. If not, the image will be cached
- in the system and reused when compiling the document later
- and save some time for you.
- # References
|