liftr-tidyverse.Rmd 4.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144
  1. ---
  2. title: "Explore tidyverse with liftr"
  3. author: "Nan Xiao <<https://nanx.me>>"
  4. date: "`r Sys.Date()`"
  5. bibliography: liftr-tidyverse.bib
  6. output:
  7. rmarkdown::html_vignette:
  8. toc: true
  9. number_sections: true
  10. css: liftr.css
  11. vignette: >
  12. %\VignetteEngine{knitr::rmarkdown}
  13. %\VignetteIndexEntry{Explore tidyverse with liftr}
  14. ---
  15. # Introduction
  16. Creating Docker images from scratch can be time and labor consuming.
  17. Fortunately, many pre-built and regularly updated Docker images
  18. for the R community are ready for use, especially when creating
  19. your own containerized R Markdown documents with liftr.
  20. Such sources of pre-built Docker images include the
  21. [rocker project](https://github.com/rocker-org/rocker) and
  22. [Bioconductor Docker containers](https://bioconductor.org/help/docker/).
  23. In this article, we will use the [tidyverse image](https://hub.docker.com/r/rocker/tidyverse/)
  24. provided by rocker. This image includes the essential tidyverse packages
  25. and devtools environment loved by many data scientists [@wickham2014tidy].
  26. We will demonstrate how to containerize and render your tidyverse-heavy
  27. R Markdown document using Docker in only a few minutes.
  28. # Install Docker
  29. If Docker has not been installed on your system, please use `install_docker()`
  30. and follow the guidelines to install it. After that, `check_docker_install()`
  31. and `check_docker_running()` would help you make sure that Docker has been
  32. installed and running properly.
  33. # Example document
  34. Let's create a new folder first and copy the example R Markdown document
  35. to this folder:
  36. ```{r, eval = FALSE}
  37. path = paste0("~/liftr-tidyverse/")
  38. dir.create(path)
  39. file.copy(system.file("examples/liftr-tidyverse.Rmd", package = "liftr"), path)
  40. input = paste0(path, "liftr-tidyverse.Rmd")
  41. ```
  42. If we open the R Markdown file, we will see the header section
  43. includes a `liftr` section, which defines the Docker system
  44. environment required to render this document. For our case,
  45. it is very straightforward and simple indeed:
  46. ```yaml
  47. ---
  48. title: "Explore tidyverse with liftr"
  49. author: "Nan Xiao <<[email protected]>>"
  50. date: "`r Sys.Date()`"
  51. output:
  52. rmarkdown::pdf_document:
  53. toc: true
  54. number_sections: true
  55. liftr:
  56. from: "rocker/tidyverse:latest"
  57. maintainer: "Nan Xiao"
  58. email: "[email protected]"
  59. pandoc: false
  60. texlive: true
  61. cran:
  62. - nycflights13
  63. ---
  64. ```
  65. Most of the fields are self-explanatory:
  66. - Here we simply specified the latest `rocker/tidyverse` image as our
  67. base image, which would save us a lot of time creating a custom
  68. base image with all the tidyverse dependencies.
  69. - The custom `pandoc` installation was not included because the
  70. tidyverse image already includes `pandoc`.
  71. - We included TeXLive here since we intend to render a PDF file in the end.
  72. - The CRAN data package `nycflights13` will be installed.
  73. # Containerize the document
  74. Let's containerize this document by generating a `Dockerfile` for it,
  75. using `liftr::lift`:
  76. ```{r, eval = FALSE}
  77. lift(input)
  78. ```
  79. A file named `Dockerfile` will be generated under the same directory
  80. of the input RMD file. It contains the necessary commands for building
  81. the Docker container for rendering the document.
  82. # Render the document
  83. We can use `render_docker()` to start the Docker container,
  84. and render the document inside it:
  85. ```{r, eval = FALSE}
  86. render_docker(input)
  87. ```
  88. Let's view the rendered document:
  89. ```{r, eval = FALSE}
  90. browseURL(paste0(path, "liftr-tidyverse.pdf"))
  91. ```
  92. In the last section of the rendered PDF, we will see that the session
  93. information are probably different with your current system's information.
  94. Yes, that is because the document is completed generated by
  95. a newly built, isolated Linux system environment, using Docker.
  96. In this way, the R Markdown document gains a higher, system level
  97. reproducibility, thus easily replicable by other users who might not
  98. have the identical system and R package environment to yours.
  99. This is a good thing for team collaboration and large-scale document
  100. orchestration. The best part is, all you need to share is still the
  101. document itself, only with a few extra metadata fields.
  102. # Housekeeping
  103. The Docker images stored in your system could take a few gigabytes
  104. and get larger gradually as you build more images. Let's remove
  105. the generated Docker image to save some disk space:
  106. ```{r, eval = FALSE}
  107. prune_image(paste0(path, "liftr-tidyverse.docker.yml"))
  108. ```
  109. If we do this, the Docker container will be rebuilt next time
  110. when you use `render_docker()`. If not, the image will be cached
  111. in the system and reused when compiling the document later
  112. and save some time for you.
  113. # References