liftr-intro.Rmd 6.4 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206
  1. ---
  2. title: "A Quick Introduction to liftr"
  3. author: "Nan Xiao <<https://nanx.me>>"
  4. date: "`r Sys.Date()`"
  5. output:
  6. rmarkdown::html_vignette:
  7. toc: true
  8. number_sections: true
  9. css: liftr.css
  10. vignette: >
  11. %\VignetteEngine{knitr::rmarkdown}
  12. %\VignetteIndexEntry{A Quick Introduction to liftr}
  13. ---
  14. # Introduction
  15. In essence, liftr aims to solve the problem of _persistent reproducible reporting_.
  16. To achieve this goal, it extends the [R Markdown](http://rmarkdown.rstudio.com)
  17. metadata format, and uses Docker to containerize and render R Markdown documents.
  18. # Metadata for containerization
  19. To containerize your R Markdown document, the first step is adding `liftr`
  20. fields to the YAML metadata section of the document. For example:
  21. ```yaml
  22. ---
  23. title: "The Missing Example of liftr"
  24. author: "Author Name"
  25. date: "`r Sys.Date()`"
  26. output: rmarkdown::html_document
  27. liftr:
  28. maintainer: "Maintainer Name"
  29. email: "[email protected]"
  30. from: "rocker/r-base:latest"
  31. pandoc: true
  32. texlive: false
  33. sysdeps:
  34. - gfortran
  35. cran:
  36. - glmnet
  37. bioc:
  38. - Gviz
  39. remotes:
  40. - "road2stat/liftr"
  41. include: "DockerfileSnippet"
  42. ---
  43. ```
  44. All available metadata fields are expained below.
  45. ## Required metadata
  46. - `maintainer`
  47. Maintainer's name for the Dockerfile.
  48. - `email`
  49. Maintainer's email address for the Dockerfile.
  50. ## Optional metadata
  51. - `from`
  52. Base image for building the docker image. Default is
  53. `"rocker/r-base:latest"`. For R users, the images offered
  54. by the [rocker project](https://github.com/rocker-org)
  55. and [Bioconductor](https://bioconductor.org/help/docker/)
  56. can be considered first.
  57. - `pandoc`
  58. Should we install pandoc in the container? Default is `true`.
  59. If pandoc was already installed in the base image, this should be
  60. set to `false` to avoid potential errors. For example, for
  61. [`rocker/rstudio` images](https://registry.hub.docker.com/u/rocker/rstudio/)
  62. and [`bioconductor/...` images](https://www.bioconductor.org/help/docker/),
  63. this option will be automatically set to `false` since they already
  64. have pandoc installed.
  65. - `texlive`
  66. Is TeX environment needed when rendering the document? Default is `false`.
  67. Should be `true` particularly when the output format is PDF.
  68. - `sysdeps`
  69. Debian/Ubuntu system software packages depended in the document.
  70. Please also include software packages depended by the R packages
  71. below. For example, here `gfortran` is required for compiling `glmnet`.
  72. - `cran`
  73. CRAN packages depended in the document.
  74. If only `pkgname` is provided, `liftr` will install the _latest_
  75. version of the package on CRAN. To improve reproducibility,
  76. we recommend to use the package name with a specified version number:
  77. `pkgname/pkgversion` (e.g. `ggplot2/1.0.0`), even if the version
  78. is the current latest version. Note: `pkgversion` must be provided
  79. to install the archived versions of packages.
  80. - `bioc`
  81. Bioconductor packages depended in the document.
  82. - `remotes`
  83. Remote R packages that are not available from CRAN or Bioconductor.
  84. The [remote package naming specification](https://github.com/hadley/devtools/blob/master/vignettes/dependencies.Rmd)
  85. from devtools is adopted here. Packages can be installed from GitHub,
  86. Bitbucket, Git/SVN servers, URLs, etc.
  87. - `include`
  88. The path to a text file that contains custom Dockerfile snippet.
  89. The snippet will be included in the generated Dockerfile.
  90. This can be used to install additional software packages
  91. or further configure the system environment.
  92. Note that this file should be in the same directory as the
  93. input R Markdown file.
  94. # Containerize the document
  95. After adding proper `liftr` metadata to the document YAML data block,
  96. we can use `lift()` to parse the document and generate a Dockerfile.
  97. We will use
  98. [a minimal example](https://github.com/road2stat/liftr/blob/master/inst/examples/liftr-minimal.Rmd)
  99. included in the liftr package. First, we create a new directory and copy
  100. the R Markdown document into the directory:
  101. ```{r, eval = FALSE}
  102. path = "~/liftr-minimal/"
  103. dir.create(path)
  104. file.copy(system.file("examples/liftr-minimal.Rmd", package = "liftr"), path)
  105. ```
  106. Then, we use `lift()` to parse the document and generate the Dockerfile:
  107. ```{r, eval = FALSE}
  108. library("liftr")
  109. input = paste0(path, "liftr-minimal.Rmd")
  110. lift(input)
  111. ```
  112. After successfully running `lift()`, the Dockerfile will be in the
  113. `~/liftr-minimal/` directory.
  114. # Render the document
  115. Now we can use `render_docker()` to render the document into an HTML file,
  116. under a Docker container:
  117. ```{r, eval = FALSE}
  118. render_docker(input)
  119. ```
  120. The function `render_docker()` will parse the Dockerfile, build a new
  121. Docker image, and run a Docker container to render the input document.
  122. If successfully rendered, the output `liftr-minimal.html` will be in
  123. the `~/liftr-minimal/` directory. You can also pass additional arguments
  124. in `rmarkdown::render` to this function.
  125. In order to share the dockerized R Markdown document, simply share the
  126. `.Rmd` file. Other users can use the `lift()` and `render_docker()`
  127. functions to render the document as above.
  128. # Housekeeping
  129. Normally, the argument `prune` is set to `TRUE` in `render_docker()`.
  130. This means any dangling containers or images due to unsuccessful
  131. builds will be automatically cleaned.
  132. To clean up the dangling containers, images, and everything without
  133. specifying names, please use `prune_container_auto()`,
  134. `prune_image_auto()`, and `prune_all_auto()`.
  135. If you wish to manually remove the Docker container or
  136. image (whose information will be stored in an output YAML file)
  137. after sucessful rendering, use `prune_container()` and `prune_image()`:
  138. ```{r, eval = FALSE}
  139. purge_image(paste0(path, "liftr-minimal.docker.yml"))
  140. ```
  141. The above input YAML file contains the basic information of the
  142. Docker container, image, and commands to render the document.
  143. It is generated by setting `purge_info = TRUE` (default) in `render_docker()`.
  144. # Install Docker
  145. Docker is an essential system requirement when using liftr to render
  146. the R Markdown documents. `install_docker()` will help you find the
  147. proper guide to install and set up Docker in your system.
  148. To check if Docker is correctly installed, use `check_docker_install()`;
  149. to check if the Docker daemon is running, use `check_docker_running()`.
  150. In particular, Linux users should configure Docker to
  151. [run without sudo](https://docs.docker.com/engine/installation/linux/linux-postinstall/).