liftr-intro.Rmd 8.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192
  1. ---
  2. title: "Dockerize R Markdown Documents"
  3. author: "Nan Xiao <<[email protected]>>"
  4. date: "`r Sys.Date()`"
  5. output:
  6. rmarkdown::html_vignette:
  7. toc: true
  8. number_sections: true
  9. vignette: >
  10. %\VignetteEngine{knitr::rmarkdown}
  11. %\VignetteIndexEntry{Dockerize R Markdown Documents}
  12. ---
  13. # Add `liftr` Metadata
  14. To dockerize your R Markdown document, the first step is adding `liftr` options in the YAML front-matter of a document. For example:
  15. ```
  16. ---
  17. title: "The Missing Example of liftr"
  18. author: "Author Name"
  19. date: "`r Sys.Date()`"
  20. output:
  21. html_document:
  22. highlight: haddock
  23. theme: readable
  24. liftr:
  25. maintainer: "Author Name"
  26. maintainer_email: "[email protected]"
  27. from: "rocker/r-base:latest"
  28. latex: false
  29. pandoc: true
  30. syslib:
  31. - gfortran
  32. - samtools
  33. cranpkg:
  34. - randomForest
  35. biocpkg:
  36. - Gviz
  37. - ggbio
  38. ghpkg:
  39. - "road2stat/liftr"
  40. rabix: true
  41. rabix_json: "https://s3.amazonaws.com/rabix/rabix-test/bwa-mem.json"
  42. rabix_d: "~/liftr_rabix/bwa/"
  43. rabix_args:
  44. - reference: "https://s3.amazonaws.com/rabix/rabix-test/chr20.fa"
  45. - reads: "https://s3.amazonaws.com/rabix/rabix-test/example_human_Illumina.pe_1.fastq"
  46. - reads: "https://s3.amazonaws.com/rabix/rabix-test/example_human_Illumina.pe_2.fastq"
  47. ---
  48. ```
  49. All available options are expained below.
  50. ## Required options
  51. * `maintainer` - Maintainer name for the `Dockerfile`.
  52. * `maintainer_email` - Maintainer email address for the `Dockerfile`.
  53. ## Optional options
  54. * `from` - [Base image](https://docs.docker.com/reference/builder/#from) for building the docker image. Default is `"rocker/r-base:latest"`.
  55. * `latex` - Is TeX environment needed when rendering the document? Default is `false`.
  56. * `pandoc` - Should we install pandoc in the container? Default is `true`. If pandoc was already installed in the base image, this should be set to `false` to avoid potential errors. For example, for [`rocker/rstudio`](https://registry.hub.docker.com/u/rocker/rstudio/) and [`bioconductor/...`](https://www.bioconductor.org/help/docker/) images, this option will be automatically set to `false` since they already have pandoc installed.
  57. * `syslib` - Debian/Ubuntu system software packages depended in the document. Please also include software packages depended by the R packages included below, such as `gfortran` here required for compiling `randomForest`.
  58. * `cranpkg` - CRAN packages depended in the document. If only `pkgname` is provided, `liftr` will install the _latest_ version of the package on CRAN. To improve reproducibility, we recommend to use the package name with a specified version number: `pkgname/pkgversion` (e.g. `ggplot2/1.0.0`), even if the version is the current latest version. Note: `pkgversion` must be provided to install the archived versions of packages.
  59. * `biocpkg` - Bioconductor packages depended in the document.
  60. * `ghpkg` - GitHub R packages depended in the document. We accept the same format as the `repo` argument in `devtools::install_github`. Normally, `"username/repo"` would be sufficient.
  61. ## Rabix options
  62. The Rabix options are optional. Just make sure `rabix: true` when you need to enable Rabix support.
  63. * `rabix` - Logical. Should Rabix support be enabled for this document?
  64. * `rabix_json` - The URI (local file path or HTTP/HTTPS URL) to a JSON document that describes the Rabix app.
  65. * `rabix_d` - Working directory for the task. Required when `rabix: true`. We recommend to set this as the same directory (or a subdirectory) as the directory of the input R Markdown document, for better reproducibility and easier access of the output.
  66. * `rabix_args` - Additional arguments for Rabix and the Rabix app, usually the inputs and parameters. Run `rabix -h` or [read this page](https://github.com/rabix/rabix/blob/master/README.md) for more details.
  67. # Use `lift()` and `drender()`
  68. After adding proper `liftr` metadata to the document YAML data block, we can use `lift()` to parse the document and generate a `Dockerfile` (it will also generate a `Rabixfile` if necessary).
  69. We will use [docker.Rmd](https://github.com/road2stat/liftr/blob/master/inst/docker.Rmd) included in the package as an example. First, we create a new directory and copy the example document to the directory:
  70. ```{r, eval = FALSE}
  71. dir_docker = "~/liftr_docker/"
  72. dir.create(dir_docker)
  73. file.copy(system.file("examples/docker.Rmd", package = "liftr"), dir_docker)
  74. ```
  75. Then, we use `lift()` to parse the document and generate `Dockerfile`:
  76. ```{r, eval = FALSE}
  77. library("liftr")
  78. docker_input = paste0(dir_docker, "docker.Rmd")
  79. lift(docker_input)
  80. ```
  81. After successfully running `lift()` on `docker.Rmd`, the `Dockerfile` will be in the `~/liftr_docker/` directory.
  82. Now we can use `drender()` on `docker.Rmd` to render the document to a html file, under a Docker container:
  83. ```{r, eval = FALSE}
  84. drender(docker_input)
  85. ```
  86. The `drender()` function will parse the `Dockerfile`, build a new Docker image, and run a container to render the input document. If successfully rendered, the output `docker.html` will be in the `~/liftr_docker/` directory. You can also passed additional arguments in `rmarkdown::render` to this function.
  87. In order to share the dockerized R Markdown document, simply share the `.Rmd` file. Other users can use the `lift()` and `drender()` functions to render the document as above.
  88. # Render an interacive Rmarkdown shiny doc
  89. This will generate a dockerized container for your shiny doc, you can launch it anywhere and browse it from browser.
  90. ```{r, eval = FALSE}
  91. file.copy(system.file("examples/ShinyDoc.Rmd", package = "liftr"), dir_docker)
  92. docker_input = paste0(dir_docker, "ShinyDoc.Rmd")
  93. lift(docker_input)
  94. drender(docker_input, clean = TRUE, browseURL = TRUE)
  95. ```
  96. # Dockerize a shiny app
  97. This will generate a dockerized container for your shiny app folder, you can launch it anywhere and browse it from browser.
  98. ```{r, eval = FALSE}
  99. file.copy(system.file("examples/shinyapp", package = "liftr"),
  100. dir_docker, recursive = TRUE)
  101. docker_input = paste0(dir_docker, "shinyapp")
  102. lift(docker_input)
  103. drender(docker_input, clean = TRUE, browseURL = TRUE)
  104. ```
  105. # Rabix Support (This is now under development)
  106. [Rabix](https://www.rabix.org) is an open source implementation of the [Common Workflow Language](https://github.com/common-workflow-language/common-workflow-language) specification for building portable bioinformatics pipelines. Users can write JSON-based tools/workflows and run them with Rabix.
  107. We will use `rabix.Rmd` included in the package as an example. As before, we create a new directory and copy the example document to the directory:
  108. ```{r, eval = FALSE}
  109. dir_rabix = "~/liftr_rabix/"
  110. dir.create(dir_rabix)
  111. file.copy(system.file("rabix.Rmd", package = "liftr"), dir_rabix)
  112. ```
  113. Use `lift()` and `drender()` as before:
  114. ```{r, eval = FALSE}
  115. library("liftr")
  116. rabix_input = paste0(dir_rabix, "rabix.Rmd")
  117. lift(rabix_input)
  118. drender(rabix_input)
  119. ```
  120. Rabix tools/workflows will run first, the document will be rendered after. In this way, we can use the output of the bioinformatics pipelines for further analysis in our R Markdown document. See [rabix.Rmd](https://github.com/road2stat/liftr/blob/master/inst/rabix.Rmd) for details.
  121. If successfully rendered, the output `rabix.html` will be in the `~/liftr_rabix/` directory.
  122. # System Requirements
  123. As the host platform, Linux is currently preferred over the other platforms due to certain limitations of running Docker and performance issues.
  124. ## Docker
  125. We need Docker installed to render the documents.
  126. To install Docker in Ubuntu:
  127. sudo apt-get install docker.io
  128. We should configure Docker to [run without sudo](https://docs.docker.com/installation/ubuntulinux/#create-a-docker-group). To avoid `sudo` when using the `docker` command, simply create a group named `docker` and add yourself to it:
  129. sudo usermod -aG docker your-username
  130. [Here](https://docs.docker.com/installation/) is a detailed guide for installing Docker on most platforms. Anyhow, just make sure you can run `docker` under shell.
  131. ## Rabix
  132. Rabix needs to be installed if you want to run Rabix tools/workflows before rendering the documents. Make sure you can run `rabix` under shell after installation.
  133. To install Rabix in Ubuntu:
  134. sudo apt-get install python-dev python-pip docker.io phantomjs libyaml-dev
  135. sudo pip install rabix
  136. [Here](https://github.com/rabix/rabix/blob/master/README.md) is a more detailed guide for installing Rabix on other platforms.
  137. <hr>
  138. `--EOF--`