Need help with template?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

Pakillo
119 Stars 16 Forks 56 Commits 1 Opened issues

Description

A template for data analysis projects structured as R packages

Services available

!
?

Need anything else?

Contributors list

# 153,062
R
schema-...
C
Shell
55 commits

template
package

Generic template for research or data analysis projects structured as R packages

Travis-CI Build
Status HitCount
since 2020-06-14

Rmarkdown documents are great to keep reproducible scientific workflows: tightly integrating code, results and text. Yet, once we are dealing with more complicated data analysis and writing custom code and functions for a project, structuring it as an R package can bring many advantages (e.g. see here and here, or read Marwick et al., but see also here for counterpoints).

Hence this package works as a template for new research or data analysis projects, with the idea of having everything (data, R scripts, functions, and manuscript reporting results) self-contained in the same package (a “research compendium”) to facilitate collaboration and promote reproducibility.

A short presentation introducing this approach on ‘Structuring data analysis projects as R packages’ is available here: https://doi.org/10.6084/m9.figshare.12479984.v1

Installation

# install.packages("remotes")
remotes::install_github("Pakillo/template")

Usage

First, load the package:

library("template")

Now run the function

new_project
to create a directory with all the scaffolding (slightly modified from R package structure). For example, to start a new project about tree growth, just use:
new_project("treegrowth")

If you want to create a GitHub repository for the project at the same time, use instead:

new_project("treegrowth", github = TRUE, private.repo = FALSE)

You could choose either public or private repository. Note that to create a GitHub repo you will need to have configured your system as explained in https://usethis.r-lib.org/articles/articles/usethis-setup.html.

There are other options you could choose, like setting up

testthat
or continuous integration (Travis-CI, GitHub actions…). See
?new_project
for all options.

Developing the project

  1. Now edit

    README.Rmd
    and the
    DESCRIPTION
    file with some basic information about your project: title, brief description, licence, package dependencies, etc.
  2. Place original (raw) data in

    data-raw
    folder. Save all R scripts (or Rmarkdown documents) used for data preparation in the same folder.
  3. Save final (clean, tidy) datasets in the

    data
    folder. You may write documentation for these data.
  4. R scripts or Rmarkdown documents used for data analyses may be placed at the

    analyses
    folder. The final manuscript/report may be placed at the
    manuscript
    folder. You could use one of the many Rmarkdown templates available out there (e.g. rticles, rrtools or rmdTemplates).
  5. If you write custom functions, place them in the

    R
    folder. Document all your functions with
    Roxygen
    . Write tests for your functions and place them in the
    tests
    folder.
  6. If your analysis uses functions from other CRAN packages, include these as dependencies (

    Imports
    ) in the
    DESCRIPTION
    file (e.g. using
    usethis::use_package()
    or
    rrtools::add_dependencies_to_description()
    . Also, use
    Roxygen
    @import
    or
    @importFrom
    in your function definitions, or alternatively
    package::function()
    , to import these dependencies in the namespace.
  7. I recommend using an advanced tool like

    drake
    or
    targets
    to manage your project workflow. A simpler alternative might be writing a

    makefile
    or master script to organise and execute all parts of the analysis. A template makefile is included with this package (use
    makefile = TRUE
    when calling
    new_project
    ).
  8. Render Rmarkdown reports using

    rmarkdown::render
    , and use Rstudio
    Build
    menu to create/update documentation, run tests, build package, etc.
  9. Record the exact dependencies of your project. One option is simply running

    sessionInfo()
    but many more sophisticated alternatives exist. For example,
    automagic::make_deps_file()
    or
    renv::snapshot()
    will create a file recording the exact versions of all packages used, which can be used to recreate such environment in the future or in another computer. If you want to use Docker, you could use e.g. 
    containerit::dockerfile()
    or
    rrtools::use_dockerfile()
    .
  10. Archive your repository (e.g. in Zenodo), get a DOI, and include citation information in your README.

Thanks to:

Links

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.