Loading [MathJax]/jax/output/CommonHTML/jax.js
+ - 0:00:00
Notes for current slide
Notes for next slide

Creating R Packages

Andy Beet
Ecosystem Dynamics and Assessment
Integrated Statistics & Northeast Fisheries Science Center

1 / 26

What is an R package?

An R package is a collection of files (functions & data) with documentation.

You will have already installed and used many packages in your work.

Packages developed by others need to be installed. Often these are installed from either CRAN (Comprehensive R Archive Network) or GitHub.

  • To install from CRAN install.packages("packageName")

  • To install from github remotes::install_github("packageName")

  • To install vignettes associated with packages on GitHub

remotes::install_github("packageName",build_vignettes = TRUE)

2 / 26

What is an R package? cont ...

The whole package can be loaded into memory using

library(packageName)

All functions in the package can then be used by its functionName

For example, the package readxl has a dozen or so functions that help you get data out of excel files and into R. To load all the functions into memory:

library(readxl)

Individual functions from a package can be loaded without having to load the entire library. This becomes important when writing your own packages. Use the syntax:

packageName::functionName

For example if you want to read a legacy excel file into R:

readxl::read_xls("filename.xls")

3 / 26

Why create your own R packages?

  • Keeps you organized

  • Reduces copy and pasting of functions from project to project

  • Allows to to keep track of code and easily reuse it

  • Allow you to easily share your code (eg. through a repository on GitHub). Important for publications

  • Great options for documentation (roxygen, rmarkdown, pkgdown)

  • Better understanding of how R works

R packages by H.Wickham

Your own package will be organized in a way akin to folders on your computer. You will keep R code in one folder, data in another, documentation in another etc.

4 / 26

Pre-requisites: devtools ,remotes & usethis

Three key packages you will need (if you dont have them already) are:

  1. devtools
  2. remotes
  3. usethis.

usethis is a package to aid in development of packages.

To check you have it installed, type the following line:

"usethis" %in% row.names(installed.packages())

A result of FALSE means you dont have it. Install it

The devtools package was split into remotes and usethis. However there are still a few cases in which you will need devtools. It is advisable to install all three.

install.packages(c("remotes","devtools","usethis"))

5 / 26

Prerequisites: Rtools, XCode

You will also need to make sure you have another component installed (This is not an R package) and this component differs between operating sytems.

  1. Windows: Rtools
  2. Mac: XCode (free in app store)
  3. Linux: install R development tools.

To check you have everything installed type in the console:

devtools::has_devel()

If the result is "Your system is ready to build packages!", then you have all required tools.

For more info regarding platform specific options: See What they forgot to teach you about R

6 / 26

Prerequisites: Other packages

  • Rmarkdown package (to creating general documentation)

  • Knitr package (to convert documentation into html or pdf)

  • Roxygen package (for function documentation)

  • devtools package (for package development)

  • usethis package (for package development)

  • remotes package (for installing packages from github)

To install all at once:

install.packages(c("knitr","roxygen2","rmarkdown","devtools","remotes","usethis"))

7 / 26

Naming your package

  • Keep it concise, unique, but descriptive.

  • only use letters, numbers, (and periods).

  • no underscores or hyphens.

  • use abbreviations or acronyms (we should be good at that!)

Hadley wickham has other recommandations in his book R-packages

8 / 26

Create your package

To create a project from scratch_

File -> New Project -> New Directory -> R Package

  • Select a name for your package (this will also be the name of the folder created to store all of your files)

  • Select where you want to save this folder

A collection of files and folders will be created for you. (next slide)

To create a project from existing code_

If you'd like to turn existing code in to a package use:

usethis::create_package()

This will create all of the same content with the exception of the man folder used for documentation.

To include this: devtools::document()

9 / 26

Package structure

All packages have a specific structure that needs to be adhered to. We will use RStudios GUI to create a package (File -> New Project -> New Directory -> R Package).

This creates the basic elements for a package (5 components)

  • A DESCRIPTION file - Basic information about the package, authors, contributors, version, title, description, the type of license, and your package dependencies (a list of other packages that your package depends on!)
  • A NAMESPACE file - defines the functions in your package made available to other users. It also defines external functions and packages that are imported. (This is automatically generated if using roxygen2 for documentation so dont worry about this.).
  • An R/ directory where your R code will reside
  • A man/ directory where function documentation will reside (this is handled by roxygen2)
  • An .rbuildignore file where you can specify which files you do not want to include in the package

Note: If you create a package this way, YOU MUST delete the NAMESPACE file. It will get regenerated at a later point in time.

10 / 26

Project preferences

You will also need to change some preferences in the project options.

Tools -> Project Options -> Build tools

Check "Generate documentation with Roxygen" then click configure. Check everything

Note: If you already have a project in RStudio, you can convert it into a package at any time.

11 / 26

Packages: Now what?

First you should edit the DESCRIPTIONS file to give some basic information about your package. However this can be done at anytime prior to sharing.

  • Package: Enter the name of your package
  • Title: Enter a single line describing your project
  • Author: Enter the authors, emails, and their roles e.g.

c(person("Joe","Bloggs",email = "joebloggs\@mydom.com"), role=c("aut","cre"), person("Jill","Bloggs",email = "jillbloggs\@mydom.com"), role="aut"))

  • Description: Paragraph describing your package
  • License: Licensing info
12 / 26

Packages: Workflow

At this point you are ready to enter the workflow loop:

  1. Develop your code
  2. Document it (Dont forget to @export)
  3. Build your package (Ctrl+Shift+B)
  4. See if the code does what you expect
  5. Look at documentation using ? to see if it looks ok
  6. Repeat

As you continue in this workflow loop you may realize you need to:

  • Include data in your package. (usethis::use_data)
  • Do some preprocessing of data that you don't want in your package (just the result of the processing) (usethis::use_data_raw)
  • Add an overview of how to use your package (usethis::use_vignette)
  • Utilize functions in external packages (usethis::use_package)
13 / 26

Example

Lets create a package called simpleStats. It has a function called calc_stats which given a vector as input calculates the mean, range, and standard deviation. The result is a list containing these 3 results. Create a data set to be bunded with the package called myData.Rdata. Make myData be a numeric vector of size 100 of a white noise time series (mean=0, sd=1). Import the package ggplot2 and plot the data

14 / 26

Documentation

Documetation is a critical part of any package, since if your code isn't documented then how will anyone know how to use it. It is also useful for you. And it is super easy and straightforward.

R provides a standard way of documenting functions and data in your package. These are .Rd files that reside in the man/ folder are similar in structure to LaTeX. However you dont need to know any of that, since roxygen2 will create these for you with minimal effort. The main advantages of using roxygen2 are:

  • The documentation resides in the same file as your code. It is then translated to the format R requires (.Rd files) and copied to the man folder.
  • The NAMESPACE is updated to reflect if code it to be made available to user
  • The DESCRIPTIONS file is updated to reflect use of external packages
15 / 26

Documentation: Functions

All functions should be documented. Every line of documentation begins with #'. Note the apostrophe! The order of the comments translate directly to the order they appear when typing ?. There are quite a few tags you could use but the main tags are:

  • @param : the name of an argument passed to the function
  • @return : a description of the result returned from your function (scalar, matrix, list)
  • @section : An arbitrary section you'd like to include in the help document
  • @seealso : A section for you to add links to other files and functions
  • @examples : Provide examples of how the function works. Include examples!
  • @export : Determins if the function is to be accessed by anyone or only internally within the package
16 / 26

Documentation: Function example

#' function Title
#'
#' function description
#'
#' @param argument1 description of argument1
#' @param argument2 description of argument2
#'
#' @return A list containing
#'
#' \item{res1}{description of res1}
#' \item{res2}{description of res2}
#'
#' @section sectionHeadingName:
#' Write whatever you want
#'
#' @seealso \code{\link{some_other_function_name}}
#'
#' @examples
#' write examples of code usage
#' myfunction(argument1,argument2)
#'
#' @export
myfunction <- function(argument1, argument2) {
do something....
res <- list(res1=res1,res2=res2)
return(res)
}
17 / 26

Documetnation: External data

You can bundle data along with your package and make it available to the user with minimal effort. Any data placed in data/ will be exported and available for users. The data can be "lazily loaded" when the package is loaded into memory. This means the data wont use any memory until it is used. This is handled in the DESCRIPTIONS file. (LazyData: true). Packages work well (and fast) with data files in the .RData format.

Data documentation has a similar feel to function documentation.

Note: If your data file is named myData.RData then it must contain one object with the same name as the filename (myData) and your documentation should be named myData.R. This file should reside in your R/ folder with your other code. (usethis::use_data() can simplify this process). NEVER @export data

18 / 26

Documetnation: External data example

#' data Title
#'
#' data description
#'
#' @format description of data class and size (often a data frame)
#' \describe{
#' \item{col1}{description of col1}
#' \item{col2}{description of col2}
#' ...
#' }
#' @section optional Section:
#' section description
#'
#' @source See authors et al (2018) \url{http:://www.somesite.com}
"myData"
19 / 26

Documentation: Internal data

You may create a package that requires data to function. You may not want to make this data available to a user. You can use

usethis::use_data(..., internal=TRUE)

to create a file (in your project root directory) called sysdata.rda. This will not get exported. No need to document.

Data documentation: Processed (Raw) data

Often before you are ready to distribute your data you have to clean it up, process it in some way, or make it more user friendly. You should include the code used to do this. This code should reside in data-raw/ and should not be bundled with your package.

usethis::use_data_raw()

will create the folder for you and include this folder in the .RbuildIgnore file

20 / 26

Documentation: Package

To access the package details by using package?packageName a file called packageName.R needs to be created:

#' Package Title
#'
#' Package description
#'
#'@section headingName:
#' Write whatever you want
#'
#'@docType package
#'@name packageName
NULL
21 / 26

Documentation: Vignettes

Package vignettes are a type of documentation (written in Rmarkdown) that give an overview of your package, explain the components and even show how the functions work. It should behave like a guide. Examples of these can be found by typing browseVignettes("packageName"). To create a vignette for your package type:

usethis::use_vignette("vignetteName")

This will:

  • Create a folder called vignettes/
  • Create a template rmarkdown .rmd file in the vignettes folder
  • Adds the Suggests: knitr, rmarkdown to your DESCRIPTION file
22 / 26

Importing packages

You may need use other packages in your package. These are termed dependencies and are handled in the DESCRIPTIONS file. To add a dependency to you package simply type

usethis::use_package("packageName")

This will add the line Imports: packageName in your DESCRIPTIONS file.

You can now use all of the functions in this package in your code. I recommend whenever possible to use the syntax packageName::functionName instead of functionName. This helps you in the future to debug your code and identiy which functions are imported. You can use as many packages as you like. These packages will be downloaded and installed on the users machine when they install your package so try to be efficient.

23 / 26

Using C++

You can speed up portions of your code by using compiled code like C++. All compiled code resides in its own folder src/. To include this functionality in your package you need the Rcpp package. This is achieved by typing:

usethis::use_rcpp()

This command does several things

  • Creates the src/ folder
  • Adds the lines (LinkingTo: Rcpp and Imports: Rcpp) to the DESCRIPTIONS file
  • Add the necessary export info to the NAMESPACE file
  • Adds the compiled files to .gitignore (if using version control)
  • Informs you of additional lines required in your packageName.R documentation file
    #' @useDynLib packageName
    #' @importFrom Rcpp sourceCpp
24 / 26

Sharing your code

At some point in your package development you should start using git (locally in RStudio) to track ongoing changes to your project/package. Eventually linking to GitHub so you can share your package with others. Even if you dont plan to share your package, GitHub is still a good option for keeping all of your packages in a central location.

  • Link RStudio to GitHub and push all of your files.
  • create a readMe.md file on GitHub to tell the world what this repository is and instructions on how to download it.

Packages can be downloaded from GitHub using:

remotes::install_github(package="packageName",build_vignettes = TRUE)

25 / 26

Step by step

  1. File->New Project->R package - Creates package in RStudio (Folder name = package name)
  2. Tools->Project Options->Build tools. Check Roxygen. Check All
  3. Edit DESCRIPTIONS file
  4. Delete NAMESPACE file
  5. devtools::document() - creates man/ folder and new NAMESPACE
  6. usthis::use_data_raw() - creates data-raw/ folder
  7. usethis::use_vignette("vignetteName") - creates vignettes/ folder
  8. Create main package documentation, packageName.R, and document it.
  9. Write some code and document it. Save file as .R in the R/ folder
  10. Build Tab (Rstudio)->"Run and Build" or "Install and Restart" to build package and test for errors. (Or Ctrl+Shift+B)
  11. usethis::use_package("packagName1","packageName2") to include external packages
  12. usethis::use_data("dataName") - .RData saved in /data folder
26 / 26

What is an R package?

An R package is a collection of files (functions & data) with documentation.

You will have already installed and used many packages in your work.

Packages developed by others need to be installed. Often these are installed from either CRAN (Comprehensive R Archive Network) or GitHub.

  • To install from CRAN install.packages("packageName")

  • To install from github remotes::install_github("packageName")

  • To install vignettes associated with packages on GitHub

remotes::install_github("packageName",build_vignettes = TRUE)

2 / 26
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow