1 Data and Code Access

1.0.1 About

The Technical Documentation for the State of the Ecosystem (SOE) reports is a bookdown document; hosted on the NOAA Northeast Fisheries Science Center (NEFSC) Ecosystems Dynamics and Assessment Branch Github page, and developed in R. Derived data used to populate figures in this document are queried directly from the ecodata R package or the NEFSC ERDDAP server. ERDDAP queries are made using the R package rerddap.

1.0.2 Accessing data and build code

In this technical documentation, we hope to shine a light on the processing and analytical steps involved to get from source data to final product. This means that whenever possible, we have included the code involved in source data extraction, processing, and analyses. We have also attempted to thoroughly describe all methods in place of or in supplement to provided code. Example plotting code for each indicator is presented in sections titled “Plotting”, and these code chunks can be used to recreate the figures found in ecosystem reporting documents where each respective indicator was included1.

Source data for the derived indicators in this document are linked to in the text unless there are privacy concerns involved. In that case, it may be possible to access source data by reaching out to the Point of Contact associated with that data set. Derived data sets make up the majority of the indicators presented in ecosystem reporting documents, and these data sets are available for download through the ecodata R package.

1.0.3 Building the document

Start a local build of the SOE bookdown document by first cloning the project’s associated git repository. Next, if you would like to build a past version of the document, use git checkout [version_commit_hash] to revert the project to a past commit of interest, and set build_latest <- FALSE in this code chunk. This will ensure the project builds from a cached data set, and not the most updated versions present on the NEFSC ERDDAP server. Once the tech-doc.Rproj file is opened in RStudio, run bookdown::serve_book() from the console to build the document.

1.0.3.1 A note on data structures

The majority of the derived time series used in State of the Ecosystem reports are in long format. This approach was taken so that all disparate data sets could be “bound” together for ease of use in our base plotting functions.

catalog link No associated catalog page


  1. There are multiple R scripts sourced throughout this document in an attempt to keep code concise. These scripts include BasePlot_source.R, GIS_source.R, and get_erddap.R. The scripts BasePlot_source.R and GIS_source.R refer to deprecated code used prior to the 2019 State of the Ecosystem reports. Indicators that were not included in reports after 2018 make use of this syntax, whereas newer indicators typically use ggplot2 for plotting.↩︎