Overview

Introduction

This report provides contextual ecosystem information for summer flounder (Paralichthys dentatus) on the Northeast U.S. Continental Shelf. Data extractions for spring and fall are confined to the summer flounder stock area based on respective survey strata sets. The information is intended to span a range of potential factors affecting the productivity and distribution of summer flounder, including: surface and bottom temperature and salinity, chlorophyll concentrations, indices of habitat, diet composition, and abundance of key zooplankton prey of larval summer flounder. These factors can be used to qualitatively inform the interpretation of population status and/or quantitatively to improve model responsiveness to ecosystem factors. The range and complexity of ecosystem data makes it unlikely to find the most relevant and comprehensive factor variables with a first evaluation; this process will require an iterative approach of evaluation and feedback. Additional indices can be included to address the needs of the working group.

Several aspects of the ecosystem seem to be changing in the most recent years. Fall bottom and surface temperature are increasing and salinity is at or near the historical high levels. These physical series may have shifted around 2012, the warmest year on record for this ecosystem. Spring chlorophyll concentrations, a measure of bottom-up ecosystem production in the summer flounder stock area, are variable, but the fall time series is decreasing, especially so over the period 2013-2017. Spring abundances for key zooplankton prey are variable and may be worth examining alongside recruitment patterns. Both probability of occurrence and modeled habitat area show similar patterns of increases from the 1990s to the present, which suggests despite reduced abundance in the past five years, the distribution footprint has not contracted.

Summary

  • The environment within the stock area for summer flounder has changed in recent decades as indicated by increasing temperature and salinity. These changes may indicate a shift in ecosystem function and productivity.

  • Change in the ecology of the stock area is supported by the declining trends in both primary (chlorophyll) and secondary (zooplankton) production indicators. In addition to serving as indicators of secondary production, these zooplankton taxa have been observed in the diet of larval summer flounder, hence the recent decline in spring abundance of these taxa may play a role in summer flounder recruitment.

  • Summer flounder diet has changed in recent decades, which may be driven by ecosystem effects on the distribution and availability of forage species.

  • Occupancy habitat models for summer flounder suggest the size of the summer flounder habitat is increasing, which may create a mismatch between the areas used by the species and the stock definition area.

Strata definition

Strata map for summer flounder (*Paralichthys dentatus*) on the NE shelf

Strata map for summer flounder (Paralichthys dentatus) on the NE shelf

Surface and bottom temperature

An optimal interpolation procedure was used to estimate NE Shelf surface and bottom temperatures for two seasonal time frames (see methods). The temperature estimates were standardized to April 3 and October 11 for spring and fall over the period 1968-2017. Spring surface and bottom temperature within the spring summer flounder stock areas are shown in the upper and lower left figures. Fall surface and bottom temperature within the fall summer flounder stock areas are shown in the upper and lower right figures. Error bars show ±1 SD of the observations.

Spring surface and bottom temperatures within the spring summer flounder stock areas vary over an approximate range of 3°C. It is difficult to see a clear trend in the spring data, however, the fall surface and bottom temperatures (upper and lower right figures) are clearly increasing, especially over the most recent decade. It is worth noting that in 2012, the warmest year on record on the Northeast Shelf, those warm condition are evident in the spring but not in the fall where the warmest conditions were recorded with the last two years. Marine organisms have thermal tolerances and optima that define in which habitats they can grow and reproduce. The warming conditions within the stock area for summer flounder is likely changing its use of habitats both within and outside the stock area.

Figure

Fall surface and bottom temperature within the fall summer flounder stock areas. Error bars show ±1 SD of the observations.

Data

Surface and bottom salinity

An optimal interpolation procedure was used to estimate NE Shelf surface and bottom salinity for two seasonal time frames (see methods). Though collected with temperature data, reliable instrumentation limits this time series to 1992-2017. The salinity estimates were standardized to April 3 and October 11 for spring and fall. Spring surface and bottom salinity within the spring summer flounder stock areas are shown in the upper and lower left figures. Fall surface and bottom salinity within the fall summer flounder stock areas are shown in the upper and lower right figures. Error bars show ±1 SD of the observations.

Similar to fall temperature, in recent years bottom and surface salinity are at or near historical highs. Shifting salinity conditions in concert with changing temperature suggest the stock area for summer flounder may be impacted by different source water, which may have different nutrient loading supporting primary production.

Figures

Fall surface and bottom salinity within the fall summer flounder stock areas. Error bars show ±1 SD of the observations.

Data

Chlorophyll concentration

The concentration of chlorophyll was measured with a suite of satellite sensors and merged into a single dataset (see methods). Chlorophyll concentrations in the spring and fall summer flounder stock areas are shown in the left and right figures, and error bars show ±1 SD of the observations. The time series of spring chlorophyll concentration may have impacts on recruitment and growth of the stock, while fall chlorophyll values may affect overwintering nutrition.

The production cycle of the Northeast Shelf is complex and varies spatially. Patterns of seasonal blooms and the overall annual primary production varies within the stock area for summer flounder; investigation into the linkages between phytoplankton production and summer flounder may require additional production metrics.

Figures

Chlorophyll concentrations in the spring and fall summer flounder stock areas. Error bars show ±1 SD of the observations.

Data

Zooplankton abundance

The primary zooplankton species identified in the diet of summer flounder larvae are Centropages typicus (CTYP), Temora longicornis (TLONG), and Pseudocalanus spp. (PSEUDO) (Grover 1998). The abundance of these taxa within the respective seasonal stock areas for spring and fall are depicted below (see methods). It is assumed that the spring abundances of these taxa would be most relevant to the growth and survival of larval summer flounder. Note that the abundance estimates are based on data that are spatially smoothed within year by first kriging the observational data before the abundances are extracted for the stock area. The spring abundance for C. typicus, T. longicornis, and Pseudocalanus are shown in the plots on left, top to bottom, respectively. Error bars indicate ±1 SD of the observations.

The abundance C. typicus and T. longicornis increased through the 1980s and 1990s before declining over the most recent decades noting that the last few years suggest higher abundances. Pseudocalanus, on the other hand, remained at a high abundance level until the year 2001, when it abundance declined and in some years dramatically so. It should also be noted that the larval stomach content data from Glover (1998) would be condition on pre-2000 samples. The fall abundance of these taxa have all declined in spring; precipitously for Pseudocalanus. Though the fall time frame would be past the larval feeding period, the changes in zooplankton abundance may be an indication of changes in the food web in the fall that might impact the success of young of the year and feeding and condition of adults.

Figures

The spring (left) and fall (right) abundance for C. typicus, T. longicornis, and Pseudocalanus. Error bars indicate ±1 SD of the observations.

Data

Diet composition

Trends in diet composition may indicate changes in the habitat for generalist predators, and may have implications for growth and productivity at the population level if the overall energy content of the diet changes. Food habits collected aboard NEFSC bottom trawl surveys in the summer flounder seasonal survey strata were used to estimate weighted diet composition over time (see methods), as well as length composition of sampled fish and sample size. Here we show seasonal weighted diet composition for major prey (averaging over 10 percent of diet across years). Food habits data for summer flounder have been collected from 1977-present (Smith and Link 2010).

Using the criteria above, more of the spring diet composition is captured than the fall or winter, suggesting more diverse diets in those seasons. Diet compositions across seasons and years include “Loligo” (recently renamed Doryteuthis) squid species as a major component, as well as unidentified bony fish (“other fish” and “osteichthyes” below). Cephalopods in addition to Loligo are major components in winter and spring, but not fall. Scombrids, Scomber scombrus (Atlantic mackerel), Clupea harengus (Atlantic herring), and Stenotomus chrysops (scup) are intermittent spring diet components, while Etrumeus teres (round herring), Gadids and Urophycis regia (spotted hake) appear in fall diets. Ammodytes spp. (sandlance) appear in both spring and fall diets, but more consistently in fall.

While trends are difficult to discern in diet components, Gadids only appear in fall diets since 2008 and may be increasing, while spotted hake proportions may be decreasing over the same period. Similarly, the proportion of scup in spring diets may be decreasing since 2010. Full seasonal diet compositions with numbers of stomachs and survey stations are part of the diet dataset for analysis but not visualized in this summary.

Figures

Seasonal weighted diet composition for major summer flounder prey (1977-2016).

Data

Probability of occurence and habitat area

Probability of occurrence of summer flounder in NEFSC bottom trawl tows was estimated in each season and year using a logistic regression, i.e., a GLM with a bionomial response and log-odds link (code: https://github.com/NOAA-EDAB/ECSA/blob/master/prob_occurence.R). Error bars represent the 95% confidence intervals.

Figures

Probability of occurrence of summer flounder in NEFSC bottom trawl tows. Error bars represent the 95% confidence intervals.

Data

Occupancy habitat

The probability of occurrence was also estimated using random forest classification models (see methods). The probabilities were extracted for the spring and fall stock definition areas. The occupancy area is shown in the upper plots and the occupancy probability is shown in the lower plots, with spring on the left and fall on the right for all plots.

The occupancy probability has increased in both stock areas and suggest similar trends to the GLM model results above for the overlapping period. The area of the Northeast Shelf with an estimated occurrence probability of 0.5 or greater has increased in both seasons. In spring, the occupancy area have increased from approximately 50,000 km\(^2\) to 80,000 km\(^2\), noting a decrease in 2017. The fall habitat has increased from approximately 40,000 km\(^2\) to 60,000 km\(^2\) with no indication of the trend abating.

Figures

Probability of occurrence estimated using random forest classification models.

Data

Methods

Surface and Bottom Temperature and Salinity

Study System

The U.S. Northeast Shelf ecosystem, which roughly aligns with the boundaries of the Northeast U.S. Continental Shelf Large Marine Ecosystem (LME), is the study area for the surface and bottom thermal environments. Surface and bottom temperatures were estimated over a 0.1° latitude/longitude grid, termed the estimation grid, which circumscribes the range of ecosystem assessment areas in the region. The difference between the extents of the estimation grid from the extent of the LME relate to the resource management programs that are the sources of the data, which are focused on fishery management needs in the region.

Data Source

Temperature and salinity were collected on the Northeast Shelf as part of ongoing resource and ecosystem surveys conducted by the Northeast Fisheries Science Center. Water column temperatures have been collected contemporaneously to trawl tows associated with a bottom trawl survey beginning in the fall of 1963 and five years later during spring (Despres-Patanjo, Azarovitz, and Byrne 1988). In addition, the ecosystem has been surveyed by multiple sampling programs with varying sampling designs. The two most comprehensive monitoring programs over the study period were the MARMAP (1977-1987) and the Ecosystem Monitoring Program or EcoMon (1992-present) programs, both serving as shelf-wide surveys of the ecosystem (Sherman et al. 1998; J. Kane 2007). Temperature measurements were made with a mix of Conductivity Temperature and Depth (CTD) instruments, analog XBT, digital XBT, mechanical BT, glass thermometers (bottle temps) and trawl mounted temperature loggers instruments collecting either water column profiles or temperatures measured at targeted depths. Salinity measurements used in this analysis was limited to 1992-2017 when CTD instrument were used. Surface and bottom temperatures were identified from these measurements. Temperatures representing the spring period include data collected during the months of February to June; however, 99% of the samples were collected during the months of March to May. Likewise, the fall period samples include data collected during September to December, with 99% of the samples collected during September to November. The total number of surface temperature measurements were 14,540 and 14,666 for spring and fall, respectively; and, 14,450 and 14,656 for spring and fall bottom temperature, respectively. On average, there were 290 temperature measurements by season, depth, and year. The number of salinity observation per year were similar.

Interpolation Procedure

Seasonal surface and bottom temperature fields were estimated using an optimal interpolation approach. The optimal interpolation combined a climatological depiction of temperature and an annual estimates based on the data for a particular depth and season. (There were a number of precursor steps that are described below.). The surveys used to collect the data were conducted during the same period each year, however, there was variation in survey timing. To account for this, temperatures were standardized to a collection date of April 3 for spring surveys and October 11 for fall using linear regression for each depth and season over the sample grid (see Appendix, Figure A1). For each depth and season, annual shelf-wide mean temperatures were calculated using the data from sample grid locations with at least 80% of the time series present. The annual observations were then transformed to anomalies by subtracting the appropriate annual mean. All anomalies for a season and depth were combined over years into a single anomaly field or climatology.

The annual estimate of temperature for a depth and season was imputed by used universal kriging to estimate the temperature over the estimation grid with depth as a covariate. The kriging yielded temperature estimates and a variance estimate over the same grid. The optimal interpolation field was assembled by combining the annual estimate and information from the anomaly climatology. The climatology was re-leveled from anomaly values to temperatures by adding back the appropriate annual mean. For each location in the estimation grid, temperature was calculated as a weighted mean between the kriged annual estimate and the releveled climatology. The weightings in the calculations were partitioned based on the variance field of the annual kriging. The field was divided into quartiles from low to high variance with the weighting ratio of annual:climatology temperatures of 4:1, 3:1, 2:1, and 1:1, respectively. Hence, in areas of low variance the weighted mean was based on a weighting of 4:1, which would reflect a higher contribution of information from the annual estimate and thus be closer to an observation. In areas with high variance, the weighting ratio of 1:1 would reflect a greater effect of the climatology in determining the interpolation estimate.

The optimal interpolation temperature fields were evaluated using cross validation and a comparison to external data. The performance of optimal interpolation was compared to the predictive skill of using either the climatology or annual interpolation alone by doing ten random cross validations of each treatment. Each random draw of training and test sets sampled 3% of the data for the test set, or about 500 observations per draw. The temperature fields were fit with the training data and compared with the test set data. The lowest error rates were realized with the optimal interpolation contrasted with highest predictive error associated with fields based on the climatology alone (see Appendix, Figure A2). The spatial distribution of error had distinct depth and seasonal patterns. The spatial errors associated with the surface estimates were generally low with the exception of a few locations along the shelf break between latitudes 39-41°N; the spatial error in the bottom temperature varied by season, and was concentrated along the shelf break in spring and across the shelf between latitudes 36-40°N in fall (see Appendix, Figure A3). The optimal interpolation data were also compared to external data from other collection programs. The absolute errors between surface temperature data collected by satellites and the interpolation had interquartile ranges of approximately ±0.75°C (see Appendix, Figure A4). The absolute error in a comparison of interpolated bottom temperature to opportunist sampling was approximately ±0.75°C for spring bottom temperature and approximately 0.75-1.0°C in the fall. Salinity was estimated in the same way with the exception of collection correction, which was deemed unnecessary for the salinity data.

Map of the study system with half-degree sample grid (blue lines) and extent of estimation grid shown in beige points (a). Major features of the study system with shelf break marked in purple line (b). 100m depth shown as dashed line.

Map of the study system with half-degree sample grid (blue lines) and extent of estimation grid shown in beige points (a). Major features of the study system with shelf break marked in purple line (b). 100m depth shown as dashed line.

Chlorophyll data

Chlorophyll a concentration ([Chl]) data extracted from satellite remote-sensing databases based on measurements made with the Sea-viewing Wide Field of View Sensor (SeaWiFS), Moderate Resolution Imaging Spectroradiometer on the Aqua satellite (MODIS), Medium Resolution Imaging Spectrometer (MERIS), and Visible and Infrared Imaging/Radiometer Suite (VIIRS) sensors. We used the Garver, Siegel, Maritorena Model (GSM) merged data product at 100 km (equivalent to a 1° grid) and 8-day spatial and temporal resolutions, respectively, obtained from the Hermes GlobColour website. These four sensors provide an overlapping time series of [Chl] during the period 1997 to 2018 and were combined based on a bio-optical model inversion algorithm (Maritorena et al. 2010).

Zooplankton abundance

Zooplankton abundance is measured by the Ecosystem Monitoring Program (EcoMon), which conducts shelf-wide bimonthly surveys of the NES ecosystem (J. Kane 2007). Zooplankton and ichthyoplankton are collected throughout the water column to a maximum depth of 200 m using paired 61-cm Bongo samplers equipped with 333-micron mesh nets. Sample location in this survey is based on a randomized strata design, with strata defined by bathymetry and along-shelf location. Plankton taxa are sorted and identified. We used the bio-volume of the 18 most abundant taxonomic categories as potential predictor variables (Table 2). The zooplankton sample time series has some missing values, which necessitated the removal of spring data for the years 1989, 1990, 1991, and 1994 and fall data for the years 1989, 1990, and 1992. The data for each seasonal time frame was interpolated to a complete field over the estimation grid using ordinary kriging.

Variable and species names for the 18 most abundant zooplankton taxa in EcoMon survey data.
Variable name Full name
acarspp Acartia spp.
calfin Calanus finmarchicus
chaeto Chaetognatha
cham Centropages hamatus
cirr Cirripedia
ctyp Centropages typicus
echino Echinodermata
evadnespp Evadne spp.
gas Gastropoda
hyper Hyperiidea
larvaceans Appendicularians
mlucens Metridia lucens
oithspp Oithona spp.
para Paracalanus parvus
penilia Penilia spp.
pseudo Pseudocalanus spp.
salps Salpa
tlong Temora longicornis

Occupancy models

Occupancy and productivity habitats for summer flounder were estimated with random forest classification and regression models using a suite of static and dynamic predictor variables. Variation in species presence or absence and biomass across space, bathymetry factors, productivity factors, and climate factors were tested. Models were constructed separately for spring and fall seasons. The response variables were the occurrence and catch-per-unit-effort of summer flounder in the Northeast Fisheries Science Center bottom trawl survey, which is a fishery-independent survey on the Northeast US Shelf. The survey is conducted in the spring and fall of the year and is based on a stratified random design, which provides both spatial and temporal depictions of fish and macroinvertebrate abundances (Grosslein 1969). The independent or predictor variable set included physical environment variables, habitat descriptors, zooplankton variables, and remote sensing variables; the variables will be described in more detail below. Occupancy models were fit as two-factor classification models (absence as 0; presence as 1) using the randomForest R package (Liaw and Wiener 2002).

Prior to fitting the model, the independent variable set was first tested for multi-collinearity among the predictors and correlated variables were eliminated (R package Evans and Murphy (2018), version 2.1-3). From this reduced set of predictors, the final model variables were selected utilizing the model selection criteria of (Murphy, Evans, and Storfer 2010) as implemented in rfUtilities. Productivity models were fit as regression models with log10 transformed biomass-per-unit-effort as the response variable and the same starting set of predictor variables as in the occupancy models. As with the occupancy models, independent variables were tested for multi-collinearity and the model selection criteria was applied. Habitat was estimated from the model fits over a standard 0.1° grid, which circumscribes the range of ecosystem assessment areas in the region.

Three types of visualizations were created from the output of the tree models. The first visualization was used to see the average probability of occupancy over space and the rate of change (Sen’s slope) in occupancy over the years. The second visualization was used to see the mean occupancy gradient magnitude, or frontal strength and the rate of change (Sen’s slope) in occupancy gradient magnitude over the years. Gradient magnitude was calculated by calculating the median of the occupancy probabilities with a moving window and then summing those medians with a moving window with a matrix of weights. The third visualization was used to see the average biomass over space and the rate of change (Sen’s slope) in biomass over the years. Trends in total occupancy habitat area, with occupancy probabilities of 25, 50, and 75% over time were plotted as well by calculating the sum of the area with occupancy probabilities at each percentage during each year.

Predictor Variables

Static variables were kept constant over years where dynamic variables varied annually. Hence, the length of the time series of model fits is constrained by the shortest dynamic variable time series to meet the requirement of complete cases in the Random Forest fitting. The fitting time series was constrained to 1992 – 2016, which was determined by the length of the station salinity data.

Physical environment

Station data included observations made contemporaneously to survey bottom trawl stations. Depth of the station was used as a static variable in the analysis. The observed depth was used in model fitting where model predictions were based on depths from the ETOPO1 dataset, which provided Northeast Shelf bathymetry at a resolution of 0.0167° (below).

Estimation grid for predictor variable and habitat estimates, grid location spaced by 0.1° longitude and latitude.

Estimation grid for predictor variable and habitat estimates, grid location spaced by 0.1° longitude and latitude.

Surface and bottom water temperature and salinity were used as dynamic variables in the analysis. Temperature and salinity on the NE Shelf was collected using Conductivity/Temperature/Depth (CTD) instruments with the most complete sample coverage associated with spring (February –April) and fall (September-November) time frames. Surface and bottom temperatures were used to develop date of collection corrections using linear regression for each time frame. Temperatures were standardized to a collection date of April 3 for spring collections and October 11 for fall. A date of collection correction was not indicated for salinity data. The observed date-corrected temperature (°C) and uncorrected salinity data (PSU) was used in model fitting.

Model predictions were based on temperature and salinity fields using an optimal interpolation approach where annual data were combined with a climatology by season. For a half degree grid of the ecosystem, mean bottom temperature or salinity was calculated by year and season. For grid locations that had data for at least 80% of the time series, the data from those locals were used to calculate a seasonal mean. The annual seasonal means were used to calculate anomalies, which were combined over the time series to provide seasonal, surface and bottom anomaly climatologies.

Returning to the raw data, the observations for a year, season, and depth were then used to estimate an annual field using universal kriging with depth as a covariate. The kriging was done on a standard 0.1° grid using the automap (Hiemstra et al. 2008, version 1.0–14). The annual field was then combined with the climatology anomaly field, adjusted by the annual mean, using the variance field from the kriging as the basis for a weighted mean between the two. The variance field was divided into quartiles with the first quartile (lowest kriging variance) carrying a weighting of 4:1 between the annual and climatology values. Hence, the optimal interpolated field at these locations were skewed towards the annual data reflecting their proximity to actual data locations and low kriging variance associated with them. The weighting ratio shifted to 1:1 in the highest variance quartile reflecting less information from the annual field and more from the climatology.

Habitat Descriptors

Habitat descriptors are a series of static variables that reflect the shape and complexity of benthic habitats. Since the response variables for these models are derived from bottom trawl gear, naturally the range of candidate taxa for modelling is skewed to benthic organisms, making these descriptors particularly relevant. Most of the variables are based on depth measurement, including the complexity, BBI, VRM, Prcurv, rugostity, seabedforms, slp, and slpslp variables (Table 1). The soft-sed variable is based on benthic sediment grain size and the vorticity variable is based on current estimates.

Zooplankton Data

Zooplankton abundance is measured by the Ecosystem Monitoring Program (EcoMon), which conducts shelf-wide bimonthly surveys of the NES ecosystem (J. Kane 2007). Zooplankton and ichthyoplankton are collected throughout the water column to a maximum depth of 200 m using paired 61-cm Bongo samplers equipped with 333-micron mesh nets. Sample location in this survey is based on a randomized strata design, with strata defined by bathymetry and along-shelf location. Plankton taxa are sorted and identified. We used the bio-volume of the 18 most abundant taxonomic categories as potential predictor variables (Table 2). The zooplankton sample time series has some missing values which were ameliorated by summing data over five-year time steps for each seasonal time frame and interpolating a complete field using ordinary kriging. Thus, for example, the data for spring 2000 would include the available data from 1998-2002 tows.

Remote Sensing Data

Chlorophyll concentration and SST from remote sensing sources were applied in the habitat models as static variables. Chlorophyll and SST were summarized as monthly means with their associated gradient magnitude or frontal fields. The basis for the chlorophyll concentration was measurements made with the Sea-viewing Wide Field of View Sensor (SeaWiFS), Moderate Resolution Imaging Spectroradiometer on the Aqua satellite (MODIS), Medium Resolution Imaging Spectrometer (MERIS), and Visible and Infrared Imaging/Radiometer Suite (VIIRS) sensors during the period 1997-2016. The data is a merged product using the Garver, Siegel, Maritorena Model (GSM) algorithm obtained from the Hermes GlobColour website.

These four sensors provide an overlapping time series of chlorophyll concentration during the period and were combined based on a bio-optical model inversion algorithm (Maritorena et al. 2010). Monthly SST fields were based on data from the MODIS Terra sensor data available from the Ocean Color Website. From these data, mean monthly fields were generated for both chlorophyll and SST. There are a range of methods used to identify fronts (Belkin and O’Reilly 2009) in oceanographic data that usually apply some focal filter to reduce noise and identify gradient magnitude with a Sobel filter. These calculations were done in R using the raster package (Hijmans 2017, version 2.6–7) using a 3 by 3 mean focal filter and a Sobel filter to generate x and y derivatives, which are then used to calculate gradient magnitude.

Model Selection Criteria and Variable Importance

The habitat models were evaluated for fit based on out-of-bag classification accuracy. For occupancy models accuracy, AUC (Area Under the ROC Curve), and Cohen’s Kappa were calculated using the irr R package (Gamer et al. 2012, version 0.84). For regression models, the variance explained by the model, mean absolute error, the root mean square error, and bias were calculated using the Metrics R package (Hamner and Frasco 2018, version 0.1.3). To evaluate variable importance in both occupancy and regression models, we plotted the number of times a variable was the root variable versus the mean minimum node depth for the variable, highlighting the top ten important variables using the randomForestExplainer R package (Paluszynska and Biecek 2017, version 0.9). For occupancy models we also plotted the Gini index decrease versus accuracy decrease, whereas for the regression models we plotted node purity increase versus MSE increase, also highlighting the top ten most important variables.

Habitat descriptors used in occupancy model parameterization.
Variables Notes References
Namera bpi BPI is a second order derivative of the surface depth using the TNC Northwest Atlantic Marine Ecoregional Assessment (“NAMERA”) data with an inner radius=5 and outer radius=50. Lundblad et al. (2006)
Namera_vrm Vector Ruggedness Measure (VRM) measures terrain ruggedness as the variation in three-dimensional orientation of grid cells within a neighborhood based the TNC Northwest Atlantic Marine Ecoregional Assessment (“NAMERA”) data. Hobson (1972); Sappington, Longshore, and Thompson (2007)
Prcurv (2 km, 10 km, and 20 km) Benthic profile curvature at 2km, 10km and 20 km spatial scales was derived from depth data. A. Winship et al. (2018)
Rugosity A measure of small-scale variations of amplitude in the height of a surface, the ratio of the real to the geometric surface area. Friedman et al. (2012)
seabedforms Seabed topography as measured by a combination of seabed position and slope. http://www.northeastoceandata.org/
Slp (2 km, 10 km, and 20 km) Benthic slope at 2km, 10km and 20km spatial scales. A. Winship et al. (2018)
Slpslp (2 km, 10 km, and 20 km) Benthic slope of slope at 2km, 10km and 20km spatial scales A. Winship et al. (2018)
soft_sed Soft-sediments is based on grain size distribution from the USGS usSeabed: Atlantic coast offshore surficial sediment data. http://www.northeastoceandata.org/
Vort (fall - fa; spring - sp; summer - su; winter - wi) Benthic current vorticity at a 1/6 degree (approx. 19 km) spatial scale. Kinlan et al. (2016)

Diet composition

NEFSC bottom-trawl sampling occurs twice annually in the spring (March-May) and fall (September-November). The survey area encompasses about 293,000 square km of continental shelf from Cape Hatteras, NC, to Nova Scotia, Canada in depths from 8-400 m. Food habits sampling has been conducted since 1973.

Stomachs are collected at sea by NEFSC, and have been primarily analyzed at sea since 1981. Total stomach volume is estimated, each prey item is identified and sorted to the lowest possible taxonomic level, and the proportion of each prey item is estimated. Detailed methods are described in Link and Almeida (2000). Prey composition percent by weight was calculated using a weighted mean (\(\overline{W_{ijs}}\)) (Link and Almeida 2000; Smith and Link 2010) to estimate mean weight of prey \(i\) in predator \(j\) for statistical group \(s\). Note: Prey volumes are used as proxies for prey weight. It may be calculated as

\[ \overline{W_{ijs}} = \frac{\sum_{t=1}^{N_{ts}}N_{jts}\overline{w_{ijts}}}{N_{ts}} \]

where \(t\) represents an individual bottom trawl tow, \(N_{jts}\) is the number of predator \(j\) stomachs in tow \(t\) for statistical group \(s\), \(N_{ts}\) is the number of tows in statistical group \(s\), and

\[ \overline{w_{ijts}} = \frac{\sum_{k=1}^{N_{jts}}w_{ijtsk}}{N_{jts}} \]

Appendix

Date of collection correction

The dates of survey data collection varied by year. To date correct temperature measurements, regressions were estimated between temperature and day of the year over the sample grid (0.5° grid) by season and depth. Spring data were transformed to a temperature representing April 3 and fall to October 11 based on the slope estimates shown in the maps in Figure A1.

**Figure A1**: Spring (a) and fall (b) linear slope coefficients used to date correct surface temperature to standard spring and fall dates; same for spring (c) and fall (d) bottom temperature correction.

Figure A1: Spring (a) and fall (b) linear slope coefficients used to date correct surface temperature to standard spring and fall dates; same for spring (c) and fall (d) bottom temperature correction.


Cross validation performance of temperature interpolation

A series of random draws of training and test sets were taken to evaluate the predictive skill of the estimation procedure. A set of climatology, annual interpolation, or optimal interpolation fields were estimated using the training set and compared to the held-out test set.

**Figure A2**: Mean and 95% confidence intervals of squared errors for spring (a) and fall (b) surface temperature estimates based on climatology, annual interpolation, and optimal interpolation (OI); same data for spring (c) and fall (d) bottom temperature estimates.

Figure A2: Mean and 95% confidence intervals of squared errors for spring (a) and fall (b) surface temperature estimates based on climatology, annual interpolation, and optimal interpolation (OI); same data for spring (c) and fall (d) bottom temperature estimates.


The distribution of errors from the optimal interpolation test sets were evaluated spatially to determine where the larger errors occurred and what ecosystem features they are associated with.

**Figure A3**: Mean absolute error by 0.1 degree latitude and longitude intervals for spring (a) and fall (b) surface temperature; same data for spring (c) and fall (d) bottom temperature. Red indicates a positive error where blue indicates negative.

Figure A3: Mean absolute error by 0.1 degree latitude and longitude intervals for spring (a) and fall (b) surface temperature; same data for spring (c) and fall (d) bottom temperature. Red indicates a positive error where blue indicates negative.


Comparisons to external data sources

Sea surface temperature estimated in this study were compared to the OISST dataset serving as a source of data for external comparison. The OISST data for April 3 (spring) and October 11 (fall) over the period 1982-2017 were extracted on the same 0.5° grid used in the study. Time series of matching spring and fall temperatures were differenced (external minus interpolation data) and presented in the figure below (sample size for surface spring and fall data were 5220 and 5365, respectively). The interquartile range for the surface comparisons were generally symmetric around zero with differences between 0.5-1°C. Bottom temperature estimated in this study were compared to the data from a cooperative data collection program for external comparison. The “Environmental Monitors on Lobster Traps” (EMOLT) is a cooperative research program that collects bottom temperatures. The data for spring and fall were extracted from rasters of the study data to match the extremal data. The external data covered the period 1968-2017, with 360 data points for the spring and 728 for the fall. Spring bottom comparisons yielded the smallest interquartile range of differences less than 0.5°C, where fall comparisons were slightly larger and biased to lower interpolation estimates compared to the external data.

**Figure A4**: Box plots (box: interquartile range, whisker: 5-95% range, square symbol: mean, line: median) of the difference between external and interpolation data.

Figure A4: Box plots (box: interquartile range, whisker: 5-95% range, square symbol: mean, line: median) of the difference between external and interpolation data.

References

Belkin, Igor M, and John E O’Reilly. 2009. “An Algorithm for Oceanic Front Detection in Chlorophyll and Sst Satellite Imagery.” Journal of Marine Systems 78 (3). Elsevier: 319–26.

Despres-Patanjo, Linda I, Thomas R Azarovitz, and Charles J Byrne. 1988. “Twenty-Five Years of Fish Surveys in the Northwest Atlantic: The Nmfs Northeast Fisheries Center’s Bottom Trawl Survey Program.” Marine Fisheries Review 50 (4). NMFS, Scientific Publications Office: 69–71.

Evans, Jeffrey S., and Melanie A. Murphy. 2018. RfUtilities. https://cran.r-project.org/package=rfUtilities.

Friedman, Ariell, Oscar Pizarro, Stefan B. Williams, and Matthew Johnson-Roberson. 2012. “Multi-Scale Measures of Rugosity, Slope and Aspect from Benthic Stereo Image Reconstructions.” PLoS ONE 7 (12). doi:10.1371/journal.pone.0050440.

Gamer, Matthias, Jim Lemon, Ian Fellows, and Puspendra Singh. 2012. Irr: Various Coefficients of Interrater Reliability and Agreement. https://CRAN.R-project.org/package=irr.

Grosslein, MD. 1969. Groundfish Survey Methods. Bureau of Commercial Fisheries, Biological Laboratory. https://nefsc.noaa.gov/publications/series/whlrd/whlrd6902.pdf.

Hamner, Ben, and Michael Frasco. 2018. Metrics: Evaluation Metrics for Machine Learning. https://CRAN.R-project.org/package=Metrics.

Hiemstra, P.H., E.J. Pebesma, C.J.W. Twenh"ofel, and G.B.M. Heuvelink. 2008. “Real-time automatic interpolation of ambient gamma dose rates from the Dutch Radioactivity Monitoring Network.” Computers & Geosciences.

Hijmans, Robert J. 2017. Raster: Geographic Data Analysis and Modeling. https://CRAN.R-project.org/package=raster.

Hobson, R. D. 1972. Surface roughness in topography: quantitative approach. New York, New York: Harper; Row.

Kane, Joseph. 2007. “Zooplankton abundance trends on Georges Bank, 1977-2004.” ICES Journal of Marine Science 64 (5): 909–19. doi:10.1093/icesjms/fsm066.

Kinlan, Brian P., Arliss J. Winship, Timothy P. White, and John Christensen. 2016. “Modeling At-Sea Occurrence and Abundance of Marine Birds to Support Atlantic Marine Renewable Energy Planning Phase I Report.”

Liaw, Andy, and Matthew Wiener. 2002. “Classification and Regression by randomForest.” R News 2 (3): 18–22. https://CRAN.R-project.org/doc/Rnews/.

Lundblad, Emily R., Dawn J. Wright, Joyce Miller, Emily M. Larkin, Ronald Rinehart, David F. Naar, Brian T. Donahue, S. Miles Anderson, and Tim Battista. 2006. “A benthic terrain classification scheme for American Samoa.” Marine Geodesy 29 (2): 89–111. doi:10.1080/01490410600738021.

Murphy, Melanie A., Jeffrey S. Evans, and Andrew Storfer. 2010. “Quantifying Bufo boreas connectivity in Yellowstone National Park with landscape genetics.” Ecology 91 (1): 252–61. doi:10.1890/08-0879.1.

Paluszynska, Aleksandra, and Przemyslaw Biecek. 2017. randomForestExplainer: Explaining and Visualizing Random Forests in Terms of Variable Importance. https://CRAN.R-project.org/package=randomForestExplainer.

Sappington, J. M., Kathleen M. Longshore, and Daniel B. Thompson. 2007. “Quantifying Landscape Ruggedness for Animal Habitat Analysis: A Case Study Using Bighorn Sheep in the Mojave Desert.” Journal of Wildlife Management 71 (5): 1419–26. doi:10.2193/2005-723.

Sherman, Kenneth, A Solow, J Jossi, and J Kane. 1998. “Biodiversity and Abundance of the Zooplankton of the Northeast Shelf Ecosystem.” ICES Journal of Marine Science 55 (4). Oxford University Press: 730–38.

Smith, Brian E., and Jason S. Link. 2010. The Trophic Dynamics of 50 Finfish and 2 Squid Species on the Northeast US Continental Shelf. NOAA Technichal Memorandum NMFS-NE-216. National Marine Fisheries Service, 166 Water Street, Woods Hole, MA 02543-1026. http://www.nefsc.noaa.gov/publications/tm/tm216/.

Winship, A.J., Kinlan B.P., T.P. White, and J.B. Leirness. 2018. “Modeling At-Sea Density of Marine Birds to Support Atlantic Marine Renewable Energy Planning: Final Report.” OCS Study BOEM 2018-010.