1 Introduction

The Atlantic herring research track working group (WG) is interested in exploring impacts of food availability on herring stock dynamics. Indices of zooplankton abundance were proposed for exploration.

These indices could potentially explain:

herring recruitment
herring growth/condition
patterns in herring “survival” random effects

The WG decided to focus on patterns in recruitment, as low recruitment in recent years is an important issue for the stock and for fishery management.

The WG used a boosted regression tree analysis (Molina 2024) to identify zooplankton indices that best explained patterns in herring recruitment. These indices included large copepods in spring (influencing growth of herring postlarvae and juveniles), and small copeopods in fall (influencing survival of herring larvae over the winter).

In this working paper, we develop indices of large copepods in spring and small copepods in fall using zooplankton survey data (citation) from 1982-2022. We tailor the indices to the spatial footprint of the Atlantic herring stock assessment, and further use information on the changing spatial distribution of herring larvae over time from the same dataset to evaluate potential changes in availability of small copepods to herring larvae.

2 Methods

We used spatio-temporal modeling to develop zooplankton indices using the ECOMON data as an input.

Decisions:

zooplankton species
which zooplankton metric (area or volume)
spatial footprint
temporal scale (seasonal? annual?)
potential covariates

2.1 Zooplankton species

We use herring diet information to determine which zooplankton species to include. This information is from both literature and food habits data.

“The herring is a plankton feeder…. Examination of 1,500 stomachs showed that adult herring near Eastport were living solely on copepods and on pelagic euphausiid shrimps (Meganyctiphanes norwegica), fish less than 4 inches long depending on the former alone, while the larger herring were eating both. When first hatched, and before the disappearance of the yolk sac, the larvae (European) feed on larval snails and crustaceans, on diatoms, and on peridinians, but they soon begin taking copepods, and depend exclusively on these for a time after they get to be 12 mm. long, especially on the little Pseudocalanus elongatus. As they grow older they feed more and more on the larger copepods and amphipods, pelagic shrimps, and decapod crustacean larvae (Collette and Klein-Macphee 2002).”

What herring eat from NEFSC food habits data is summarized here: https://fwdp.shinyapps.io/tm2020/ Copepods are highest diet proportion, followed by well digested prey (AR), and euphausiids. Broken down by decade and season, it is clear that copeopods dominate adult diets in winter and spring, and more so recently.

Maybe copeopods are bad for hering??

# Use Brian's new Rmd to get herring diet by decade and season or season and region

herringdiet <- readRDS(here::here("data/herringdiet.rds"))

herringdecseastot <- herringdiet %>%
  # make decade column
  # make 2 seasons winter--> spring and summer--> fall
  dplyr::mutate(decade = floor(year/10)*10) |>
                #season2 = case_when(season %in% c("WINTER", "SPRING") ~ "SPRING",
                #                    season %in% c("SUMMER", "FALL") ~ "FALL")) %>%
                #season2 = case_when(season %in% c( "SPRING") ~ "SPRING",
                #                    season %in% c( "FALL") ~ "FALL")) %>%
  dplyr::select(decade, season, totwt) %>%
  dplyr::distinct() %>%
  dplyr::group_by(decade, season) %>%
  dplyr::mutate(totwt2 = sum(totwt, na.rm = TRUE),
                nyrs = n_distinct(year)) %>%
  dplyr::select(decade, season, totwt2, nyrs) %>%
  dplyr::distinct()

diet <- read.csv(here::here("data/allwtp2024-09-28.csv"))

preylook <- read.csv(here::here("data/SASPREY24.csv"))

preylook <- dplyr::select(preylook, c(Pynam, pycomnam, COLLCAT))

# add blueprey list
decadeseason <- diet %>%
  dplyr::mutate(copeprey = ifelse(prey %in% c("COPEPO"), TRUE, FALSE),
                plotcol = ifelse(copeprey, "blue", "lightgrey"),
                decade = as.integer(gsub("s","",decade))) %>%
  dplyr::left_join(herringdecseastot, by=c("season" = "season", 
                                            "decade" = "decade")) %>%

  dplyr::group_by(decade, season) %>%
  dplyr::filter(meansw>0) %>%
  dplyr::arrange(desc(copeprey), .by_group = TRUE)


bars <- map(unique(decadeseason$decade)
            , ~geom_bar(aes(width=nyrs), stat = "identity", colour = "white"#, position = "stack", 
                       , data = decadeseason %>% filter(decade == .x)))


p2 <-   ggplot(decadeseason, aes(decade, relmsw, fill=copeprey)) +
                                    #fill=factor(prey, 
                                    #            levels = c(as.factor(names(preycolaggsel)),
                                    #                       setdiff(prey, preycolaggsel)),
                                    #           ordered = TRUE
                                    #           ))) +
   #geom_bar(aes(width=nyrs), stat = "identity") + #
   bars +
   ylab("Percent in Diet") +
   xlab("Decade") +
   geom_text(aes(y=relmsw, 
             label = ifelse(relmsw > 3, prey, "")),
             position = position_stack(vjust = 0.5),
             size=2.5)+
   facet_wrap(~fct_relevel(season,"WINTER", "SPRING","SUMMER", "FALL")) +
   theme_bw() +
   #viridis::scale_fill_viridis(discrete = TRUE) +
   scale_fill_manual(values=c( "grey90", "lightblue")) + 
   theme(legend.position = "none"
         #legend.position="bottom",
         #legend.text=element_text(size=5)
         ) #+
    #geom_bar_interactive(stat = "identity", aes(tooltip = prey, data_id = prey))

#ggiraph(code = print(p1)) 
#p1

p2

Our focus in developing the indices is on herring recruitment, so we prioritize indices of food for larvae through juveniles.

Models developed for the Herring RT are for the following copepod categories:

Large copepods ALL: Calanus finmarchicus, Metridia lucens, Calanus minor, Eucalanus spp., Calanus spp.
Small copepods ALL: Centropages typicus, Pseudocalanus spp., Temora longicornis, Centropages hamatus, Paracalanus parvus, Acartia spp., Clausocalanus arcuicornis, Acartia longiremis, Clausocalanus furcatus, Temora stylifera, Temora spp., Tortanus discaudatus, Paracalanus spp.

These categories included all life stages of each copepod species.

(Models were also developed for Calanus finmarchicus alone, total zooplankton volume, and Euphausiids, which may be useful for comparison with adult Atlantic herring condition or growth in the future.)

An additional model was developed for Atlantic herring larvae using the same dataset in order to characterize the spatial extent of herring larvae compared with their zooplankton prey. This model was not used as an index due to missing data during several years of the time series.

2.2 Zooplankton metric

Previous zooplankton indices use the numbers per 100 cubic meters of filtered water volume _100m3 data (Kane 2007; Morse et al. 2017). Other current analyses use the numbers per 100 cubic meters as well.

2.3 Spatial scale

NEFSC survey strata definitions are built into the VAST northwest-atlantic extrapolation grid already. Therefore, fall and spring survey strata sets used in the Atlantic herring assessment were also applied to the zooplankton indices. Indices were also calculated in Ecological Production Units (EPUs) used in regional ecosystem reporting (Fig. \(\ref{fig:maps}\)).

herring_spring <- c(01010, 01020, 01030, 01040, 01050, 01060, 01070, 01080, 01090, 
                    01100, 01110, 01120, 01130, 01140, 01150, 01160, 01170, 01180, 
                    01190, 01200, 01210, 01220, 01230, 01240, 01250, 01260, 01270, 
                    01280, 01290, 01300, 01360, 01370, 01380, 01390, 01400, 01610, 
                    01620, 01630, 01640, 01650, 01660, 01670, 01680, 01690, 01700, 
                    01710, 01720, 01730, 01740, 01750, 01760)
herring_fall <- c(01050, 01060, 01070, 01080, 01090, 01100, 01110, 01120, 01130, 
                  01140, 01150, 01160, 01170, 01180, 01190, 01200, 01210, 01220, 
                  01230, 01240, 01250, 01260, 01270, 01280, 01290, 01300, 01360, 
                  01370, 01380, 01390, 01400)

MAB <- c(1010:1080, 1100:1120, 1600:1750, 3010:3450, 3470, 3500, 3510)
GB  <- c(1090, 1130:1210, 1230, 1250, 3460, 3480, 3490, 3520:3550)
GOM <- c(1220, 1240, 1260:1290, 1360:1400, 3560:3830)
SS  <- c(1300:1352, 3840:3990)

D70pct_nwast <- readRDS(url("https://github.com/NOAA-EDAB/zooplanktonindex/raw/main/spatialdat/D70pct_nwa_strat2.rds"))

# MAB EPU
MAB2 <- D70pct_nwast %>% 
  dplyr::filter(stratum_number %in% MAB) 

# MAB herring larvae area
MAB2herr <- MAB2 %>%
  dplyr::filter(stratum_number2 %% 10 == 1) 

# MAB outside larval area
MAB2out <- MAB2 %>%
  dplyr::filter(stratum_number2 %% 10 == 2) 

# Georges Bank EPU
GB2 <- D70pct_nwast %>% 
  dplyr::filter(stratum_number %in% GB) 

# GB herring larvae
GB2herr <- GB2 %>%
  dplyr::filter(stratum_number2 %% 10 == 1) 

#GB outside larval area
GB2out <- GB2 %>%
  dplyr::filter(stratum_number2 %% 10 == 2)

# gulf of maine EPU 
GOM2 <- D70pct_nwast %>%
  dplyr::filter(stratum_number %in% GOM)

# GOM herring larvae
GOM2herr <- GOM2 %>%
  dplyr::filter(stratum_number2 %% 10 == 1) 

#GOM outside larval area
GOM2out <- GOM2 %>%
  dplyr::filter(stratum_number2 %% 10 == 2)

# scotian shelf EPU 
SS2 <- D70pct_nwast %>%
  dplyr::filter(stratum_number %in% SS) 

# SS herring larvae
SS2herr <- SS2 %>%
  dplyr::filter(stratum_number2 %% 10 == 1) 

#SS outside larval area
SS2out <- SS2 %>%
  dplyr::filter(stratum_number2 %% 10 == 2)

# whole herring larval area
herrlarv <- dplyr::bind_rows(MAB2herr, GB2herr, GOM2herr, SS2herr)

# outside herring larval area
nolarv <- dplyr::bind_rows(MAB2out, GB2out, GOM2out, SS2out)

# spring herring NEFSC BTS
her_spr2 <- D70pct_nwast %>%
  dplyr::filter(stratum_number %in% herring_spring) 

# fall herring NEFSC BTS
her_fall2 <- D70pct_nwast %>%
  dplyr::filter(stratum_number %in% herring_fall) 

# all epus
allEPU2 <- D70pct_nwast %>%
  dplyr::filter(stratum_number %in% c(MAB, GB, GOM, SS))

theme_set(theme_bw())

ggplot(data = ecodata::coast) +
  geom_sf() + 
  geom_point(data = FishStatsUtils::northwest_atlantic_grid, aes(x = Lon, y = Lat),  colour = "coral4", size=0.05, alpha=0.1) +
  geom_point(data = MAB2, aes(x = Lon, y = Lat), colour = "green3", size=0.05, alpha=0.5) +
  geom_point(data = GB2, aes(x = Lon, y = Lat), colour = "orange", size=0.05, alpha=0.5) +
  geom_point(data = GOM2, aes(x = Lon, y = Lat), colour = "red", size=0.05, alpha=0.5) +
  geom_point(data = SS2, aes(x = Lon, y = Lat), colour = "purple", size=0.05, alpha=0.1) +
  #coord_sf(xlim = c(-79, -65.5), ylim = c(33, 45)) + #full extent of VAST model
  coord_sf(xlim =c(-78, -65.5), ylim = c(35, 45)) + #zoomed to Hatteras and N
  ggtitle("MAB (green), GB (orange), GOM (red), SS(purple)")


ggplot(data = ecodata::coast) +
  geom_sf() + 
  geom_point(data = FishStatsUtils::northwest_atlantic_grid, aes(x = Lon, y = Lat),  colour = "coral4", size=0.05, alpha=0.1) +
  geom_point(data = her_spr2, aes(x = Lon, y = Lat), size=0.05, colour = "yellow",  alpha=0.7) +
  #geom_point(data = her_fall2, aes(x = Lon, y = Lat), size=0.03, colour = "lightblue",  alpha=0.3) +
  #coord_sf(xlim = c(-79, -65.5), ylim = c(33, 45)) +
  coord_sf(xlim =c(-78, -65.5), ylim = c(35, 45)) +
  ggtitle("Herring spring (yellow) survey strata")

ggplot(data = ecodata::coast) +
  geom_sf() + 
  geom_point(data = FishStatsUtils::northwest_atlantic_grid, aes(x = Lon, y = Lat),  colour = "coral4", size=0.05, alpha=0.1) +
  #geom_point(data = her_spr2, aes(x = Lon, y = Lat), size=0.05, colour = "yellow",  alpha=0.1) +
  geom_point(data = her_fall2, aes(x = Lon, y = Lat), size=0.03, colour = "lightblue",  alpha=0.7) +
  #coord_sf(xlim = c(-79, -65.5), ylim = c(33, 45)) +
  coord_sf(xlim =c(-78, -65.5), ylim = c(35, 45)) +
  ggtitle("Herring fall (light blue) survey strata")


# ggplot(data = ecodata::coast) +
#   geom_sf() + 
#   geom_point(data = FishStatsUtils::northwest_atlantic_grid, aes(x = Lon, y = Lat),  colour = "coral4", size=0.05, alpha=0.1) +
#   geom_point(data = her_spr2, aes(x = Lon, y = Lat), size=0.05, colour = "yellow",  alpha=0.1) +
#   geom_point(data = herrlarv, aes(x = Lon, y = Lat), size=0.03, colour = "blue",  alpha=0.5) +
#   #coord_sf(xlim = c(-79, -65.5), ylim = c(33, 45)) +
#   coord_sf(xlim =c(-78, -65.5), ylim = c(35, 45)) +
#   ggtitle("Herring larval area Sept-Feb (blue), over spring survey strata (yellow")

Maps of key areas for Herring assessment indices. The full VAST model grid is shown in brown.

Figure 2.1: Maps of key areas for Herring assessment indices. The full VAST model grid is shown in brown.

2.4 Temporal scale

Options explored for temporal scale included:

An annual model mixes many generations of zooplankton; “all” is months 1-12
Splitting the year into two seasons, rolling winter into spring and summer into fall is done for multiple fish species
- “SPRING” months 1-6
- “FALL” months 7-12
Splitting the year into 4 seasons, 2 matching survey seasons (spring/fall) and try two for winter and summer to cover other portions of herring life history. Current designations are
- “winter” months 12, 1, 2
- “spring” months 3, 4, 5
- “summer” months 6, 7, 8
- “fall” months 9, 10, 11
Finally, timing of herring larvae in the system was derived from the herring larvae data
- “sepfeb” months 9-12 and 1-2 of the following year
- “nolarv” months 3-8

Testing models with data split over 4 seasons for 1982-2022 resulted in poor model performance, inability to estimate parameters, and lack of convergence for cases where parameters could be estimated. Therefore, we used the SPRING = January-June and FALL = July-December temporal scale as a starting point.

2.5 Spatio-temporal modeling

We applied the Vector Autoregressive Spatio-Temporal (VAST) model (Thorson and Barnett 2017; Thorson 2019) to estimate zooplankton indices from zooplankton survey data.

VAST is a Geostatistical generalized linear mixed effects model (GLMM) that models two linear predictors for an index: 1. encounter rate, and 2. positive catch (amount in stomach)

A full model for the first linear predictor \(\rho_1\) for each observation \(i\) can include: * fixed intercepts \(\beta_1\) for each category \(c\) and time \(t\), * spatial random effects \(\omega_1\) for each location \(s\) and category, * spatio-temporal random effects \(\varepsilon_1\) for each location, category, and time, * fixed vessel effects \(\eta_1\) by vessel \(v\) and category, and * fixed catchability impacts \(\lambda_1\) of covariates \(Q\) for each observation and variable \(k\):

\[\rho_1(i) = \beta_1(c_i, t_i) + \omega_1^*(s_i, c_i) + \varepsilon_1^*(s_i, c_i, t_i) + \eta_1(v_i, c_i) + \sum_{k=1}^{n_k} \lambda_1(k) Q(i,k)\]

The full model for the second linear predictor \(\rho_2\) has the same structure, estimating \(\beta_2\), \(\omega_2\), \(\varepsilon_2\), \(\eta_2\), and \(\lambda_2\) using the observations, categories, locations, times, and covariates.

VAST model code and documentation is available here: https://github.com/James-Thorson-NOAA/VAST

Spatial and spatio-temporal correlation decay with increasing distance estimated as \(\kappa\) in a Matern function with fixed smoothness and geometric anisotropy (directional correlation, optionally estimated by the model).

Initial model selection consistently supported the inclusion of spatial and spatio-temporal random effects and anisotropy across all datasets: fall, spring, and annual.

Observations in space are used to define fixed locations (“knots”) covering the full extent of the model. We assume constant variation within a timestep at a knot.

Modelers specify the number of knots to group observation locations, to balance computation time and resolution. We used 500 knots (Fig. \(\ref{fig:knots}\)).

knitr::include_graphics(here::here("pyindex/lgcopeALL_spring_500_biascorrect_doy/Data_and_knots.png"))

Observations correlated in space and in space over time due to unmeasured processes are modeled as multivariate normal Gaussian Random Fields (GRF):

\(\omega_1\) ~ MVN(0, \(\mathbf R_1\)); \(\omega_2\) ~ MVN(0, \(\mathbf R_2\))
\(\varepsilon_1\)(,t) ~ MVN(0, \(\mathbf R_1\)); \(\varepsilon_2\)(,t) ~ MVN(0, \(\mathbf R_2\))

Spatial and spatio-temporal correlation decay with increasing distance \(\mathbf d\) estimated as \(\kappa\) in a Matérn function with fixed smoothness \(\nu\) and geometric anisotropy \(H\) (directional correlation).

Correlation function between locations \(s\) and \(s'\):

\[\mathbf R_1(s, s') = \frac{1}{2^{\nu-1}\Gamma(\nu)} \times (\kappa_1 \lvert \mathbf d(s,s') \mathbf H \rvert)^\nu \times K_\nu (\kappa_1 \lvert \mathbf d(s,s') \mathbf H \rvert)\]

Estimation uses stochastic partial differential equation (SPDE) approximation.

Observation model “Index2”, Gamma distribution for positive catches and Alternative “Poisson-link delta-model” using log-link for numbers-density and log-link for biomass per number. It is intended for continuous data, which includes biomass data and “numbers standardized to a fixed area.”

Probability of encounter Poisson: \(p(i) = 1 - exp[-n(i)]\) set to 1 for groups found at all stations, e.g. small copepods or zooplankton volume

Number of zooplankton cells per volume given encounter: \(r(i) = \frac{n(i)}{p(i)}w(i)\)

Probability for numbers per volume \(B\) where \(g\) is a Gamma function. :

\[ Pr[b(i) = B] = \begin{cases} 1-p(i), B = 0\\\\p(i) \times g[B|r(i), \sigma_b^2], B > 0 \end{cases},\]

We are interpreting zooplankton abundance per 100 cubic meters as numbers standardized to a fixed area (volume) in applying the Gamma observation model.

Density \(b\) at a location (knot) \(s\) for year \(t\) is then the predicted number of cells per volume (linear predictor for encounter * linear predictor for cells/vol given encounter):

\[\hat{b}_{s,t} = \hat{n}_{s,t}\hat{w}_{s,t}\]

Index based on area \(a\) weighting for each of 500 knots (or subsets):

\[I_t = \sum_{s=1}^{500} a_s\hat{b}_{s,t}\]

For final models, bias correction was done as in Thorson and Kristensen (2016).

2.6 Structural decisions

Two different observation models were applied, depending on the dataset. The default VAST index standardization (purpose = "index2" in make_settings) uses a Gamma distribution for positive catches and an alternative “Poisson-link delta-model” using log-link for numbers-density and log-link for biomass per number (ObsModel= c(2,1)).

We applied the default observation model to Calanus finmarchicus (calfin_100m3) and Large copepods (calfin_100m3 + mlucens_100m3 + calminor_100m3 + euc_100m3 + calspp_100m3) datasets.

The default was used for index standardization of stomach contents data for pelagic and benthic forage indices. It is intended for continuous data, which includes biomass data and “numbers standardized to a fixed area” (see section starting at line 239 in the VAST user manual here. I am interpreting zooplankton abundance per 100 cubic meters as numbers standardized to a fixed area (volume) in applying the Gamma observation model.

For data where there are some years where the species is present in all (or 0) samples, estimating the probability of encounter fails (or at least, VAST won’t let you try). In these cases, the options are to treat intercepts representing temporal variability as random effects (by setting RhoConfig Beta or Epsilon entries to something other than 0), or to use a different link function.

We intend for our indices to potentially be used in assessment (though as a covariate rather than an index, so maybe I’m being too strict), so the recommendation is to “minimize covariance in the estimated index by excluding any temporal correlation on model components (i.e., the intercept is a fixed effect in each year, and the spatio-temporal term is independent in each year)” (Thorson 2019).

All of the small copepods datasets had at least one year where our small copepods groupings were encountered at all stations. None had years with 0 encounters.

Therefore, we used a different link function, the Poisson-link fixing encounter probability=1 for any year where all samples encounter the species. We kept all other settings for index standardization the same, but set (ObsModel= c(2,4)).

2.7 Model validation

Zooplankton indices have not previously been developed using spatio-temporal modeling in the Northeast US. Therefore, we compare previously estimated annual abundance anomalies for Calanus finmarchicus with annual anomalies estimated from our VAST models. These anomalies are estimated at the ecoregion (Mid Atlantic Bight–MAB, Georges Bank–GB, and Gulf of Maine–GOM) scale.

2.8 Model selection

Model selection was done in two stages, the first looking at spatial, spatio-temporal random effects and the second looking at covariates, following Ng et al. (2021) and Gaichas et al. (2023). The model selection script is available at https://github.com/NOAA-EDAB/zooplanktonindex/blob/main/VASTscripts/VASTunivariate_zoopindex_modselection.R

Models were run using Reduced Maximum Likelihood (REML) and without bias correction for the first stage of model selection.

The only “catchability” covariates tested were Day of Year (“doy”) to compensate for the fast turnover rate of zooplankton during the season.

Converged models with the lowest AIC for each taxonomic/season combination were considered best.

2.9 Model diagnostics

After selecting the converged model structures with the most support based on AIC, standard diagnostics were evaluated for each model: data available by season, year, and in space, quantile residuals in aggregate and in space. Model results include the ln density each year and the resulting indices for each spatial unit.

2.10 Defining herring larval presence in time and space

When developing a zooplankton index, it is important to determine whether the zooplankton are available to herring larvae in time and space. We used data available on herring larvae in the zooplankton dataset to define the most likely time and space for herring larvae, and used those temporal and spatial definitions to further refine selected zooplankton indices.

A herring larvae VAST model was developed using the same methods as above, and resulting indices were compared with those estimated by Richardson et al. (2010).

Models were initially developed using the same seasons as for zooplankton (Spring = January-June, Fall = July-December). However, nearly all observations of herring larvae in the 1982-2022 dataset were from September - February (also reflected in the index developed in Richardson et al. (2010).) Therefore, another model for September-February was developed, which aligned observations in January-February with those from the preceeding fall September-December.

To define areas with the most dense herring larvae, we calculate quantiles of density in space across the time series. First we sum the density in each cells over all years, then calculate the quantiles of summed log density from the September-February herring larvae VAST model.

stratlook <- data.frame(Stratum = c("Stratum_1",
                                    "Stratum_2",
                                    "Stratum_3",
                                    "Stratum_4",
                                    "Stratum_5",
                                    "Stratum_6",
                                    "Stratum_7"),
                        Region  = c("AllEPU",
                                    "her_sp",
                                    "her_fa",
                                    "MAB",
                                    "GB",
                                    "GOM",
                                    "SS"))

stratlook2 <- data.frame(Stratum = c("Stratum_1",
                                    "Stratum_2",
                                    "Stratum_3",
                                    "Stratum_4",
                                    "Stratum_5",
                                    "Stratum_6",
                                    "Stratum_7",
                                    "Stratum_8",
                                    "Stratum_9"),
                        Region  = c("AllEPU",
                                    "her_sp",
                                    "her_fa",
                                    "her_larv",
                                    "no_larv",
                                    "MAB",
                                    "GB",
                                    "GOM",
                                    "SS"))

source(here::here("R/utils_VASToutput.R")) #IEA/HerringRT_2025/

outdir <- here::here("pyindex") #IEA/HerringRT_2025/
moddirs <- list.dirs(outdir) 
moddirs <- moddirs[-1]
# keep folder name
modnames <- list.dirs(outdir, full.names = FALSE)

modselect <- modtable(moddirs)

# lets only look at indices for converged models
modcompare_conv <- modselect |>
  dplyr::ungroup() |>
  dplyr::filter(converged2 == "likely") |>
  dplyr::select(modname) |>
  as.vector() |>
  unname() |>
  unlist()

moddirs_conv <-  moddirs[grepl(sprintf("\\.*(%s)$", paste(modcompare_conv, collapse = '|')), moddirs)]


modcompareindex <- purrr::map_dfr(moddirs_conv, purrr::possibly(getmodindex, otherwise = NULL))

splitoutput <- modcompareindex %>%
  dplyr::mutate(Season = modname |> map(str_split, pattern = "_") |> map_chr(c(1,2))) %>%
  dplyr::mutate(Data = modname |> map(str_split, pattern = "_") |> map_chr(c(1,1))) %>%
  dplyr::mutate(Estimate = ifelse(Estimate == 0, NA, Estimate)) |>
  dplyr::left_join(stratlook)

# only larvarea models have this strata set
splitoutput2 <- modcompareindex %>%
  dplyr::filter(str_detect(modname, "larvarea")) |>
  dplyr::mutate(Season = modname |> map(str_split, pattern = "_") |> map_chr(c(1,2))) %>%
  dplyr::mutate(Data = modname |> map(str_split, pattern = "_") |> map_chr(c(1,1))) %>%
  dplyr::mutate(Estimate = ifelse(Estimate == 0, NA, Estimate)) |>
  dplyr::left_join(stratlook2)

zoomax <- max(splitoutput$Estimate, na.rm=T)

zootsmean <- splitoutput %>%
  dplyr::group_by(modname, Region) %>%
  dplyr::mutate(fmean = mean(Estimate, na.rm=T))

# one option, sum D_gct over all years then take quantiles in space

D <- readRDS("Dherr_sepfeb.rds")

Dtot <- D |> 
  dplyr::group_by(cell) |>
  dplyr::mutate(Dsum = sum(D, na.rm=TRUE),
                logD = log(as.vector(Dsum))) |>
  dplyr::select(!c(D, Year)) |>
  dplyr::distinct()

Dvec <- terra::vect(Dtot, geom=c("Lon", "Lat"))

q <- quantile(log(as.vector(Dtot$Dsum)), probs=seq(0,1,0.1))
q2 <- quantile(Dtot$logD, probs=seq(0,1,0.1))

qvec <- terra::quantile(Dvec, probs=c(0.6, 0.65, 0.7, 0.75, 0.8))

quants <- classInt::classIntervals(log(as.vector(Dtot$Dsum)), 
                                   style = "quantile", n = 10)

#quants$brks


D60pct <- Dtot |>
  dplyr::filter(log(as.vector(Dsum))>qvec["60%","logD"]) |>
  dplyr::rename("60th" = "Include")

D65pct <- Dtot |>
  dplyr::filter(log(as.vector(Dsum))>qvec["65%","logD"]) |>
  dplyr::rename("65th" = "Include")

D70pct <- Dtot |>
  dplyr::filter(log(as.vector(Dsum))>qvec["70%","logD"]) |>
  dplyr::rename("70th" = "Include")

D75pct <- Dtot |>
  dplyr::filter(log(as.vector(Dsum))>qvec["75%","logD"]) |>
  dplyr::rename("75th" = "Include")

D80pct <- Dtot |>
  dplyr::filter(log(as.vector(Dsum))>qvec["80%","logD"]) |>
  dplyr::rename("80th" = "Include")

qvec

##     Include     logD
## 60%       1 3.274264
## 65%       1 3.828164
## 70%       1 4.554015
## 75%       1 5.215182
## 80%       1 5.668452

Mapping the summed density below.

d.name <- moddirs[str_detect(moddirs, "herringlarvae_sepfeb")]
fit <- readRDS(paste0(d.name, "/fit.rds"))
#fit <- VAST::reload_model(fit) #added to try to make work after restart, no previous VAST run
years <- unique(fit$data_frame$t_i)
years <- c(min(years):max(years))

mdl <- FishStatsUtils::make_map_info(Region = fit$settings$Region,
                     spatial_list = fit$spatial_list,
                     Extrapolation_List = fit$extrapolation_list#,
                     #added to try to make work after restart, no previous VAST run
                     #Include = fit$extrapolation_list[["Area_km2_x"]] > 0 &  rowSums(fit$extrapolation_list[["a_el"]]) >  0
                     )

gmap <- ggplot(data = ecodata::coast) +
  geom_sf() + 
  #aes(x = lon, y = lat, group = group) +
  #geom_polygon(fill="black", colour = "white") +
  scale_color_viridis_c(option = "magma") +  # now make this quantiles...
  theme(axis.title.x=element_blank(),
        axis.text.x=element_blank(),
        axis.ticks.x=element_blank(),
        axis.title.y=element_blank(),
        axis.text.y=element_blank(),
        axis.ticks.y=element_blank(),
        panel.spacing.x=unit(0, "lines"),
        panel.spacing.y=unit(0, "lines"),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank() ) +
  coord_sf(xlim=mdl$Xlim, ylim=mdl$Ylim)

g <- gmap +
  geom_point(data=Dtot, aes(Lon, Lat, color=log(as.vector(Dsum)), group=NULL)
             ,
             ## These settings are necessary to avoid
             ## overlplotting which is a problem here. May need
             ## to be tweaked further.
             size=1.5, stroke=0,shape=16
             ) 

g

Mapping the quantiles from 60% (white) to 80% (light blue).

Quantiles below 70% include densities south of Long Island and off Cape Hatteras. The WG thinks the larvae found to the south may be lost to the population so we likely don’t want to use a quantile below the 70th percentile.

The 80th percentile starts to show a gap on Georges Bank and along the southwest coast of Maine. This may be slicing things too finely to cover general herring larval habitat.

I think this leaves us with using either the 70th (light green) or 75th (dark green) percentile of summed density over all years to define herring larvae relevant habitat for the small copepods index.

gq <- gmap +
 geom_point(data=D60pct, 
             aes(x=Lon, y=Lat, color = "60th"), 
            # color="white"
            #)
            # ,
             ## These settings are necessary to avoid
             ## overlplotting which is a problem here. May need
             ## to be tweaked further.
             size=1.5, stroke=0,shape=16
             )  +
  
  geom_point(data=D65pct, 
             aes(x=Lon, y=Lat, color = "65th"), 
             #color="yellow"
             #)
             #,
             ## These settings are necessary to avoid
             ## overlplotting which is a problem here. May need
             ## to be tweaked further.
             size=1.5, stroke=0,shape=16
             )  +
  
  geom_point(data=D70pct, 
             aes(x=Lon, y=Lat, color = "70th"), 
             #color="green"
             #)
             #,
             ## These settings are necessary to avoid
             ## overlplotting which is a problem here. May need
             ## to be tweaked further.
             size=1.5, stroke=0,shape=16
             )  +
  
  geom_point(data=D75pct, 
             aes(x=Lon, y=Lat, color = "75th"), 
             #color="darkgreen"
              #)
             #,
             ## These settings are necessary to avoid
             ## overlplotting which is a problem here. May need
             ## to be tweaked further.
             size=1.5, stroke=0,shape=16
             )  +
  
   geom_point(data=D80pct, 
             aes(x=Lon, y=Lat, color = "80th"), 
             #color="lightblue"
             #)
             #,
             ## These settings are necessary to avoid
             ## overlplotting which is a problem here. May need
             ## to be tweaked further.
             size=1.5, stroke=0,shape=16
             )  +
  
  ggplot2::scale_color_manual(name = "Quantile",
                        breaks = c("60th", "65th", "70th", "75th", "80th"),
                        values = c("60th" = "white", "65th" = "yellow", 
                                   "70th" = "green", "75th" = "darkgreen", 
                                   "80th" = "lightblue") )
  
gq + ggplot2::theme_grey()

# another option, pull from fit$report Omega 1 and 2?

We selected the 70th percentile as the largest that did not include areas in the southern portion of the range to define the “herring larval area” for use in other zooplankton VAST models.

To make a new extrapolation grid using herring larval density 70th percentile, we first make the 70th percentile+ points into an sf object by outlining the “concave hull” of the area, then intersect that object with the built in FishStatsUtils::northwest_atlantic_grid.

Next, we define new strata based on that intersection and make a new extrapolation grid. Then we can use this grid and call the new strata as strata.limits when running the small copepods model.

See the full process here: https://noaa-edab.github.io/zooplanktonindex/HerringLarvaeExplore.html

Using the new grid, run the smallcopeALL_sepfeb dataset in VAST to get an index in the herring larvae area as well as in the others.

Script for doing this is https://github.com/NOAA-EDAB/zooplanktonindex/blob/main/VASTscripts/VASTunivariate_zoopindex_smcopeALL_herrlarvarea.R

3 Results

3.1 Sampling during each season

Sample sizes in each of the temporal categories are reported below. Definitions: “all” = January-December; “FALL” = July-December; “SPRING” = January-June; “fall” = September-November; “spring” = March-May; “summer” = June-August; “winter” = December-February; “nolarv” = March-August; “sepfeb” = September-February of the following year.

herringfood_stn <- readRDS(here::here("data/herringfood_stn_all_OISST.rds"))

# make SST column that uses surftemp unless missing or 0

herringfood_stn <- herringfood_stn |>
  dplyr::mutate(sstfill = ifelse((is.na(sfc_temp)|sfc_temp==0), oisst, sfc_temp),
                season_larv = month %in% c(1:2, 9:12)) |>
  dplyr::filter(year>1981)

alltot <- herringfood_stn |> 
  dplyr::summarise(totstn = n(),
                   across(c(calfin_100m3,
                            lgcopeALL_100m3,
                            smallcopeALL_100m3,
                            cluhar_100m3,
                            euph_100m3,
                            volume_100m3
                            ), ~ sum(.x != 0, na.rm = TRUE))) |>
  dplyr::mutate(season = "all")


seasontot <- herringfood_stn |> 
  dplyr::group_by(season = season_ng) |>
  dplyr::summarise(totstn = n(),
                   across(c(calfin_100m3,
                            lgcopeALL_100m3,
                            smallcopeALL_100m3,
                            cluhar_100m3,
                            euph_100m3,
                            volume_100m3
                            ), ~ sum(.x != 0, na.rm = TRUE)))

season4tot <- herringfood_stn |> 
  dplyr::group_by(season = season_4) |>
  dplyr::summarise(totstn = n(),
                   across(c(calfin_100m3,
                            lgcopeALL_100m3,
                            smallcopeALL_100m3,
                            cluhar_100m3,
                            euph_100m3,
                            volume_100m3
                            ), ~ sum(.x != 0, na.rm = TRUE)))

larvtot <- herringfood_stn |> 
  dplyr::mutate(season = dplyr::if_else(season_larv, "sepfeb", "nolarv")) |>
  dplyr::group_by(season) |>
  dplyr::summarise(totstn = n(),
                   across(c(calfin_100m3,
                            lgcopeALL_100m3,
                            smallcopeALL_100m3,
                            cluhar_100m3,
                            euph_100m3,
                            volume_100m3
                            ), ~ sum(.x != 0, na.rm = TRUE)))

samptab <- dplyr::bind_rows(alltot, seasontot, season4tot, larvtot) |>
  dplyr::select(season, 
                stations = totstn,
                zoopvol = volume_100m3,
                largecope = lgcopeALL_100m3,
                smallcope = smallcopeALL_100m3,
                herringlarv = cluhar_100m3,
                calanusfin = calfin_100m3,
                euphausid = euph_100m3)
  
flextable::flextable(samptab) |>
  flextable::set_caption("Number of survey stations 1982-2022 by season with number of postive samples by zooplankton species group.") |>
  flextable::colformat_double(big.mark = "", digits = 0) |>
  #flextable::autofit()
  flextable::width(width = 1)

``` ```{=html}

(\#tab:unnamed-chunk-6)Number of survey stations 1982-2022 by season with number of postive samples by zooplankton species group.
season	stations	zoopvol	largecope	smallcope	herringlarv	calanusfin	euphausid
all	27,774	25,266	23,813	25,255	3,815	21,524	11,710
FALL	13,249	12,935	12,081	12,946	2,765	10,355	5,906
SPRING	14,525	12,331	11,732	12,309	1,050	11,169	5,804
fall	8,745	8,558	8,012	8,557	2,134	6,526	4,154
spring	7,739	6,549	6,176	6,540	227	5,972	3,360
summer	5,810	5,471	5,134	5,488	25	4,939	2,566
winter	5,480	4,688	4,491	4,670	1,429	4,087	1,630
nolarv	13,549	12,020	11,310	12,028	252	10,911	5,926
sepfeb	14,225	13,246	12,503	13,227	3,563	10,613	5,784

3.1.1 Zooplankton stations by season

zoopvol <- moddirs[str_detect(moddirs, "zoopvol")]
zoopvolnocovs <- zoopvol[!str_detect(zoopvol, "_doy")]

for(d.name in zoopvolnocovs){
  
  modpath <- unlist(str_split(d.name, pattern = "/"))
  modname <- modpath[length(modpath)]
  
  cat(modname, "\n")
  cat(paste0("![](",d.name, "/Data_by_year.png)"))
  cat("\n\n")
  
}

zoopvol_ann_500_biascorrect

zoopvol_fall_500_biascorrect

zoopvol_spring_500_biascorrect

3.1.2 Herring larvae stations by season

Fall sampling for herring larvae was completed in most years aside from GLOBEC. In our definition of spring, herring larvae primarily occur in January and February. We now include a shifted season to better match larval herring availability throughout the time series: September - February. Year corresponds to September-December, and the following January and February are aligned with the previous year (hatch year) in these analyses. Data were missing for

for(d.name in moddirs[str_detect(moddirs, "herring")]){
  
  modpath <- unlist(str_split(d.name, pattern = "/"))
  modname <- modpath[length(modpath)]
  
  cat(modname, "\n")
  cat(paste0("![](",d.name, "/Data_by_year.png)"))
  cat("\n\n")
  
}

herringlarvae_fall_500_biascorrect

herringlarvae_sepfeb_yrshift_500_biascorrect

herringlarvae_spring_500_biascorrect

3.2 Model validation

We compared a preliminary annual scale VAST Calanus finmarchicus model output (no covariates or bias correction) with the annual anomalies currently included in the R ecodata package (overview, methods). Estimation methods are completely different and spatial strata are slightly different, but the timing of high and low anomalies is well-aligned between the two methods for the region mainly occupied by both Atlantic herring and Calanus finmarchicus, Georges Bank (GB) and the Gulf of Maine (GOM).

stratlook <- data.frame(Stratum = c("Stratum_1",
                                    "Stratum_2",
                                    "Stratum_3",
                                    "Stratum_4",
                                    "Stratum_5",
                                    "Stratum_6",
                                    "Stratum_7"),
                        Region  = c("AllEPU",
                                    "her_sp",
                                    "her_fa",
                                    "MAB",
                                    "GB",
                                    "GOM",
                                    "SS"))

stratlook2 <- data.frame(Stratum = c("Stratum_1",
                                    "Stratum_2",
                                    "Stratum_3",
                                    "Stratum_4",
                                    "Stratum_5",
                                    "Stratum_6",
                                    "Stratum_7",
                                    "Stratum_8",
                                    "Stratum_9"),
                        Region  = c("AllEPU",
                                    "her_sp",
                                    "her_fa",
                                    "her_larv",
                                    "no_larv",
                                    "MAB",
                                    "GB",
                                    "GOM",
                                    "SS"))

source(here::here("R/utils_VASToutput.R")) #IEA/HerringRT_2025/

outdir <- here::here("pyindex_tests") #IEA/HerringRT_2025/
moddirs <- list.dirs(outdir) 
moddirs <- moddirs[-1]
# keep folder name
modnames <- list.dirs(outdir, full.names = FALSE)

modselect <- modtable(moddirs)

# lets only look at indices for converged models
modcompare_conv <- modselect |>
  dplyr::ungroup() |>
  dplyr::filter(converged2 == "likely") |>
  dplyr::select(modname) |>
  as.vector() |>
  unname() |>
  unlist()

moddirs_conv <-  moddirs[grepl(sprintf("\\.*(%s)$", paste(modcompare_conv, collapse = '|')), moddirs)]


modcompareindex <- purrr::map_dfr(moddirs_conv, purrr::possibly(getmodindex, otherwise = NULL))

splitoutput <- modcompareindex %>%
  dplyr::mutate(Season = modname |> map(str_split, pattern = "_") |> map_chr(c(1,2))) %>%
  dplyr::mutate(Data = modname |> map(str_split, pattern = "_") |> map_chr(c(1,1))) %>%
  dplyr::mutate(Estimate = ifelse(Estimate == 0, NA, Estimate)) |>
  dplyr::left_join(stratlook)

# only larvarea models have this strata set
splitoutput2 <- modcompareindex %>%
  dplyr::filter(str_detect(modname, "larvarea")) |>
  dplyr::mutate(Season = modname |> map(str_split, pattern = "_") |> map_chr(c(1,2))) %>%
  dplyr::mutate(Data = modname |> map(str_split, pattern = "_") |> map_chr(c(1,1))) %>%
  dplyr::mutate(Estimate = ifelse(Estimate == 0, NA, Estimate)) |>
  dplyr::left_join(stratlook2)

zoomax <- max(splitoutput$Estimate, na.rm=T)

zootsmean <- splitoutput %>%
  dplyr::group_by(modname, Region) %>%
  dplyr::mutate(fmean = mean(Estimate, na.rm=T))

VASTmean <- splitoutput |>
    dplyr::filter(Data %in% c("calfin"),
                  Season %in% c("annual"),
                  Region %in% c("MAB", "GB", "GOM")) |>
    dplyr::group_by(Region, Season) |>
    dplyr::summarise(tsmean = mean(Estimate))

VASTanom <- splitoutput |>
    dplyr::filter(Data %in% c("calfin"),
                  Season %in% c("annual"),
                  Region %in% c("MAB", "GB", "GOM")) |>
    dplyr::left_join(VASTmean) |>
    dplyr::group_by(Region, Season) |>
    dplyr::mutate(#Value = Value/resca,
      Mean = as.numeric(Estimate),
      SE = Std..Error.for.Estimate,
      Mean = (Mean-tsmean)/tsmean,
      SE = (SE-tsmean)/tsmean,
      Upper = Mean + SE,
      Lower = Mean - SE) |>
  dplyr::rename(EPU = Region)

SOEcalfin <- ecodata::zoo_abundance_anom |>
  dplyr::mutate(Value = as.numeric(Value)) |>
  dplyr::filter(Var == "Calfin") |>
  ggplot2::ggplot(aes(x = Time, y = Value)) +
  ggplot2::geom_point(color="darkred") + 
  ggplot2::geom_line(color="darkred") + 
  ecodata::geom_gls() +
  ggplot2::geom_line(data = VASTanom, aes(x=Time, y=Mean), color = "blue", size=1.5) +
  ecodata::theme_facet() +
  ggplot2::facet_wrap(~EPU)

SOEcalfin + ggplot2::ggtitle("Calanus finmarchicus annual anomaly: ecodata in dark red; VAST in blue")

GB = Georges Bank, GOM = Gulf of Maine, MAB = Mid Atlantic Bight.

Comparison of spatial definitions: ecodata anomalies use the dark red line while VAST models use the blue shaded polygons.

url <- "https://github.com/NOAA-EDAB/presentations/raw/master/docs/EDAB_images/EPU_Designations_Map.jpg"
knitr::include_graphics(url)

3.3 Model selection Stage 1 results

# from each output folder in pyindex, 
outdir <- here::here("pyindex_modsel1")
moddirs <- list.dirs(outdir) 
moddirs <- moddirs[-1]
# keep folder name
modnames <- list.dirs(outdir, full.names = FALSE)


# function to apply extracting info
getmodinfo <- function(d.name){
  # read settings
  modpath <- stringr::str_split(d.name, "/", simplify = TRUE)
  modname <- modpath[length(modpath)]
  
  settings <- read.table(file.path(d.name, "settings.txt"), comment.char = "",
    fill = TRUE, header = FALSE)
  
  n_x <- as.numeric(as.character(settings[(which(settings[,1]=="$n_x")+1),2]))
  grid_size_km <- as.numeric(as.character(settings[(which(settings[,1]=="$grid_size_km")+1),2]))
  max_cells <- as.numeric(as.character(settings[(which(settings[,1]=="$max_cells")+1),2]))
  use_anisotropy <- as.character(settings[(which(settings[,1]=="$use_anisotropy")+1),2])
  fine_scale <- as.character(settings[(which(settings[,1]=="$fine_scale")+1),2])
  bias.correct <- as.character(settings[(which(settings[,1]=="$bias.correct")+1),2])
  
  #FieldConfig
  if(settings[(which(settings[,1]=="$FieldConfig")+1),1]=="Component_1"){
    omega1 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+2),2])
    omega2 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+3),1])
    epsilon1 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+4),2])
    epsilon2 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+5),1])
    beta1 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+6),2])
    beta2 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+7),1])
  }
  
  if(settings[(which(settings[,1]=="$FieldConfig")+1),1]=="Omega1"){
    omega1 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+3),1])
    omega2 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+4),1])
    epsilon1 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+3),2])
    epsilon2 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+4),2])
    beta1 <- "IID"
    beta2 <- "IID"
  }
  
  
  #RhoConfig
  rho_beta1 <- as.numeric(as.character(settings[(which(settings[,1]=="$RhoConfig")+3),1]))
  rho_beta2 <- as.numeric(as.character(settings[(which(settings[,1]=="$RhoConfig")+3),2]))
  rho_epsilon1 <- as.numeric(as.character(settings[(which(settings[,1]=="$RhoConfig")+4),1]))
  rho_epsilon2 <- as.numeric(as.character(settings[(which(settings[,1]=="$RhoConfig")+4),2]))
  
  # read parameter estimates, object is called parameter_Estimates
  if(file.exists(file.path(d.name, "parameter_estimates.RData"))) {
    load(file.path(d.name, "parameter_estimates.RData"))
    
    AIC <- parameter_estimates$AIC[1]  
    converged <- parameter_estimates$Convergence_check[1]
    fixedcoeff <- unname(parameter_estimates$number_of_coefficients[2])
    randomcoeff <- unname(parameter_estimates$number_of_coefficients[3])
    
  }else if(file.exists(file.path(d.name, "parameter_estimates.txt"))){
    
    reptext <- readLines(file.path(d.name, "parameter_estimates.txt"))
    
    AIC <- as.double(reptext[grep(reptext, pattern = "AIC")+1])
    converged <- reptext[grep(reptext, pattern = "Convergence_check")+1]
    fixedcoeff <- as.integer(stringr::str_split(reptext[grep(reptext, pattern = "number_of_coefficients")+2], 
                                     boundary("word"))[[1]][2])
    randomcoeff <- as.integer(stringr::str_split(reptext[grep(reptext, pattern = "number_of_coefficients")+2], 
                                     boundary("word"))[[1]][3])
    
  }else{
    
    AIC <- NA_real_
    converged <- NA_character_
    fixedcoeff <- NA_integer_
    randomcoeff <- NA_integer_
  }
  
  #index <- read.csv(file.path(d.name, "Index.csv"))
  
  
  # return model attributes as a dataframe
  out <- data.frame(modname = modname,
                    n_x = n_x,
                    grid_size_km = grid_size_km,
                    max_cells = max_cells,
                    use_anisotropy = use_anisotropy,
                    fine_scale =  fine_scale,
                    bias.correct = bias.correct,
                    omega1 = omega1,
                    omega2 = omega2,
                    epsilon1 = epsilon1,
                    epsilon2 = epsilon2,
                    beta1 = beta1,
                    beta2 = beta2,
                    rho_epsilon1 = rho_epsilon1,
                    rho_epsilon2 = rho_epsilon2,
                    rho_beta1 = rho_beta1,
                    rho_beta2 = rho_beta2,
                    AIC = AIC,
                    converged = converged,
                    fixedcoeff = fixedcoeff,
                    randomcoeff = randomcoeff#,
                    #index = index
  )
    
    return(out)

}


modcompare <- purrr::map_dfr(moddirs, getmodinfo)

Models compared using REML are identified by model name (“modname” in Table Sb??) which always includes all prey aggregated, season (“all” for annual models of months 1-12, “fall” for models of months 7-12, and “spring” for models of months 1-6), number of knots (500 for all models), and which fixed and random spatial and spatio-temporal effects were included in which linear predictor (1 or 2). The names for model options and associated VAST model settings are:

Model selection 1 (spatial, spatio-temporal effects, no covariates) options (following Ng et al. (2021)) and naming: * All models set Use_REML = TRUE in fit_model specifications.
* Modeled effects, model name suffix, and VAST settings by model:

“_alleffectson” = Spatial and spatio-temporal random effects plus anisotropy in both linear predictors; FieldConfig default (all IID)
“_noaniso” = Spatial and spatio-temporal random effects in both linear predictors with anisotopy turned off; FieldConfig default (all IID) and use_anisotopy = FALSE
“_noomeps2” = Spatial and spatio-temporal random effects plus anisotropy only in linear predictor 1; FieldConfig = 0 for Omega2, Epsilon2
“_noomeps2_noaniso” = Spatial and spatio-temporal random effects only in linear predictor 1 with anisotopy turned off; FieldConfig = 0 for Omega2, Epsilon2 and use_anisotopy = FALSE
“_noomeps2_noeps1” = Spatial random effects plus anisotropy only in linear predictor 1; FieldConfig = 0 for Omega2, Epsilon2, Epsilon1
“_noomeps2_noeps1_noaniso” = Spatial random effects only in linear predictor 1 with anisotopy turned off; FieldConfig = 0 for Omega2, Epsilon2, Epsilon1 and use_anisotopy = FALSE
“_noomeps12” = Anisotropy, but no spatial or spatio-temporal random effects in either linear predictor; FieldConfig = 0 for both Omega, Epsilon
“_noomeps12_noaniso” = No spatial or spatio-temporal random effects in either linear predictor; FieldConfig = 0 for both Omega, Epsilon and use_anisotopy = FALSE

modselect <- modcompare |>
  dplyr::mutate(season = dplyr::case_when(stringr::str_detect(modname, "_fall_") ~ "Fall",
                            stringr::str_detect(modname, "spring") ~ "Spring",
                            stringr::str_detect(modname, "_all_") ~ "Annual",
                            TRUE ~ as.character(NA))) |>
  dplyr::mutate(converged2 = dplyr::case_when(stringr::str_detect(converged, "no evidence") ~ "likely",
                                stringr::str_detect(converged, "is likely not") ~ "unlikely",
                                TRUE ~ as.character(NA))) |>
  dplyr::mutate(copegroup = stringr::str_extract(modname, "[^_]+")) |>
  #dplyr::mutate(modname = str_extract(modname, '(?<=allagg_).*')) |>
  dplyr::group_by(copegroup, season) |>
  dplyr::mutate(deltaAIC = AIC-min(AIC)) |>
  dplyr::select(copegroup, modname, season, deltaAIC, fixedcoeff,
         randomcoeff, use_anisotropy, 
         omega1, omega2, epsilon1, epsilon2, 
         beta1, beta2, AIC, converged2) |>
  dplyr::arrange(copegroup, season, AIC)

# DT::datatable(modselect, rownames = FALSE, 
#               options= list(pageLength = 25, scrollX = TRUE),
#               caption = "Comparison of delta AIC values using Restricted Maxiumum Likelihood (REML) for alternative fixed and random effects model structures. See text for model descriptions.")

flextable::flextable(modselect |>
                       dplyr::select(-c(use_anisotropy, 
         omega1, omega2, epsilon1, epsilon2, 
         beta1, beta2))) |>
  flextable::set_header_labels(copegroup = "Copepod group",
                               modname = "Model name",
                               season = "Season",
                               deltaAIC = "dAIC",
                               fixedcoeff = "N fixed",
                               randomcoeff = "N random",
                               converged2 = "Convergence") |>
  flextable::set_caption("Comparison of delta AIC (dAIC) values using Restricted Maxiumum Likelihood (REML) for alternative fixed and random effects model structures, with number of fixed (N fixed) and random (N random) coefficients. See text for model descriptions associated with each model name.") |>
  flextable::fontsize(size = 9, part = "all") |>
  flextable::colformat_double(digits = 2) |>
  flextable::set_table_properties(layout = "autofit", width = 1)

``` ```{=html}

(\#tab:modsel1)Comparison of delta AIC (dAIC) values using Restricted Maxiumum Likelihood (REML) for alternative fixed and random effects model structures, with number of fixed (N fixed) and random (N random) coefficients. See text for model descriptions associated with each model name.
Copepod group	Model name	Season	dAIC	N fixed	N random	AIC	Convergence
calfin	calfin_fall_500_alleffectson	Fall	0.00	9	46,616	208,373.00	likely
calfin	calfin_fall_500_noaniso	Fall	79.25	7	46,616	208,452.25	likely
calfin	calfin_fall_500_noomeps2	Fall	1,753.04	6	23,348	210,126.05	likely
calfin	calfin_fall_500_noomeps2_noaniso	Fall	1,814.90	4	23,348	210,187.90	likely
calfin	calfin_fall_500_noomeps2_noeps1	Fall	3,474.74	5	634	211,847.75	likely
calfin	calfin_fall_500_noomeps2_noeps1_noaniso	Fall	3,501.57	3	634	211,874.57	likely
calfin	calfin_fall_500_noomeps12	Fall	9,589.69	1	80	217,962.69	likely
calfin	calfin_fall_500_noomeps12_noaniso	Fall	9,589.69	1	80	217,962.69	likely
calfin	calfin_spring_500_alleffectson	Spring	0.00	9	46,534	239,281.74	likely
calfin	calfin_spring_500_noaniso	Spring	48.35	7	46,534	239,330.09	likely
calfin	calfin_spring_500_noomeps2	Spring	864.36	6	23,308	240,146.10	likely
calfin	calfin_spring_500_noomeps2_noaniso	Spring	889.01	4	23,308	240,170.75	likely
calfin	calfin_spring_500_noomeps2_noeps1	Spring	2,325.90	5	635	241,607.64	likely
calfin	calfin_spring_500_noomeps2_noeps1_noaniso	Spring	2,339.84	3	635	241,621.59	likely
calfin	calfin_spring_500_noomeps12	Spring	6,517.46	1	82	245,799.20	likely
calfin	calfin_spring_500_noomeps12_noaniso	Spring	6,517.46	1	82	245,799.20	likely
lgcopeALL	lgcopeALL_fall_500_alleffectson	Fall	0.00	9	46,616	251,030.89	likely
lgcopeALL	lgcopeALL_fall_500_noaniso	Fall	100.03	7	46,616	251,130.92	likely
lgcopeALL	lgcopeALL_fall_500_noomeps2	Fall	1,496.48	6	23,348	252,527.36	likely
lgcopeALL	lgcopeALL_fall_500_noomeps2_noaniso	Fall	1,588.70	4	23,348	252,619.58	likely
lgcopeALL	lgcopeALL_fall_500_noomeps2_noeps1	Fall	3,572.64	5	634	254,603.53	likely
lgcopeALL	lgcopeALL_fall_500_noomeps2_noeps1_noaniso	Fall	3,610.16	3	634	254,641.04	likely
lgcopeALL	lgcopeALL_fall_500_noomeps12	Fall	7,346.16	1	80	258,377.05	likely
lgcopeALL	lgcopeALL_fall_500_noomeps12_noaniso	Fall	7,346.16	1	80	258,377.05	likely
lgcopeALL	lgcopeALL_spring_500_alleffectson	Spring	0.00	9	46,450	257,499.63	likely
lgcopeALL	lgcopeALL_spring_500_noaniso	Spring	78.55	7	46,450	257,578.17	likely
lgcopeALL	lgcopeALL_spring_500_noomeps2	Spring	1,746.74	6	23,266	259,246.36	likely
lgcopeALL	lgcopeALL_spring_500_noomeps2_noaniso	Spring	1,798.07	4	23,266	259,297.70	likely
lgcopeALL	lgcopeALL_spring_500_noomeps2_noeps1	Spring	4,612.34	5	634	262,111.96	likely
lgcopeALL	lgcopeALL_spring_500_noomeps2_noeps1_noaniso	Spring	4,622.66	3	634	262,122.28	likely
lgcopeALL	lgcopeALL_spring_500_noomeps12	Spring	7,906.90	1	82	265,406.53	likely
lgcopeALL	lgcopeALL_spring_500_noomeps12_noaniso	Spring	7,906.90	1	82	265,406.53	likely
smallcopeALL	smallcopeALL_fall_500_alleffectson	Fall	0.00	9	46,607	303,280.90	likely
smallcopeALL	smallcopeALL_fall_500_noaniso	Fall	47.18	7	46,607	303,328.07	likely
smallcopeALL	smallcopeALL_fall_500_noomeps2	Fall	2,231.62	6	23,339	305,512.51	likely
smallcopeALL	smallcopeALL_fall_500_noomeps2_noaniso	Fall	2,265.50	4	23,339	305,546.39	likely
smallcopeALL	smallcopeALL_fall_500_noomeps2_noeps1	Fall	4,501.12	5	625	307,782.01	likely
smallcopeALL	smallcopeALL_fall_500_noomeps2_noeps1_noaniso	Fall	4,508.11	3	625	307,789.01	likely
smallcopeALL	smallcopeALL_fall_500_noomeps12	Fall	8,563.21	1	71	311,844.11	likely
smallcopeALL	smallcopeALL_fall_500_noomeps12_noaniso	Fall	8,563.21	1	71	311,844.11	likely
smallcopeALL	smallcopeALL_spring_500_alleffectson	Spring	0.00	9	46,441	282,327.03	likely
smallcopeALL	smallcopeALL_spring_500_noaniso	Spring	56.67	7	46,441	282,383.71	likely
smallcopeALL	smallcopeALL_spring_500_noomeps2	Spring	3,961.69	6	23,257	286,288.72	likely
smallcopeALL	smallcopeALL_spring_500_noomeps2_noaniso	Spring	4,004.70	4	23,257	286,331.73	likely
smallcopeALL	smallcopeALL_spring_500_noomeps2_noeps1_noaniso	Spring	7,586.19	3	625	289,913.22	likely
smallcopeALL	smallcopeALL_spring_500_noomeps2_noeps1	Spring	7,586.74	5	625	289,913.78	likely
smallcopeALL	smallcopeALL_spring_500_noomeps12	Spring	13,390.23	1	73	295,717.26	likely
smallcopeALL	smallcopeALL_spring_500_noomeps12_noaniso	Spring	13,390.23	1	73	295,717.26	likely
smallcopeSOE	smallcopeSOE_fall_500_alleffectson	Fall	0.00	9	46,616	290,760.56	likely
smallcopeSOE	smallcopeSOE_fall_500_noaniso	Fall	61.29	7	46,616	290,821.85	likely
smallcopeSOE	smallcopeSOE_fall_500_noomeps2	Fall	2,338.60	6	23,348	293,099.16	likely
smallcopeSOE	smallcopeSOE_fall_500_noomeps2_noaniso	Fall	2,380.30	4	23,348	293,140.86	likely
smallcopeSOE	smallcopeSOE_fall_500_noomeps2_noeps1	Fall	4,708.61	5	634	295,469.17	likely
smallcopeSOE	smallcopeSOE_fall_500_noomeps2_noeps1_noaniso	Fall	4,720.67	3	634	295,481.22	likely
smallcopeSOE	smallcopeSOE_fall_500_noomeps12	Fall	8,233.60	1	80	298,994.16	likely
smallcopeSOE	smallcopeSOE_fall_500_noomeps12_noaniso	Fall	8,233.60	1	80	298,994.16	likely
smallcopeSOE	smallcopeSOE_spring_500_alleffectson	Spring	0.00	9	46,445	278,928.98	likely
smallcopeSOE	smallcopeSOE_spring_500_noaniso	Spring	72.09	7	46,445	279,001.07	likely
smallcopeSOE	smallcopeSOE_spring_500_noomeps2	Spring	4,100.24	6	23,261	283,029.22	likely
smallcopeSOE	smallcopeSOE_spring_500_noomeps2_noaniso	Spring	4,157.09	4	23,261	283,086.06	likely
smallcopeSOE	smallcopeSOE_spring_500_noomeps2_noeps1_noaniso	Spring	7,636.58	3	629	286,565.55	likely
smallcopeSOE	smallcopeSOE_spring_500_noomeps2_noeps1	Spring	7,638.62	5	629	286,567.60	likely
smallcopeSOE	smallcopeSOE_spring_500_noomeps12	Spring	12,907.54	1	77	291,836.51	likely
smallcopeSOE	smallcopeSOE_spring_500_noomeps12_noaniso	Spring	12,907.54	1	77	291,836.51	likely

Using REML, models including spatial and spatio-temporal random effects as well as anisotropy were best supported by the data. This was true for the spring datasets and the fall datasets for calfin, lgcopeALL, smallcopeALL, and smallcopeSOE.

However, the small copepod fall model had the spatial random effects parameter for encounter rate, \(\omega_1\), approach 0. We allowed it to be estimated at 0 rather than fixing it at 0. Resulting indices were identical.

Full outputs from stage 1 model selection are posted to google drive here.

3.4 Model selection stage 2 results

Including the day of year covariate had mixed success. While some models fit better with this covariate, others did not meet the convergence criteria. Therefore, we moved forward with the best fitting model that converged to generate indices.

Full results from the best fitting models are posted to google drive here

Only spring large zooplankton did better with a day of year catchability covariate among the models related to herring recruitment.

# from each output folder in pyindex, 
outdir <- here::here("pyindex")
moddirs <- list.dirs(outdir) 
moddirs <- moddirs[-1]
# keep folder name
modnames <- list.dirs(outdir, full.names = FALSE)

modcompare <- purrr::map_dfr(moddirs, getmodinfo)

modselect <- modcompare |>
  dplyr::mutate(season = dplyr::case_when(stringr::str_detect(modname, "_fall_") ~ "Fall",
                            stringr::str_detect(modname, "spring") ~ "Spring",
                            stringr::str_detect(modname, "_ann_") ~ "Annual",
                            stringr::str_detect(modname, "sepfeb") ~ "SepFeb",
                            TRUE ~ as.character(NA))) |>
  dplyr::mutate(converged2 = dplyr::case_when(stringr::str_detect(converged, "no evidence") ~ "likely",
                                stringr::str_detect(converged, "is likely not") ~ "unlikely",
                                TRUE ~ as.character(NA))) |>
  dplyr::mutate(copegroup = stringr::str_extract(modname, "[^_]+")) |>
  #dplyr::mutate(modname = str_extract(modname, '(?<=allagg_).*')) |>
  dplyr::group_by(copegroup, season) |>
  dplyr::mutate(deltaAIC = AIC-min(AIC)) |>
  dplyr::select(copegroup, modname, season, deltaAIC, fixedcoeff,
         randomcoeff, 
         omega1, omega2, #epsilon1, epsilon2, 
         #beta1, beta2, use_anisotropy, 
         AIC, converged2) |>
  dplyr::arrange(copegroup, season, AIC)

flextable::flextable(modselect) %>%
                       #dplyr::select(-c(use_anisotropy, 
         #omega1, omega2, epsilon1, epsilon2, 
         #beta1, beta2))
         #) %>%
  flextable::set_header_labels(modname = "Model name",
                               season = "Season",
                               #deltaAIC = "dAIC",
                               fixedcoeff = "N fixed",
                               randomcoeff = "N random",
                               omega2 = "All other effects",
                               converged2 = "Convergence") |>
  #flextable::set_caption("Comparison of delta AIC (dAIC) values using Restricted Maxiumum Likelihood (REML) for alternative fixed and random effects model structures, with number of fixed (N fixed) and random (N random) coefficients. See text for model descriptions associated with each model name.") %>%
  flextable::fontsize(size = 9, part = "all") %>%
  flextable::colformat_double(digits = 2) |>
  flextable::set_table_properties(layout = "autofit", width = 1)

copegroup	Model name	Season	deltaAIC	N fixed	N random	omega1	All other effects	AIC	Convergence
calfin	calfin_ann_500_biascorrect_doy	Annual	0.00	93	46,536	IID	IID	453,074.91	unlikely
calfin	calfin_fall_500_biascorrect_doy	Fall	0.00	91	46,536	IID	IID	206,053.34	unlikely
calfin	calfin_fall_500_biascorrect	Fall	2,405.54	89	46,536	IID	IID	208,458.88	likely
calfin	calfin_spring_500_biascorrect	Spring		91	46,452	IID	IID	239,403.84	likely
calfin	calfin_spring_500_biascorrect_doy	Spring				IID	IID
euph	euph_ann_500_biascorrect_doy	Annual	0.00	93	46,536	IID	IID	205,163.31	unlikely
euph	euph_ann_500_biascorrect	Annual	45.86	91	46,536	IID	IID	205,209.17	likely
euph	euph_fall_500_biascorrect_doy	Fall	0.00	91	46,536	IID	IID	100,475.13	unlikely
euph	euph_fall_500_biascorrect	Fall	96.55	89	46,536	IID	IID	100,571.67	likely
euph	euph_spring_500_biascorrect_doy	Spring	0.00	93	46,452	IID	IID	101,850.11	likely
euph	euph_spring_500_biascorrect	Spring	819.63	91	46,452	IID	IID	102,669.73	likely
herringlarvae	herringlarvae_fall_500_biascorrect	Fall	0.00	79	46,536	IID	IID	31,298.46	likely
herringlarvae	herringlarvae_sepfeb_yrshift_500_biascorrect	SepFeb	0.00	79	46,620	IID	IID	38,202.32	likely
herringlarvae	herringlarvae_spring_500_biascorrect	Spring	0.00	73	46,368	IID	IID	9,683.77	likely
lgcopeALL	lgcopeALL_ann_500_biascorrect_doy	Annual	0.00	93	46,368	IID	IID	515,161.55	unlikely
lgcopeALL	lgcopeALL_fall_500_biascorrect_doy	Fall	0.00	91	46,536	IID	IID	250,277.08	unlikely
lgcopeALL	lgcopeALL_fall_500_biascorrect	Fall	824.41	89	46,536	IID	IID	251,101.49	likely
lgcopeALL	lgcopeALL_spring_500_biascorrect_doy	Spring	0.00	93	46,368	IID	IID	255,470.75	likely
lgcopeALL	lgcopeALL_spring_500_biascorrect	Spring	2,175.71	91	46,368	IID	IID	257,646.46	likely
smallcopeALL	smallcopeALL_ann_500_biascorrect_doy	Annual	0.00	90	46,368	IID	IID	593,223.14	unlikely
smallcopeALL	smallcopeALL_fall_500_biascorrect_noom1_doy	Fall		81	45,982	0	IID	303,199.76	unlikely
smallcopeALL	smallcopeALL_fall_500_biascorrect_noom1	Fall		79	45,982	0	IID	303,358.72	likely
smallcopeALL	smallcopeALL_fall_500_biascorrect	Fall		80	46,536	IID	IID	303,360.72	likely
smallcopeALL	smallcopeALL_fall_500_biascorrect_fail	Fall				IID	IID
smallcopeALL	smallcopeALL_sepfeb_yrshift_500_larvarea_biascorrect	SepFeb	0.00	84	46,452	IID	IID	306,107.29	likely
smallcopeALL	smallcopeALL_sepfeb_yrshift_500_larvarea_biascorrect_oldarea	SepFeb	8.72	84	46,452	IID	IID	306,116.01	likely
smallcopeALL	smallcopeALL_sepfeb_yrshift_500_biascorrect	SepFeb	18.99	84	46,620	IID	IID	306,126.28	likely
smallcopeALL	smallcopeALL_spring_500_biascorrect_doy	Spring	0.00	84	46,368	IID	IID	281,080.63	unlikely
smallcopeALL	smallcopeALL_spring_500_biascorrect	Spring	1,385.07	82	46,368	IID	IID	282,465.70	likely
smallcopeSOE	smallcopeSOE_ann_500_biascorrect_doy	Annual	0.00	93	46,368	IID	IID	577,717.71	unlikely
smallcopeSOE	smallcopeSOE_fall_500_biascorrect_doy	Fall	0.00	91	46,536	IID	IID	290,576.94	unlikely
smallcopeSOE	smallcopeSOE_fall_500_biascorrect	Fall	307.39	89	46,536	IID	IID	290,884.33	likely
smallcopeSOE	smallcopeSOE_spring_500_biascorrect_doy	Spring	0.00	88	46,368	IID	IID	277,659.94	likely
smallcopeSOE	smallcopeSOE_spring_500_biascorrect	Spring	1,418.56	86	46,368	IID	IID	279,078.51	likely
zoopvol	zoopvol_ann_500_biascorrect_doy	Annual	0.00	52	46,536	IID	IID	226,109.45	unlikely
zoopvol	zoopvol_ann_500_biascorrect	Annual	77.87	50	46,536	IID	IID	226,187.32	likely
zoopvol	zoopvol_fall_500_biascorrect	Fall		49	46,536	IID	IID	111,955.59	likely
zoopvol	zoopvol_fall_500_biascorrect_doy	Fall				IID	IID
zoopvol	zoopvol_spring_500_biascorrect_doy	Spring	0.00	52	46,452	IID	IID	107,636.22	unlikely
zoopvol	zoopvol_spring_500_biascorrect	Spring	2,644.91	50	46,452	IID	IID	110,281.13	likely

3.5 Indices by zooplankton group

Although models were run in multiple seasons and at multiple spatial scales, we focus on results from three indices that were evaluated as covariates in the Atlantic herring assessment model. Based on the results of the boosted regression tree analysis (Molina 2024), the first two indices are large copepods in spring (influencing growth of herring postlarvae and juveniles), and small copeopods in fall (influencing survival of herring larvae over the winter). The spatial scale of these indices is the Atlantic herring bottom trawl survey strata in spring and fall, respectively. The third index is more refined to focus on herring larvae, using temporal and spatial bounds definerd by herring larvae observations in the zooplankton dataset. This third index focuses on small copepods from September-February in the area where more than 70% of herring larval density was identified between 1982-2022.

# lets only look at indices for converged models

stratlook <- data.frame(Stratum = c("Stratum_1",
                                    "Stratum_2",
                                    "Stratum_3",
                                    "Stratum_4",
                                    "Stratum_5",
                                    "Stratum_6",
                                    "Stratum_7"),
                        Region  = c("AllEPU",
                                    "her_sp",
                                    "her_fa",
                                    "MAB",
                                    "GB",
                                    "GOM",
                                    "SS"))

stratlook2 <- data.frame(Stratum = c("Stratum_1",
                                    "Stratum_2",
                                    "Stratum_3",
                                    "Stratum_4",
                                    "Stratum_5",
                                    "Stratum_6",
                                    "Stratum_7",
                                    "Stratum_8",
                                    "Stratum_9"),
                        Region  = c("AllEPU",
                                    "her_sp",
                                    "her_fa",
                                    "her_larv",
                                    "no_larv",
                                    "MAB",
                                    "GB",
                                    "GOM",
                                    "SS"))


modcompare_conv <- modselect |>
  dplyr::ungroup() |>
  dplyr::filter(converged2 == "likely") |>
  dplyr::select(modname) |>
  as.vector() |>
  unname() |>
  unlist()

# or only converged and best AIC
modcompare_best <- modselect |>
  dplyr::ungroup() |>
  dplyr::filter(converged2 == "likely") |>
  dplyr::distinct(copegroup, season, .keep_all = T) |> # assumes orderd by dAIC!
  dplyr::select(modname) |>
  as.vector() |>
  unname() |>
  unlist()

moddirs_conv <-  moddirs[grepl(sprintf("\\.*(%s)$", paste(modcompare_conv, collapse = '|')), moddirs)]

moddirs_best <-  moddirs[grepl(sprintf("\\.*(%s)$", paste(modcompare_best, collapse = '|')), moddirs)]

modcompareindex <- purrr::map_dfr(moddirs_best, purrr::possibly(getmodindex, otherwise = NULL))

splitoutput <- modcompareindex %>%
  dplyr::mutate(Season = modname |> map(str_split, pattern = "_") |> map_chr(c(1,2))) %>%
  dplyr::mutate(Data = modname |> map(str_split, pattern = "_") |> map_chr(c(1,1))) %>%
  dplyr::mutate(Estimate = ifelse(Estimate == 0, NA, Estimate)) |>
  dplyr::left_join(stratlook)

# only larvarea models have this strata set
splitoutput2 <- modcompareindex %>%
  dplyr::filter(str_detect(modname, "larvarea")) |>
  dplyr::mutate(Season = modname |> map(str_split, pattern = "_") |> map_chr(c(1,2))) %>%
  dplyr::mutate(Data = modname |> map(str_split, pattern = "_") |> map_chr(c(1,1))) %>%
  dplyr::mutate(Estimate = ifelse(Estimate == 0, NA, Estimate)) |>
  dplyr::left_join(stratlook2)

zoomax <- max(splitoutput$Estimate, na.rm=T)

zootsmean <- splitoutput %>%
  dplyr::group_by(modname, Region) %>%
  dplyr::mutate(fmean = mean(Estimate, na.rm=T))

3.5.1 Spring (January-June) large copepods

Spring indices for large copepods in the herring spring survey strata (her_sp), on Georges Bank (GB), and in the Gulf of Maine (GOM).

plot_zooindices(splitoutput = splitoutput |> dplyr::filter(Season == "spring"), 
             plotdata = "lgcopeALL", 
             plotregions = c("her_sp", "GB", "GOM"), #"AllEPU", "SS"
             plottitle = "Large copeopods")

VAST-estimated density of large copepods

knitr::include_graphics(here::here("pyindex/lgcopeALL_spring_500_biascorrect_doy/ln_density-predicted.png"))

3.5.2 Fall (July-December) small copepods

Spring indices for small copepods in the herring fall survey strata (her_fa), on Georges Bank (GB), and in the Gulf of Maine (GOM).

plot_zooindices(splitoutput = splitoutput |> dplyr::filter(Season == "fall"), 
             plotdata = "smallcopeALL", 
             plotregions = c("her_fa", "GB", "GOM"), #"AllEPU", "SS"
             plottitle = "small copeopods")

VAST-estimated density of small copepods

knitr::include_graphics(here::here("pyindex/smallcopeALL_fall_500_biascorrect/ln_density-predicted.png"))

3.5.3 September-February small copepods in herring larval area

Small copepods available to larvae

plot_zooindices(splitoutput = splitoutput2, 
             plotdata = "smallcopeALL", 
             plotregions = c("her_larv", "her_fa"), 
             plottitle = "Small copepods Sept-Feb") +
  ggplot2::theme(legend.position="none")

Comparison with small copepods outside the larval area

plot_zooindices_onepanel(splitoutput = splitoutput2 |> dplyr::filter(Region %in% c("her_larv", "no_larv")), 
             plotdata = "smallcopeALL", 
             plottitle = "Small copepods Sept-Feb") +
  ggplot2::ggtitle(NULL)

VAST-estimated density of small copepods September-February

knitr::include_graphics(here::here("pyindex/smallcopeALL_sepfeb_yrshift_500_larvarea_biascorrect/ln_density-predicted.png"))

4 Discussion

The indices suggest that spring large copepods incereased to a relatively high level in 2015 but have been declining since, although there is no long term trend in the time series.

Fall small copepods have generally increased from a low in the mid-2000s and have remained relatively stable 2010-present, and similarly have no long term trend in the time series.

September-February small copepods have very similar trajectories in the Atlantic herring bottom trawl survey strata and in the herring larval area defined using larval data. The trend for the bottom trawl survey strata shows a significant long term increase, while that in the herring larval area does not. Both time series have increased from a low in the mid-2000s.

There is some evidence that the proportion of small copepods outside the herring larval area was higher than within the larval area up until the mid-2010s. Since then, the estimated abundance of small copepods is similar inside and outside the herring larval area.

References

Collette, B.B., and Klein-Macphee, G. 2002. Bigelow and Schroeder’s Fishes of the Gulf of Maine, Third Edition. In 3rd ed. edition. Smithsonian Books, Washington, DC.

Gaichas, S.K., Gartland, J., Smith, B.E., Wood, A.D., Ng, E.L., Celestino, M., Drew, K., Tyrell, A.S., and Thorson, J.T. 2023. Assessing small pelagic fish trends in space and time using piscivore stomach contents. Canadian Journal of Fisheries and Aquatic Sciences: cjfas-2023-0093. doi:10.1139/cjfas-2023-0093.

Kane, J. 2007. Zooplankton abundance trends on Georges Bank, 1977–2004. ICES Journal of Marine Science 64(5): 909–919. doi:10.1093/icesjms/fsm066.

Morse, R.E., Friedland, K.D., Tommasi, D., Stock, C., and Nye, J. 2017. Distinct zooplankton regime shift patterns across ecoregions of the U.S. Northeast continental shelf Large Marine Ecosystem. Journal of Marine Systems 165: 77–91. doi:10.1016/j.jmarsys.2016.09.011.

Ng, E.L., Deroba, J.J., Essington, T.E., Grüss, A., Smith, B.E., and Thorson, J.T. 2021. Predator stomach contents can provide accurate indices of prey biomass. ICES Journal of Marine Science 78(3): 1146–1159. doi:10.1093/icesjms/fsab026.

Richardson, D.E., Hare, J.A., Overholtz, W.J., and Johnson, D.L. 2010. Development of long-term larval indices for Atlantic herring (Clupea harengus) on the northeast US continental shelf. ICES Journal of Marine Science 67(4): 617–627. doi:10.1093/icesjms/fsp276.

Thorson, J.T. 2019. Guidance for decisions using the Vector Autoregressive Spatio-Temporal (VAST) package in stock, ecosystem, habitat and climate assessments. Fisheries Research 210: 143–161. doi:10.1016/j.fishres.2018.10.013.

Thorson, J.T., and Barnett, L.A.K. 2017. Comparing estimates of abundance trends and distribution shifts using single- and multispecies models of fishes and biogenic habitat. ICES Journal of Marine Science 74(5): 1311–1321. doi:10.1093/icesjms/fsw193.

Thorson, J.T., and Kristensen, K. 2016. Implementing a generic method for bias correction in statistical models using random effects, with spatial and population dynamics examples. Fisheries Research 175: 66–74. doi:10.1016/j.fishres.2015.11.016.

Working Paper: Developing zooplankton indices in VAST

Sarah Gaichas

2024-11-12