The Atlantic herring research track working group (WG) is interested in exploring impacts of food availability on herring stock dynamics. Indices of zooplankton abundance were proposed for exploration.
These indices could potentially explain:
The WG decided to focus on patterns in recruitment, as low recruitment in recent years is an important issue for the stock and for fishery management.
The WG used a boosted regression tree analysis (Molina 2024) to identify zooplankton indices that best explained patterns in herring recruitment. These indices included large copepods in spring (influencing growth of herring postlarvae and juveniles), and small copeopods in fall (influencing survival of herring larvae over the winter).
In this working paper, we develop indices of large copepods in spring and small copepods in fall using zooplankton survey data (citation) from 1982-2022. We tailor the indices to the spatial footprint of the Atlantic herring stock assessment, and further use information on the changing spatial distribution of herring larvae over time from the same dataset to evaluate potential changes in availability of small copepods to herring larvae.
We used spatio-temporal modeling to develop zooplankton indices using the ECOMON data as an input.
Decisions:
We use herring diet information to determine which zooplankton species to include. This information is from both literature and food habits data.
“The herring is a plankton feeder…. Examination of 1,500 stomachs showed that adult herring near Eastport were living solely on copepods and on pelagic euphausiid shrimps (Meganyctiphanes norwegica), fish less than 4 inches long depending on the former alone, while the larger herring were eating both. When first hatched, and before the disappearance of the yolk sac, the larvae (European) feed on larval snails and crustaceans, on diatoms, and on peridinians, but they soon begin taking copepods, and depend exclusively on these for a time after they get to be 12 mm. long, especially on the little Pseudocalanus elongatus. As they grow older they feed more and more on the larger copepods and amphipods, pelagic shrimps, and decapod crustacean larvae (Collette and Klein-Macphee 2002).”
What herring eat from NEFSC food habits data is summarized here: https://fwdp.shinyapps.io/tm2020/ Copepods are highest diet proportion, followed by well digested prey (AR), and euphausiids. Broken down by decade and season, it is clear that copeopods dominate adult diets in winter and spring, and more so recently.
Maybe copeopods are bad for hering??
# Use Brian's new Rmd to get herring diet by decade and season or season and region
herringdiet <- readRDS(here::here("data/herringdiet.rds"))
herringdecseastot <- herringdiet %>%
# make decade column
# make 2 seasons winter--> spring and summer--> fall
dplyr::mutate(decade = floor(year/10)*10) |>
#season2 = case_when(season %in% c("WINTER", "SPRING") ~ "SPRING",
# season %in% c("SUMMER", "FALL") ~ "FALL")) %>%
#season2 = case_when(season %in% c( "SPRING") ~ "SPRING",
# season %in% c( "FALL") ~ "FALL")) %>%
dplyr::select(decade, season, totwt) %>%
dplyr::distinct() %>%
dplyr::group_by(decade, season) %>%
dplyr::mutate(totwt2 = sum(totwt, na.rm = TRUE),
nyrs = n_distinct(year)) %>%
dplyr::select(decade, season, totwt2, nyrs) %>%
dplyr::distinct()
diet <- read.csv(here::here("data/allwtp2024-09-28.csv"))
preylook <- read.csv(here::here("data/SASPREY24.csv"))
preylook <- dplyr::select(preylook, c(Pynam, pycomnam, COLLCAT))
# add blueprey list
decadeseason <- diet %>%
dplyr::mutate(copeprey = ifelse(prey %in% c("COPEPO"), TRUE, FALSE),
plotcol = ifelse(copeprey, "blue", "lightgrey"),
decade = as.integer(gsub("s","",decade))) %>%
dplyr::left_join(herringdecseastot, by=c("season" = "season",
"decade" = "decade")) %>%
dplyr::group_by(decade, season) %>%
dplyr::filter(meansw>0) %>%
dplyr::arrange(desc(copeprey), .by_group = TRUE)
bars <- map(unique(decadeseason$decade)
, ~geom_bar(aes(width=nyrs), stat = "identity", colour = "white"#, position = "stack",
, data = decadeseason %>% filter(decade == .x)))
p2 <- ggplot(decadeseason, aes(decade, relmsw, fill=copeprey)) +
#fill=factor(prey,
# levels = c(as.factor(names(preycolaggsel)),
# setdiff(prey, preycolaggsel)),
# ordered = TRUE
# ))) +
#geom_bar(aes(width=nyrs), stat = "identity") + #
bars +
ylab("Percent in Diet") +
xlab("Decade") +
geom_text(aes(y=relmsw,
label = ifelse(relmsw > 3, prey, "")),
position = position_stack(vjust = 0.5),
size=2.5)+
facet_wrap(~fct_relevel(season,"WINTER", "SPRING","SUMMER", "FALL")) +
theme_bw() +
#viridis::scale_fill_viridis(discrete = TRUE) +
scale_fill_manual(values=c( "grey90", "lightblue")) +
theme(legend.position = "none"
#legend.position="bottom",
#legend.text=element_text(size=5)
) #+
#geom_bar_interactive(stat = "identity", aes(tooltip = prey, data_id = prey))
#ggiraph(code = print(p1))
#p1
p2
Our focus in developing the indices is on herring recruitment, so we prioritize indices of food for larvae through juveniles.
Models developed for the Herring RT are for the following copepod categories:
Large copepods ALL: Calanus finmarchicus, Metridia lucens, Calanus minor, Eucalanus spp., Calanus spp.
Small copepods ALL: Centropages typicus, Pseudocalanus spp., Temora longicornis, Centropages hamatus, Paracalanus parvus, Acartia spp., Clausocalanus arcuicornis, Acartia longiremis, Clausocalanus furcatus, Temora stylifera, Temora spp., Tortanus discaudatus, Paracalanus spp.
These categories included all life stages of each copepod species.
(Models were also developed for Calanus finmarchicus alone, total zooplankton volume, and Euphausiids, which may be useful for comparison with adult Atlantic herring condition or growth in the future.)
An additional model was developed for Atlantic herring larvae using the same dataset in order to characterize the spatial extent of herring larvae compared with their zooplankton prey. This model was not used as an index due to missing data during several years of the time series.
Previous zooplankton indices use the numbers per 100 cubic meters of filtered water volume _100m3
data (Kane 2007; Morse et al. 2017). Other current analyses use the numbers per 100 cubic meters as well.
NEFSC survey strata definitions are built into the VAST northwest-atlantic
extrapolation grid already. Therefore, fall and spring survey strata sets used in the Atlantic herring assessment were also applied to the zooplankton indices. Indices were also calculated in Ecological Production Units (EPUs) used in regional ecosystem reporting (Fig. \(\ref{fig:maps}\)).
herring_spring <- c(01010, 01020, 01030, 01040, 01050, 01060, 01070, 01080, 01090,
01100, 01110, 01120, 01130, 01140, 01150, 01160, 01170, 01180,
01190, 01200, 01210, 01220, 01230, 01240, 01250, 01260, 01270,
01280, 01290, 01300, 01360, 01370, 01380, 01390, 01400, 01610,
01620, 01630, 01640, 01650, 01660, 01670, 01680, 01690, 01700,
01710, 01720, 01730, 01740, 01750, 01760)
herring_fall <- c(01050, 01060, 01070, 01080, 01090, 01100, 01110, 01120, 01130,
01140, 01150, 01160, 01170, 01180, 01190, 01200, 01210, 01220,
01230, 01240, 01250, 01260, 01270, 01280, 01290, 01300, 01360,
01370, 01380, 01390, 01400)
MAB <- c(1010:1080, 1100:1120, 1600:1750, 3010:3450, 3470, 3500, 3510)
GB <- c(1090, 1130:1210, 1230, 1250, 3460, 3480, 3490, 3520:3550)
GOM <- c(1220, 1240, 1260:1290, 1360:1400, 3560:3830)
SS <- c(1300:1352, 3840:3990)
D70pct_nwast <- readRDS(url("https://github.com/NOAA-EDAB/zooplanktonindex/raw/main/spatialdat/D70pct_nwa_strat2.rds"))
# MAB EPU
MAB2 <- D70pct_nwast %>%
dplyr::filter(stratum_number %in% MAB)
# MAB herring larvae area
MAB2herr <- MAB2 %>%
dplyr::filter(stratum_number2 %% 10 == 1)
# MAB outside larval area
MAB2out <- MAB2 %>%
dplyr::filter(stratum_number2 %% 10 == 2)
# Georges Bank EPU
GB2 <- D70pct_nwast %>%
dplyr::filter(stratum_number %in% GB)
# GB herring larvae
GB2herr <- GB2 %>%
dplyr::filter(stratum_number2 %% 10 == 1)
#GB outside larval area
GB2out <- GB2 %>%
dplyr::filter(stratum_number2 %% 10 == 2)
# gulf of maine EPU
GOM2 <- D70pct_nwast %>%
dplyr::filter(stratum_number %in% GOM)
# GOM herring larvae
GOM2herr <- GOM2 %>%
dplyr::filter(stratum_number2 %% 10 == 1)
#GOM outside larval area
GOM2out <- GOM2 %>%
dplyr::filter(stratum_number2 %% 10 == 2)
# scotian shelf EPU
SS2 <- D70pct_nwast %>%
dplyr::filter(stratum_number %in% SS)
# SS herring larvae
SS2herr <- SS2 %>%
dplyr::filter(stratum_number2 %% 10 == 1)
#SS outside larval area
SS2out <- SS2 %>%
dplyr::filter(stratum_number2 %% 10 == 2)
# whole herring larval area
herrlarv <- dplyr::bind_rows(MAB2herr, GB2herr, GOM2herr, SS2herr)
# outside herring larval area
nolarv <- dplyr::bind_rows(MAB2out, GB2out, GOM2out, SS2out)
# spring herring NEFSC BTS
her_spr2 <- D70pct_nwast %>%
dplyr::filter(stratum_number %in% herring_spring)
# fall herring NEFSC BTS
her_fall2 <- D70pct_nwast %>%
dplyr::filter(stratum_number %in% herring_fall)
# all epus
allEPU2 <- D70pct_nwast %>%
dplyr::filter(stratum_number %in% c(MAB, GB, GOM, SS))
theme_set(theme_bw())
ggplot(data = ecodata::coast) +
geom_sf() +
geom_point(data = FishStatsUtils::northwest_atlantic_grid, aes(x = Lon, y = Lat), colour = "coral4", size=0.05, alpha=0.1) +
geom_point(data = MAB2, aes(x = Lon, y = Lat), colour = "green3", size=0.05, alpha=0.5) +
geom_point(data = GB2, aes(x = Lon, y = Lat), colour = "orange", size=0.05, alpha=0.5) +
geom_point(data = GOM2, aes(x = Lon, y = Lat), colour = "red", size=0.05, alpha=0.5) +
geom_point(data = SS2, aes(x = Lon, y = Lat), colour = "purple", size=0.05, alpha=0.1) +
#coord_sf(xlim = c(-79, -65.5), ylim = c(33, 45)) + #full extent of VAST model
coord_sf(xlim =c(-78, -65.5), ylim = c(35, 45)) + #zoomed to Hatteras and N
ggtitle("MAB (green), GB (orange), GOM (red), SS(purple)")
ggplot(data = ecodata::coast) +
geom_sf() +
geom_point(data = FishStatsUtils::northwest_atlantic_grid, aes(x = Lon, y = Lat), colour = "coral4", size=0.05, alpha=0.1) +
geom_point(data = her_spr2, aes(x = Lon, y = Lat), size=0.05, colour = "yellow", alpha=0.7) +
#geom_point(data = her_fall2, aes(x = Lon, y = Lat), size=0.03, colour = "lightblue", alpha=0.3) +
#coord_sf(xlim = c(-79, -65.5), ylim = c(33, 45)) +
coord_sf(xlim =c(-78, -65.5), ylim = c(35, 45)) +
ggtitle("Herring spring (yellow) survey strata")
ggplot(data = ecodata::coast) +
geom_sf() +
geom_point(data = FishStatsUtils::northwest_atlantic_grid, aes(x = Lon, y = Lat), colour = "coral4", size=0.05, alpha=0.1) +
#geom_point(data = her_spr2, aes(x = Lon, y = Lat), size=0.05, colour = "yellow", alpha=0.1) +
geom_point(data = her_fall2, aes(x = Lon, y = Lat), size=0.03, colour = "lightblue", alpha=0.7) +
#coord_sf(xlim = c(-79, -65.5), ylim = c(33, 45)) +
coord_sf(xlim =c(-78, -65.5), ylim = c(35, 45)) +
ggtitle("Herring fall (light blue) survey strata")
# ggplot(data = ecodata::coast) +
# geom_sf() +
# geom_point(data = FishStatsUtils::northwest_atlantic_grid, aes(x = Lon, y = Lat), colour = "coral4", size=0.05, alpha=0.1) +
# geom_point(data = her_spr2, aes(x = Lon, y = Lat), size=0.05, colour = "yellow", alpha=0.1) +
# geom_point(data = herrlarv, aes(x = Lon, y = Lat), size=0.03, colour = "blue", alpha=0.5) +
# #coord_sf(xlim = c(-79, -65.5), ylim = c(33, 45)) +
# coord_sf(xlim =c(-78, -65.5), ylim = c(35, 45)) +
# ggtitle("Herring larval area Sept-Feb (blue), over spring survey strata (yellow")
Options explored for temporal scale included:
Testing models with data split over 4 seasons for 1982-2022 resulted in poor model performance, inability to estimate parameters, and lack of convergence for cases where parameters could be estimated. Therefore, we used the SPRING = January-June and FALL = July-December temporal scale as a starting point.
We applied the Vector Autoregressive Spatio-Temporal (VAST) model (Thorson and Barnett 2017; Thorson 2019) to estimate zooplankton indices from zooplankton survey data.
VAST is a Geostatistical generalized linear mixed effects model (GLMM) that models two linear predictors for an index: 1. encounter rate, and 2. positive catch (amount in stomach)
A full model for the first linear predictor \(\rho_1\) for each observation \(i\) can include: * fixed intercepts \(\beta_1\) for each category \(c\) and time \(t\), * spatial random effects \(\omega_1\) for each location \(s\) and category, * spatio-temporal random effects \(\varepsilon_1\) for each location, category, and time, * fixed vessel effects \(\eta_1\) by vessel \(v\) and category, and * fixed catchability impacts \(\lambda_1\) of covariates \(Q\) for each observation and variable \(k\):
\[\rho_1(i) = \beta_1(c_i, t_i) + \omega_1^*(s_i, c_i) + \varepsilon_1^*(s_i, c_i, t_i) + \eta_1(v_i, c_i) + \sum_{k=1}^{n_k} \lambda_1(k) Q(i,k)\]
The full model for the second linear predictor \(\rho_2\) has the same structure, estimating \(\beta_2\), \(\omega_2\), \(\varepsilon_2\), \(\eta_2\), and \(\lambda_2\) using the observations, categories, locations, times, and covariates.
VAST model code and documentation is available here: https://github.com/James-Thorson-NOAA/VAST
Spatial and spatio-temporal correlation decay with increasing distance estimated as \(\kappa\) in a Matern function with fixed smoothness and geometric anisotropy (directional correlation, optionally estimated by the model).
Initial model selection consistently supported the inclusion of spatial and spatio-temporal random effects and anisotropy across all datasets: fall, spring, and annual.
Observations in space are used to define fixed locations (“knots”) covering the full extent of the model. We assume constant variation within a timestep at a knot.
Modelers specify the number of knots to group observation locations, to balance computation time and resolution. We used 500 knots (Fig. \(\ref{fig:knots}\)).
knitr::include_graphics(here::here("pyindex/lgcopeALL_spring_500_biascorrect_doy/Data_and_knots.png"))
Observations correlated in space and in space over time due to unmeasured processes are modeled as multivariate normal Gaussian Random Fields (GRF):
\(\omega_1\) ~ MVN(0, \(\mathbf R_1\)); \(\omega_2\) ~ MVN(0, \(\mathbf R_2\))
\(\varepsilon_1\)(,t) ~ MVN(0, \(\mathbf R_1\)); \(\varepsilon_2\)(,t) ~ MVN(0, \(\mathbf R_2\))
Spatial and spatio-temporal correlation decay with increasing distance \(\mathbf d\) estimated as \(\kappa\) in a Matérn function with fixed smoothness \(\nu\) and geometric anisotropy \(H\) (directional correlation).
Correlation function between locations \(s\) and \(s'\):
\[\mathbf R_1(s, s') = \frac{1}{2^{\nu-1}\Gamma(\nu)} \times (\kappa_1 \lvert \mathbf d(s,s') \mathbf H \rvert)^\nu \times K_\nu (\kappa_1 \lvert \mathbf d(s,s') \mathbf H \rvert)\]
Estimation uses stochastic partial differential equation (SPDE) approximation.
Observation model “Index2”, Gamma distribution for positive catches and Alternative “Poisson-link delta-model” using log-link for numbers-density and log-link for biomass per number. It is intended for continuous data, which includes biomass data and “numbers standardized to a fixed area.”
Probability of encounter Poisson: \(p(i) = 1 - exp[-n(i)]\) set to 1 for groups found at all stations, e.g. small copepods or zooplankton volume
Number of zooplankton cells per volume given encounter: \(r(i) = \frac{n(i)}{p(i)}w(i)\)
Probability for numbers per volume \(B\) where \(g\) is a Gamma function. :
\[ Pr[b(i) = B] = \begin{cases} 1-p(i), B = 0\\\\p(i) \times g[B|r(i), \sigma_b^2], B > 0 \end{cases},\]
We are interpreting zooplankton abundance per 100 cubic meters as numbers standardized to a fixed area (volume) in applying the Gamma observation model.
Density \(b\) at a location (knot) \(s\) for year \(t\) is then the predicted number of cells per volume (linear predictor for encounter * linear predictor for cells/vol given encounter):
\[\hat{b}_{s,t} = \hat{n}_{s,t}\hat{w}_{s,t}\]
Index based on area \(a\) weighting for each of 500 knots (or subsets):
\[I_t = \sum_{s=1}^{500} a_s\hat{b}_{s,t}\]
For final models, bias correction was done as in Thorson and Kristensen (2016).
Two different observation models were applied, depending on the dataset. The default VAST index standardization (purpose = "index2"
in make_settings
) uses a Gamma distribution for positive catches and an alternative “Poisson-link delta-model” using log-link for numbers-density and log-link for biomass per number (ObsModel= c(2,1)
).
We applied the default observation model to Calanus finmarchicus (calfin_100m3) and Large copepods (calfin_100m3 + mlucens_100m3 + calminor_100m3 + euc_100m3 + calspp_100m3) datasets.
The default was used for index standardization of stomach contents data for pelagic and benthic forage indices. It is intended for continuous data, which includes biomass data and “numbers standardized to a fixed area” (see section starting at line 239 in the VAST user manual here. I am interpreting zooplankton abundance per 100 cubic meters as numbers standardized to a fixed area (volume) in applying the Gamma observation model.
For data where there are some years where the species is present in all (or 0) samples, estimating the probability of encounter fails (or at least, VAST won’t let you try). In these cases, the options are to treat intercepts representing temporal variability as random effects (by setting RhoConfig
Beta
or Epsilon
entries to something other than 0), or to use a different link function.
We intend for our indices to potentially be used in assessment (though as a covariate rather than an index, so maybe I’m being too strict), so the recommendation is to “minimize covariance in the estimated index by excluding any temporal correlation on model components (i.e., the intercept is a fixed effect in each year, and the spatio-temporal term is independent in each year)” (Thorson 2019).
All of the small copepods datasets had at least one year where our small copepods groupings were encountered at all stations. None had years with 0 encounters.
Therefore, we used a different link function, the Poisson-link fixing encounter probability=1 for any year where all samples encounter the species. We kept all other settings for index standardization the same, but set (ObsModel= c(2,4)
).
Zooplankton indices have not previously been developed using spatio-temporal modeling in the Northeast US. Therefore, we compare previously estimated annual abundance anomalies for Calanus finmarchicus with annual anomalies estimated from our VAST models. These anomalies are estimated at the ecoregion (Mid Atlantic Bight–MAB, Georges Bank–GB, and Gulf of Maine–GOM) scale.
Model selection was done in two stages, the first looking at spatial, spatio-temporal random effects and the second looking at covariates, following Ng et al. (2021) and Gaichas et al. (2023). The model selection script is available at https://github.com/NOAA-EDAB/zooplanktonindex/blob/main/VASTscripts/VASTunivariate_zoopindex_modselection.R
Models were run using Reduced Maximum Likelihood (REML) and without bias correction for the first stage of model selection.
The only “catchability” covariates tested were Day of Year (“doy”) to compensate for the fast turnover rate of zooplankton during the season.
Converged models with the lowest AIC for each taxonomic/season combination were considered best.
After selecting the converged model structures with the most support based on AIC, standard diagnostics were evaluated for each model: data available by season, year, and in space, quantile residuals in aggregate and in space. Model results include the ln density each year and the resulting indices for each spatial unit.
When developing a zooplankton index, it is important to determine whether the zooplankton are available to herring larvae in time and space. We used data available on herring larvae in the zooplankton dataset to define the most likely time and space for herring larvae, and used those temporal and spatial definitions to further refine selected zooplankton indices.
A herring larvae VAST model was developed using the same methods as above, and resulting indices were compared with those estimated by Richardson et al. (2010).
Models were initially developed using the same seasons as for zooplankton (Spring = January-June, Fall = July-December). However, nearly all observations of herring larvae in the 1982-2022 dataset were from September - February (also reflected in the index developed in Richardson et al. (2010).) Therefore, another model for September-February was developed, which aligned observations in January-February with those from the preceeding fall September-December.
To define areas with the most dense herring larvae, we calculate quantiles of density in space across the time series. First we sum the density in each cells over all years, then calculate the quantiles of summed log density from the September-February herring larvae VAST model.
stratlook <- data.frame(Stratum = c("Stratum_1",
"Stratum_2",
"Stratum_3",
"Stratum_4",
"Stratum_5",
"Stratum_6",
"Stratum_7"),
Region = c("AllEPU",
"her_sp",
"her_fa",
"MAB",
"GB",
"GOM",
"SS"))
stratlook2 <- data.frame(Stratum = c("Stratum_1",
"Stratum_2",
"Stratum_3",
"Stratum_4",
"Stratum_5",
"Stratum_6",
"Stratum_7",
"Stratum_8",
"Stratum_9"),
Region = c("AllEPU",
"her_sp",
"her_fa",
"her_larv",
"no_larv",
"MAB",
"GB",
"GOM",
"SS"))
source(here::here("R/utils_VASToutput.R")) #IEA/HerringRT_2025/
outdir <- here::here("pyindex") #IEA/HerringRT_2025/
moddirs <- list.dirs(outdir)
moddirs <- moddirs[-1]
# keep folder name
modnames <- list.dirs(outdir, full.names = FALSE)
modselect <- modtable(moddirs)
# lets only look at indices for converged models
modcompare_conv <- modselect |>
dplyr::ungroup() |>
dplyr::filter(converged2 == "likely") |>
dplyr::select(modname) |>
as.vector() |>
unname() |>
unlist()
moddirs_conv <- moddirs[grepl(sprintf("\\.*(%s)$", paste(modcompare_conv, collapse = '|')), moddirs)]
modcompareindex <- purrr::map_dfr(moddirs_conv, purrr::possibly(getmodindex, otherwise = NULL))
splitoutput <- modcompareindex %>%
dplyr::mutate(Season = modname |> map(str_split, pattern = "_") |> map_chr(c(1,2))) %>%
dplyr::mutate(Data = modname |> map(str_split, pattern = "_") |> map_chr(c(1,1))) %>%
dplyr::mutate(Estimate = ifelse(Estimate == 0, NA, Estimate)) |>
dplyr::left_join(stratlook)
# only larvarea models have this strata set
splitoutput2 <- modcompareindex %>%
dplyr::filter(str_detect(modname, "larvarea")) |>
dplyr::mutate(Season = modname |> map(str_split, pattern = "_") |> map_chr(c(1,2))) %>%
dplyr::mutate(Data = modname |> map(str_split, pattern = "_") |> map_chr(c(1,1))) %>%
dplyr::mutate(Estimate = ifelse(Estimate == 0, NA, Estimate)) |>
dplyr::left_join(stratlook2)
zoomax <- max(splitoutput$Estimate, na.rm=T)
zootsmean <- splitoutput %>%
dplyr::group_by(modname, Region) %>%
dplyr::mutate(fmean = mean(Estimate, na.rm=T))
# one option, sum D_gct over all years then take quantiles in space
D <- readRDS("Dherr_sepfeb.rds")
Dtot <- D |>
dplyr::group_by(cell) |>
dplyr::mutate(Dsum = sum(D, na.rm=TRUE),
logD = log(as.vector(Dsum))) |>
dplyr::select(!c(D, Year)) |>
dplyr::distinct()
Dvec <- terra::vect(Dtot, geom=c("Lon", "Lat"))
q <- quantile(log(as.vector(Dtot$Dsum)), probs=seq(0,1,0.1))
q2 <- quantile(Dtot$logD, probs=seq(0,1,0.1))
qvec <- terra::quantile(Dvec, probs=c(0.6, 0.65, 0.7, 0.75, 0.8))
quants <- classInt::classIntervals(log(as.vector(Dtot$Dsum)),
style = "quantile", n = 10)
#quants$brks
D60pct <- Dtot |>
dplyr::filter(log(as.vector(Dsum))>qvec["60%","logD"]) |>
dplyr::rename("60th" = "Include")
D65pct <- Dtot |>
dplyr::filter(log(as.vector(Dsum))>qvec["65%","logD"]) |>
dplyr::rename("65th" = "Include")
D70pct <- Dtot |>
dplyr::filter(log(as.vector(Dsum))>qvec["70%","logD"]) |>
dplyr::rename("70th" = "Include")
D75pct <- Dtot |>
dplyr::filter(log(as.vector(Dsum))>qvec["75%","logD"]) |>
dplyr::rename("75th" = "Include")
D80pct <- Dtot |>
dplyr::filter(log(as.vector(Dsum))>qvec["80%","logD"]) |>
dplyr::rename("80th" = "Include")
qvec
## Include logD
## 60% 1 3.274264
## 65% 1 3.828164
## 70% 1 4.554015
## 75% 1 5.215182
## 80% 1 5.668452
Mapping the summed density below.
d.name <- moddirs[str_detect(moddirs, "herringlarvae_sepfeb")]
fit <- readRDS(paste0(d.name, "/fit.rds"))
#fit <- VAST::reload_model(fit) #added to try to make work after restart, no previous VAST run
years <- unique(fit$data_frame$t_i)
years <- c(min(years):max(years))
mdl <- FishStatsUtils::make_map_info(Region = fit$settings$Region,
spatial_list = fit$spatial_list,
Extrapolation_List = fit$extrapolation_list#,
#added to try to make work after restart, no previous VAST run
#Include = fit$extrapolation_list[["Area_km2_x"]] > 0 & rowSums(fit$extrapolation_list[["a_el"]]) > 0
)
gmap <- ggplot(data = ecodata::coast) +
geom_sf() +
#aes(x = lon, y = lat, group = group) +
#geom_polygon(fill="black", colour = "white") +
scale_color_viridis_c(option = "magma") + # now make this quantiles...
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank(),
panel.spacing.x=unit(0, "lines"),
panel.spacing.y=unit(0, "lines"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank() ) +
coord_sf(xlim=mdl$Xlim, ylim=mdl$Ylim)
g <- gmap +
geom_point(data=Dtot, aes(Lon, Lat, color=log(as.vector(Dsum)), group=NULL)
,
## These settings are necessary to avoid
## overlplotting which is a problem here. May need
## to be tweaked further.
size=1.5, stroke=0,shape=16
)
g
Mapping the quantiles from 60% (white) to 80% (light blue).
Quantiles below 70% include densities south of Long Island and off Cape Hatteras. The WG thinks the larvae found to the south may be lost to the population so we likely don’t want to use a quantile below the 70th percentile.
The 80th percentile starts to show a gap on Georges Bank and along the southwest coast of Maine. This may be slicing things too finely to cover general herring larval habitat.
I think this leaves us with using either the 70th (light green) or 75th (dark green) percentile of summed density over all years to define herring larvae relevant habitat for the small copepods index.
gq <- gmap +
geom_point(data=D60pct,
aes(x=Lon, y=Lat, color = "60th"),
# color="white"
#)
# ,
## These settings are necessary to avoid
## overlplotting which is a problem here. May need
## to be tweaked further.
size=1.5, stroke=0,shape=16
) +
geom_point(data=D65pct,
aes(x=Lon, y=Lat, color = "65th"),
#color="yellow"
#)
#,
## These settings are necessary to avoid
## overlplotting which is a problem here. May need
## to be tweaked further.
size=1.5, stroke=0,shape=16
) +
geom_point(data=D70pct,
aes(x=Lon, y=Lat, color = "70th"),
#color="green"
#)
#,
## These settings are necessary to avoid
## overlplotting which is a problem here. May need
## to be tweaked further.
size=1.5, stroke=0,shape=16
) +
geom_point(data=D75pct,
aes(x=Lon, y=Lat, color = "75th"),
#color="darkgreen"
#)
#,
## These settings are necessary to avoid
## overlplotting which is a problem here. May need
## to be tweaked further.
size=1.5, stroke=0,shape=16
) +
geom_point(data=D80pct,
aes(x=Lon, y=Lat, color = "80th"),
#color="lightblue"
#)
#,
## These settings are necessary to avoid
## overlplotting which is a problem here. May need
## to be tweaked further.
size=1.5, stroke=0,shape=16
) +
ggplot2::scale_color_manual(name = "Quantile",
breaks = c("60th", "65th", "70th", "75th", "80th"),
values = c("60th" = "white", "65th" = "yellow",
"70th" = "green", "75th" = "darkgreen",
"80th" = "lightblue") )
gq + ggplot2::theme_grey()
# another option, pull from fit$report Omega 1 and 2?
We selected the 70th percentile as the largest that did not include areas in the southern portion of the range to define the “herring larval area” for use in other zooplankton VAST models.
To make a new extrapolation grid using herring larval density 70th percentile, we first make the 70th percentile+ points into an sf
object by outlining the “concave hull” of the area, then intersect that object with the built in FishStatsUtils::northwest_atlantic_grid
.
Next, we define new strata based on that intersection and make a new extrapolation grid. Then we can use this grid and call the new strata as strata.limits when running the small copepods model.
See the full process here: https://noaa-edab.github.io/zooplanktonindex/HerringLarvaeExplore.html
Using the new grid, run the smallcopeALL_sepfeb dataset in VAST to get an index in the herring larvae area as well as in the others.
Script for doing this is https://github.com/NOAA-EDAB/zooplanktonindex/blob/main/VASTscripts/VASTunivariate_zoopindex_smcopeALL_herrlarvarea.R
Sample sizes in each of the temporal categories are reported below. Definitions: “all” = January-December; “FALL” = July-December; “SPRING” = January-June; “fall” = September-November; “spring” = March-May; “summer” = June-August; “winter” = December-February; “nolarv” = March-August; “sepfeb” = September-February of the following year.
herringfood_stn <- readRDS(here::here("data/herringfood_stn_all_OISST.rds"))
# make SST column that uses surftemp unless missing or 0
herringfood_stn <- herringfood_stn |>
dplyr::mutate(sstfill = ifelse((is.na(sfc_temp)|sfc_temp==0), oisst, sfc_temp),
season_larv = month %in% c(1:2, 9:12)) |>
dplyr::filter(year>1981)
alltot <- herringfood_stn |>
dplyr::summarise(totstn = n(),
across(c(calfin_100m3,
lgcopeALL_100m3,
smallcopeALL_100m3,
cluhar_100m3,
euph_100m3,
volume_100m3
), ~ sum(.x != 0, na.rm = TRUE))) |>
dplyr::mutate(season = "all")
seasontot <- herringfood_stn |>
dplyr::group_by(season = season_ng) |>
dplyr::summarise(totstn = n(),
across(c(calfin_100m3,
lgcopeALL_100m3,
smallcopeALL_100m3,
cluhar_100m3,
euph_100m3,
volume_100m3
), ~ sum(.x != 0, na.rm = TRUE)))
season4tot <- herringfood_stn |>
dplyr::group_by(season = season_4) |>
dplyr::summarise(totstn = n(),
across(c(calfin_100m3,
lgcopeALL_100m3,
smallcopeALL_100m3,
cluhar_100m3,
euph_100m3,
volume_100m3
), ~ sum(.x != 0, na.rm = TRUE)))
larvtot <- herringfood_stn |>
dplyr::mutate(season = dplyr::if_else(season_larv, "sepfeb", "nolarv")) |>
dplyr::group_by(season) |>
dplyr::summarise(totstn = n(),
across(c(calfin_100m3,
lgcopeALL_100m3,
smallcopeALL_100m3,
cluhar_100m3,
euph_100m3,
volume_100m3
), ~ sum(.x != 0, na.rm = TRUE)))
samptab <- dplyr::bind_rows(alltot, seasontot, season4tot, larvtot) |>
dplyr::select(season,
stations = totstn,
zoopvol = volume_100m3,
largecope = lgcopeALL_100m3,
smallcope = smallcopeALL_100m3,
herringlarv = cluhar_100m3,
calanusfin = calfin_100m3,
euphausid = euph_100m3)
flextable::flextable(samptab) |>
flextable::set_caption("Number of survey stations 1982-2022 by season with number of postive samples by zooplankton species group.") |>
flextable::colformat_double(big.mark = "", digits = 0) |>
#flextable::autofit()
flextable::width(width = 1)
season | stations | zoopvol | largecope | smallcope | herringlarv | calanusfin | euphausid |
---|---|---|---|---|---|---|---|
all | 27,774 | 25,266 | 23,813 | 25,255 | 3,815 | 21,524 | 11,710 |
FALL | 13,249 | 12,935 | 12,081 | 12,946 | 2,765 | 10,355 | 5,906 |
SPRING | 14,525 | 12,331 | 11,732 | 12,309 | 1,050 | 11,169 | 5,804 |
fall | 8,745 | 8,558 | 8,012 | 8,557 | 2,134 | 6,526 | 4,154 |
spring | 7,739 | 6,549 | 6,176 | 6,540 | 227 | 5,972 | 3,360 |
summer | 5,810 | 5,471 | 5,134 | 5,488 | 25 | 4,939 | 2,566 |
winter | 5,480 | 4,688 | 4,491 | 4,670 | 1,429 | 4,087 | 1,630 |
nolarv | 13,549 | 12,020 | 11,310 | 12,028 | 252 | 10,911 | 5,926 |
sepfeb | 14,225 | 13,246 | 12,503 | 13,227 | 3,563 | 10,613 | 5,784 |
zoopvol <- moddirs[str_detect(moddirs, "zoopvol")]
zoopvolnocovs <- zoopvol[!str_detect(zoopvol, "_doy")]
for(d.name in zoopvolnocovs){
modpath <- unlist(str_split(d.name, pattern = "/"))
modname <- modpath[length(modpath)]
cat(modname, "\n")
cat(paste0("![](",d.name, "/Data_by_year.png)"))
cat("\n\n")
}
zoopvol_ann_500_biascorrect
zoopvol_fall_500_biascorrect
zoopvol_spring_500_biascorrect
Fall sampling for herring larvae was completed in most years aside from GLOBEC. In our definition of spring, herring larvae primarily occur in January and February. We now include a shifted season to better match larval herring availability throughout the time series: September - February. Year corresponds to September-December, and the following January and February are aligned with the previous year (hatch year) in these analyses. Data were missing for
for(d.name in moddirs[str_detect(moddirs, "herring")]){
modpath <- unlist(str_split(d.name, pattern = "/"))
modname <- modpath[length(modpath)]
cat(modname, "\n")
cat(paste0("![](",d.name, "/Data_by_year.png)"))
cat("\n\n")
}
herringlarvae_fall_500_biascorrect
herringlarvae_sepfeb_yrshift_500_biascorrect
herringlarvae_spring_500_biascorrect
We compared a preliminary annual scale VAST Calanus finmarchicus model output (no covariates or bias correction) with the annual anomalies currently included in the R ecodata
package (overview, methods). Estimation methods are completely different and spatial strata are slightly different, but the timing of high and low anomalies is well-aligned between the two methods for the region mainly occupied by both Atlantic herring and Calanus finmarchicus, Georges Bank (GB) and the Gulf of Maine (GOM).
stratlook <- data.frame(Stratum = c("Stratum_1",
"Stratum_2",
"Stratum_3",
"Stratum_4",
"Stratum_5",
"Stratum_6",
"Stratum_7"),
Region = c("AllEPU",
"her_sp",
"her_fa",
"MAB",
"GB",
"GOM",
"SS"))
stratlook2 <- data.frame(Stratum = c("Stratum_1",
"Stratum_2",
"Stratum_3",
"Stratum_4",
"Stratum_5",
"Stratum_6",
"Stratum_7",
"Stratum_8",
"Stratum_9"),
Region = c("AllEPU",
"her_sp",
"her_fa",
"her_larv",
"no_larv",
"MAB",
"GB",
"GOM",
"SS"))
source(here::here("R/utils_VASToutput.R")) #IEA/HerringRT_2025/
outdir <- here::here("pyindex_tests") #IEA/HerringRT_2025/
moddirs <- list.dirs(outdir)
moddirs <- moddirs[-1]
# keep folder name
modnames <- list.dirs(outdir, full.names = FALSE)
modselect <- modtable(moddirs)
# lets only look at indices for converged models
modcompare_conv <- modselect |>
dplyr::ungroup() |>
dplyr::filter(converged2 == "likely") |>
dplyr::select(modname) |>
as.vector() |>
unname() |>
unlist()
moddirs_conv <- moddirs[grepl(sprintf("\\.*(%s)$", paste(modcompare_conv, collapse = '|')), moddirs)]
modcompareindex <- purrr::map_dfr(moddirs_conv, purrr::possibly(getmodindex, otherwise = NULL))
splitoutput <- modcompareindex %>%
dplyr::mutate(Season = modname |> map(str_split, pattern = "_") |> map_chr(c(1,2))) %>%
dplyr::mutate(Data = modname |> map(str_split, pattern = "_") |> map_chr(c(1,1))) %>%
dplyr::mutate(Estimate = ifelse(Estimate == 0, NA, Estimate)) |>
dplyr::left_join(stratlook)
# only larvarea models have this strata set
splitoutput2 <- modcompareindex %>%
dplyr::filter(str_detect(modname, "larvarea")) |>
dplyr::mutate(Season = modname |> map(str_split, pattern = "_") |> map_chr(c(1,2))) %>%
dplyr::mutate(Data = modname |> map(str_split, pattern = "_") |> map_chr(c(1,1))) %>%
dplyr::mutate(Estimate = ifelse(Estimate == 0, NA, Estimate)) |>
dplyr::left_join(stratlook2)
zoomax <- max(splitoutput$Estimate, na.rm=T)
zootsmean <- splitoutput %>%
dplyr::group_by(modname, Region) %>%
dplyr::mutate(fmean = mean(Estimate, na.rm=T))
VASTmean <- splitoutput |>
dplyr::filter(Data %in% c("calfin"),
Season %in% c("annual"),
Region %in% c("MAB", "GB", "GOM")) |>
dplyr::group_by(Region, Season) |>
dplyr::summarise(tsmean = mean(Estimate))
VASTanom <- splitoutput |>
dplyr::filter(Data %in% c("calfin"),
Season %in% c("annual"),
Region %in% c("MAB", "GB", "GOM")) |>
dplyr::left_join(VASTmean) |>
dplyr::group_by(Region, Season) |>
dplyr::mutate(#Value = Value/resca,
Mean = as.numeric(Estimate),
SE = Std..Error.for.Estimate,
Mean = (Mean-tsmean)/tsmean,
SE = (SE-tsmean)/tsmean,
Upper = Mean + SE,
Lower = Mean - SE) |>
dplyr::rename(EPU = Region)
SOEcalfin <- ecodata::zoo_abundance_anom |>
dplyr::mutate(Value = as.numeric(Value)) |>
dplyr::filter(Var == "Calfin") |>
ggplot2::ggplot(aes(x = Time, y = Value)) +
ggplot2::geom_point(color="darkred") +
ggplot2::geom_line(color="darkred") +
ecodata::geom_gls() +
ggplot2::geom_line(data = VASTanom, aes(x=Time, y=Mean), color = "blue", size=1.5) +
ecodata::theme_facet() +
ggplot2::facet_wrap(~EPU)
SOEcalfin + ggplot2::ggtitle("Calanus finmarchicus annual anomaly: ecodata in dark red; VAST in blue")
GB = Georges Bank, GOM = Gulf of Maine, MAB = Mid Atlantic Bight.
Comparison of spatial definitions: ecodata anomalies use the dark red line while VAST models use the blue shaded polygons.
url <- "https://github.com/NOAA-EDAB/presentations/raw/master/docs/EDAB_images/EPU_Designations_Map.jpg"
knitr::include_graphics(url)
# from each output folder in pyindex,
outdir <- here::here("pyindex_modsel1")
moddirs <- list.dirs(outdir)
moddirs <- moddirs[-1]
# keep folder name
modnames <- list.dirs(outdir, full.names = FALSE)
# function to apply extracting info
getmodinfo <- function(d.name){
# read settings
modpath <- stringr::str_split(d.name, "/", simplify = TRUE)
modname <- modpath[length(modpath)]
settings <- read.table(file.path(d.name, "settings.txt"), comment.char = "",
fill = TRUE, header = FALSE)
n_x <- as.numeric(as.character(settings[(which(settings[,1]=="$n_x")+1),2]))
grid_size_km <- as.numeric(as.character(settings[(which(settings[,1]=="$grid_size_km")+1),2]))
max_cells <- as.numeric(as.character(settings[(which(settings[,1]=="$max_cells")+1),2]))
use_anisotropy <- as.character(settings[(which(settings[,1]=="$use_anisotropy")+1),2])
fine_scale <- as.character(settings[(which(settings[,1]=="$fine_scale")+1),2])
bias.correct <- as.character(settings[(which(settings[,1]=="$bias.correct")+1),2])
#FieldConfig
if(settings[(which(settings[,1]=="$FieldConfig")+1),1]=="Component_1"){
omega1 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+2),2])
omega2 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+3),1])
epsilon1 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+4),2])
epsilon2 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+5),1])
beta1 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+6),2])
beta2 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+7),1])
}
if(settings[(which(settings[,1]=="$FieldConfig")+1),1]=="Omega1"){
omega1 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+3),1])
omega2 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+4),1])
epsilon1 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+3),2])
epsilon2 <- as.character(settings[(which(settings[,1]=="$FieldConfig")+4),2])
beta1 <- "IID"
beta2 <- "IID"
}
#RhoConfig
rho_beta1 <- as.numeric(as.character(settings[(which(settings[,1]=="$RhoConfig")+3),1]))
rho_beta2 <- as.numeric(as.character(settings[(which(settings[,1]=="$RhoConfig")+3),2]))
rho_epsilon1 <- as.numeric(as.character(settings[(which(settings[,1]=="$RhoConfig")+4),1]))
rho_epsilon2 <- as.numeric(as.character(settings[(which(settings[,1]=="$RhoConfig")+4),2]))
# read parameter estimates, object is called parameter_Estimates
if(file.exists(file.path(d.name, "parameter_estimates.RData"))) {
load(file.path(d.name, "parameter_estimates.RData"))
AIC <- parameter_estimates$AIC[1]
converged <- parameter_estimates$Convergence_check[1]
fixedcoeff <- unname(parameter_estimates$number_of_coefficients[2])
randomcoeff <- unname(parameter_estimates$number_of_coefficients[3])
}else if(file.exists(file.path(d.name, "parameter_estimates.txt"))){
reptext <- readLines(file.path(d.name, "parameter_estimates.txt"))
AIC <- as.double(reptext[grep(reptext, pattern = "AIC")+1])
converged <- reptext[grep(reptext, pattern = "Convergence_check")+1]
fixedcoeff <- as.integer(stringr::str_split(reptext[grep(reptext, pattern = "number_of_coefficients")+2],
boundary("word"))[[1]][2])
randomcoeff <- as.integer(stringr::str_split(reptext[grep(reptext, pattern = "number_of_coefficients")+2],
boundary("word"))[[1]][3])
}else{
AIC <- NA_real_
converged <- NA_character_
fixedcoeff <- NA_integer_
randomcoeff <- NA_integer_
}
#index <- read.csv(file.path(d.name, "Index.csv"))
# return model attributes as a dataframe
out <- data.frame(modname = modname,
n_x = n_x,
grid_size_km = grid_size_km,
max_cells = max_cells,
use_anisotropy = use_anisotropy,
fine_scale = fine_scale,
bias.correct = bias.correct,
omega1 = omega1,
omega2 = omega2,
epsilon1 = epsilon1,
epsilon2 = epsilon2,
beta1 = beta1,
beta2 = beta2,
rho_epsilon1 = rho_epsilon1,
rho_epsilon2 = rho_epsilon2,
rho_beta1 = rho_beta1,
rho_beta2 = rho_beta2,
AIC = AIC,
converged = converged,
fixedcoeff = fixedcoeff,
randomcoeff = randomcoeff#,
#index = index
)
return(out)
}
modcompare <- purrr::map_dfr(moddirs, getmodinfo)
Models compared using REML are identified by model name (“modname” in Table Sb??) which always includes all prey aggregated, season (“all” for annual models of months 1-12, “fall” for models of months 7-12, and “spring” for models of months 1-6), number of knots (500 for all models), and which fixed and random spatial and spatio-temporal effects were included in which linear predictor (1 or 2). The names for model options and associated VAST model settings are:
Model selection 1 (spatial, spatio-temporal effects, no covariates) options (following Ng et al. (2021)) and naming:
* All models set Use_REML = TRUE in fit_model specifications.
* Modeled effects, model name suffix, and VAST settings by model:
use_anisotopy = FALSE
use_anisotopy = FALSE
use_anisotopy = FALSE
use_anisotopy = FALSE
modselect <- modcompare |>
dplyr::mutate(season = dplyr::case_when(stringr::str_detect(modname, "_fall_") ~ "Fall",
stringr::str_detect(modname, "spring") ~ "Spring",
stringr::str_detect(modname, "_all_") ~ "Annual",
TRUE ~ as.character(NA))) |>
dplyr::mutate(converged2 = dplyr::case_when(stringr::str_detect(converged, "no evidence") ~ "likely",
stringr::str_detect(converged, "is likely not") ~ "unlikely",
TRUE ~ as.character(NA))) |>
dplyr::mutate(copegroup = stringr::str_extract(modname, "[^_]+")) |>
#dplyr::mutate(modname = str_extract(modname, '(?<=allagg_).*')) |>
dplyr::group_by(copegroup, season) |>
dplyr::mutate(deltaAIC = AIC-min(AIC)) |>
dplyr::select(copegroup, modname, season, deltaAIC, fixedcoeff,
randomcoeff, use_anisotropy,
omega1, omega2, epsilon1, epsilon2,
beta1, beta2, AIC, converged2) |>
dplyr::arrange(copegroup, season, AIC)
# DT::datatable(modselect, rownames = FALSE,
# options= list(pageLength = 25, scrollX = TRUE),
# caption = "Comparison of delta AIC values using Restricted Maxiumum Likelihood (REML) for alternative fixed and random effects model structures. See text for model descriptions.")
flextable::flextable(modselect |>
dplyr::select(-c(use_anisotropy,
omega1, omega2, epsilon1, epsilon2,
beta1, beta2))) |>
flextable::set_header_labels(copegroup = "Copepod group",
modname = "Model name",
season = "Season",
deltaAIC = "dAIC",
fixedcoeff = "N fixed",
randomcoeff = "N random",
converged2 = "Convergence") |>
flextable::set_caption("Comparison of delta AIC (dAIC) values using Restricted Maxiumum Likelihood (REML) for alternative fixed and random effects model structures, with number of fixed (N fixed) and random (N random) coefficients. See text for model descriptions associated with each model name.") |>
flextable::fontsize(size = 9, part = "all") |>
flextable::colformat_double(digits = 2) |>
flextable::set_table_properties(layout = "autofit", width = 1)
Copepod group | Model name | Season | dAIC | N fixed | N random | AIC | Convergence |
---|---|---|---|---|---|---|---|
calfin | calfin_fall_500_alleffectson | Fall | 0.00 | 9 | 46,616 | 208,373.00 | likely |
calfin | calfin_fall_500_noaniso | Fall | 79.25 | 7 | 46,616 | 208,452.25 | likely |
calfin | calfin_fall_500_noomeps2 | Fall | 1,753.04 | 6 | 23,348 | 210,126.05 | likely |
calfin | calfin_fall_500_noomeps2_noaniso | Fall | 1,814.90 | 4 | 23,348 | 210,187.90 | likely |
calfin | calfin_fall_500_noomeps2_noeps1 | Fall | 3,474.74 | 5 | 634 | 211,847.75 | likely |
calfin | calfin_fall_500_noomeps2_noeps1_noaniso | Fall | 3,501.57 | 3 | 634 | 211,874.57 | likely |
calfin | calfin_fall_500_noomeps12 | Fall | 9,589.69 | 1 | 80 | 217,962.69 | likely |
calfin | calfin_fall_500_noomeps12_noaniso | Fall | 9,589.69 | 1 | 80 | 217,962.69 | likely |
calfin | calfin_spring_500_alleffectson | Spring | 0.00 | 9 | 46,534 | 239,281.74 | likely |
calfin | calfin_spring_500_noaniso | Spring | 48.35 | 7 | 46,534 | 239,330.09 | likely |
calfin | calfin_spring_500_noomeps2 | Spring | 864.36 | 6 | 23,308 | 240,146.10 | likely |
calfin | calfin_spring_500_noomeps2_noaniso | Spring | 889.01 | 4 | 23,308 | 240,170.75 | likely |
calfin | calfin_spring_500_noomeps2_noeps1 | Spring | 2,325.90 | 5 | 635 | 241,607.64 | likely |
calfin | calfin_spring_500_noomeps2_noeps1_noaniso | Spring | 2,339.84 | 3 | 635 | 241,621.59 | likely |
calfin | calfin_spring_500_noomeps12 | Spring | 6,517.46 | 1 | 82 | 245,799.20 | likely |
calfin | calfin_spring_500_noomeps12_noaniso | Spring | 6,517.46 | 1 | 82 | 245,799.20 | likely |
lgcopeALL | lgcopeALL_fall_500_alleffectson | Fall | 0.00 | 9 | 46,616 | 251,030.89 | likely |
lgcopeALL | lgcopeALL_fall_500_noaniso | Fall | 100.03 | 7 | 46,616 | 251,130.92 | likely |
lgcopeALL | lgcopeALL_fall_500_noomeps2 | Fall | 1,496.48 | 6 | 23,348 | 252,527.36 | likely |
lgcopeALL | lgcopeALL_fall_500_noomeps2_noaniso | Fall | 1,588.70 | 4 | 23,348 | 252,619.58 | likely |
lgcopeALL | lgcopeALL_fall_500_noomeps2_noeps1 | Fall | 3,572.64 | 5 | 634 | 254,603.53 | likely |
lgcopeALL | lgcopeALL_fall_500_noomeps2_noeps1_noaniso | Fall | 3,610.16 | 3 | 634 | 254,641.04 | likely |
lgcopeALL | lgcopeALL_fall_500_noomeps12 | Fall | 7,346.16 | 1 | 80 | 258,377.05 | likely |
lgcopeALL | lgcopeALL_fall_500_noomeps12_noaniso | Fall | 7,346.16 | 1 | 80 | 258,377.05 | likely |
lgcopeALL | lgcopeALL_spring_500_alleffectson | Spring | 0.00 | 9 | 46,450 | 257,499.63 | likely |
lgcopeALL | lgcopeALL_spring_500_noaniso | Spring | 78.55 | 7 | 46,450 | 257,578.17 | likely |
lgcopeALL | lgcopeALL_spring_500_noomeps2 | Spring | 1,746.74 | 6 | 23,266 | 259,246.36 | likely |
lgcopeALL | lgcopeALL_spring_500_noomeps2_noaniso | Spring | 1,798.07 | 4 | 23,266 | 259,297.70 | likely |
lgcopeALL | lgcopeALL_spring_500_noomeps2_noeps1 | Spring | 4,612.34 | 5 | 634 | 262,111.96 | likely |
lgcopeALL | lgcopeALL_spring_500_noomeps2_noeps1_noaniso | Spring | 4,622.66 | 3 | 634 | 262,122.28 | likely |
lgcopeALL | lgcopeALL_spring_500_noomeps12 | Spring | 7,906.90 | 1 | 82 | 265,406.53 | likely |
lgcopeALL | lgcopeALL_spring_500_noomeps12_noaniso | Spring | 7,906.90 | 1 | 82 | 265,406.53 | likely |
smallcopeALL | smallcopeALL_fall_500_alleffectson | Fall | 0.00 | 9 | 46,607 | 303,280.90 | likely |
smallcopeALL | smallcopeALL_fall_500_noaniso | Fall | 47.18 | 7 | 46,607 | 303,328.07 | likely |
smallcopeALL | smallcopeALL_fall_500_noomeps2 | Fall | 2,231.62 | 6 | 23,339 | 305,512.51 | likely |
smallcopeALL | smallcopeALL_fall_500_noomeps2_noaniso | Fall | 2,265.50 | 4 | 23,339 | 305,546.39 | likely |
smallcopeALL | smallcopeALL_fall_500_noomeps2_noeps1 | Fall | 4,501.12 | 5 | 625 | 307,782.01 | likely |
smallcopeALL | smallcopeALL_fall_500_noomeps2_noeps1_noaniso | Fall | 4,508.11 | 3 | 625 | 307,789.01 | likely |
smallcopeALL | smallcopeALL_fall_500_noomeps12 | Fall | 8,563.21 | 1 | 71 | 311,844.11 | likely |
smallcopeALL | smallcopeALL_fall_500_noomeps12_noaniso | Fall | 8,563.21 | 1 | 71 | 311,844.11 | likely |
smallcopeALL | smallcopeALL_spring_500_alleffectson | Spring | 0.00 | 9 | 46,441 | 282,327.03 | likely |
smallcopeALL | smallcopeALL_spring_500_noaniso | Spring | 56.67 | 7 | 46,441 | 282,383.71 | likely |
smallcopeALL | smallcopeALL_spring_500_noomeps2 | Spring | 3,961.69 | 6 | 23,257 | 286,288.72 | likely |
smallcopeALL | smallcopeALL_spring_500_noomeps2_noaniso | Spring | 4,004.70 | 4 | 23,257 | 286,331.73 | likely |
smallcopeALL | smallcopeALL_spring_500_noomeps2_noeps1_noaniso | Spring | 7,586.19 | 3 | 625 | 289,913.22 | likely |
smallcopeALL | smallcopeALL_spring_500_noomeps2_noeps1 | Spring | 7,586.74 | 5 | 625 | 289,913.78 | likely |
smallcopeALL | smallcopeALL_spring_500_noomeps12 | Spring | 13,390.23 | 1 | 73 | 295,717.26 | likely |
smallcopeALL | smallcopeALL_spring_500_noomeps12_noaniso | Spring | 13,390.23 | 1 | 73 | 295,717.26 | likely |
smallcopeSOE | smallcopeSOE_fall_500_alleffectson | Fall | 0.00 | 9 | 46,616 | 290,760.56 | likely |
smallcopeSOE | smallcopeSOE_fall_500_noaniso | Fall | 61.29 | 7 | 46,616 | 290,821.85 | likely |
smallcopeSOE | smallcopeSOE_fall_500_noomeps2 | Fall | 2,338.60 | 6 | 23,348 | 293,099.16 | likely |
smallcopeSOE | smallcopeSOE_fall_500_noomeps2_noaniso | Fall | 2,380.30 | 4 | 23,348 | 293,140.86 | likely |
smallcopeSOE | smallcopeSOE_fall_500_noomeps2_noeps1 | Fall | 4,708.61 | 5 | 634 | 295,469.17 | likely |
smallcopeSOE | smallcopeSOE_fall_500_noomeps2_noeps1_noaniso | Fall | 4,720.67 | 3 | 634 | 295,481.22 | likely |
smallcopeSOE | smallcopeSOE_fall_500_noomeps12 | Fall | 8,233.60 | 1 | 80 | 298,994.16 | likely |
smallcopeSOE | smallcopeSOE_fall_500_noomeps12_noaniso | Fall | 8,233.60 | 1 | 80 | 298,994.16 | likely |
smallcopeSOE | smallcopeSOE_spring_500_alleffectson | Spring | 0.00 | 9 | 46,445 | 278,928.98 | likely |
smallcopeSOE | smallcopeSOE_spring_500_noaniso | Spring | 72.09 | 7 | 46,445 | 279,001.07 | likely |
smallcopeSOE | smallcopeSOE_spring_500_noomeps2 | Spring | 4,100.24 | 6 | 23,261 | 283,029.22 | likely |
smallcopeSOE | smallcopeSOE_spring_500_noomeps2_noaniso | Spring | 4,157.09 | 4 | 23,261 | 283,086.06 | likely |
smallcopeSOE | smallcopeSOE_spring_500_noomeps2_noeps1_noaniso | Spring | 7,636.58 | 3 | 629 | 286,565.55 | likely |
smallcopeSOE | smallcopeSOE_spring_500_noomeps2_noeps1 | Spring | 7,638.62 | 5 | 629 | 286,567.60 | likely |
smallcopeSOE | smallcopeSOE_spring_500_noomeps12 | Spring | 12,907.54 | 1 | 77 | 291,836.51 | likely |
smallcopeSOE | smallcopeSOE_spring_500_noomeps12_noaniso | Spring | 12,907.54 | 1 | 77 | 291,836.51 | likely |
Using REML, models including spatial and spatio-temporal random effects as well as anisotropy were best supported by the data. This was true for the spring datasets and the fall datasets for calfin, lgcopeALL, smallcopeALL, and smallcopeSOE.
However, the small copepod fall model had the spatial random effects parameter for encounter rate, \(\omega_1\), approach 0. We allowed it to be estimated at 0 rather than fixing it at 0. Resulting indices were identical.
Full outputs from stage 1 model selection are posted to google drive here.
Including the day of year covariate had mixed success. While some models fit better with this covariate, others did not meet the convergence criteria. Therefore, we moved forward with the best fitting model that converged to generate indices.
Full results from the best fitting models are posted to google drive here
Only spring large zooplankton did better with a day of year catchability covariate among the models related to herring recruitment.
# from each output folder in pyindex,
outdir <- here::here("pyindex")
moddirs <- list.dirs(outdir)
moddirs <- moddirs[-1]
# keep folder name
modnames <- list.dirs(outdir, full.names = FALSE)
modcompare <- purrr::map_dfr(moddirs, getmodinfo)
modselect <- modcompare |>
dplyr::mutate(season = dplyr::case_when(stringr::str_detect(modname, "_fall_") ~ "Fall",
stringr::str_detect(modname, "spring") ~ "Spring",
stringr::str_detect(modname, "_ann_") ~ "Annual",
stringr::str_detect(modname, "sepfeb") ~ "SepFeb",
TRUE ~ as.character(NA))) |>
dplyr::mutate(converged2 = dplyr::case_when(stringr::str_detect(converged, "no evidence") ~ "likely",
stringr::str_detect(converged, "is likely not") ~ "unlikely",
TRUE ~ as.character(NA))) |>
dplyr::mutate(copegroup = stringr::str_extract(modname, "[^_]+")) |>
#dplyr::mutate(modname = str_extract(modname, '(?<=allagg_).*')) |>
dplyr::group_by(copegroup, season) |>
dplyr::mutate(deltaAIC = AIC-min(AIC)) |>
dplyr::select(copegroup, modname, season, deltaAIC, fixedcoeff,
randomcoeff,
omega1, omega2, #epsilon1, epsilon2,
#beta1, beta2, use_anisotropy,
AIC, converged2) |>
dplyr::arrange(copegroup, season, AIC)
flextable::flextable(modselect) %>%
#dplyr::select(-c(use_anisotropy,
#omega1, omega2, epsilon1, epsilon2,
#beta1, beta2))
#) %>%
flextable::set_header_labels(modname = "Model name",
season = "Season",
#deltaAIC = "dAIC",
fixedcoeff = "N fixed",
randomcoeff = "N random",
omega2 = "All other effects",
converged2 = "Convergence") |>
#flextable::set_caption("Comparison of delta AIC (dAIC) values using Restricted Maxiumum Likelihood (REML) for alternative fixed and random effects model structures, with number of fixed (N fixed) and random (N random) coefficients. See text for model descriptions associated with each model name.") %>%
flextable::fontsize(size = 9, part = "all") %>%
flextable::colformat_double(digits = 2) |>
flextable::set_table_properties(layout = "autofit", width = 1)
copegroup | Model name | Season | deltaAIC | N fixed | N random | omega1 | All other effects | AIC | Convergence |
---|---|---|---|---|---|---|---|---|---|
calfin | calfin_ann_500_biascorrect_doy | Annual | 0.00 | 93 | 46,536 | IID | IID | 453,074.91 | unlikely |
calfin | calfin_fall_500_biascorrect_doy | Fall | 0.00 | 91 | 46,536 | IID | IID | 206,053.34 | unlikely |
calfin | calfin_fall_500_biascorrect | Fall | 2,405.54 | 89 | 46,536 | IID | IID | 208,458.88 | likely |
calfin | calfin_spring_500_biascorrect | Spring | 91 | 46,452 | IID | IID | 239,403.84 | likely | |
calfin | calfin_spring_500_biascorrect_doy | Spring | IID | IID | |||||
euph | euph_ann_500_biascorrect_doy | Annual | 0.00 | 93 | 46,536 | IID | IID | 205,163.31 | unlikely |
euph | euph_ann_500_biascorrect | Annual | 45.86 | 91 | 46,536 | IID | IID | 205,209.17 | likely |
euph | euph_fall_500_biascorrect_doy | Fall | 0.00 | 91 | 46,536 | IID | IID | 100,475.13 | unlikely |
euph | euph_fall_500_biascorrect | Fall | 96.55 | 89 | 46,536 | IID | IID | 100,571.67 | likely |
euph | euph_spring_500_biascorrect_doy | Spring | 0.00 | 93 | 46,452 | IID | IID | 101,850.11 | likely |
euph | euph_spring_500_biascorrect | Spring | 819.63 | 91 | 46,452 | IID | IID | 102,669.73 | likely |
herringlarvae | herringlarvae_fall_500_biascorrect | Fall | 0.00 | 79 | 46,536 | IID | IID | 31,298.46 | likely |
herringlarvae | herringlarvae_sepfeb_yrshift_500_biascorrect | SepFeb | 0.00 | 79 | 46,620 | IID | IID | 38,202.32 | likely |
herringlarvae | herringlarvae_spring_500_biascorrect | Spring | 0.00 | 73 | 46,368 | IID | IID | 9,683.77 | likely |
lgcopeALL | lgcopeALL_ann_500_biascorrect_doy | Annual | 0.00 | 93 | 46,368 | IID | IID | 515,161.55 | unlikely |
lgcopeALL | lgcopeALL_fall_500_biascorrect_doy | Fall | 0.00 | 91 | 46,536 | IID | IID | 250,277.08 | unlikely |
lgcopeALL | lgcopeALL_fall_500_biascorrect | Fall | 824.41 | 89 | 46,536 | IID | IID | 251,101.49 | likely |
lgcopeALL | lgcopeALL_spring_500_biascorrect_doy | Spring | 0.00 | 93 | 46,368 | IID | IID | 255,470.75 | likely |
lgcopeALL | lgcopeALL_spring_500_biascorrect | Spring | 2,175.71 | 91 | 46,368 | IID | IID | 257,646.46 | likely |
smallcopeALL | smallcopeALL_ann_500_biascorrect_doy | Annual | 0.00 | 90 | 46,368 | IID | IID | 593,223.14 | unlikely |
smallcopeALL | smallcopeALL_fall_500_biascorrect_noom1_doy | Fall | 81 | 45,982 | 0 | IID | 303,199.76 | unlikely | |
smallcopeALL | smallcopeALL_fall_500_biascorrect_noom1 | Fall | 79 | 45,982 | 0 | IID | 303,358.72 | likely | |
smallcopeALL | smallcopeALL_fall_500_biascorrect | Fall | 80 | 46,536 | IID | IID | 303,360.72 | likely | |
smallcopeALL | smallcopeALL_fall_500_biascorrect_fail | Fall | IID | IID | |||||
smallcopeALL | smallcopeALL_sepfeb_yrshift_500_larvarea_biascorrect | SepFeb | 0.00 | 84 | 46,452 | IID | IID | 306,107.29 | likely |
smallcopeALL | smallcopeALL_sepfeb_yrshift_500_larvarea_biascorrect_oldarea | SepFeb | 8.72 | 84 | 46,452 | IID | IID | 306,116.01 | likely |
smallcopeALL | smallcopeALL_sepfeb_yrshift_500_biascorrect | SepFeb | 18.99 | 84 | 46,620 | IID | IID | 306,126.28 | likely |
smallcopeALL | smallcopeALL_spring_500_biascorrect_doy | Spring | 0.00 | 84 | 46,368 | IID | IID | 281,080.63 | unlikely |
smallcopeALL | smallcopeALL_spring_500_biascorrect | Spring | 1,385.07 | 82 | 46,368 | IID | IID | 282,465.70 | likely |
smallcopeSOE | smallcopeSOE_ann_500_biascorrect_doy | Annual | 0.00 | 93 | 46,368 | IID | IID | 577,717.71 | unlikely |
smallcopeSOE | smallcopeSOE_fall_500_biascorrect_doy | Fall | 0.00 | 91 | 46,536 | IID | IID | 290,576.94 | unlikely |
smallcopeSOE | smallcopeSOE_fall_500_biascorrect | Fall | 307.39 | 89 | 46,536 | IID | IID | 290,884.33 | likely |
smallcopeSOE | smallcopeSOE_spring_500_biascorrect_doy | Spring | 0.00 | 88 | 46,368 | IID | IID | 277,659.94 | likely |
smallcopeSOE | smallcopeSOE_spring_500_biascorrect | Spring | 1,418.56 | 86 | 46,368 | IID | IID | 279,078.51 | likely |
zoopvol | zoopvol_ann_500_biascorrect_doy | Annual | 0.00 | 52 | 46,536 | IID | IID | 226,109.45 | unlikely |
zoopvol | zoopvol_ann_500_biascorrect | Annual | 77.87 | 50 | 46,536 | IID | IID | 226,187.32 | likely |
zoopvol | zoopvol_fall_500_biascorrect | Fall | 49 | 46,536 | IID | IID | 111,955.59 | likely | |
zoopvol | zoopvol_fall_500_biascorrect_doy | Fall | IID | IID | |||||
zoopvol | zoopvol_spring_500_biascorrect_doy | Spring | 0.00 | 52 | 46,452 | IID | IID | 107,636.22 | unlikely |
zoopvol | zoopvol_spring_500_biascorrect | Spring | 2,644.91 | 50 | 46,452 | IID | IID | 110,281.13 | likely |
Although models were run in multiple seasons and at multiple spatial scales, we focus on results from three indices that were evaluated as covariates in the Atlantic herring assessment model. Based on the results of the boosted regression tree analysis (Molina 2024), the first two indices are large copepods in spring (influencing growth of herring postlarvae and juveniles), and small copeopods in fall (influencing survival of herring larvae over the winter). The spatial scale of these indices is the Atlantic herring bottom trawl survey strata in spring and fall, respectively. The third index is more refined to focus on herring larvae, using temporal and spatial bounds definerd by herring larvae observations in the zooplankton dataset. This third index focuses on small copepods from September-February in the area where more than 70% of herring larval density was identified between 1982-2022.
# lets only look at indices for converged models
stratlook <- data.frame(Stratum = c("Stratum_1",
"Stratum_2",
"Stratum_3",
"Stratum_4",
"Stratum_5",
"Stratum_6",
"Stratum_7"),
Region = c("AllEPU",
"her_sp",
"her_fa",
"MAB",
"GB",
"GOM",
"SS"))
stratlook2 <- data.frame(Stratum = c("Stratum_1",
"Stratum_2",
"Stratum_3",
"Stratum_4",
"Stratum_5",
"Stratum_6",
"Stratum_7",
"Stratum_8",
"Stratum_9"),
Region = c("AllEPU",
"her_sp",
"her_fa",
"her_larv",
"no_larv",
"MAB",
"GB",
"GOM",
"SS"))
modcompare_conv <- modselect |>
dplyr::ungroup() |>
dplyr::filter(converged2 == "likely") |>
dplyr::select(modname) |>
as.vector() |>
unname() |>
unlist()
# or only converged and best AIC
modcompare_best <- modselect |>
dplyr::ungroup() |>
dplyr::filter(converged2 == "likely") |>
dplyr::distinct(copegroup, season, .keep_all = T) |> # assumes orderd by dAIC!
dplyr::select(modname) |>
as.vector() |>
unname() |>
unlist()
moddirs_conv <- moddirs[grepl(sprintf("\\.*(%s)$", paste(modcompare_conv, collapse = '|')), moddirs)]
moddirs_best <- moddirs[grepl(sprintf("\\.*(%s)$", paste(modcompare_best, collapse = '|')), moddirs)]
modcompareindex <- purrr::map_dfr(moddirs_best, purrr::possibly(getmodindex, otherwise = NULL))
splitoutput <- modcompareindex %>%
dplyr::mutate(Season = modname |> map(str_split, pattern = "_") |> map_chr(c(1,2))) %>%
dplyr::mutate(Data = modname |> map(str_split, pattern = "_") |> map_chr(c(1,1))) %>%
dplyr::mutate(Estimate = ifelse(Estimate == 0, NA, Estimate)) |>
dplyr::left_join(stratlook)
# only larvarea models have this strata set
splitoutput2 <- modcompareindex %>%
dplyr::filter(str_detect(modname, "larvarea")) |>
dplyr::mutate(Season = modname |> map(str_split, pattern = "_") |> map_chr(c(1,2))) %>%
dplyr::mutate(Data = modname |> map(str_split, pattern = "_") |> map_chr(c(1,1))) %>%
dplyr::mutate(Estimate = ifelse(Estimate == 0, NA, Estimate)) |>
dplyr::left_join(stratlook2)
zoomax <- max(splitoutput$Estimate, na.rm=T)
zootsmean <- splitoutput %>%
dplyr::group_by(modname, Region) %>%
dplyr::mutate(fmean = mean(Estimate, na.rm=T))
Spring indices for large copepods in the herring spring survey strata (her_sp), on Georges Bank (GB), and in the Gulf of Maine (GOM).
plot_zooindices(splitoutput = splitoutput |> dplyr::filter(Season == "spring"),
plotdata = "lgcopeALL",
plotregions = c("her_sp", "GB", "GOM"), #"AllEPU", "SS"
plottitle = "Large copeopods")
VAST-estimated density of large copepods
knitr::include_graphics(here::here("pyindex/lgcopeALL_spring_500_biascorrect_doy/ln_density-predicted.png"))
Spring indices for small copepods in the herring fall survey strata (her_fa), on Georges Bank (GB), and in the Gulf of Maine (GOM).
plot_zooindices(splitoutput = splitoutput |> dplyr::filter(Season == "fall"),
plotdata = "smallcopeALL",
plotregions = c("her_fa", "GB", "GOM"), #"AllEPU", "SS"
plottitle = "small copeopods")
VAST-estimated density of small copepods
knitr::include_graphics(here::here("pyindex/smallcopeALL_fall_500_biascorrect/ln_density-predicted.png"))
Small copepods available to larvae
plot_zooindices(splitoutput = splitoutput2,
plotdata = "smallcopeALL",
plotregions = c("her_larv", "her_fa"),
plottitle = "Small copepods Sept-Feb") +
ggplot2::theme(legend.position="none")