This gives an overview of the source data. Data are processed for input into VAST in a script linked in the next section.
Read in file obtained from Scott Large or Harvey Walsh or whoever has posted it to a google drive and indicated that it is the most recent data. This is the part that requires automation. Everything downstream of here depends on this file being in the same format as the one used in 2024, and it being updated by Harvey.
THE FILE READ IN HERE IS AN OLD FILE AND SHOULD NOT BE USED AS VAST INPUT
ecomonall <- readr::read_csv("data/EcoMon_Plankton_Data_v3_8.csv")
## Rows: 32693 Columns: 289
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): cruise_name, zoo_gear, ich_gear, date
## dbl (284): station, lat, lon, depth, sfc_temp, sfc_salt, btm_temp, btm_salt...
## time (1): time
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
names(ecomonall)
## [1] "cruise_name" "station" "zoo_gear"
## [4] "ich_gear" "lat" "lon"
## [7] "date" "time" "depth"
## [10] "sfc_temp" "sfc_salt" "btm_temp"
## [13] "btm_salt" "volume_1m2" "ctyp_10m2"
## [16] "calfin_10m2" "pseudo_10m2" "penilia_10m2"
## [19] "tlong_10m2" "cham_10m2" "echino_10m2"
## [22] "larvaceans_10m2" "para_10m2" "gas_10m2"
## [25] "acarspp_10m2" "mlucens_10m2" "evadnespp_10m2"
## [28] "salps_10m2" "oithspp_10m2" "cirr_10m2"
## [31] "chaeto_10m2" "hyper_10m2" "gam_10m2"
## [34] "evadnord_10m2" "calminor_10m2" "copepoda_10m2"
## [37] "clauso_10m2" "dec_10m2" "euph_10m2"
## [40] "prot_10m2" "acarlong_10m2" "euc_10m2"
## [43] "pel_10m2" "poly_10m2" "podon_10m2"
## [46] "fish_10m2" "bry_10m2" "fur_10m2"
## [49] "calspp_10m2" "oncaea_10m2" "cory_10m2"
## [52] "ost_10m2" "tstyl_10m2" "oithspin_10m2"
## [55] "mysids_10m2" "temspp_10m2" "tort_10m2"
## [58] "paraspp_10m2" "scyphz_10m2" "anthz_10m2"
## [61] "siph_10m2" "hydrom_10m2" "coel_10m2"
## [64] "ctenop_10m2" "euph1_10m2" "thysin_10m2"
## [67] "megan_10m2" "thysra_10m2" "thyslo_10m2"
## [70] "eupham_10m2" "euphkr_10m2" "euphspp_10m2"
## [73] "thysgr_10m2" "nemaspp_10m2" "stylspp_10m2"
## [76] "stylel_10m2" "nemame_10m2" "thysspp_10m2"
## [79] "shysac_10m2" "thypsp_10m2" "nemabo_10m2"
## [82] "thecos_10m2" "spirre_10m2" "spirhe_10m2"
## [85] "spirin_10m2" "spirtr_10m2" "spirspp_10m2"
## [88] "clispp_10m2" "crevir_10m2" "diatri_10m2"
## [91] "clicus_10m2" "clipyr_10m2" "cavunc_10m2"
## [94] "cavinf_10m2" "cavlon_10m2" "stysub_10m2"
## [97] "spirbu_10m2" "crespp_10m2" "cavspp_10m2"
## [100] "cavoli_10m2x" "gymnos_10m2" "pnespp_10m2"
## [103] "paedol_10m2" "clilim_10m2" "pnepau_10m2"
## [106] "volume_100m3" "ctyp_100m3" "calfin_100m3"
## [109] "pseudo_100m3" "penilia_100m3" "tlong_100m3"
## [112] "cham_100m3" "echino_100m3" "larvaceans_100m3"
## [115] "para_100m3" "gas_100m3" "acarspp_100m3"
## [118] "mlucens_100m3" "evadnespp_100m3" "salps_100m3"
## [121] "oithspp_100m3" "cirr_100m3" "chaeto_100m3"
## [124] "hyper_100m3" "gam_100m3" "evadnord_100m3"
## [127] "calminor_100m3" "copepoda_100m3" "clauso"
## [130] "dec_100m3" "euph_100m3" "prot_100m3"
## [133] "acarlong_100m3" "euc_100m3" "pel_100m3"
## [136] "poly_100m3" "podon_100m3" "fish_100m3"
## [139] "bry_100m3" "fur_100m3" "calspp_100m3"
## [142] "oncaea_100m3" "cory_100m3" "ost_100m3"
## [145] "tstyl_100m3" "oithspin_100m3" "mysids_100m3"
## [148] "temspp_100m3" "tort_100m3" "paraspp_100m3"
## [151] "scyphz_100m3" "anthz_100m3" "siph_100m3"
## [154] "hydrom_100m3" "coel_100m3" "ctenop_100m3"
## [157] "euph1_100m3" "thysin_100m3" "megan_100m3"
## [160] "thysra_100m3" "thyslo_100m3" "eupham_100m3"
## [163] "euphkr_100m3" "euphspp_100m3" "thysgr_100m3"
## [166] "nemaspp_100m3" "stylspp_100m3" "stylel_100m3"
## [169] "nemame_100m3" "thysspp_100m3" "shysac_100m3"
## [172] "thypsp_100m3" "nemabo_100m3" "thecos_100m3"
## [175] "spirre_100m3" "spirhe_100m3" "spirin_100m3"
## [178] "spirtr_100m3" "spirspp_100m3" "clispp_100m3"
## [181] "crevir_100m3" "diatri_100m3" "clicus_100m3"
## [184] "clipyr_100m3" "cavunc_100m3" "cavinf_100m3"
## [187] "cavlon_100m3" "stysub_100m3" "spirbu_100m3"
## [190] "crespp_100m3" "cavspp_100m3" "cavoli_100m3x"
## [193] "gymnos_100m3" "pnespp_100m3" "paedol_100m3"
## [196] "clilim_100m3" "pnepau_100m3" "nofish_10m2"
## [199] "bretyr_10m2" "cluhar_10m2" "cycspp_10m2"
## [202] "diaspp_10m2" "cermad_10m2" "benspp_10m2"
## [205] "urospp_10m2" "enccim_10m2" "gadmor_10m2"
## [208] "melaeg_10m2" "polvir_10m2" "meralb_10m2"
## [211] "merbil_10m2" "centstr_10m2" "pomsal_10m2"
## [214] "cynreg_10m2" "leixan_10m2" "menspp_10m2"
## [217] "micund_10m2" "tauads_10m2" "tauoni_10m2"
## [220] "auxspp_10m2" "scosco_10m2" "pepspp_10m2"
## [223] "sebspp_10m2" "prispp_10m2" "myoaen_10m2"
## [226] "myooct_10m2" "ammspp_10m2" "phogun_10m2"
## [229] "ulvsub_10m2" "anaspp_10m2" "citarc_10m2"
## [232] "etrspp_10m2" "syaspp_10m2" "botspp_10m2"
## [235] "hipobl_10m2" "parden_10m2" "pseame_10m2"
## [238] "hippla_10m2" "limfer_10m2" "glycyn_10m2"
## [241] "scoaqu_10m2" "sypspp_10m2" "lopame_10m2"
## [244] "nofish_100m3" "bretyr_100m3" "cluhar_100m3"
## [247] "cycspp_100m3" "diaspp_100m3" "cermad_100m3"
## [250] "benspp_100m3" "urospp_100m3" "enccim_100m3"
## [253] "gadmor_100m3" "melaeg_100m3" "polvir_100m3"
## [256] "meralb_100m3" "merbil_100m3" "centstr_100m3"
## [259] "pomsal_100m3" "cynreg_100m3" "leixan_100m3"
## [262] "menspp_100m3" "micund_100m3" "tauads_100m3"
## [265] "tauoni_100m3" "auxspp_100m3" "scosco_100m3"
## [268] "pepspp_100m3" "sebspp_100m3" "prispp_100m3"
## [271] "myoaen_100m3" "myooct_100m3" "ammspp_100m3"
## [274] "phogun_100m3" "ulvsub_100m3" "anaspp_100m3"
## [277] "citarc_100m3" "etrspp_100m3" "syaspp_100m3"
## [280] "botspp_100m3" "hipobl_100m3" "parden_100m3"
## [283] "pseame_100m3" "hippla_100m3" "limfer_100m3"
## [286] "glycyn_100m3" "scoaqu_100m3" "sypspp_100m3"
## [289] "lopame_100m3"
A lookup of these column headings is/was here: https://www.fisheries.noaa.gov/inport/item/35054
The data are currently grouped for the SOE in each VAST script.
Models for the SOE included four copepod categories, plus zooplankton volume and Euphausiids.
The copepod categories are defined as:
Actual species in each:
Calanus finmarchicus, = Large copeopds SOE (used in small-large index)
Large copepods ALL: Calanus finmarchicus, Metridia lucens, Calanus minor, Eucalanus spp., Calanus spp.
Small copepods ALL: Centropages typicus, Pseudocalanus spp., Temora longicornis, Centropages hamatus, Paracalanus parvus, Acartia spp., Clausocalanus arcuicornis, Acartia longiremis, Clausocalanus furcatus, Temora stylifera, Temora spp., Tortanus discaudatus, Paracalanus spp.
Small copeopods SOE (used in small-large index): Centropages typicus, Pseudocalanus spp., Temora longicornis, Centropages hamatus
Data are processed using the script in the zooplanktonindex repo data folder https://github.com/NOAA-EDAB/zooplanktonindex/blob/main/data/VASTzoopindex_processinputs.R
Change the location of the dataset in line 13 to the new dataset. I recommend posting the dataset to github (if permitted) for full transparency. The dataset used for 2024 indices was not allowed to be posted.
The processing script sums the copeopod categories and also brings in the euphausiid and zooplankton volume data and produces a single dataset that is used in all VAST scripts.
The data processing script is looking for processed OISST data that is used in the forage index to add to the zooplankton dataset for cases where surface temperature is missing, so that data are not dropped.
There is a hardcoded reference at line 288 locating the SST datasets used in generating the forage index. This will need to be changed to the new location of the processd OISST data. Coordination with the production of the forage index is recommended, because the forage index uses the OISST data, and updating the SST data is time consuming so should not be done twice.
The SST data are not currently used in the zooplankton model. The
join with OISST step could be skipped, but the code to run the model in
each script needs to be changed to comment out any reference to
temperature data (comment out all sstfill
in dplyr::select
statements creating the input data). If you choose to leave out SST
data, this modification needs to be done in multiple places in every
VAST script noted below.
For the SOE I suggest running the same scripts run in 2024. Model selection investigating the inclusion of spatial and spatio-temporal random effects is extremely time consuming and unlikely to result in the selection of different configurations. Experimentation with different covariates is encouraged if time permits in the future since there wasn’t much time for this in 2024, but that will also be a time consuming process. This is a long way of saying don’t rerun the model selection script (VASTunivariate_zoopindex_modselection.R).
There are 6 zooplankton categories listed above, and there are 3 corresponding VAST scripts that need to be run with updated data: euphausiids, zooplankton volume, and all the copepod combinations (4 categories). Each script produces output for two seasons, “Spring” and “Fall” for its zooplankton categories.
These are the scripts run in 2024 for the 2025 SOE:
Euphausiids: https://github.com/NOAA-EDAB/zooplanktonindex/blob/main/VASTscripts/VASTunivariate_zoopindex_euph.R
Zooplankton volume: https://github.com/NOAA-EDAB/zooplanktonindex/blob/main/VASTscripts/VASTunivariate_zoopindex_zoopvolume.R
The Euphausiids and Zooplankton volume scripts produce fits with and
without the Day of Year covariate. Sometimes this produced a better
fitting model, sometimes not. See the previous results in https://noaa-edab.github.io/zooplanktonindex/CopeModResults.html
and decide whether you want the extra overhead of running these models
again. If not, take the covariate you don’t want out of the list before
the loop. Annual models also may not be necessary and can be taken out
of the list structures, remove the mod.season
,
mod.dat
, and mod.obsmod
components
corresponding to the annual models. Or just plan a long run on a server
and sort them out afterwards.
The copepod script was most recently run with the day of year covariate, and lacks the loop structure for the covariate that the Euphausiid and Zooplankton volume scripts have. I suggest checking the previous results linked above to see which copeopod models fit better with this; I think most did not. Your options are to add a similar structure to do both models with and without the day of year covariate, or to pick the model structure that worked best last time and run only that. Similarly to above, the annual models could also be removed if you don’t plan to use them in the SOE.
Another time saving shortcut would be to run the nodels only for the
SOE regions and comment out the regions for herring spring and fall
survey areas in the strata.limits =
portion of the
code.
If strata limits are changed in the VAST script, you also need to
change them similarly in the SOEinputs
function defined at
the end of the page https://noaa-edab.github.io/zooplanktonindex/ZoopCOG.html
All the scripts are writing model output to a pyindex folder with subfolders for each model run. This structure is important to maintain if you want to use the scripts below that make the SOE input datasets.
Once models are run this process is relatively well documented here: https://noaa-edab.github.io/zooplanktonindex/ZoopCOG.html
In brief, there are functions in that page that compare across model runs for converged best fit models for each zooplankton category, then create both the index and center of gravity SOE data from them.