Zooplankton data overview

This gives an overview of the source data. Data are processed for input into VAST in a script linked in the next section.

Read in file obtained from Scott Large or Harvey Walsh or whoever has posted it to a google drive and indicated that it is the most recent data. This is the part that requires automation. Everything downstream of here depends on this file being in the same format as the one used in 2024, and it being updated by Harvey.

THE FILE READ IN HERE IS AN OLD FILE AND SHOULD NOT BE USED AS VAST INPUT

ecomonall <- readr::read_csv("data/EcoMon_Plankton_Data_v3_8.csv")
## Rows: 32693 Columns: 289
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr    (4): cruise_name, zoo_gear, ich_gear, date
## dbl  (284): station, lat, lon, depth, sfc_temp, sfc_salt, btm_temp, btm_salt...
## time   (1): time
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
names(ecomonall)
##   [1] "cruise_name"      "station"          "zoo_gear"        
##   [4] "ich_gear"         "lat"              "lon"             
##   [7] "date"             "time"             "depth"           
##  [10] "sfc_temp"         "sfc_salt"         "btm_temp"        
##  [13] "btm_salt"         "volume_1m2"       "ctyp_10m2"       
##  [16] "calfin_10m2"      "pseudo_10m2"      "penilia_10m2"    
##  [19] "tlong_10m2"       "cham_10m2"        "echino_10m2"     
##  [22] "larvaceans_10m2"  "para_10m2"        "gas_10m2"        
##  [25] "acarspp_10m2"     "mlucens_10m2"     "evadnespp_10m2"  
##  [28] "salps_10m2"       "oithspp_10m2"     "cirr_10m2"       
##  [31] "chaeto_10m2"      "hyper_10m2"       "gam_10m2"        
##  [34] "evadnord_10m2"    "calminor_10m2"    "copepoda_10m2"   
##  [37] "clauso_10m2"      "dec_10m2"         "euph_10m2"       
##  [40] "prot_10m2"        "acarlong_10m2"    "euc_10m2"        
##  [43] "pel_10m2"         "poly_10m2"        "podon_10m2"      
##  [46] "fish_10m2"        "bry_10m2"         "fur_10m2"        
##  [49] "calspp_10m2"      "oncaea_10m2"      "cory_10m2"       
##  [52] "ost_10m2"         "tstyl_10m2"       "oithspin_10m2"   
##  [55] "mysids_10m2"      "temspp_10m2"      "tort_10m2"       
##  [58] "paraspp_10m2"     "scyphz_10m2"      "anthz_10m2"      
##  [61] "siph_10m2"        "hydrom_10m2"      "coel_10m2"       
##  [64] "ctenop_10m2"      "euph1_10m2"       "thysin_10m2"     
##  [67] "megan_10m2"       "thysra_10m2"      "thyslo_10m2"     
##  [70] "eupham_10m2"      "euphkr_10m2"      "euphspp_10m2"    
##  [73] "thysgr_10m2"      "nemaspp_10m2"     "stylspp_10m2"    
##  [76] "stylel_10m2"      "nemame_10m2"      "thysspp_10m2"    
##  [79] "shysac_10m2"      "thypsp_10m2"      "nemabo_10m2"     
##  [82] "thecos_10m2"      "spirre_10m2"      "spirhe_10m2"     
##  [85] "spirin_10m2"      "spirtr_10m2"      "spirspp_10m2"    
##  [88] "clispp_10m2"      "crevir_10m2"      "diatri_10m2"     
##  [91] "clicus_10m2"      "clipyr_10m2"      "cavunc_10m2"     
##  [94] "cavinf_10m2"      "cavlon_10m2"      "stysub_10m2"     
##  [97] "spirbu_10m2"      "crespp_10m2"      "cavspp_10m2"     
## [100] "cavoli_10m2x"     "gymnos_10m2"      "pnespp_10m2"     
## [103] "paedol_10m2"      "clilim_10m2"      "pnepau_10m2"     
## [106] "volume_100m3"     "ctyp_100m3"       "calfin_100m3"    
## [109] "pseudo_100m3"     "penilia_100m3"    "tlong_100m3"     
## [112] "cham_100m3"       "echino_100m3"     "larvaceans_100m3"
## [115] "para_100m3"       "gas_100m3"        "acarspp_100m3"   
## [118] "mlucens_100m3"    "evadnespp_100m3"  "salps_100m3"     
## [121] "oithspp_100m3"    "cirr_100m3"       "chaeto_100m3"    
## [124] "hyper_100m3"      "gam_100m3"        "evadnord_100m3"  
## [127] "calminor_100m3"   "copepoda_100m3"   "clauso"          
## [130] "dec_100m3"        "euph_100m3"       "prot_100m3"      
## [133] "acarlong_100m3"   "euc_100m3"        "pel_100m3"       
## [136] "poly_100m3"       "podon_100m3"      "fish_100m3"      
## [139] "bry_100m3"        "fur_100m3"        "calspp_100m3"    
## [142] "oncaea_100m3"     "cory_100m3"       "ost_100m3"       
## [145] "tstyl_100m3"      "oithspin_100m3"   "mysids_100m3"    
## [148] "temspp_100m3"     "tort_100m3"       "paraspp_100m3"   
## [151] "scyphz_100m3"     "anthz_100m3"      "siph_100m3"      
## [154] "hydrom_100m3"     "coel_100m3"       "ctenop_100m3"    
## [157] "euph1_100m3"      "thysin_100m3"     "megan_100m3"     
## [160] "thysra_100m3"     "thyslo_100m3"     "eupham_100m3"    
## [163] "euphkr_100m3"     "euphspp_100m3"    "thysgr_100m3"    
## [166] "nemaspp_100m3"    "stylspp_100m3"    "stylel_100m3"    
## [169] "nemame_100m3"     "thysspp_100m3"    "shysac_100m3"    
## [172] "thypsp_100m3"     "nemabo_100m3"     "thecos_100m3"    
## [175] "spirre_100m3"     "spirhe_100m3"     "spirin_100m3"    
## [178] "spirtr_100m3"     "spirspp_100m3"    "clispp_100m3"    
## [181] "crevir_100m3"     "diatri_100m3"     "clicus_100m3"    
## [184] "clipyr_100m3"     "cavunc_100m3"     "cavinf_100m3"    
## [187] "cavlon_100m3"     "stysub_100m3"     "spirbu_100m3"    
## [190] "crespp_100m3"     "cavspp_100m3"     "cavoli_100m3x"   
## [193] "gymnos_100m3"     "pnespp_100m3"     "paedol_100m3"    
## [196] "clilim_100m3"     "pnepau_100m3"     "nofish_10m2"     
## [199] "bretyr_10m2"      "cluhar_10m2"      "cycspp_10m2"     
## [202] "diaspp_10m2"      "cermad_10m2"      "benspp_10m2"     
## [205] "urospp_10m2"      "enccim_10m2"      "gadmor_10m2"     
## [208] "melaeg_10m2"      "polvir_10m2"      "meralb_10m2"     
## [211] "merbil_10m2"      "centstr_10m2"     "pomsal_10m2"     
## [214] "cynreg_10m2"      "leixan_10m2"      "menspp_10m2"     
## [217] "micund_10m2"      "tauads_10m2"      "tauoni_10m2"     
## [220] "auxspp_10m2"      "scosco_10m2"      "pepspp_10m2"     
## [223] "sebspp_10m2"      "prispp_10m2"      "myoaen_10m2"     
## [226] "myooct_10m2"      "ammspp_10m2"      "phogun_10m2"     
## [229] "ulvsub_10m2"      "anaspp_10m2"      "citarc_10m2"     
## [232] "etrspp_10m2"      "syaspp_10m2"      "botspp_10m2"     
## [235] "hipobl_10m2"      "parden_10m2"      "pseame_10m2"     
## [238] "hippla_10m2"      "limfer_10m2"      "glycyn_10m2"     
## [241] "scoaqu_10m2"      "sypspp_10m2"      "lopame_10m2"     
## [244] "nofish_100m3"     "bretyr_100m3"     "cluhar_100m3"    
## [247] "cycspp_100m3"     "diaspp_100m3"     "cermad_100m3"    
## [250] "benspp_100m3"     "urospp_100m3"     "enccim_100m3"    
## [253] "gadmor_100m3"     "melaeg_100m3"     "polvir_100m3"    
## [256] "meralb_100m3"     "merbil_100m3"     "centstr_100m3"   
## [259] "pomsal_100m3"     "cynreg_100m3"     "leixan_100m3"    
## [262] "menspp_100m3"     "micund_100m3"     "tauads_100m3"    
## [265] "tauoni_100m3"     "auxspp_100m3"     "scosco_100m3"    
## [268] "pepspp_100m3"     "sebspp_100m3"     "prispp_100m3"    
## [271] "myoaen_100m3"     "myooct_100m3"     "ammspp_100m3"    
## [274] "phogun_100m3"     "ulvsub_100m3"     "anaspp_100m3"    
## [277] "citarc_100m3"     "etrspp_100m3"     "syaspp_100m3"    
## [280] "botspp_100m3"     "hipobl_100m3"     "parden_100m3"    
## [283] "pseame_100m3"     "hippla_100m3"     "limfer_100m3"    
## [286] "glycyn_100m3"     "scoaqu_100m3"     "sypspp_100m3"    
## [289] "lopame_100m3"

A lookup of these column headings is/was here: https://www.fisheries.noaa.gov/inport/item/35054

The data are currently grouped for the SOE in each VAST script.

Models for the SOE included four copepod categories, plus zooplankton volume and Euphausiids.

The copepod categories are defined as:

Actual species in each:

Zooplankton data processing

Data are processed using the script in the zooplanktonindex repo data folder https://github.com/NOAA-EDAB/zooplanktonindex/blob/main/data/VASTzoopindex_processinputs.R

Change the location of the dataset in line 13 to the new dataset. I recommend posting the dataset to github (if permitted) for full transparency. The dataset used for 2024 indices was not allowed to be posted.

The processing script sums the copeopod categories and also brings in the euphausiid and zooplankton volume data and produces a single dataset that is used in all VAST scripts.

Important: SST data

The data processing script is looking for processed OISST data that is used in the forage index to add to the zooplankton dataset for cases where surface temperature is missing, so that data are not dropped.

There is a hardcoded reference at line 288 locating the SST datasets used in generating the forage index. This will need to be changed to the new location of the processd OISST data. Coordination with the production of the forage index is recommended, because the forage index uses the OISST data, and updating the SST data is time consuming so should not be done twice.

The SST data are not currently used in the zooplankton model. The join with OISST step could be skipped, but the code to run the model in each script needs to be changed to comment out any reference to temperature data (comment out all sstfill in dplyr::select statements creating the input data). If you choose to leave out SST data, this modification needs to be done in multiple places in every VAST script noted below.

Run the VAST scripts

For the SOE I suggest running the same scripts run in 2024. Model selection investigating the inclusion of spatial and spatio-temporal random effects is extremely time consuming and unlikely to result in the selection of different configurations. Experimentation with different covariates is encouraged if time permits in the future since there wasn’t much time for this in 2024, but that will also be a time consuming process. This is a long way of saying don’t rerun the model selection script (VASTunivariate_zoopindex_modselection.R).

There are 6 zooplankton categories listed above, and there are 3 corresponding VAST scripts that need to be run with updated data: euphausiids, zooplankton volume, and all the copepod combinations (4 categories). Each script produces output for two seasons, “Spring” and “Fall” for its zooplankton categories.

These are the scripts run in 2024 for the 2025 SOE:

Euphausiids: https://github.com/NOAA-EDAB/zooplanktonindex/blob/main/VASTscripts/VASTunivariate_zoopindex_euph.R

Zooplankton volume: https://github.com/NOAA-EDAB/zooplanktonindex/blob/main/VASTscripts/VASTunivariate_zoopindex_zoopvolume.R

Copepods: https://github.com/NOAA-EDAB/zooplanktonindex/blob/main/VASTscripts/VASTunivariate_zoopindex_copepods.R

The Euphausiids and Zooplankton volume scripts produce fits with and without the Day of Year covariate. Sometimes this produced a better fitting model, sometimes not. See the previous results in https://noaa-edab.github.io/zooplanktonindex/CopeModResults.html and decide whether you want the extra overhead of running these models again. If not, take the covariate you don’t want out of the list before the loop. Annual models also may not be necessary and can be taken out of the list structures, remove the mod.season, mod.dat, and mod.obsmod components corresponding to the annual models. Or just plan a long run on a server and sort them out afterwards.

The copepod script was most recently run with the day of year covariate, and lacks the loop structure for the covariate that the Euphausiid and Zooplankton volume scripts have. I suggest checking the previous results linked above to see which copeopod models fit better with this; I think most did not. Your options are to add a similar structure to do both models with and without the day of year covariate, or to pick the model structure that worked best last time and run only that. Similarly to above, the annual models could also be removed if you don’t plan to use them in the SOE.

Another time saving shortcut would be to run the nodels only for the SOE regions and comment out the regions for herring spring and fall survey areas in the strata.limits = portion of the code.

If strata limits are changed in the VAST script, you also need to change them similarly in the SOEinputs function defined at the end of the page https://noaa-edab.github.io/zooplanktonindex/ZoopCOG.html

All the scripts are writing model output to a pyindex folder with subfolders for each model run. This structure is important to maintain if you want to use the scripts below that make the SOE input datasets.

Make the SOE datasets

Once models are run this process is relatively well documented here: https://noaa-edab.github.io/zooplanktonindex/ZoopCOG.html

In brief, there are functions in that page that compare across model runs for converged best fit models for each zooplankton category, then create both the index and center of gravity SOE data from them.