Data collected onboard are stored in access files, located in the “OnBoard/access” subfolder. Ideally, a maximum of 20 hauls should be stored in a .accdb file. IMPORTANT: in case the data collection is done in parallel over multiple laptops, the “OnBoard” folder must be renamed with unique identifiers (e.g.: OnBoard_2024_PLE). Instructions to set up a folder on a new laptop are reported in the repository main page
To start a new file, copy the template version, which in 2024 it is called “bio_data_v2024_SOLEMON_template.accdb”, paste it in the same “OnBoard/access” (or OnBoard_2024_PLE/access) subfolder and rename it as “bio_data_v2024_SOLEMON_YEAR_N.accdb”, where YEAR is the reporting year and N is a progressive number. Example: “bio_data_v2024_SOLEMON_2024_1.accdb” . It does not matter if parts of the same haul are in different files, this is handled in post-processing. It does not matters if access with the same name are located in different laptops, this is handled in post processing.
Each access file contains a template sheet, called “cala_template”. To start a new haul, copy the template sheet, paste it and rename it as “cala_x”, where x is the haul number. Example: cala_1; cala_7bis. It does not matter if parts of the same haul are in different files, this is handled in post-processing.
The structure of the sheets, in 2024, is according to Figure1. In
detail:
Auto compilation applies to some column of the access file. This mean that in the processing of the table, to empty cells is assigned the first available data that is found in the previous rows. When collecting data, for these column you need to specify the value just for the first observation, then you should fill the value again only when this change. These columns are:
Standard procedure applies when all the individuals for a given species are collected and reported
When reporting data for target species you should use only the first 7 columns. Of these, the columns gear and species_name autocompiles according to the first record inserted. This mean that you should write in these columns only the first time you report an observation (i.e.: first record of a gear, first record of a species). The column id_specimen serves to store any kind of individual id (e.g.: otholits code, genetic samples).
When reporting data for elasmobranchs you should use only the first 9 columns. Of these, the columns gear and species_name autocompiles according to the first record inserted. This mean that you should write in these columns only the first time you report an observation (i.e.: first record of a gear, first record of a species). The column id_specimen serves to store any kind of individual id (e.g.: otholits code, genetic samples). Refer to the Figure 4 for reporting the 3 length measures
When reporting data for other commercial species you should use only the first 4 columns. Of these, the columns gear and species_name autocompiles according to the first record inserted. This mean that you should write in these columns only the first time you report an observation (i.e.: first record of a gear, first record of a species).
For this species category are recorded individual length and cumulative weight. The cumulative weight should be reported in the last record, as in Figure 5.
When reporting data for shellfishes (MUREBRA, HEXATRU, OSTREDU) you should use the columns as reported in Figure 6. Of these, the columns gear and species_name autocompiles according to the first record inserted. This mean that you should write in these columns only the first time you report an observation (i.e.: first record of a gear, first record of a species).
NB: in case of subsamples, refer to the dedicated section
Subsamples are taken when the amount of individuals in a given species is too high to be processed. Considering the onboard practice used in the solemon survey there are three cases of subsamples happening. Figure 7 reports the steps that are done according to the three cases. Each case is treated differently depending on the type of species, and the data treatment is explained in the dedicated sections.
In terms of onboard procedure, the cases general refers to:
A typical case for subsample of target species is AEQUOPE, which usually occurs in large quantities and LFD is needed. In this case only a few individuals are measured to obtain the length structure, then it is needed to estimate total number (and sometimes total weight). Individuals that are processed for length and weight can be treated as any other target species. Regarding the other information needed to raise the values, you need to create a new record for the species (and gear) to store subsample data. There are two expected cases:
ALL individuals collected from the haul (sorted sample), then a subsample is taken (sorted subsample) for individual processing. One record is dedicated to the subsample details. The total weight of the sorted sample (in kilograms) is reported in the in the kg_field1 field. The weight of the sorted subsample is reported in the kg_field2. type_subsample is “C1_species”. The individuals subsampled are processed according to the standard procedure for target species, creating a new record for each specimen.
A subsample is taken from the haul as soon as it is taken (unsorted subsample, before removing anything from the haul), then ALL the individuals in the subsample are collected (sorted subsample) for individual processing. One record is dedicated to the subsample details. The weight of the haul is reported in haul_weight. The weigth of the unsorted subsample is reported in kg_field1. The weight of the sorted subsample is reported in kg_field2. type_subsample is “C2_haul”. The individuals subsampled are processed according to the standard procedure for target species, creating a new record for each specimen.
A subsample is taken from the haul after partial sorting (partially sorted subsample). This procedure is tipically applied to benthos subsample, which is taken after commercial species and litter are removed. Therefore, subsamples taken with this procedure need to be raised using the standard procedure to raise discard data. Then, ALL the individuals in the subsample are collected (sorted subsample) for individual processing. One record is dedicated to the subsample details. The weight of the haul is reported in haul_weight. The weigth of the unsorted subsample is reported in kg_field1. The weight of the sorted subsample is reported in kg_field2. type_subsample is “C3_benthos”. The individuals subsampled are processed according to the standard procedure for target species, creating a new record for each specimen.
A typical case for subsample of non-target species is MUREBRA. When it occurrs in large aggregations, total number (and sometimes total weight) are estimated from subsamples. To store information needed to raise the values, there are two expected cases:
ALL individuals collected from the haul (sorted sample), then a subsample is taken (sorted subsample) for estimating total number. One record is dedicated to the subsample details. The total weight of the sorted sample (in kilograms) is reported in the in the kg_field1 field. The weight of the sorted subsample is reported in the kg_field2. The number of individual in the sorted subsample is reported in the subsample_number. type_subsample is “C1_species”.
A subsample is taken from the haul as soon as it is taken (unsorted subsample, before removing anything from the haul), then ALL the individuals in the subsample are collected (sorted subsample) for estimating total number. One record is dedicated to the subsample details. The weight of the haul is reported in haul_weight. The weigth of the unsorted subsample is reported in kg_field1. The weight of the sorted subsample is reported in kg_field2. The number of individual in the sorted subsample is reported in the subsample_number. type_subsample is “C2_haul”.
A subsample is taken from the haul after partial sorting (partially sorted subsample). This procedure is tipically applied to benthos subsample, which is taken after commercial species and litter are removed. Therefore, subsamples taken with this procedure need to be raised using the standard procedure to raise discard data. Then, ALL the individuals in the subsample are collected (sorted subsample) for individual processing. One record is dedicated to the subsample details. The weight of the haul is reported in haul_weight. The weigth of the unsorted subsample is reported in kg_field1. The weight of the sorted subsample is reported in kg_field2. The number of individual in the sorted subsample is reported in the subsample_number. type_subsample is “C3_benthos”.
it has been only used for MUREBRA and HEXATRU in cases of exceptional catches. ALL individuals of these two species are collected from the haul (partially sorted sample), then a subsample is taken from the partially sorted sample (partially sorted subsample). The partially sorted subsample is sorted, aka it is divede into species (sorted subsample). The sorted subsample of each species is subsampled againg (sorted sub subsample) for estimating total number. One record is dedicated to the subsample details. The weight of the partially sorted sample is reported in kg_field1. The weigth of the sorted subsample for each species is reported in kg_field2. The weight of the sorted sub-subsample is reported in kg_field3. The number of individual in the sorted sub-subsample is reported in the subsample_number. type_subsample is “multi”.
Data are processed in R. It is good practice to process the data relatively frequently in order to create backup files and to produce some checks that can be used to adjust data.
Data stored in the .accdb file are retrieved and handled by R
scripts, located in the “R” folder. The required script is
workflow_access_v2024.R
.
To process single hauls, run the lines up to 15 and then please set the following parameters:
# set parameters
haul=22
db='test'
updateID='N'
area_sepia='D'
year=2022
area='ITA17'
When parameters are set, the data processing is pre-defined: you just
have to run it without changing the parameters. The first step is the
function1
, which scope is to format the access table
according to output standards. There is no need to see the output of
this function. Just run it.
# function1 extract data from access db and format them
hauldata=function1(haul=haul,
db=db,
year=year)# extract and format data
When the function1 is done, you can proceed with the
function2
, which scope is to perform some checks. Checks
done are plots that would be saved under the path ‘output/checks’
# function 2: perform checks
function2(xdat=hauldata,
haul=haul)
function3
scope is to format data according to trust
format. Excel sheets are saved under the path ‘output/trust’.
# function 3: format data to trust format
trustdat=function3(xdat=hauldata[[1]],
haul=haul,
year = year,
weight_not_target = hauldata[[2]],
subsamples_target=hauldata[[3]],
catch_sample_disattivati = catch_sample_disattivati) # function 2
function4
creates a pdf report and save it under the
path ‘output/pdf’.
# function4: save PDF
function4(trustdat = trustdat,
year=year,
area = area,
haul=haul)
To process more than one haul, you should care to properly fill in the ‘haul_order’ excel sheet (see input data section). After having loaded the haul summary, just run the loop represented below.
haul_summary=read_excel("data/haul_order.xlsx")
haul_summary=haul_summary[1:5,]
for(xhaul in 1:nrow(haul_summary)){
# loop parameters
haul=haul_summary[xhaul,]$haul
db=haul_summary[xhaul,]$DB
area=haul_summary[xhaul,]$country
cat('processing haul no.', haul, '(', xhaul,'/', nrow(haul_summary),')' )
# function1 extract data from access db and format them
hauldata=function1(haul=haul,
db=db,
year=year)# extract and format data
# function 2: perform checks
function2(xdat=hauldata,
haul=haul)
# function 3: format data to trust format
trustdat=function3(xdat=hauldata[[1]],
haul=haul,
year = year,
weight_not_target = hauldata[[2]],
subsamples_target=hauldata[[3]],
catch_sample_disattivati = catch_sample_disattivati) # function 2
# function4: save PDF
function4(trustdat = trustdat,
year=year,
area=area,
haul=haul)
}
Contains the species that are target for the survey (individual length and weight) and the molluscs for which only total weight and total number are needed.
Species | Species_code | target |
---|---|---|
Aequipecten opercularis | AEQUOPE | 1 |
Callinectes sapidus | CALLSAP | 1 |
Flexopecten glaber proteus (= Chlamys proteus) | CHLAPRO | 1 |
Chlamys varia | CHLAVAR | 1 |
Melicertus kerathurus | MELIKER | 1 |
Merluccius merluccius | MERLMER | 1 |
Mullus barbatus | MULLBAR | 1 |
Nephrops norvegicus | NEPRNOR | 1 |
Parapenaeus longirostris | PAPELON | 1 |
Pecten jacobaeus | PECTJAC | 1 |
Platichthys flesus | PLATFLE | 1 |
Psetta maxima | PSETMAX | 1 |
Raja asterias | RAJAAST | 1 |
Raja clavata | RAJACLA | 1 |
Raja miraletus | RAJAMIR | 1 |
Scophthalmus rhombus | SCOHRHO | 1 |
Scyliorhinus canicula | SCYOCAN | 1 |
Scyliorhinus stellaris | SCYOSTE | 1 |
Sepia officinalis | SEPIOFF | 1 |
Solea aegyptiaca | SOLEAEG | 1 |
Solea solea | SOLEVUL | 1 |
Squilla mantis | SQUIMAN | 1 |
Torpedo marmorata | TORPMAR | 1 |
Crassostrea gigas | CRASGIG | 2 |
Galeodea echinophora | GALEECH | 2 |
Hexaplex trunculus | HEXATRU | 2 |
Bolinus brandaris | MUREBRA | 2 |
Mytilus galloprovincialis | MYTGALL | 2 |
Natica stercusmuscarum | NATISTE | 2 |
Ostrea edulis | OSTREDU | 2 |
Rapana venosa | RAPAVEN | 2 |
This file controls formatting of trust templates. It indicates which samples (Station, Gear and SpecCode) should be indicated as “InUse” = FALSE in the catch sample files used as input data in trust.
Station | Gear | SpecCode |
---|---|---|
11 | D | AEQUOPE |
Store the updated serial number used to identify specimens
for which detailed samples (otolit, genetic etc.) were taken. This
number should refere to the last ID assigned to a specimen. The
use of this file is controlled by the updateID
parameter in
the workflow_access_v0 file: if updateID
is set equal to Y,
the fishID file is used to assign IDs (when requested) and it is then
updated. The columns refers to:
type | code | fishID | haul | species |
---|---|---|---|---|
ELAS | Elas | 202 | NA | RAJAAST:RAJACLA:RAJAMIR:TORPMAR |
solea | SS | 7061 | NA | SOLEVUL:SOLEAEG |
RHO | SR | 447 | NA | SCOHRHO |
Store the information associated with hauls. This file is used (1) when the data workflow is applied in loop; (2) by the minilog script. The columns refers to:
day | haul | id | note | inizio | fine | valid | DB | country | peso_rapido_A | peso_rapido_D | peso_subcampione_a | peso_subcampione_D |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2022-11-24 | 31 | 1 | NA | NA | NA | 1 | 2022_1 | ITA17 | NA | NA | NA | NA |
2022-11-24 | 66 | 2 | NA | NA | NA | 1 | 2022_1 | ITA17 | NA | NA | NA | NA |
2022-11-24 | 29 | 3 | NA | NA | NA | 1 | 2022_1 | ITA17 | NA | NA | NA | NA |
Store the length-weigth parameters of target species. This file is used to reconstruct length (or weight) when it is not available in the recorded data. Example: shrimps where missi a part of the tail but have the head intact are suitable only for length measurement; fishes that were spoiled by the gear may be ok for weight but not measurables for length. The columns refers to:
species_name | sex | a | b | source |
---|---|---|---|---|
SOLEVUL | NA | 0.00460 | 3.11 | benchmark_assessment |
MULLBAR | NA | 0.00871 | 3.09 | fishbase |
RAJACLA | NA | 0.00269 | 3.23 | fishbase |
Store the code of the maturity scales for target species. This file is used to format input files for trust.
SPECIES | SEX | SCALE |
---|---|---|
MELIKER | F | MEDPF |
MERLMER | F | MEDFI |
MERLMER | M | MEDFI |
This is just the TB file updated to 2021, as stored in trust. It serves to perform some checks and it should not be modified. Preview not shown.
This file is downloaded from the trust database and not modified. It contains the species list. Last update xxx. If need to modify, please do it in thrust and then download the excel again!.
Species | Medits | Sp_Subcat | Len_Class | Notes |
---|---|---|---|---|
Aaptos aaptos | AAPTAAP | E | NA | NA |
Abra prismatica | ABRAPRI | E | NA | NA |
Abra sp | ABRASPP | E | NA | NA |