Quality assurance of our data collection

Page reviewed:  14/04/2025

To ensure the quality of the data we collect under the national Data Collection Framework (DCF), we use several quality assurance measures. You can read more about these here.

National Programs and Annual Reports

On the Swedish Agency for Marine and Water Management's website you can find Sweden's national programs for data collection and subsequent annual reports listing those collection results.

Data collection, storage, processing and quality control

  • We record field data from fisheries-dependent surveys either on paper or in E-reg.
  • We record field data from fisheries-independent surveys in E-reg and Sve-reg.

Download the E-reg quick guide (docx).

Download the Sve-reg quick guide (docx).

Data collection manuals can be provided by contacting lars.magnus.andersson@slu.se.

We carry out quality checks on data at different stages:

Data collection is made by Institute of Marine Research, Department of Aquatic resources (SLU Aqua), Swedish University of Agricultural Sciences.

Data is either registered on paper forms and recorded manually into the database or captured electronically with the applications E-reg or Sve-reg. The same checks are made when data is registered or imported to the database.

The following checks are some of those done when data is saved into the database:

  • is there data in the haul?
  • is there length data for each individual?
  • is there a total weight for the catch?
  • is the sample weight smaller than the total weight?
  • are all mandatory fields for this sampling type filled in?
  • is there a length frequency when specimen data exist?

The database (and the electronic sampling applications) has species codes and lists of predefined gear types etc.

Spatial checks can be made roughly in the database by a map with positions of sampling stations.

Data is also checked with R-scripts (described in the document “Quality controls with R-scrips”) before biological assessment analyses.

Data collection made by Institute of Marine Research, Department of Aquatic resources (SLU Aqua), Swedish University of Agricultural Sciences.

Data editing is the application of routines to detect missing, invalid or inconsistent entries in data. Such routines highlight data records that are potentially in error which are evaluated and, if possible fixed, before data is used.

A set of checks for missing values are made with R-scripts on extracts from FD2. These include checks for, e.g., missing and duplicated trips, missing samples, missing total weight /sample weight values, missing length frequencies, missing lengths in length frequencies, missing specimens, missing biological data (various types of biological data).

After missing values have been checked, data is checked for some clear errors (e.g., noticeably wrong haul coordinates, likely duplicated records) and outliers. Outliers checks are also made with R-scripts and include screening the data for potentially erroneous entries that may be highly impactful to final estimates such as too high/low landings or discards per haul, unusually high/low catch/sample weights, atypical values in trip characteristics (e.g., atypical number of days at sea, atypical positioning of fishing operations). Depending on the variable of interest, an outlier can be defined as an atypical observation relative to other data collected or relative to estimates derived from them. Example of outliers are estimated weights of samples (as obtained via length-weight relationship) being very different from registered sampled weights; atypical catch fraction values (as signalled by several types of box-plot analysis, e.g.,  by gear, etc);  atypical lengths (as signalled by several types of box-plot analysis, e.g., by fraction, by size category, by gear, etc); atypical observations in scatterplots and boxplots between different biological variables (length, weight, age,...) 

During the estimation temporal consistency data checks (e.g. variation of data with quarters/years) are made.

How the data checks are made is documented in the R-script and is stored locally.

 

Data collection made by Institute of Marine Research, Department of Aquatic resources (SLU Aqua), Swedish University of Agricultural Sciences.

Example: Pot fishery for Norwegian lobster. (Length, weight, sex, maturity in females and diseases.)

Data is captured in an electronic protocol/application on a tough book in the field. The application is developed by the institute and is designed for different sampling types of on-board sampling and a separate scheme for surveys. The user chooses a sampling type of the current trip and is then steered through a defined workflow, with flexibility due to differences in work schedule on different fishing vessels. 

All data from the trip is transferred to the main database when coming back to the institute. For safety, a backup copy of the data is made on an USB-stick regularly during the trip.

When entering data into the application, length and weight are checked towards historical/recent measured length-weight relationships +/- 30% for the measured species. When outliers are detected observers get a question so they check if they are correct. If correct, the outlier value can still be stored. 

Measurements as length can be made with an electronic calliper, connected by Bluetooth or USB-wire to the tough book.

Sample weights are checked by comparing the length frequency of the sample and sample weight cannot be larger than the total weight. 

The person performing data capture choose e.g. latin names and gear types in predefined lists in the application. Only comments can be made in free text.

At sea, the application controls that all values needed for a sample type is there, a) per individual when measuring individuals, b) per haul, when verifying the haul when ending the haul registration and c) per trip when verifying the trip, before entering the harbour. 

The position is not yet checked in the electronic protocol, but as long as you have wifi/satellite contact the position can be captured automatically at the sight by pushing a button. If not, it needs to be entered manually. You can also choose an ICES rectangle. 

The rectangle or position defines which sampling target you have (you define this beforehand) and the number of sampled specimens per species is restricted due to this. (When collection a number of otoliths per length class in fish, the protocol jumps to the individual sampling page automatically as long as you still have individuals to sample. When a length class is full/ready the protocol stops jumping to the sampling page and stays on the length measurement page. To facilitate the correct number of sampled otoliths.)

Duplications are checked for at several occasions, when importing data from the field, ad hoc in the database (for things that cannot be checked when registration or import of electronic data occurs) and when delivering data to ICES. Things that are compared are eg. but not only:

  • The combination any vessel and fromdatetime must be unique.
  • The combination fish number and catch id must be unique.
  • The combination length group and catch id must be unique.
  • The combination species, processing, preservation, size must be unique.
  • The combination station, species and sub sample must be unique.

When the data is entered into the main database FD2, additional checks are made. These checks are made for all data inserted in the database and are described in a separate document (“Quality checks database FD2”).

Data collection made by Institute of Marine Research, Department of Aquatic resources (SLU Aqua), Swedish University of Agricultural Sciences.

Data is captured in an electronic protocol/application on board the research vessel Svea. The application, Sve-reg, is developed by the institute and is designed for different sampling scheme for surveys. The user chooses a sampling type of the current trip and is then steered through a defined workflow, with flexibility due to differences in work schedule on different surveys.

Measurements as length can be made with an electronic calliper or electronic measuring board, connected by Bluetooth or USB-wire to the work station. 

Weight of the species and individual weight is transferred electronically from the scale to the application. 

When entering data into the application, length and weight are checked towards historical/ recent measured values of length-weight relationships +/- 30% for the measured species. Outliers can be found after internal calculations displayed as a graph with a follow up question if the value is correct, from the application when data is captured. If yes, the outlier value can still be stored.

Sample weights are checked by comparing the length frequency of the sample and sample weight cannot be larger than the total weight. 

The person performing data capture choose e.g. latin names and other parameters in predefined lists in the application. Only comments can be made in free text. 

At sea, the application controls that all values needed for a sample type is there, a) per individual when measuring individuals, b) per haul, when verifying the haul when ending the haul registration and c) per trip when verifying the trip, before ending the trip. 

The position of the haul is acquired from the navigation system of the ship. 

Duplications are checked for at several occasions, a) when importing data from the field, b) ad hoc in the database (for things that cannot be checked when registration or import of electronic data occurs) and c) when delivering data to ICES. 

All data from the trip is transferred to the main database FD2 when coming back to the institute. When data is imported to FD2 some additional checks are made described in a separate document (“Quality checks database FD2”).  

Annual quality checks of survey data are also made when uploading to the international database DATRAS.

Species-specific data collection

Salmon and sea trout

A questionnaire (form) will be sent via mail from SLU Aqua to just over 30 recipients where they will fill in information on the current year's releases of salmon and sea trout including tagged individuals. The data, 1 ex per river, coastal area and county, shall be returned at the latest a few weeks later. These data (stocking location, river, X and Y coordinates, counties, number of stocked fish  (number of which are Carlin-, PIT-, nose-tagged or fin clipped), rearing site, stock/population and other information together with Sender, place of service and telephone number) are then compiled into different files. Thereafter, all data collected is entered into two files, one with salmon and one with sea trout. There they are sorted by ICES sub-areas, counties, stock, egg, fry, 1-summer parr, 1-yr parr, 1-yr smolt, 2-summer parr, 2-yr smolt, where the sums of smolt are summed under Sum smolt. The number of fin clipped, PIT-tagged, Carlin-tagged and rearing site/stock are also reported in the same way. When these compilations are complete, the data is entered in a third compilation "Salmon and sea trout smolt years xx-xx" which is divided into "Number of stocked" (which shows stocked numbers in the Baltic Sea, the West Coast, Lake Vänern and Lake Vättern), "Number of stocked and number of Carlin marks","Number of fin clipped" and "Number of Carlin tagged ". All data about the stockings is entered into a database, where each row shows the stocking place and age of the fish, if several age classes are released in one stocking occasion each age is entered on its own row.

SPSS Statistics syntax file:

***Används för att visuellt kolla min-/maxvärden i resultatfilen Sammanut (elfiskeresultat för utskick).

****Ändrat Ansvarig till Utförare 20140414.

**Kör nedanstående för att kolla min- och maxvärden.

GET FILE="C:\SPSS\Korningar\Resultatfiler\SAMMANUT.SAV".

EXECUTE.

DESCRIPTIVES

  VARIABLES=lan xkoorvdr ykoorvdr hoh avstupp avstner xkoorlok ykoorlok

  fiskedat abbor abbormax abbormin asp benlö bjöna besim besimmax besimmin

  simpa simpamax simpamin stesi stesimax stesimin hosim braxe bäcne flone

  havne havnemax nejonöga bäcrö0 bäcrö bäcrömax bäcrömin bäcrmax0 bäcröxör

  bäcröxö0 bäcxömax bäcxömin bäcxömx0 elrit elritmax elritmin flodkräf

  flodkmin flodkmax signkräf signkmin signkmax kräfta kräftmin kräftmax färna

  gers grolö grönl gädda gäddamax gäddamin gös harr0 harr harrmax harrmin

  harrmax0 id kanrö0 kanrö lake lakemax lakemin lax0 lax laxmax laxmin laxmax0

  laxber laxpval lax0ber lax0pval laxfisk0 laxfisk laxfimax laxfimin laxfmax0

  laxör0 laxör laxörmax laxörmin mal malmax mört mörtmax mörtmin karpfisk

  nissö nors regnb0 regnb ruda rödin0 rödin rödinmax rödinmin rödimax0 sankr

  sarv sik siklö skrub rödsp småsp spigg stosp stäm sutar vimma ål ålmax ålmin

  öring0 öring örinmax0 öringmax öringmin öringber örinpval örin0ber öri0pval

  volt strstyrk pulsfrek antutfis bredd langd lokalbre area medbredd medyta

  grumligh vtnfarg maxdjup medeldju bottento vattente lufttemp beskuggn

  vedivant vedivpar lokalvar kalkdatu pastyrk1 pastyrk2 pastyrk3 kommunnr

  öringtot harrtot laxtot laxfixto laxörtot kanrötot regnbtot rödintot

  bäcrötot bäcröxöt artantal år månad

  /STATISTICS=MEAN STDDEV MIN MAX .

 

**Stanna här och kolla outputen.

**Kör sedan nedanstående och kolla outputen.

GET FILE="C:\SPSS\Korningar\SAMMANUT.SAV".

EXECUTE.

FREQUENCIES

  VARIABLES=utförare substr1 substr2 substr3 vattenni vattenha ovegmang

  uvegmang uvegtyp narmiljo domtra1 domtra2 vandhind typavpop kalkpave

  kalkntyp paverkt1 paverkt2 paverkt3

  /ORDER=  ANALYSIS .

**Hit.

***Nedanstående måste köras i steg.

 

GET FILE= 'C:\SPSS\Korningar\SAMMANUT.SAV'.

EXECUTE.

MISSING VALUE LOKALBRE BREDD BOTTENTO (-9).

MISSING VALUE MEDELDJU MAXDJUP LAXPVAL LAX0PVAL ÖRINPVAL ÖRI0PVAL (-0.90).

IF (XKOORLOK GT 1) FELKOD=0.

EXECUTE.

iF (LOKALBRE GT BREDD) FELKOD=1.

EXECUTE.

SORT CASES BY

  felkod (D) .

**Hit.

**Kör.

GET FILE= 'C:\SPSS\Korningar\SAMMANUT.SAV'.

EXECUTE.

MISSING VALUE LOKALBRE BREDD BOTTENTO (-9).

MISSING VALUE MEDELDJU MAXDJUP LAXPVAL LAX0PVAL ÖRINPVAL ÖRI0PVAL (-0.90).

IF (XKOORLOK GT 1) FELKOD=0.

EXECUTE.

iF (MEDELDJU GT MAXDJUP) FELKOD=2.

EXECUTE.

SORT CASES BY

  felkod (D) .

**Hit.

**Kör.

MISSING VALUE LOKALBRE BREDD BOTTENTO (-9).

MISSING VALUE MEDELDJU MAXDJUP LAXPVAL LAX0PVAL ÖRINPVAL ÖRI0PVAL (-0.90).

IF (XKOORLOK GT 1) FELKOD=0.

EXECUTE.

iF (BOTTENTO EQ 0) FELKOD=3.

EXECUTE.

SORT CASES BY

  felkod (D) .

EXECUTE.

**Hit.

*Slut på körningen. Nedanstående har kollats i första körningen ovan, Descriptives.

Eel

Contact