| Field | Type | Required | Description |
|---|---|---|---|
| taxon_code | STRING | true | Short code used in field datasheets (e.g., CH.ATR) |
| taxon_name | STRING | true | Original name assigned to code (e.g., Chromis atrilobata) |
| aphia_id | INTEGER | true | AphiaID corresponding to taxon_name (may be outdated or unaccepted) |
| rank | STRING | true | Taxonomic rank of the observation (species, genus, family) |
| status | STRING | true | Taxonomic status of original name (accepted, synonym, unresolved) |
| accepted_name | STRING | true | Accepted scientific name (Genus species) |
| accepted_aphia_id | INTEGER | true | Accepted AphiaID for current taxonomy |
| fb_spec_code | INTEGER | false | Optional FishBase SpecCode for cross-referencing |
| notes | STRING | false | Optional comments or notes (e.g., name updates, uncertainty) |
Field Codes
This table maps shorthand taxon codes used during underwater visual surveys (UVS) to authoritative taxonomic identifiers and names. It serves as a translation layer between diver-entered field codes and the canonical taxonomy in taxonomy.fish.
Each row corresponds to a unique taxon_code used in the field, whether referring to an accepted species, a synonym, or a higher-level taxon (genus or family). This mapping enables consistent, traceable integration of UVS data with modern taxonomic standards and trait metadata.
Why it matters
- Legacy harmonization: Links historical field entries to accepted names and identifiers
- Soft joins: Enables flexible, code-to-species mapping across datasets
- Traceability: Maintains original diver intent while supporting global taxonomy
This table captures both the field-level identity (what divers recorded) and the accepted scientific classification (based on WoRMS and FishBase), enabling robust reconciliation and trait-based analysis.
How are codes generated?
taxon_code is the primary join key across all UVS fish datasets in the Pristine Seas Database. These short, deterministic codes are optimized for diver entry — concise, unambiguous, and easy to write underwater.
Field codes (taxon_code) are used across UVS datasets to record species in a compact, consistent format optimized for field entry. They follow a structured and deterministic convention:
Format for Species-Level Codes
GEN2.SPEC4 — First 2 letters of the genus + first 4 of the species (uppercase)
- Acanthurus tristis →
AC.TRIS
- Apogon tricolor →
AP.TRIC
- Anthias tricolor →
AN.TRIC
Handling Duplicates
When multiple species would share the same code:
- The most common taxon keeps the default
- Others extend the genus or species portion to ensure uniqueness
Examples
- Apogon tristis →
AP.TRIS
- Aplodactylus tristis →
APL.TRIS
- Labroides bilineatus →
LA.BILI
- Labroides bilinearis →
LA.BILIN
Genus- and Family-Level Codes
Used when IDs are not to species:
- Genus →
GEN4.SP(e.g., Labroides sp. →LABR.SP)
- Family →
FAM4.SPP(e.g., Labridae spp. →LABR.SPP)
These conventions ensure clean joins, traceability, and consistent taxonomy across field data and reference tables.
Hybrids
Hybrid taxa use an extended code format: GEN2.SPxSP
- Combine first two genus letters with
xand capitalized genus/species initials of each parent - Use consistent casing and separators
Examples
- Acanthurus achilles × nigricans →
AC.ACxNI
- Acanthurus olivaceus × nigricans →
AC.OLxNI
- Paracirrhites arcatus × bicolor →
PA.ARxBI